博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
故障案例:slave延迟很大
阅读量:2496 次
发布时间:2019-05-11

本文共 9407 字,大约阅读时间需要 31 分钟。

案例0:binlog_format为mixed时,对于一些批量更新的sql非常有利,比如update xx set name='xxx';但是有一些条件下mysql检测到无法满足statement模式时,会自动转换成row模式,这时如果主库一个sql是全表扫描,那么反应到从库上时就是N次全表扫描了,这通常也是很多情况下用户问:为什么我主库上执行1分钟,但从库上执行却需要1小时以上,从库再怎么单线程也不至于这么慢吧?下面是官网给出的在mixed下,自动从statement转换成row模式的条件

When running in MIXED logging format, the server automatically switches from statement-based to rowbased logging under the following conditions:

• When a function contains UUID().
• When one or more tables withAUTO_INCREMENT columns are updated and a trigger or stored
function is invoked. Like all other unsafe statements, this generates a warning ifbinlog_format =
STATEMENT.
• When the body of a view requires row-based replication, the statement creating the view also uses it. For
example, this occurs when the statement creating a view uses theUUID() function.
• When a call to a UDF is involved.
• When any INSERT DELAYED is executed for a nontransactional table.
• If a statement is logged by row and the session that executed the statement has any temporary tables,
logging by row is used for all subsequent statements (except for those accessing temporary tables) until
all temporary tables in use by that session are dropped.
This is true whether or not any temporary tables are actually logged.
Temporary tables cannot be logged using row-based format; thus, once row-based logging is used, all
subsequent statements using that table are unsafe. The server approximates this condition by treating
all statements executed during the session as unsafe until the session no longer holds any temporary
tables.
• When FOUND_ROWS() or ROW_COUNT() is used. (Bug #12092, Bug #30244)
• When USER(), CURRENT_USER(), or CURRENT_USER is used. (Bug #28086)
• When a statement refers to one or more system variables. (Bug #31168)
Exception. The following system variables, when used with session scope (only), do not cause the
logging format to switch:
auto_increment_increment
auto_increment_offset
character_set_client
character_set_connection
character_set_database
character_set_server
collation_connection
collation_database
collation_server
foreign_key_checks
identity
last_insert_id
lc_time_names
pseudo_thread_id
sql_auto_is_null
time_zone
timestamp
unique_checks
For information about determining system variable scope, seeSection 5.1.5, “Using System Variables”.
For information about how replication treats sql_mode, see Section 17.4.1.34, “Replication and
Variables”.
• When one of the tables involved is a log table in themysql database.
• When the LOAD_FILE() function is used. (Bug #39701)

案例1:lvm机型,从库创建完成后,主库qps 2w,从库6k多。从某个时间点开始延迟在缓慢增加,一直涨到7w多秒才发现去处理;从库io的util很高

故障原因:查看配置发现这个从库开启了log_slave_updates,一直在产生binlog,当把这个参数禁用或者设置sync_binlog=0以后,util立马就降下来了,同步延迟也就慢慢变小了直到为0,此前已经发现多次使用lvm逻辑卷管理或者SSD的机器的udb只要开启了这个sync_binlog=1,磁盘util也很高。后来发现是raid卡设置,必须设置成write back模式,不然性能非常差。

案例2:slave延迟一直在增大,从库和主库的qps很低,io都很低,从库cpu 100%

故障原因:因为从库执行SQL是单线程的,所以只能利用一个CPU的资源,当cpu使用率到100%,整个库都卡住了,不管什么操作都很慢;发现从库上只执行一个简单的主键更新操作,所以很奇怪为什么主键更新还这么慢

     ID: 6933

   USER: system user
   HOST: 
     DB: gangao
COMMAND: Connect
   TIME: 457156
  STATE: Updating
   INFO: UPDATE`STK_DAILYQUOTEFA`SET`ID`=0x3631323538393739303330,`SECUCODE`=0x3237353233,`TRADINGDAY`='2003-03-18 00:00:00',`TRADINGSTATE`=0x31,`PREVCLOSINGPRICEBA`=0x312E363238,`OPENINGPRICEBA`=0x312E36323633,`HIGHESTPRICEBA`=0x312E36333332,`LOWESTPRICEBA`=0x312E36303734,`CLOSINGPRICEBA`=0x312E363136,`ENTRYTIME`='2015-07-24 15:41:13',`UPDATETIME`='2015-07-24 15:41:13',`GROUNDTIME`='2015-07-24 15:41:13',`UPDATEID`=0x3631323538393739303330,`RESOURCEID`=0x43616C63,`RECORDID`=NULL WHERE`ID`=0x3631323538393739303330

后来发现这里的ID是0x3631323538393739303330十六机制表示的字符串

普通10进制int值时的查询计划

mysql> explain select * from gangao.STK_DAILYQUOTEFA where id = 60737669922;

+----+-------------+------------------+-------+---------------+---------+---------+-------+------+-------+
| id | select_type | table            | type  | possible_keys | key     | key_len | ref   | rows | Extra |
+----+-------------+------------------+-------+---------------+---------+---------+-------+------+-------+
|  1 | SIMPLE      | STK_DAILYQUOTEFA | const | PRIMARY       | PRIMARY | 8       | const |    1 |       |
+----+-------------+------------------+-------+---------------+---------+---------+-------+------+-------+
1 row in set (2.84 sec)
注意这里转过去其实是字符,而问题就是出在这里
mysql> select hex('60737669922');
+------------------------+
| hex('60737669922')     |
+------------------------+
| 3630373337363639393232 |
+------------------------+
1 row in set (0.00 sec)
mysql> select 0x3630373337363639393232;
+--------------------------+
| 0x3630373337363639393232 |
+--------------------------+
| 60737669922              |
+--------------------------+
1 row in set (0.00 sec)
这里再解析执行计划,发现找不到该记录
mysql> explain select * from gangao.STK_DAILYQUOTEFA where id = 0x3630373337363639393232;
+----+-------------+-------+------+---------------+------+---------+------+------+-----------------------------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                                               |
+----+-------------+-------+------+---------------+------+---------+------+------+-----------------------------------------------------+
|  1 | SIMPLE      | NULL  | NULL | NULL          | NULL | NULL    | NULL | NULL | Impossible WHERE noticed after reading const tables |
+----+-------------+-------+------+---------------+------+---------+------+------+-----------------------------------------------------+
1 row in set (3.10 sec)
实际上记录是存在的
mysql> select * from gangao.STK_DAILYQUOTEFA where id = 60737669922;
+-------------+----------+---------------------+--------------+--------------------+----------------+----------------+---------------+----------------+---------------------+---------------------+---------------------+-------------+------------+----------+
| ID          | SECUCODE | TRADINGDAY          | TRADINGSTATE | PREVCLOSINGPRICEBA | OPENINGPRICEBA | HIGHESTPRICEBA | LOWESTPRICEBA | CLOSINGPRICEBA | ENTRYTIME           | UPDATETIME          | GROUNDTIME          | UPDATEID    | RESOURCEID | RECORDID |
+-------------+----------+---------------------+--------------+--------------------+----------------+----------------+---------------+----------------+---------------------+---------------------+---------------------+-------------+------------+----------+
| 60737669922 |    20041 | 2009-06-05 00:00:00 | 1            |            33.9315 |        33.8687 |        35.8897 |       33.7851 |        35.2277 | 2015-07-13 15:40:01 | 2015-07-13 15:40:01 | 2015-07-13 15:40:01 | 60737669922 | Calc       | NULL     |
+-------------+----------+---------------------+--------------+--------------------+----------------+----------------+---------------+----------------+---------------------+---------------------+---------------------+-------------+------------+----------+
1 row in set (1.94 sec)
但是转成这16进制表示的字符形式就出不来结果了
mysql> select * from gangao.STK_DAILYQUOTEFA where id = 0x3630373337363639393232;
Empty set (0.40 sec)

而这个数值真正的16进制数值应该是

mysql> select hex(60737669922);

+------------------+
| hex(60737669922) |
+------------------+
| E243F4B22        |
+------------------+
1 row in set (0.00 sec)

这里需要加上0才行

mysql> select 0xE243F4B22+0;

+---------------+
| 0xE243F4B22+0 |
+---------------+
|   60737669922 |
+---------------+
1 row in set (0.00 sec)

而这时再解析

mysql> explain select * from gangao.STK_DAILYQUOTEFA where id = 0xE243F4B22+0;

+----+-------------+------------------+-------+---------------+---------+---------+-------+------+-------+
| id | select_type | table            | type  | possible_keys | key     | key_len | ref   | rows | Extra |
+----+-------------+------------------+-------+---------------+---------+---------+-------+------+-------+
|  1 | SIMPLE      | STK_DAILYQUOTEFA | const | PRIMARY       | PRIMARY | 8       | const |    1 |       |
+----+-------------+------------------+-------+---------------+---------+---------+-------+------+-------+
1 row in set (2.83 sec)

mysql> select * from gangao.STK_DAILYQUOTEFA where id = 0xE243F4B22+0;

+-------------+----------+---------------------+--------------+--------------------+----------------+----------------+---------------+----------------+---------------------+---------------------+---------------------+-------------+------------+----------+
| ID          | SECUCODE | TRADINGDAY          | TRADINGSTATE | PREVCLOSINGPRICEBA | OPENINGPRICEBA | HIGHESTPRICEBA | LOWESTPRICEBA | CLOSINGPRICEBA | ENTRYTIME           | UPDATETIME          | GROUNDTIME          | UPDATEID    | RESOURCEID | RECORDID |
+-------------+----------+---------------------+--------------+--------------------+----------------+----------------+---------------+----------------+---------------------+---------------------+---------------------+-------------+------------+----------+
| 60737669922 |    20041 | 2009-06-05 00:00:00 | 1            |            33.9315 |        33.8687 |        35.8897 |       33.7851 |        35.2277 | 2015-07-13 15:40:01 | 2015-07-13 15:40:01 | 2015-07-13 15:40:01 | 60737669922 | Calc       | NULL     |
+-------------+----------+---------------------+--------------+--------------------+----------------+----------------+---------------+----------------+---------------------+---------------------+---------------------+-------------+------------+----------+
1 row in set (4.25 sec)

所以问题就出在客户将10进制整数123456转成字符串'123456'的16进制形式处理了,导致了索引根本没用上,查询缓慢,甚至一直在做无用功,根本不存在这些值。

案例3:主库上只有write操作,主库的io比较低,从库io很高,主库不存在峰值。

原因分析:后来对比主从的配置,发现主库的innodb_flush_log_at_trx_commit设置为2,而从库上innodb_flush_log_at_trx_commit是1,后来将从库也调为2或者0就没问题,这个参数具体的含义不多说,下图是没调整前,主从的io对比

案例4:从库上创建了一个触发器,当检测到某张表有写入时,将其做了一个统计汇总写到一个新的表里,导致slave延迟巨大

原因:从库本身单线程,没检测到一个写操作还需要停下来做额外的工作,cpu已经跑满一个核,导致延迟一直增加

处理措施:弃用该触发器

你可能感兴趣的文章
新词发现博文收集
查看>>
input text focus去掉默认光影
查看>>
使用JsonP进行跨域请求
查看>>
HDU 5317 RGCDQ (数论素筛)
查看>>
学习JSP(一)
查看>>
node安装-Win+Linux+Mac osx
查看>>
cookie和session笔记
查看>>
Java中使用注释
查看>>
构建你的第一个App
查看>>
Network Mapper 嗅探工具
查看>>
linux下定时执行任务的方法
查看>>
ASP.NET MVC 常用内置验证特性 简介
查看>>
tuple有无list对key的影响
查看>>
java study3
查看>>
优秀的后台管理界面设计案例分享
查看>>
在VIM中使用GDB调试 – 使用vimgdb
查看>>
数据挖掘中哪些算法使用率较高?
查看>>
编程算法 - 推断二叉树是不是平衡树 代码(C)
查看>>
MySpring dataSource从配置文件获取
查看>>
矩阵的转置
查看>>