CRASH with error- last checkpoint exceed log group capacity

今天碰到了一次MYSQL CRASH,不像以前是BUG所致,这次是人为搞死的,又是一次血的教训
直接触发原因:
 在白天业务期间执行了200W的DELETE语句(自动 提交);
以下是当时的日志和状态信息:

100811 17:06:06 InnoDB: ERROR: the age of the last checkpoint is 1887429810,
InnoDB: which exceeds the log group capacity 1887429428.
InnoDB: If you are using big BLOB or TEXT rows, you must set the
InnoDB: combined size of log files at least 10 times bigger than the
InnoDB: largest such row.

[@more@]

100811 17:06:22 InnoDB: ERROR: the age of the last checkpoint is 1928074432,
InnoDB: which exceeds the log group capacity 1887429428.
InnoDB: If you are using big BLOB or TEXT rows, you must set the
InnoDB: combined size of log files at least 10 times bigger than the
InnoDB: largest such row.
100811 17:08:41 InnoDB: ERROR: the age of the last checkpoint is 1955265899,
InnoDB: which exceeds the log group capacity 1887429428.
InnoDB: If you are using big BLOB or TEXT rows, you must set the
InnoDB: combined size of log files at least 10 times bigger than the
InnoDB: largest such row.
100811 17:08:59 InnoDB: ERROR: the age of the last checkpoint is 1955266310,
InnoDB: which exceeds the log group capacity 1887429428.
InnoDB: If you are using big BLOB or TEXT rows, you must set the
InnoDB: combined size of log files at least 10 times bigger than the
InnoDB: largest such row.

show Innodb status
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 98390, free list len 294851, seg size 393242,
1962952796 inserts, 1960901438 merged recs, 126408533 merges
Hash table size 17700857, used cells 6710294, node heap has 12444 buffer(s)
0.16 hash searches/s, 9.00 non-hash searches/s
---
LOG
---
Log sequence number 2017 503698566
Log flushed up to 2017 503698566
Last checkpoint at 2016 3207393311
0 pending log writes, 0 pending chkp writes
3659386378 log i/o's done, 0.05 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total memory allocated 10367101167; in additional pool allocated 281326592
Buffer pool size 524288
Free buffers 0
Database pages 511844
Modified db pages 159931
Pending reads 1
Pending writes: LRU 0, flush list 129, single page 0
Pages read 226580777, created 88398135, written 1231690245
11.26 reads/s, 0.00 creates/s, 161.68 writes/s
Buffer pool hit rate 750 / 1000


现象 : INSERT进程无法写入,被HANG住;
直接用KILL方式杀掉MYSQL进程, 并重启MYSQL:
开始恢复,日志信息如下 :

InnoDB: Log scan progressed past the checkpoint lsn 2016 3207393311
100811 17:21:50 InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
InnoDB: Doing recovery: scanned up to log sequence number 2016 3212636160
InnoDB: Doing recovery: scanned up to log sequence number 2016 3217879040
InnoDB: Doing recovery: scanned up to log sequence number 2016 3223121920
InnoDB: Doing recovery: scanned up to log sequence number 2016 3228364800
InnoDB: Doing recovery: scanned up to log sequence number 2016 3233607680

...

InnoDB: Doing recovery: scanned up to log sequence number 2016 4271697920
InnoDB: Doing recovery: scanned up to log sequence number 2016 4276940800
InnoDB: Doing recovery: scanned up to log sequence number 2016 4282183680
InnoDB: Doing recovery: scanned up to log sequence number 2016 4287426560
InnoDB: Doing recovery: scanned up to log sequence number 2016 4292669440
InnoDB: Doing recovery: scanned up to log sequence number 2017 2945024
InnoDB: Doing recovery: scanned up to log sequence number 2017 8187904
InnoDB: Doing recovery: scanned up to log sequence number 2017 13430784
InnoDB: Doing recovery: scanned up to log sequence number 2017 18673664
InnoDB: Doing recovery: scanned up to log sequence number 2017 23916544
InnoDB: Doing recovery: scanned up to log sequence number 2017 29159424
InnoDB: Doing recovery: scanned up to log sequence number 2017 34402304
....

InnoDB: Doing recovery: scanned up to log sequence number 2017 322760704
InnoDB: Doing recovery: scanned up to log sequence number 2017 328003584
InnoDB: Doing recovery: scanned up to log sequence number 2017 333246464
InnoDB: Doing recovery: scanned up to log sequence number 2017 338489344
InnoDB: Doing recovery: scanned up to log sequence number 2017 343732224
到这里,恢复以失败告终, 没有任何其他信息;多次尝试均未果。


出现这个现象是由于手工以自动提交的方式,执行200W条DELETE语句导致;(正常应用的操作主要是INSERT,没有UPDATE,1000QPS,有TEXT字段)

有故障发生后我们需要分析原因:

在这里首先有几个配置需要列出:
OS :RHEL 4u7
MYSQL: 5.0.51a
HD :146G*15 , RAID 5
MEM :16G
/etc/my.cnf:
innodb_log_buffer_size = 4M
innodb_log_file_size = 500M
innodb_log_files_in_group = 4
innodb_max_dirty_pages_pct = 5
innodb_flush_log_at_trx_commit = 2
innodb_buffer_pool_size = 8G
innodb_data_file_path = ibdata1:500M:autoextend
innodb_file_per_table


我们能提供的信息有:
1) 在执行200W删除操作时候出现问题,
2) 有报错:
100811 17:06:06 InnoDB: ERROR: the age of the last checkpoint is 1887429810,
InnoDB: which exceeds the log group capacity 1887429428.
这里已经表明,日志已经全部用完,需要FLUSH 脏数据块;

3) 从show innodb status 的LOG部分看:
Log flushed up to 2017 503698566
Last checkpoint at 2016 3207393311

脏数据的范围从 2016 3207393311 --&gt 2017 503698566 ,
脏数据范围:(2016 4292669440)-(2016 3207393311)+(2017 503698566) = 1588974695 ,也就是需要RECOVER的范围。

为什么这样算呢:
这里的2016,2017代表的是日志轮循写的次数(我猜的,因为我看到INNODb在恢复时,ERROR LOG里:
InnoDB: Doing recovery: scanned up to log sequence number 2016 4287426560
InnoDB: Doing recovery: scanned up to log sequence number 2016 4292669440 --最大值
InnoDB: Doing recovery: scanned up to log sequence number 2017 2945024
InnoDB: Doing recovery: scanned up to log sequence number 2017 8187904

第2016轮循读到4292669440,接着是下一个2017 2945024 ;
((( 但这里的4292669440 = 4G, 却与我的日志大小不匹配(500M*4),这里有点矛盾,先跳过。)))

4) 脏数据块的量已经明显超过:(8G*0.05)400M, 为什么MYSQL来不及刷数据到硬盘? 一种可能是COMMIT太频繁导致MYSQL频繁写BINLOG日志,太多占用了IOPS,

5) 我们从innodb_log信息来看
-rw-rw---- 1 mysql mysql 524288000 Aug 11 19:57 ib_logfile0
-rw-rw---- 1 mysql mysql 524288000 Aug 11 16:50 ib_logfile1
-rw-rw---- 1 mysql mysql 524288000 Aug 11 16:58 ib_logfile2
-rw-rw---- 1 mysql mysql 524288000 Aug 11 17:04 ib_logfile3


小结:
1, commit 过于频繁,导致产生大量BINLOG日志IO请求,而导致MYSQL变得缓慢(脏块刷不下去),进而导致HANG住;
2, 大量的删除,产生大量脏数据块,超过(8G*0.05=400M)CHECKPOINT threshold, 触发CHECKPOINT,从而使得MYSQL需要刷脏数据到块到硬盘;
3, 大量的删除,产生大量的REDO(innodb_logs),超过当前(500M*4)可以承受的量,强制MYSQL需要CHECKPOINT,触发MYSQL 刷BUFFER脏块到硬盘;


经验与教训:
 1,在业务繁忙期间,绝对不允许做如此大的操作。比如大量的DELETE、INSERT,并且是频繁的提交;
2,大操作需要批量提交,减少日志IO请求;


在这里,我们还有不少的知识点需要补充:

MYSQL INNODB CHECKPOINT:
--------------------------------------
参考: http://www.mysqlperformanceblog.com/2008/11/13/adaptive-checkpointing/
http://dev.mysql.com/doc/refman/5.0/en/innodb-checkpoints.html

CHECKPOINT 信息被存储在第一个INNODB LOG FILE中。

INNODB CRUSH 后,需要进行恢复(前滚,后滚),从哪个点开始恢复,就是CHECKPOINT来控制的;
我们在show innodb status 后可以找到以下信息:
---
LOG
---
Log sequence number 2017 503698566
Log flushed up to 2017 503698566
Last checkpoint at 2016 3207393311
------------------

Last checkpoint 指向的值,就是我们需要恢复的起始点. 要恢复到“Log flushed“,数据才是完整的;

那么什么时候来做checkpoint呢?

由两个点来控制: 脏数据块数量 && INNODB LOG FILE 大小

在INNODB BUFFER中,允许的脏数据量= innodb_buffer_size * innodb_max_dirty_pages_pct /100
如果脏数据块超过这个值,MYSQL后台进程就会需要将脏数据块向磁盘刷新,并刷新CHECKPOINT值 ;

另外一个,INNODB_LOG 里的内容里用来恢复数据的, 类似于ORACLE的REDO日志,只要内存的脏数据块没有被刷到硬盘,那么这些相应的日志也不能丢失,
否则数据恢复就会失败;INNODBLOG是循环使用的;当日志一次循环使用到最后一个日志也写满了,接着需要写第一个日志的时候,
如果第一个日志的内容还未被标志成”无用“状态,那么MYSQL就会触发CHECKPOINT,刷脏数据 ;


如果确定INNODB_LOG_FILE大小
--------------------------------------
参考: http://www.mysqlperformanceblog.com/2008/11/21/how-to-calculate-a-good-innodb-log-file-size/
INNODB_LOG_FILE 过于小,会直接触发CHECKPOINT,导致频繁IO请求; 多大是合适的?
========================================
root@127.0.0.1 : (none) 16:13:13> pager grep sequence
PAGER set to 'grep sequence'
root@127.0.0.1 : (none) 16:13:14>
root@127.0.0.1 : (none) 16:13:15> SHOW engine innodb STATUSG SELECT sleep(60); SHOW engine innodb STATUSG
Log sequence number 1450 485101299
1 row in set (0.09 sec)


1 row in set (1 min 0.01 sec)

Log sequence number 1450 505024667
1 row in set (0.00 sec)

root@127.0.0.1 : (none) 16:14:37> nopager
PAGER set to stdout
root@127.0.0.1 : (none) 16:14:43> select (505024667-485101299)/1024/1024;
+---------------------------------+
| (505024667-485101299)/1024/1024 |
+---------------------------------+
| 19.00040436 |
+---------------------------------+
1 row in set (0.00 sec)
========================================
Notice the log sequence number. That's the total number of bytes written to the transaction log.
我们在高峰期间采样可以得到,1分钟产生19M的日志; 我觉得这个INNODB LOG大小设成 19M*60=1140M 已经足够了;
60分钟是一个经验值, 你也可以适当调大,比如 500M,3个文件 ;这相对来说是安全的;
当然你也可以用以下命令来查看日志产生的大小:
show status like 'Innodb_os_log_written'; select sleep(60); show status like 'Innodb_os_log_written';


关于innodb_log_buffer_size
-----------------------
InnoDB用来往磁盘上的日志文件写操作的缓冲区的大小。明智的值是从1MB到8MB。默认的是1MB。一个大的日志缓冲允许大型事务运行而不需要在事务提交之前往磁盘写日志。因此,如果你有大型事务,使日志缓冲区更大以节约磁盘I/O。
InnoDB flushes the buffer to the log files on disk
when the buffer gets full,
when a transaction commits,
when once per second

你可以查看变量Innodb_os_log_written来估算每秒有多少日志写到磁盘.
if you see a peak of 100 KB written to the log per second,
a 1 MB log buffer is probably plenty.
a 256 MB log file is enough to store at least 2,560 seconds of log entries

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/703656/viewspace-1036698/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/703656/viewspace-1036698/

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值