1. 环境描述
操作系统:ubuntu16.04
数据库:mysql5.7.11
数据库引擎:InnoDB
2. 现象描述
数据库中包含一个工作的database,暂时命名为abc,其中有很多表,如table1,table2,..., tablen。
通过终端连入,执行use abc, 执行show tables;
出现如下错误:
2006,MySQL Server has gone away.
3. 排查过程
通过ps命令查看,数据库服务任然在工作,而且可以正常的启动、停止。
通过程序日志,能查到的日志就是2003,Cannot Connect to MySQL Server on 'xxx',而程序这里通过的内网ip连接,且所有的内网ip均已经通过权限放通,并且包括localhost上面的设备也会报这个错,所以排除权限的问题,并且程序是间歇性的报数据库无法连接的错误。
通过程序、终端提示,获取到的2003和2006号错误信息范围太过于广泛,而且绝大部分的google结果都是建议设置my.cnf中的max_allowed_packet等参数,这里出现的问题和这些实际操作并无意义。
分析到这里,想到了mysql的error日志中可能还存在某些有意义的信息,于是进行了排查。果然找到了如下的信息:
2017-07-03T03:37:45.826669Z 10 [Note] InnoDB: Uncompressed page, stored checksum in field1 3016282142, calcksum in field2 2124946097, calculated checksums for field2: crc32 860595562/1814632262, innodb 1314564691, red to page already) 1, space id (if created with >= MySQL-4.1.1 and stored already) 8292
InnoDB: Page may be an insert buffer bitmap page
2017-07-03T03:37:45.826689Z 10 [Note] InnoDB: It is also possible that your operating system has corrupted You can also try to fix the corruption by dumping, dropping, and reimporting the corrupt table. You can us/forcing-innodb-recovery.html for information about forcing recovery.
2017-07-03T03:37:45.826728Z 10 [ERROR] InnoDB: Database page corruption on disk or a failed file read of pa
2017-07-03T03:37:45.826740Z 10 [Note] InnoDB: Page dump in ascii and hex (16384 bytes):
@ @ @ @ @ @ @ @ @ @ @ @ @
2017-07-03T03:37:45.691584Z 10 [Note] InnoDB: Page dump in ascii and hex (16384 bytes):
InnoDB: End of page dump
InnoDB: Page may be an insert buffer bitmap page
2017-07-03T03:37:30.574822Z 8 [Note] InnoDB: It is also possible that your operating system has corrupted its own file cache and rebooting your computer removes the error. If the corrupt page is an index page. You can also try to fix the corruption by dumping, dropping, and reimporting the corrupt table. You can use CHECK TABLE to scan your table