【实战篇】_MySQL服务器一次异常掉电的恢复

最新推荐文章于 2024-05-16 00:50:09 发布

db_murphy

最新推荐文章于 2024-05-16 00:50:09 发布

阅读量965

点赞数 1

分类专栏： mysql

本文链接：https://blog.csdn.net/db_murphy/article/details/102747573

版权

mysql 专栏收录该内容

55 篇文章 2 订阅

订阅专栏

【引言】
自己用来测试的一台MySQL环境，部署在WMWARE虚拟机环境下，因一次意外强制关闭笔记本，结果重新打开虚拟环境，怎么都登陆不上MySQL；场景相当于一次数据库服务器断电恢复过程。怎么快速恢复，本文介绍一种方法。

环境介绍：
linux版本：
#cat /etc/redhat-release
Red Hat Enterprise Linux Server release 6.6 (Santiago)
mysql版本：
mysql> select @@version;
±------------------+
| @@version |
±------------------+
| 8.0.17-commercial |
±------------------+
1 row in set (0.00 sec)

错误日志信息：
2019-10-24T09:12:42.171191Z 0 InnoDB: Completed initialization of buffer pool
InnoDB: Error: log file ./ib_logfile0 is of different size 0 1679818752 bytes
InnoDB: than specified in the .cnf file 0 268435456 bytes!
2019-10-24T09:12:42.171191Z 0 [ERROR] Plugin ‘InnoDB’ init function returned error.
2019-10-24T09:12:42.171191Z 0 [ERROR] Plugin ‘InnoDB’ registration as a STORAGE ENGINE failed.
2019-10-24T09:12:42.171191Z 0 [Note] Plugin ‘FEEDBACK’ is disabled.
2019-10-24T09:12:42.171191Z 0 [ERROR] Unknown/unsupported storage engine: InnoDB
2019-10-24T09:12:42.171191Z 0 [ERROR] Aborting
2019-10-24T09:12:42.171191Z 0 [Note] /usr/libexec/mysqld: Shutdown complete’

从日志信息看，是ib_logfile0有问题了，可以理解为redo日志出问题了，熟悉oracle redo日志原理的亲可以更加理解。关于ib_logfile0日志的介绍，请详见本人公众号“一森咖记”之前的一篇文章：
《【生产篇】_mysqlbackup备份异常遇到的ib_logfile问题》
大致介绍ib_logfile如下：
ib_logfile是个嘛？
MySQL的innodb中事务日志ib_logfile，或称redo日志，在mysql中默认以ib_logfile0,ib_logfile1名称存在,可以手工修改参数，调节开启几组日志来服务于当前mysql数据库,mysql采用顺序，循环写方式，每开启一个事务时，会把一些相关信息记录事务日志中(记录对数据文件数据修改的物理位置或叫做偏移量);
ib_logfile的作用
在系统崩溃重启时，进行事务重做。在系统正常时，每次checkpoint时间点，会将之前写入事务应用到数据文件中。

解决方法：
把ib_logfile日志重命名保留，重新启动数据库，让数据库重新产生并使用新的redo日志；注意：此方法是因为自己的MySQL环境为测试环境，不用担心数据丢失，如果是生产环境，请不要轻易尝试，生产环境建议使用备份恢复的方法，或者其他可用方式。本文只是提供一种快捷的方式尝试把库拉起来。

然而，重新启动时依旧报错，错误信息如下：
InnoDB: Page lsn 38 1354583575, low 4 bytes of lsn at page end 1354583575
InnoDB: Page number (if stored to page already) 535,
InnoDB: Page may be a BLOB page
191024 09:39:30 [ERROR] mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
To report this bug, see http://kb.askmonty.org/en/reporting-bugs
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.

以上根据回滚信息，此次数据库重启动时，在创建完ib_logfile日志后，就开始进行了数据恢复，也就是creash recovery，但数据库恢复失败，而同时数据库也启动失败；原因如上文标红处，可能是二进制文件被损坏了或者硬件损坏。后来决定使用强制InnoDB恢复，尝试使用innodb_force_recovery进行数据库的恢复启动。

innodb_force_recovery参数，官方介绍如下：
在这里插入图片描述
翻译如下：
[mysqld]
innodb_force_recovery = 1
警告
只有在紧急情况下将innodb_force_recovery设为大于0的值，你才能启动InnoDB并转储表。在进行此操作之前，确保你有数据库的备份副本，以备重建。4及以上的值可以永久破坏数据文件。只有在数据库的独立物理副本的成功地测试了设置，才能在生产服务器实例使用4及以上的innodb_force_recovery设置。当强制进行InnoDB恢复，你应该总是以innodb_force_recovery=1启动，且仅在必要时才逐渐增加值。
在这里插入图片描述
翻译如下：
innodb_force_recovery默认为0（没有强制恢复的正常启动）。对于innodb_force_recovery允许的非零值范围为1到6。较大值包括较小值的功能。例如，为3的值包括所有的值1和2的功能。

如果你能以innodb_force_recovery为3或更低值恢复/转储你的表时，这种情况库还相对安全的，因为只在相对独立的页上的一些数据会丢失。innodb_force_recovery>4时，数据库损坏度比较大/危险，因为数据文件被永久地损坏。值6被认为是严重的，数据库页被留在一个陈旧的状态，这反过来又可能带给B-trees和其它数据库结构更多的损坏。

作为一个安全措施，InnoDB 在innodb_force_recovery大于0时阻止INSERT，UPDATE或DELETE操作。innodb_force_recovery设为4或更高时，InnoDB库处于只读模式。

以下是innodb_force_recovery 1 到6的官方介绍
在这里插入图片描述
大致解释如下：
innodb_force_recovery可以设置为1-6,大的数字包含前面所有数字的影响。

(SRV_FORCE_IGNORE_CORRUPT):忽略检查到的corrupt页。
(SRV_FORCE_NO_BACKGROUND):阻止主线程的运行，如主线程需要执行full purge操作，会导致crash。
(SRV_FORCE_NO_TRX_UNDO):不执行事务回滚操作。
(SRV_FORCE_NO_IBUF_MERGE):不执行插入缓冲的合并操作。
(SRV_FORCE_NO_UNDO_LOG_SCAN):不查看重做日志，InnoDB存储引擎会将未提交的事务视为已提交。
(SRV_FORCE_NO_LOG_REDO):不执行前滚的操作。

恢复操作步骤：
按照官方意见，首先使用最低级别的恢复innodb_force_recovery=1，跳过坏页的方法进行数据库启动
初始参数文件my.cnf中，添加innodb_purge_threads参数
[mysqld]
innodb_force_recovery=1
innodb_purge_threads = 0

MySQL未启动成功，所报日志错误和前文报错一样。

设置innodb_force_recovery=2
[mysqld]
innodb_force_recovery=2
innodb_purge_threads = 0

重启数据库成功
150125 17:10:47 [Note] Crash recovery finished.
150125 17:10:47 [Note] Server socket created on IP: ‘0.0.0.0’.
150125 17:10:47 [Note] Event Scheduler: Loaded 0 events
150125 17:10:47 [Note] /vdata/webserver/mysql/bin/mysqld: ready for connections.
Version: ‘8.0.17-enterprise-commercial-advanced-log’ socket: ‘/tmp/mysql.sock’ port: 3306 MySQL Enterprise Server - Advanced Edition (Commercial)

立即对数据库做mysqldump逻辑导出，完成之后将innodb_force_recovery设置还原为0 ，innodb_purge_thread=1 ,然后重建一份新的数据库。

另，注意：innodb_force_recovery一定要配合innodb_purge_threads=0使用，为啥？

因为当innodb_purge_threads 和 innodb_force_recovery=1一起设置，会出现如下loop现象，见下文：
191024 10:07:42 InnoDB: Waiting for the background threads to start
191024 10:07:43 InnoDB: Waiting for the background threads to start
191024 10:07:44 InnoDB: Waiting for the background threads to start
191024 10:07:45 InnoDB: Waiting for the background threads to start
…

purge thread 线程负责定期回收已经使用并分配的undo页，purge操作默认是由master thread中完成，为了减轻master thread负载，提高cpu使用率和提升存储引擎性能。在my…cnf参数文件中设置innodb_purge_threads=1可启动单独的purge thread。
[mysqld]
innodb_purge_threads=1
从innodb1.2版本开始，可以指定多个innodb_purge_threads来进一步加快和提高undo回收速度。

注意：Mysql 8版本中innodb_purge_threads =1（默认），官方介绍如下：
在这里插入图片描述
注意：
如果innodb_purge_threads 和 innodb_force_recovery一起设置，需要将innodb_purge_threads设置为0；

以下内容参见【参考2】，介绍了innodb_force_recovery和innodb_purge_threads的关联关系。

在srv_purge_thread里，会判断当前是否以recovery mode启动：
879 /* Check for shutdown and whether we should do purge at all. */
3880 if (srv_force_recovery >= SRV_FORCE_NO_BACKGROUND
3881 || srv_shutdown_state != 0
3882 || srv_fast_shutdown) {
3883
3884 break;
3885 }

其中SRV_FORCE_NO_BACKGROUND值为2，可以看看不同的恢复级别分别代表什么：
[cpp]
enum {
SRV_FORCE_IGNORE_CORRUPT = 1, /*!< let the server run even if it
detects a corrupt page /
SRV_FORCE_NO_BACKGROUND = 2, /!< prevent the main thread from
running: if a crash would occur
in purge, this prevents it /
SRV_FORCE_NO_TRX_UNDO = 3, /!< do not run trx rollback after
recovery /
SRV_FORCE_NO_IBUF_MERGE = 4, /!< prevent also ibuf operations:
if they would cause a crash, better
not do them /
SRV_FORCE_NO_UNDO_LOG_SCAN = 5, /!< do not look at undo logs when
starting the database: InnoDB will
treat even incomplete transactions
as committed /
SRV_FORCE_NO_LOG_REDO = 6 /!< do not do the log roll-forward
in connection with recovery */
};
也就是说，当恢复级别大于等于2时，将会从while循环中break然后退出线程（os_thread_exit）
但在函数innobase_start_or_create_for_mysql里，由于设置innodb_purge_thread为1，因此这里会等待purge线程起来。

[cpp]
2027 while (srv_shutdown_state == SRV_SHUTDOWN_NONE) {
2028 if (srv_thread_has_reserved_slot(SRV_MASTER) == ULINT_UNDEFINED
2029 || (srv_n_purge_threads == 1
2030 && srv_thread_has_reserved_slot(SRV_WORKER)
2031 == ULINT_UNDEFINED)) {
2032
2033 ut_print_timestamp(stderr);
2034 fprintf(stderr, " InnoDB: "
2035 "Waiting for the background threads to "
2036 “start\n”);
2037 os_thread_sleep(1000000);
2038 } else {
2039 break;
2040 }
2041 }
FIX：
在创建purge线程前，同时判断recovery值，当>=2时，我们强制将innodb_purge_thread置为0，以防止无限Loop
[cpp]
diff -ur Percona-Server-5.5.18.stock/storage/innobase/srv/srv0start.c Percona-Server-5.5.18.fix-purge/storage/innobase/srv/srv0start.c
— Percona-Server-5.5.18.stock/storage/innobase/srv/srv0start.c 2012-01-07 16:38:37.000000000 +0800
+++ Percona-Server-5.5.18.fix-purge/storage/innobase/srv/srv0start.c 2012-01-29 11:34:09.000000000 +0800
@@ -2019,7 +2019,14 @@
/* If the user has requested a separate purge thread then
start the purge thread. */
if (srv_n_purge_threads == 1) {

          os_thread_create(&srv_purge_thread, NULL, NULL);

          if (srv_force_recovery < SRV_FORCE_NO_BACKGROUND) {

                  os_thread_create(&srv_purge_thread, NULL, NULL);

```
          } else {  
```

                  fprintf(stderr, " InnoDB: "

                                  "we will force innodb_purge_thread to 0 "

                                  "becanse force recovery is larger than 1\n");

                  srv_n_purge_threads = 0;

          }  
  }  

  /* Wait for the purge and master thread to startup. */

【结语】
1.本文介绍了mysql服务器强制断电后一次库恢复演练，使用了innodb_purge_threads 和 innodb_force_recovery两个参数进行联合恢复；并对两个参数进行了解读。
2.注意:当innodb_force_recovery参数设置为>=2时，由于设置innodb_purge_thread为1，因此会等待purge线程起来，导致恢复中断，故需设置innodb_purge_thread为0；
3.Mysql库拉起来后，立即对数据库做mysqldump逻辑导出，记得完成之后将innodb_force_recovery设置还原为0 ，innodb_purge_thread=1 ,然后重建一份新的数据库。

迎关注个人微信公众号:“一森咖记”

db_murphy

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
【实战篇】_MySQL服务器一次异常掉电的恢复

【引言】自己用来测试的一台MySQL环境，部署在WMWARE虚拟机环境下，因一次意外强制关闭笔记本，结果重新打开虚拟环境，怎么都登陆不上MySQL；场景相当于一次数据库服务器断电恢复过程。怎么快速恢复，本文介绍一种方法。环境介绍：linux版本：cat /etc/redhat-releaseRed Hat Enterprise Linux Server release 6.6 (Sant...
复制链接

扫一扫

专栏目录