一旦Docker出现宕机,运行其中的Oracle数据库如何快速恢复?

本文详细描述了在Docker容器中的Oracle数据库因服务异常宕机后,如何处理两种不同场景:一是docker服务能重新拉起但容器无法启动,通过新建容器映射数据文件恢复;二是docker服务无法启动时,进行异机恢复并调整参数。
摘要由CSDN通过智能技术生成
背景描述:数据库运行中,docker服务异常宕机,如何快速恢复oracle数据库服务
场景一:docker服务能够重新拉起,但是容器无法拉起
场景二:docker服务无法拉起,

一、准备工作

禁止 Docker 的自动拉起

修改为no
vi /usr/lib/systemd/system/docker.service


Restart=no

重启生效
[root@e3qiutldor05 system]# systemctl daemon-reload
[root@e3qiutldor05 system]# systemctl restart docker.service

通过存储过程模拟生产场景

1、建表
CREATE TABLE test(
    id NUMBER,
    name VARCHAR2(255)
);
2、对test表插入三百万次,每插入一次就提交一次
CREATE OR REPLACE PROCEDURE insert_data_multiple_times
IS
BEGIN
    FOR i IN 1..3000000 LOOP
        INSERT INTO test(id, name) VALUES(i, 'Test' || i);
        COMMIT;
    END LOOP;
END;
/

执行

BEGIN
    insert_data_multiple_times;
END;
/
3、模拟宕机

在数据写入过程中,直接kill docker服务


[root@e3qiutldor05 _data]# ps -ef |grep docker
root     119205      1  0 15:40 ?        00:00:01 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
root     125981 119205  0 15:55 ?        00:00:00 /usr/bin/docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 1521 -container-ip 172.17.0.2 -container-port 1521
root     125986 119205  0 15:55 ?        00:00:00 /usr/bin/docker-proxy -proto tcp -host-ip :: -host-port 1521 -container-ip 172.17.0.2 -container-port 1521
root     126381 122460  0 15:55 pts/1    00:00:00 grep --color=auto docker
[root@e3qiutldor05 _data]# kill -9 125981
[root@e3qiutldor05 _data]# kill -9 125986
[root@e3qiutldor05 _data]# kill -9 119205
[root@e3qiutldor05 _data]# ps -ef |grep docker
root     127502 122460  0 15:57 pts/1    00:00:00 grep --color=auto docker

docker宕机

[root@e3qiutldor05 volumes]# systemctl stop docker
Warning: Stopping docker.service, but it can still be activated by:
  docker.socket
[root@e3qiutldor05 volumes]# systemctl status docker
● docker.service - Docker Application Container Engine
   Loaded: loaded (/usr/lib/systemd/system/docker.service; disabled; vendor preset: disabled)
   Active: inactive (dead) since Thu 2024-02-22 16:03:06 CST; 2s ago
     Docs: https://docs.docker.com
  Process: 129914 ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock (code=exited, status=0/SUCCESS)
 Main PID: 129914 (code=exited, status=0/SUCCESS)



二、开始恢复

场景一:docker服务可以拉起,但是数据库容器无法拉起

恢复方式:通过镜像新建容器,并将原数据文件目录映射到新的容器

原数据库容器状态为exited

[root@e3qiutldor05 helowin]# docker ps -a
CONTAINER ID   IMAGE                                   COMMAND                  CREATED             STATUS                        PORTS                                       NAMES
7fa87488e3fb   harbor.supcon.com/database/oracle_11g   "/bin/sh -c '/home/o…"   About an hour ago   Exited (137) 12 minutes ago   0.0.0.0:1521->1521/tcp, :::1521->1521/tcp   oracle11g

1、确认宕机数据库的数据文件所在宿主机的路径
docker inspect oracle11g


   "Source": "/var/lib/docker/volumes/031b54580261fe56faf694dc862d9614ad63d4d8b820537795aa81a2eb0a2834/_data",
                "Destination": "/home/oracle/app/oracle/oradata",
2、利用镜像,新建一台容器

确认新容器数据库的数据文件路径

docker inspect harbor.supcon.com/database/oracle_11g



Volumes": {
                "/home/oracle/app/oracle/oradata": {}


利用原有镜像,创建容器,将老容器的路径映射到新容器里的地址

docker run -d -p 1522:1521 --name oracle2 -v /var/lib/docker/volumes/813255bf2505a3ea227dfad957cc965da51b2a48d44f11896b5bae88c9613f4c/_data:/home/oracle/app/oracle/oradata  harbor.supcon.com/database/oracle_11g 
3、拉起数据库

新容器数据库启动失败,两个控制文件的version不一致

SQL> alter database mount;
alter database mount
*
ERROR at line 1:
ORA-00214: control file '/home/oracle/app/oracle/oradata/helowin/control01.ctl'
version 896 inconsistent with file
'/home/oracle/app/oracle/flash_recovery_area/helowin/control02.ctl' version 841

修改参数,只指定一个控制文件

SQL>  alter system set control_files='/home/oracle/app/oracle/oradata/helowin/control01.ctl' scope=spfile;

数据库重启成功,完成数据验证,未提交的数据将会丢失

SQL> startup
ORACLE instance started.

Total System Global Area 1603411968 bytes
Fixed Size                  2213776 bytes
Variable Size             402655344 bytes
Database Buffers         1191182336 bytes
Redo Buffers                7360512 bytes
Database mounted.
Database opened.

SQL> select count(*) from test;

  COUNT(*)
----------
    951317


4、数据比对
新容器
SQL> select count(*) from test;

  COUNT(*)
----------
    951317

原生产容器
SQL> select count(*) from test;

  COUNT(*)
----------
    951317

数据一致,但是由于是异常宕机,未提交的数据新老容器都是会丢失的

三、开始恢复

场景二:docker服务无法启动

恢复方式:异机恢复
进入到宿主机目录下,将数据文件压缩拷贝,并传输到恢复机器
tar -cf LHR11G.tar LHR11G/

创建新的参数文件
vi  initlhr11g.ora



*.audit_file_dest='/test/adump'
*.audit_trail='db'
*.compatible='11.2.0.4.0'
*.control_files='/test/LHR11G/control01.ctl'
*.db_block_size=8192
*.db_domain=''
*.db_name='lhr11g'
*.diagnostic_dest='/test/adump'
*.dispatchers='(PROTOCOL=TCP) (SERVICE=orclXDB)'
*.open_cursors=300
*.pga_aggregate_target=183500800
*.processes=150
*.remote_login_passwordfile='EXCLUSIVE'
*.sga_target=550502400
*.undo_tablespace='UNDOTBS1'
启动数据库到nomount状态
[oracle@qiuyang1 test]$ export ORACLE_SID=lhr11g
[oracle@qiuyang1 test]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.4.0 Production on Mon Feb 26 10:28:12 2024

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup nomount pfile='/oracle/app/product/11.2.0.4/db_1/dbs/initlhr11g.ora';
ORACLE instance started.

Total System Global Area  551165952 bytes
Fixed Size                  2255112 bytes
Variable Size             218105592 bytes
Database Buffers          322961408 bytes
Redo Buffers                7843840 bytes

SQL> alter database mount;

通过拼接语句生成修改控制文件中数据文件路径的sql
SQL> SELECT 'ALTER DATABASE RENAME FILE ''' || name || ''' TO ''/test/LHR11G/' || SUBSTR(name, INSTR(name, '/', -1) + 1) || '''' || ';' AS command FROM v$datafile;

COMMAND
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
ALTER DATABASE RENAME FILE '/u01/app/oracle/oradata/LHR11G/system01.dbf' TO '/test/LHR11G/system01.dbf';
ALTER DATABASE RENAME FILE '/u01/app/oracle/oradata/LHR11G/sysaux01.dbf' TO '/test/LHR11G/sysaux01.dbf';
ALTER DATABASE RENAME FILE '/u01/app/oracle/oradata/LHR11G/undotbs01.dbf' TO '/test/LHR11G/undotbs01.dbf';
ALTER DATABASE RENAME FILE '/u01/app/oracle/oradata/LHR11G/users01.dbf' TO '/test/LHR11G/users01.dbf';
ALTER DATABASE RENAME FILE '/u01/app/oracle/oradata/LHR11G/example01.dbf' TO '/test/LHR11G/example01.dbf';


通过拼接语句生成修改控制文件中redolog路径的sql
SQL> SELECT 'ALTER DATABASE RENAME FILE ''' || member || ''' TO ''/test/LHR11G/' || SUBSTR(member, INSTR(member, '/', -1) + 1) || '''' || ';' AS command FROM v$logfile;

COMMAND
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
ALTER DATABASE RENAME FILE '/u01/app/oracle/oradata/LHR11G/redo03.log' TO '/test/LHR11G/redo03.log';
ALTER DATABASE RENAME FILE '/u01/app/oracle/oradata/LHR11G/redo02.log' TO '/test/LHR11G/redo02.log';
ALTER DATABASE RENAME FILE '/u01/app/oracle/oradata/LHR11G/redo01.log' TO '/test/LHR11G/redo01.log';

打开数据库
SQL> alter database open;

Database altered.

比对数据

老库

SYS@LHR11G> select count(*) from test;

  COUNT(*)
----------
   1139633

异地恢复库:

SQL> select count(*) from test;

  COUNT(*)
----------
   1139633

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值