从alert日志看Oracle 11g Datagurad日志传输（下）

最新推荐文章于 2024-08-21 23:41:45 发布

ciqu9915

最新推荐文章于 2024-08-21 23:41:45 发布

阅读量245

点赞数

文章标签：数据库操作系统网络

3、Standby端配置

启动监听器，建立联系。但是这个时候，我们不启动standby数据库。

[oracle@SimpleLinux ~]$ lsnrctl start

LSNRCTL for Linux: Version 11.2.0.4.0 - Production on 27-APR-2014 14:32:11

Starting /u01/app/oracle/bin/tnslsnr: please wait...

TNSLSNR for Linux: Version 11.2.0.4.0 - Production

System parameter file is /u01/app/oracle/network/admin/listener.ora

Log messages written to /u01/app/diag/tnslsnr/SimpleLinux/listener/alert/log.xml

Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=SimpleLinux)(PORT=1521)))

Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1521)))

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=SimpleLinux)(PORT=1521)))

STATUS of the LISTENER

------------------------

Alias LISTENER

Version TNSLSNR for Linux: Version 11.2.0.4.0 - Production

Start Date 27-APR-2014 14:32:13

Uptime 0 days 0 hr. 0 min. 0 sec

Trace Level off

Security ON: Local OS Authentication

SNMP OFF

Listener Parameter File /u01/app/oracle/network/admin/listener.ora

Listener Log File /u01/app/diag/tnslsnr/SimpleLinux/listener/alert/log.xml

Listening Endpoints Summary...

(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=SimpleLinux)(PORT=1521)))

(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1521)))

Services Summary...

Service "ora11g" has 1 instance(s).

Instance "ora11g", status UNKNOWN, has 1 handler(s) for this service...

Service "ora11gsy" has 1 instance(s).

Instance "ora11gsy", status UNKNOWN, has 1 handler(s) for this service...

The command completed successfully

监听器启动之后，Primary端日志循环报错信息变化，减少到心跳heartbeat错误。

Sun Apr 27 14:33:02 2014

Error 1034 received logging on to the standby

PING[ARC1]: Heartbeat failed to connect to standby 'ora11gsy'. Error is 1034.

错误代码从原来的12541，变化为1034。

[oracle@SimpleLinux ~]$ oerr ora 1034

01034, 00000, "ORACLE not available"

// *Cause: Oracle was not started up. Possible causes include the following:

// - The SGA requires more space than was allocated for it.

// - The operating-system variable pointing to the instance is

// improperly defined.

// *Action: Refer to accompanying messages for possible causes and correct

// the problem mentioned in the other messages.

// If Oracle has been initialized, then on some operating systems,

// verify that Oracle was linked correctly. See the platform

// specific Oracle documentation.

对于传递日志的ARCH进程而言，在传递日志的过程中，其实就是一个一般的client process。在访问过程中，遇到的问题和一般连接问题没有太多的差异。

Standby端启动过程。

[oracle@SimpleLinux ~]$ export ORACLE_SID=ora11gsy

[oracle@SimpleLinux ~]$ sqlplus /nolog

SQL*Plus: Release 11.2.0.4.0 Production on Sun Apr 27 14:39:18 2014

SQL> conn / as sysdba

Connected to an idle instance.

SQL> startup nomount

ORACLE instance started.

Total System Global Area 372449280 bytes

Fixed Size 1364732 bytes

Variable Size 331353348 bytes

Database Buffers 33554432 bytes

Redo Buffers 6176768 bytes

Standby端alert log信息。

Sun Apr 27 14:39:32 2014

MMNL started with pid=16, OS id=2339

starting up 1 shared server(s) ...

ORACLE_BASE from environment = /u01/app

Sun Apr 27 14:40:09 2014

Using STANDBY_ARCHIVE_DEST parameter default value as USE_DB_RECOVERY_FILE_DEST

destination database instance is 'started' not 'mounted'

这个时候，Primary日志也发生变化，heartbeat错误依然，错误编号变化。

Sun Apr 27 14:39:07 2014

Error 1034 received logging on to the standby

PING[ARC1]: Heartbeat failed to connect to standby 'ora11gsy'. Error is 1034.

Sun Apr 27 14:40:10 2014

PING[ARC1]: Heartbeat failed to connect to standby 'ora11gsy'. Error is 16058.

Sun Apr 27 14:41:10 2014

PING[ARC1]: Heartbeat failed to connect to standby 'ora11gsy'. Error is 16058.

错误含义：

[oracle@SimpleLinux ~]$ oerr ora 16058

16058, 00000, "standby database instance is not mounted"

// *Cause: The Remote File Server (RFS) process on the standby database

// received an internal error.

// *Action: Check the standby alert log and RFS trace files for more

// information.

RFS进程是Oracle DG传递日志的一个重要进程。错误里面，也指出standby要启动到mount才可以。

这个时候，standby端也“感觉”到传递的问题，也在不断报错。

Sun Apr 27 14:41:10 2014

destination database instance is 'started' not 'mounted'

Sun Apr 27 14:42:10 2014

destination database instance is 'started' not 'mounted'

启动数据库到mount状态。

SQL> alter database mount;

Database altered.

Standby端日志变化：

Sun Apr 27 14:45:04 2014

alter database mount

ARCH: STARTING ARCH PROCESSES

Sun Apr 27 14:45:09 2014

ARC0 started with pid=20, OS id=2525

ARC0: Archival started

ARCH: STARTING ARCH PROCESSES COMPLETE

ARC0: STARTING ARCH PROCESSES

Sun Apr 27 14:45:10 2014

Successful mount of redo thread 1, with mount id 4242154160

Physical Standby Database mounted.

Lost write protection disabled

Sun Apr 27 14:45:11 2014

ARC1 started with pid=21, OS id=2527

Completed: alter database mount --mount结束，下面就进行接收动作。

Sun Apr 27 14:45:11 2014

ARC2 started with pid=22, OS id=2529

Sun Apr 27 14:45:11 2014

ARC3 started with pid=23, OS id=2531

ARC1: Archival started

ARC2: Archival started

ARC3: Archival started

ARC1: Becoming the 'no FAL' ARCH

Sun Apr 27 14:45:12 2014

ARC4 started with pid=24, OS id=2535

ARC2: Becoming the heartbeat ARCH

ARC2: Becoming the active heartbeat ARCH

ARC4: Archival started

ARC0: STARTING ARCH PROCESSES COMPLETE

--RFS工作

Sun Apr 27 14:45:14 2014

RFS[1]: Assigned to RFS process 2539

RFS[1]: Selected log 5 for thread 1 sequence 32 dbid -55025450 branch 843741722

Sun Apr 27 14:45:14 2014

RFS[2]: Assigned to RFS process 2541

RFS[2]: Opened log for thread 1 sequence 33 dbid -55025450 branch 843741722

Archived Log entry 30 added for thread 1 sequence 33 rlc 843741722 ID 0xfcd0f891 dest 2:

Sun Apr 27 14:45:17 2014

Primary database is in MAXIMUM PERFORMANCE mode

Archived Log entry 31 added for thread 1 sequence 32 ID 0xfcd0f891 dest 1:

RFS[3]: Assigned to RFS process 2547

RFS[3]: Selected log 4 for thread 1 sequence 35 dbid -55025450 branch 843741722

Sun Apr 27 14:45:24 2014

RFS[4]: Assigned to RFS process 2553

RFS[4]: Selected log 5 for thread 1 sequence 34 dbid -55025450 branch 843741722

Sun Apr 27 14:45:35 2014

Archived Log entry 32 added for thread 1 sequence 34 ID 0xfcd0f891 dest 1:

RFS是一个服务，从日志看，是Primary端的进程来负责完成的。对应进程也的确印证。

oracle 2539 1 0 14:45 ? 00:00:00 oracleora11gsy (LOCAL=NO)

oracle 2547 1 0 14:45 ? 00:00:00 oracleora11gsy (LOCAL=NO)

oracle 2553 1 0 14:45 ? 00:00:00 oracleora11gsy (LOCAL=NO)

oracle 2578 1 0 14:50 ? 00:00:00 oracleora11gsy (LOCAL=NO)

Primary端情况如下：

Sun Apr 27 14:45:15 2014

Thread 1 advanced to log sequence 35 (LGWR switch)

Current log# 1 seq# 35 mem# 0: /u01/app/oradata/ORA11G/onlinelog/o1_mf_1_9mnjwtj9_.log

Current log# 1 seq# 35 mem# 1: /u01/app/fast_recovery_area/ORA11G/onlinelog/o1_mf_1_9mnjwvdm_.log

Sun Apr 27 14:45:15 2014

******************************************************************

LGWR: Setting 'active' archival for destination LOG_ARCHIVE_DEST_2

******************************************************************

Sun Apr 27 14:45:15 2014

ARC0: Standby redo logfile selected for thread 1 sequence 32 for destination LOG_ARCHIVE_DEST_2

LNS: Standby redo logfile selected for thread 1 sequence 35 for destination LOG_ARCHIVE_DEST_2

Sun Apr 27 14:45:23 2014

Archived Log entry 54 added for thread 1 sequence 34 ID 0xfcd0f891 dest 1:

Sun Apr 27 14:45:23 2014

Using STANDBY_ARCHIVE_DEST parameter default value as USE_DB_RECOVERY_FILE_DEST

ALTER SYSTEM SET log_archive_dest_state_2='ENABLE' SCOPE=MEMORY SID='*';

Sun Apr 27 14:45:24 2014

ARC3: Standby redo logfile selected for thread 1 sequence 34 for destination LOG_ARCHIVE_DEST_2

在ping通standby端之后，Oracle Primary就会将之前invalid状态的日志归档数据通道打通启动（自动）。日志传输开始。

4、日志Apply过程

在standby端启动应用动作。

SQL> alter database recover managed standby database using current logfile disconnect from session;

Database altered.

日志对应：

Sun Apr 27 14:55:32 2014

alter database recover managed standby database using current logfile disconnect from session

Attempt to start background Managed Standby Recovery process (ora11gsy)

Sun Apr 27 14:55:33 2014

MRP0 started with pid=29, OS id=2620

MRP0: Background Managed Standby Recovery process started (ora11gsy)

Serial Media Recovery started

Managed Standby Recovery starting Real Time Apply

Waiting for all non-current ORLs to be archived...

All non-current ORLs have been archived.

Media Recovery Log /u01/app/fast_recovery_area/ORA11GSY/archivelog/2014_04_27/o1_mf_1_32_9os9vxko_.arc

Completed: alter database recover managed standby database using current logfile disconnect from session

Media Recovery Log /u01/app/fast_recovery_area/ORA11GSY/archivelog/2014_04_27/o1_mf_1_33_9os9vtp5_.arc

Media Recovery Log /u01/app/fast_recovery_area/ORA11GSY/archivelog/2014_04_27/o1_mf_1_34_9os9w9xf_.arc

Sun Apr 27 14:56:12 2014

Media Recovery Waiting for thread 1 sequence 35 (in transit)

Recovery of Online Redo Log: Thread 1 Group 4 Seq 35 Reading mem 0

Mem# 0: /u01/app/oradata/ORA11GSY/onlinelog/o1_mf_4_9nn8pckn_.log

Mem# 1: /u01/app/fast_recovery_area/ORA11GSY/onlinelog/o1_mf_4_9nn8pjjx_.log

如此，启动完成。

5、结论

从上面的实验，我们可以了解Oracle DG的一些特性。

ü DG的重要组件是监听器。一般常见的配置策略，是关闭动态注册方法，对Primary和Standby采用静态注册手段。Oracle Primary在从mount到open过程中，就会异步的进行日志传递，检查连接到standby的网络通路。而且，这个检测过程是贯彻始终的。如果数据库处在最大保护状态，相信这个检测不通过的话，Primary一定会受到影响；

ü 除了监听器检查，还会有heartbeat检查过程。相信在全过程，Primary都在进行heartbeat检查，判断Redo Log是否可以正常传递。我们配置service只是指定了服务名，至于standby log位置信息，还要保证standby处在mount状态下，因为只有这个状态，实例才知道standby log的位置在哪里；