Redhat Linux 5.3环境实施DB2 V9.7 HADR

最新推荐文章于 2023-05-09 12:36:42 发布

chenqunan3231

最新推荐文章于 2023-05-09 12:36:42 发布

阅读量117

点赞数

文章标签：数据库网络操作系统

原文链接：https://my.oschina.net/chen106106/blog/45185

版权

数据库主机：使用VMware workstation 虚拟两台机器

操作系统：Red Hat EL 5.3

数据库软件：DB2 V9.7

虚拟机，操作系统环境安装过程省去。

2.2逻辑拓扑结构

2.3网络配置 2.3.1主机的IP配置

Linux1 130.30.3.252（255.255.255.0）

Linux2 130.30.3.136（255.255.255.0）

2.3.2/etc/hosts文件配置

Linux1的/etc/hosts

130.30.3.252 linux1

130.30.3.136 linux2

Linux2的/etc/hosts

130.30.3.252 linux1

130.30.3.136 linux2

2.3.3验证网络连通性

验证，确保网络连通性是否正常及配置正确

Ping linux1

Ping linux2

Ping 130.30.3.252

Ping 130.30.3.136

3DB2软件安装 3.1系统内核参数调整

用root用户登录，编辑/etc/sysctl.conf文件，修改需调整的内核参数，增加如下：

kernel.sem=250 256000 32 1024

kernel.shmmax=268435456

kernel.shmall=8388608

kernel.msgmax=65535

kernel.msgmnb=65535

kernel.shmall = 2097152
       kernel.shmmax = 2147483648
       kernel.shmmni = 4096
       # semaphores: semmsl, semmns, semopm, semmni
       kernel.sem = 250 32000 100 128
       fs.file-max = 65536
       net.ipv4.ip_local_port_range = 1024 65000
       net.core.rmem_default=262144
       net.core.rmem_max=262144
       net.core.wmem_default=262144
       net.core.wmem_max=262144

执行sysctl –p，从/etc/sysctl.conf文件装入sysctl设置。

运行ipcs –l命令显示验证当前的内核参数设置

# ipcs -l

------ Shared Memory Limits --------

max number of segments = 4096 // SHMMNI

max seg size (kbytes) = 32768 // SHMMAX

max total shared memory (kbytes) = 8388608 // SHMALL

min seg size (bytes) = 1

------ Semaphore Limits --------

max number of arrays = 1024 // SEMMNI

max semaphores per array = 250 // SEMMSL

max semaphores system wide = 256000 // SEMMNS

max ops per semop call = 32 // SEMOPM

semaphore max value = 32767

------ Messages: Limits --------

max queues system wide = 1024 // MSGMNI

max size of message (bytes) = 65536 // MSGMAX

default max size of queue (bytes) = 65536 // MSGMNB

3.2用户资源限制调整

设置DB2实例用户Data，nofiles，fsize资源的操作系统硬限制为无限制。

可通过修改文件/etc/security/limits.conf设置

* soft nproc 3000

* hard nproc 16384

* soft nofile 65536

* hard nofile 65536

可通过ulimit –Hdnf命令查询值的限制

3.3创建文件系统

在主节点创建以下文件系统：

创建PV，卷组

[root@linux1 db2]# pvcreate /dev/sdb1

[root@linux1 db2]# vgcreate datavg /dev/sdb1

新建LV，文件系统

[root@linux1 ~]# lvcreate -n db2homelv -L 1G /dev/datavg

Logical volume "db2homelv" created

[root@linux1 ~]# lvcreate -n hafs01lv -L 2G /dev/datavg

Logical volume "hafs01lv" created

# mkfs.ext3 /dev/datavg/db2homelv

# mkfs.ext3 /dev/datavg/hafs01lv

# mkdir /db2home

# mkdir /db2bak

挂载文件系统

# mount /dev/datavg/db2homelv /db2home

# mount /dev/datavg/hafs01lv /db2bak

增加挂载点条目

# vi /etc/fstab

/dev/datavg/db2homelv /db2home ext3 defaults 1 2

/dev/datavg/hafs01lv /db2bak ext3 defaults 1 2

在备用主机节点创建以下文件系统

[root@linux2 ~]# vgcreate datavg /dev/sdb1

Volume group "datavg" successfully created

[root@linux2 ~]# lvcreate -n db2homelv -L 1G /dev/datavg

Logical volume "db2homelv" created

[root@linux2 ~]# mkfs.ext3 /dev/datavg/db2homelv

[root@linux2 ~]# mkdir /db2home

[root@linux2 ~]# mount /dev/datavg/db2homelv /db2home

[root@linux2 ~]# vi /etc/fstab

/dev/datavg/db2homelv /db2home ext3 defaults 1 2

3.4创建用户及组

分别在主、备节点机器上创建以下用户组及用户，确保两台机器使用完全相同的UID,GID。

新建用户组

通过输入下列命令，为实例所有者创建一个组（例如，db2iadm1），为将要执行 UDF或存储过程的用户创建一个组（例如，db2fadm1），并为管理服务器创建一个组（例如，dasadm1）：

groupadd -g 999 db2iadm1

groupadd -g 998 db2fadm1

groupadd -g 997 dasadm1

新建用户

通过使用下列命令，为前一步骤中创建的每个组创建一个用户。每个用户的主目录将是先前创建的共享DB2主目录（db2home）。

useradd -u 999 -g db2iadm1 -m -d /db2home/db2inst1 db2inst1

useradd -u 998 -g db2fadm1 -m -d /db2home/db2fenc1 db2fenc1

useradd -u 997 -g dasadm1 -m -d /db2home/dasusr1 dasusr1

设置新建用户的初始密码

# passwd db2inst1

# passwd db2fenc1

# passwd dasusr1

修改文件系统属主

# chown db2inst1:db2iadm1 /db2home

# chown db2inst1:db2iadm1 /db2bak

3.5安装DB2软件

依次在主、备节点分别完成DB2数据库软件的安装，一般建议使用root安装，非root用户安装会有一些限制。

解压安装下载的安装介质

[root@linux1 db2]# tar zxvf v9.7_linuxia32_server.tar.gz

使用db2_install命令执行安装

[root@linux1 server]# ./db2_install

选择DB2安装目录，可以更改安装目录，linux下缺省安装目录为/opt/ibm/db2/V9.7

用于安装产品的缺省目录－ /opt/ibm/db2/V9.7

***********************************************************

要选择另一个目录用于安装吗？[是/否]

否

安装DB2产品选择ESE

指定下列其中一个关键字以安装 DB2产品。

ESE

CONSV

WSE

EXP

CLIENT

RTCL

按“帮助”以重新显示产品名称。

按“退出”以退出。

***********************************************************

ESE

等待安装执行过程，完成后会提示安装成功信息

正在初始化 DB2安装。

要执行的任务总数为：46

要执行的所有任务的总估计时间为：1876

任务 #1启动

描述：正在检查许可协议的接受情况

估计时间 1秒

任务 #1结束

…………………………………

…………………………………S

已成功完成执行。

有关更多信息，请参阅 "/tmp/db2_install.log.18940"上的 DB2安装日志。

3.6配置NTP网络时间同步

建议同步集群节点上的时间和日期（但不是必须的）。

4配置NFS文件系统 4.1配置NFS服务器

在主服务器上通过在 /etc/exports文件中添加以下条目，可以在启动时通过一个 NFS服务导出这个文件系统

[root@linux1 db2bak]# vi /etc/exports

/db2bak *(rw,no_root_squash)

启动NFS服务

[root@linux1 db2bak]#service nfs start

[root@linux1 db2bak]#service portmap start

执行 exportfs命令，使将挂载的 NFS客户机能使用配置的/db2bak共享目录

/usr/sbin/exportfs -a

其中选项 a 用于导出 /etc/exports文件中列出的所有目录。

4.2配置NFS客户机

用以下命令在在备机上创建共享目录：

[root@linux2 ~]# mkdir /db2bak

添加一个条目到 /etc/fstab文件，使 NFS在启动时自动挂载文件系统：

130.30.3.252:/db2bak /db2bak nfs rw,timeo=300,retrans=5,hard,intr,bg,suid

用以下命令在备机数据库服务器上挂载导出的文件系统：

[root@linux2 ~]# mount -t nfs 130.30.3.252:/db2bak /db2bak

或[root@linux2 server]# mount /db2bak

5创建示例数据库 5.1创建数据库实例

分别在主、备节点完成数据库实例创建，修改参数

[root@linux1 instance]cd /opt/ibm/db2/V9.7/instance

[root@linux1 instance]# ./db2icrt -u db2fenc1 db2inst1

DBI1070I Program db2icrt completed successfully.

[root@linux1 instance]# ./db2ilist

db2inst1

[root@linux1 instance]# su - db2inst1

[db2inst1@linux1 ~]$ db2start

01/27/2011 17:37:46 0 0 SQL1063N DB2START processing was successful.

SQL1063N DB2START processing was successful.

[db2inst1@linux1 ~]$ db2set db2comm=tcpip

[db2inst1@linux1 ~]$ db2 update dbm cfg using svcename DB2_db2inst1

DB20000I The UPDATE DATABASE MANAGER CONFIGURATION command completed successfully.

[db2inst1@linux1 ~]$ cat /etc/services |grep DB2_

DB2_db2inst1 60000/tcp

DB2_db2inst1_1 60001/tcp

DB2_db2inst1_2 60002/tcp

DB2_db2inst1_END 60003/tcp

5.2创建管理服务器

分别在主、备节点创建数据库管理服务器

[root@linux1 ~]# cd /opt/ibm/db2/V9.7/instance

[root@linux1 instance]# ./dascrt -u dasusr1

SQL4406W The DB2 Administration Server was started successfully.

DBI1070I Program dascrt completed successfully.

[root@linux1 instance]# su - dasusr1

[dasusr1@linux1 ~]$ db2admin start

SQL4409W The DB2 Administration Server is already active.

5.3创建示例数据库

在主节点创建sample数据库

[db2inst1@linux1 ~]$ db2sampl -dbpath /db2home/db2inst1/sample

6HADR数据库配置 6.1修改主数据库参数

修改以下数据库参数：

db2 UPDATE DB CFG FOR sample USING INDEXREC RESTART LOGINDEXBUILD ON LOGARCHMETH1 DISK:/db2bak/logs LOGPRIMARY 20 LOGSECOND 20 AUTORESTART OFF TRACKMOD ON

[db2inst1@linux1 ~]$ db2 UPDATE DB CFG FOR sample USING INDEXREC RESTART LOGINDEXBUILD ON LOGARCHMETH1 DISK:/db2bak/logs LOGPRIMARY 20 LOGSECOND 20 AUTORESTART OFF

DB20000I The UPDATE DATABASE CONFIGURATION command completed successfully.

6.2在主服务器上备份数据库

在主机上执行脱机备份数据库

[db2inst1@linux1 db2bak]$ db2 deactivate db sample

DB20000I The DEACTIVATE DATABASE command completed successfully.

[db2inst1@linux1 db2bak]$ db2 backup db sample to /db2bak/ with 2 buffers buffer 1024 parallelism 4 without prompting

Backup successful. The timestamp for this backup image is : 20110309104324

[db2inst1@linux1 db2bak]$ ls

logs lost+found SAMPLE.0.db2inst1.NODE0000.CATN0000.20110309104324.001

[db2inst1@linux1 db2bak]$ db2ckbkp SAMPLE.0.db2inst1.NODE0000.CATN0000.20110309104324.001

[1] Buffers processed: ###################################

Image Verification Complete - successful.

6.3在备用服务器上恢复数据库

通过主数据库备份进行完全恢复，完成备用数据库的创建

[root@linux2 instance]# su - db2inst1

[db2inst1@linux2 ~]$ cd /db2bak

[db2inst1@linux2 ~]$ mkdir /db2home/db2inst1/sample

[db2inst1@linux2 ~]$ ls

sample sqllib

[db2inst1@linux2 ~]$ cd /db2bak

[db2inst1@linux2 db2bak]$ ls

logs lost+found SAMPLE.0.db2inst1.NODE0000.CATN0000.20110309104324.001 SAMPLE.0.db2inst1.NODE0000.CATN0000.20110310105829.001

[db2inst1@linux2 db2bak]$ db2 restore db sample from /db2bak taken at 20110310105829 replace history file

DB20000I The RESTORE DATABASE command completed successfully.

6.4修改/etc/services配置文件

分别在主、备服务器上修改服务配置文件，增加用于HADR通讯使用的服务名称和端口。

– Service name: DB2_HADR_1

– Port number: 55001

– Service name: DB2_HADR_2

– Port number: 55002

在主服务器linux1上添加如下服务

[root@linux1 ~]# cat /etc/services|grep DB2_

DB2_db2inst1 60000/tcp

DB2_db2inst1_1 60001/tcp

DB2_db2inst1_2 60002/tcp

DB2_db2inst1_END 60003/tcp

DB2_HADR_1 55001/tcp

DB2_HADR_2 55002/tcp

在备服务器linux2上添加如下服务

[root@linux2 ~]# cat /etc/services|grep DB2_

DB2_db2inst1 60000/tcp

DB2_db2inst1_1 60001/tcp

DB2_db2inst1_2 60002/tcp

DB2_db2inst1_END 60003/tcp

DB2_HADR_1 55001/tcp

DB2_HADR_2 55002/tcp

注：这一步不是必须的，因为在下面配置HADR_LOCAL_SVC和HADR_REMOTE_SVC数据库参数的时候您可以直接使用端口号来替代服务名。

6.5修改主服务器的HADR参数

db2 update db cfg for sample using HADR_LOCAL_HOST 130.30.3.252

db2 update db cfg for sample using HADR_LOCAL_SVC DB2_HADR_1

db2 update db cfg for sample using HADR_REMOTE_HOST 130.30.3.136

db2 update db cfg for sample using HADR_REMOTE_SVC DB2_HADR_2

db2 update db cfg for sample using HADR_REMOTE_INST DB2INST1

db2 update db cfg for sample using HADR_SYNCMODE NEARSYNC

db2 update db cfg for sample using HADR_TIMEOUT 10

db2 update db cfg for sample using HADR_PEER_WINDOW 120

db2 connect to sample

db2 quiesce database immediate force connections

db2 unquiesce database

db2 connect reset

6.6修改备用服务器的HADR参数

db2 update db cfg for sample using HADR_LOCAL_HOST 130.30.3.136

db2 update db cfg for sample using HADR_LOCAL_SVC DB2_HADR_2

db2 update db cfg for sample using HADR_REMOTE_HOST 130.30.3.252

db2 update db cfg for sample using HADR_REMOTE_SVC DB2_HADR_1

db2 update db cfg for sample using HADR_REMOTE_INST db2inst1

db2 update db cfg for sample using HADR_SYNCMODE NEARSYNC

db2 update db cfg for sample using HADR_TIMEOUT 10

db2 update db cfg for sample using HADR_PEER_WINDOW 120

6.7启动HADR

在备用服务器上启动HADR备用数据库

db2 deactivate database sample

db2 start hadr on database sample as standby

[db2inst1@linux2 db2bak]$ db2 deactivate database sample

SQL1496W Deactivate database is successful, but the database was not activated.

[db2inst1@linux2 db2bak]$ db2 start hadr on database sample as standby

DB20000I The START HADR ON DATABASE command completed successfully.

在主服务器上启动HADR主数据库

db2 deactivate database sample

db2 start hadr on database sample as primary

[db2inst1@linux1 db2bak]$ db2 deactivate database sample

SQL1496W Deactivate database is successful, but the database was not activated.

[db2inst1@linux1 db2bak]$ db2 start hadr on database sample as primary

DB20000I The START HADR ON DATABASE command completed successfully.

注意：

在启动 HADR时，一定要先在备用服务器上启动 HADR服务，然后在主服务器上启动。同样，在停止 HADR时，要先在主服务器上停止服务，然后是备用服务器。

如果你先启动主数据库服务器HADR，那么你必须保证在HADR_TIMEOUT参数指定的时间内（单位为秒）启动备用数据库服务器HADR。否则将启动失败。

6.8查看HADR运行状态

现在，已经完成了 HADR设置，就应该检查它是否能够正常工作。

在主服务器linux1上执行以下命令：

db2 get snapshot for db on sample

输出如下：

HADR Status

Role = Primary

State = Peer

Synchronization mode = Nearsync

Connection status = Connected, 2011-03-09 11:01:55.144160

Peer window end = 2011-03-09 11:06:34.000000 (1299639994)

Peer window (seconds) = 120

Heartbeats missed = 0

Local host = 130.30.3.252

Local service = DB2_HADR_1

Remote host = 130.30.3.136

Remote service = DB2_HADR_2

Remote instance = DB2INST1

timeout(seconds) = 3

Primary log position(file, page, LSN) = S0000000.LOG, 0, 00000000045F81C1

Standby log position(file, page, LSN) = S0000000.LOG, 0, 00000000045F81C1

Log gap running average(bytes) = 0

在主服务器linux1上执行以下命令：

db2 get snapshot for db on sample

输出如下：

HADR Status

Role = Standby

State = Peer

Synchronization mode = Nearsync

Connection status = Connected, 2011-03-09 11:01:54.978888

Peer window end = 2011-03-09 11:08:24.000000 (1299640104)

Peer window (seconds) = 120

Heartbeats missed = 0

Local host = 130.30.3.136

Local service = DB2_HADR_2

Remote host = 130.30.3.252

Remote service = DB2_HADR_1

Remote instance = db2inst1

timeout(seconds) = 3

Primary log position(file, page, LSN) = S0000000.LOG, 0, 00000000045F81C1

Standby log position(file, page, LSN) = S0000000.LOG, 0, 00000000045F81C1

Log gap running average(bytes) = 0

也可以通过db2pd –db sample –hadr命令来查看HADR运行状态。

7HADR接管测试

设置过程的最后一步是测试 HADR的故障恢复功能。可以手工在备用服务器上执行普通接管命令

db2 takeover hadr on database sample

如果一般的接管不起作用，就需要指定 BY FORCE选项，强迫 db2切换到备用服务器上的HADR（如主数据库主机异常情况下启用备用服务器接管数据库）。

7.1配置客户端重新路由

在主服务器linux1上，执行以下命令启用 HADR的自动客户机重路由特性：

db2 update alternate server for database sample using hostname 130.30.3.136 port 60000

[db2inst1@linux1 db2bak]$ db2 update alternate server for database sample using hostname 130.30.3.136 port 60000

DB20000I The UPDATE ALTERNATE SERVER FOR DATABASE command completed

successfully.

DB21056W Directory changes may not be effective until the directory cache is

refreshed.

[db2inst1@linux1 db2bak]$ db2 list db directory

System Database Directory

Number of entries in the directory = 1

Database 1 entry:

Database alias = SAMPLE

Database name = SAMPLE

Local database directory = /db2home/db2inst1/sample

Database release level = d.00

Comment =

Directory entry type = Indirect

Catalog database partition number = 0

Alternate server hostname = 130.30.3.136

Alternate server port number = 60000

在备用服务器linux2上，执行以下命令启用 HADR的自动客户机重路由特性：

db2 update alternate server for database sample using hostname 130.30.3.252 port 60000

[db2inst1@linux2 db2bak]$ db2 update alternate server for database sample using hostname 130.30.3.252 port 60000

DB20000I The UPDATE ALTERNATE SERVER FOR DATABASE command completed

successfully.

DB21056W Directory changes may not be effective until the directory cache is

refreshed.

[db2inst1@linux2 db2bak]$ db2 list db directory

System Database Directory

Number of entries in the directory = 1

Database 1 entry:

Database alias = SAMPLE

Database name = SAMPLE

Local database directory = /db2home/db2inst1

Database release level = d.00

Comment =

Directory entry type = Indirect

Catalog database partition number = 0

Alternate server hostname = 130.30.3.252

Alternate server port number = 60000

7.2在客户端机器编目

db2 catalog tcpip node testhadr remote 130.30.3.252 server 60000

db2 catalog db sample as testhadr at node testhadr

7.3备用服务器正常接管

1.在主数据库上创建测试表

C:\Documents and Settings\fengsh>db2 connect to testhadr user db2inst1 using db2inst1

db2 => CREATE TABLE HADRTEST(ID INTEGER NOT NULL WITH DEFAULT,NAME VARCHAR(10),PRIMARY KEY (ID))

DB20000I SQL命令成功完成。

db2 => INSERT INTO HADRTEST (ID,NAME) VALUES (1,'张三')

DB20000I SQL命令成功完成。

db2 => INSERT INTO HADRTEST (ID,NAME) VALUES (2,'李四')

DB20000I SQL命令成功完成。

db2 => select * from hadrtest

ID NAME

----------- ----------

1张三

2李四

2条记录已选择。

db2 =>

2.在备用服务器上接管主数据库

[db2inst1@linux2 db2bak]$ db2 takeover hadr on database sample

DB20000I The TAKEOVER HADR ON DATABASE command completed successfully.

获取快照发现，备用服务器上HADR已经接管，成为primay数据库

[db2inst1@linux2 db2bak]$ db2 get snapshot for db on sample|more

HADR Status

Role = Primary

State = Peer

Synchronization mode = Nearsync

Connection status = Connected, 2011-03-09 14:55:55.852957

Peer window end = 2011-03-09 15:01:08.000000 (1299654068)

Peer window (seconds) = 120

Heartbeats missed = 0

Local host = 130.30.3.136

Local service = DB2_HADR_2

Remote host = 130.30.3.252

Remote service = DB2_HADR_1

Remote instance = db2inst1

timeout(seconds) = 3

Primary log position(file, page, LSN) = S0000001.LOG, 0, 00000000049DB3A1

Standby log position(file, page, LSN) = S0000001.LOG, 0, 00000000049DB3A1

Log gap running average(bytes) = 0

而对应原来主服务器上数据库成为standby数据库

[db2inst1@linux1 ~]$ db2 get snapshot for database on sample

HADR Status

Role = Standby

State = Peer

Synchronization mode = Nearsync

Connection status = Connected, 2011-03-09 14:55:56.222980

Peer window end = 2011-03-09 15:03:16.000000 (1299654196)

Peer window (seconds) = 120

Heartbeats missed = 0

Local host = 130.30.3.252

Local service = DB2_HADR_1

Remote host = 130.30.3.136

Remote service = DB2_HADR_2

Remote instance = DB2INST1

timeout(seconds) = 3

Primary log position(file, page, LSN) = S0000001.LOG, 0, 00000000049DB3A1

Standby log position(file, page, LSN) = S0000001.LOG, 0, 00000000049DB3A1

Log gap running average(bytes) = 0

3.客户端应用程序验证

客户端程序无法连原主数据库后，会自动重新路由连接到备用数据库，不用客户端重新连库。

C:\Documents and Settings\fengsh>db2 "select * from hadrtest"

SQL30108N 连接失败，但是已经重新建立了连接。主机名或 IP地址为

"130.30.3.136"，服务名称或端口号为

"60000"。可能会也可能不会重新尝试专用寄存器（原因码 = "1"）。 SQLSTATE=08506

C:\Documents and Settings\fengsh>db2 "select * from hadrtest"

ID NAME

----------- ----------

1张三

2李四

2条记录已选择。

重新在原主服务器上接管数据库，恢复最初主备数据库服务器状态，验证客户程序的连接，也自动完成重新路由。

C:\Documents and Settings\fengsh>db2 "select * from hadrtest"

SQL30108N 连接失败，但是已经重新建立了连接。主机名或 IP地址为

"130.30.3.252"，服务名称或端口号为

"60000"。可能会也可能不会重新尝试专用寄存器（原因码 = "1"）。 SQLSTATE=08506

C:\Documents and Settings\fengsh>db2 "select * from hadrtest"

ID NAME

----------- ----------

1张三

2李四

2条记录已选择。

7.4主数据库异常时接管

在主数据库宕库，主数据库主机故障，网络故障等异常情况下导致主数据库问题是的接管测试。

1. 用db2_kill命令手工关闭主服务器

[db2inst1@linux1 db2dump]$ db2_kill

ipclean: Removing DB2 engine and client's IPC resources for db2inst1.

在备库服务器上执行db2 get snapshot for db on sample获取快照信息

HADR Status

Role = Standby

State = Disconnected peer

Synchronization mode = Nearsync

Connection status = Disconnected, 2011-03-10 16:54:28.400277

Peer window end = 2011-03-10 17:00:29.000000 (1299747629)

Peer window (seconds) = 120

Heartbeats missed = 0

Local host = 130.30.3.136

Local service = DB2_HADR_2

Remote host = 130.30.3.252

Remote service = DB2_HADR_1

Remote instance = db2inst1

timeout(seconds) = 10

Primary log position(file, page, LSN) = S0000006.LOG, 59, 0000000005D860C4

Standby log position(file, page, LSN) = S0000006.LOG, 59, 0000000005D860C4

Log gap running average(bytes) = 0

2.在备用服务器上接管HADR主数据库

[db2inst1@linux2 sample]$ db2 takeover hadr on database sample

SQL1770N Takeover HADR cannot complete. Reason code = "1".

[db2inst1@linux2 sample]$ db2 takeover hadr on database sample by force

DB20000I The TAKEOVER HADR ON DATABASE command completed successfully.

备用机器HADR接管为primary数据库

HADR Status

Role = Primary

State = Disconnected

Synchronization mode = Nearsync

Connection status = Disconnected, 2011-03-10 16:54:28.400277

Peer window end = Null (0)

Peer window (seconds) = 120

Heartbeats missed = 0

Local host = 130.30.3.136

Local service = DB2_HADR_2

Remote host = 130.30.3.252

Remote service = DB2_HADR_1

Remote instance = db2inst1

timeout(seconds) = 10

Primary log position(file, page, LSN) = S0000006.LOG, 59, 0000000005D860C4

Standby log position(file, page, LSN) = S0000000.LOG, 0, 0000000000000000

Log gap running average(bytes) = 0

3.客户端应用程序验证

客户端程序无法连原主数据库后，会自动重新路由连接到备用数据库，不用客户端重新连库。

C:\Documents and Settings\fengsh>db2 select * from hadrtest

ID NAME

----------- ----------

1张三

2李四

2条记录已选择。

C:\Documents and Settings\fengsh>db2 select * from hadrtest

SQL30108N 连接失败，但是已经重新建立了连接。主机名或 IP地址为

"130.30.3.136"，服务名称或端口号为

"60000"。可能会也可能不会重新尝试专用寄存器（原因码 = "1"）。 SQLSTATE=08506

C:\Documents and Settings\fengsh>db2 select * from hadrtest

ID NAME

----------- ----------

1张三

2李四

2条记录已选择。

C:\Documents and Settings\fengsh>db2 select * from hadrtest

SQL30108N 连接失败，但是已经重新建立了连接。主机名或 IP地址为

"130.30.3.252"，服务名称或端口号为

"60000"。可能会也可能不会重新尝试专用寄存器（原因码 = "1"）。 SQLSTATE=08506

C:\Documents and Settings\fengsh>db2 select * from hadrtest

ID NAME

----------- ----------

1张三

2李四

2条记录已选择。

7.5原主服务器恢复接管

在主数据库宕库，主数据库主机故障，网络故障等异常情况下导致HADR数据库故障接管；待主服务器恢复后，重新进行hadr主数据库接管，恢复最初主备状态。

1.启动实例

[db2inst1@linux1 db2dump]$ db2start

03/10/2011 17:05:54 0 0 SQL1063N DB2START processing was successful.

SQL1063N DB2START processing was successful.

2.启动hadr到standby数据库

[db2inst1@linux1 db2dump]$ db2 start hadr on database sample as standby

DB20000I The START HADR ON DATABASE command completed successfully.

3.重新接管恢复为primary数据库

[db2inst1@linux1 db2dump]$ db2 takeover hadr on db sample

DB20000I The TAKEOVER HADR ON DATABASE command completed successfully.

4．客户端应用程序验证

客户端程序自动重新路由连接到主服务器的primary数据库。

8 其他问题 8.1HADR设计限制

1. 仅在 DB2 UDB企业服务器版本（ESE）上支持 HADR。但是，当 ESE上有多个数据库分区时，不支持 HADR。

2. 主数据库和备用数据库主机的操作系统、db2软件的版本，级别必须相同，交替卷动升级过程中较短时间除外。

3. 主数据库和备用数据库上的 DB2 UDB发行版必须具有相同的位大小（32位或 64位）。

4. 日志要使用独立的、高性能的磁盘或文件系统归档日志的位置，主从节点必须都能访问的到

5. HADR通讯最好使用独立的网络

6. 考虑使用多块网卡

7. 考虑提供一个虚拟IP访问数据库服务器

8. 考虑使用客户端自动路由――虚拟IP和客户端自动路由二选一即可，不可同时使用。

客户端自动路由与虚拟IP相比，它是DB2软件集成的，不需其他硬件或软件支持。

9. 选择合适的hadr_syncmode模式

10. 在V9.7前版本不支持备用数据库上的读操作，客户机无法与备用数据库连接，V9.7支持备用数据库查询模式打开。

8.2基本维护操作 8.2.1启动HADR

1. 在备用服务器上启动备用数据库上的HADR

db2 deactivate database sample

db2 start hadr on database sample as standby

2. 在主服务器上启动主数据库上的HADR

db2 deactivate database sample

db2 start hadr on database sample as primary

8.2.2停止HADR

1.在主服务器上停止主数据库的HADR

db2 deactivate database sample

db2 stop hadr on database sample

2. 在备用服务器上停止备用数据库的HADR

db2 deactivate database sample

db2 stop hadr on database sample

8.2.3HADR接管

在备用服务器上执行数据库普通接管命令

db2 takeover hadr on database sample

如果普通接管失败，或者主数据库故障时，需执行强制接管命令

db2 takeover hadr on database sample by force

8.3HADR复制的操作

以下操作将会通过HADR从主数据库复制到standby库

Data Definition Language (DDL):

– Includes create and drop table, index, and more

Data Manipulation Language (DML):

– Insert, update, or delete

Buffer pool:

– Create and drop

Table space:

– Create, drop, or alter.

– Container sizes, file types (raw device or file system), and paths must be identical on the primary and the standby.

– Table space types (DMS or SMS) must be identical on both the servers.

– If the database is enabled for automatic storage, then the storage paths must be identical.

Online REORG:

– Replicated. The standby’s Peer state with the primary is rarely impacted

by this, although it can impact the system because of the increase in the number of log records generated.

Offline REORG:

- Replicated.

Stored procedures and user defined functions (UDF):

– Creation statements are replicated.

– HADR does not replicate the external code for stored procedures or UDF object library files. In order to keep these in synchronization between the primary and the standby, it is the user’s responsibility to set these up on identical paths on both the primary and the standby servers. If the paths are not identical, invocation fails on the standby.

8.4HADR不复制的操作

以下操作HADR不会复制standby数据库：

_NOT LOGGED INITIALLY tables:

– Tables created or altered with the NOT LOGGED INITIALLY option

specified are not replicated. The tables are invalid on the standby and

after a takeover, attempts to access these tables result in an error.

_BLOBs and CLOBs:

– Non-logged Large Objects (LOBs) are not replicated, but the space is

allocated and LOB data is set to binary zeroes on the standby.

_Data links:

– NOT SUPPORTED in HADR.

_Database configuration changes:

– UPDATE DB CFG is not replicated.

– Dynamic database configuration parameters can be updated to both the

primary and the standby without interruption. For the database

configuration parameters that need a recycle of the database we

recommend you follow the rolling upgrade steps. See 7.1, “DB2 fix pack

rolling upgrades” on page 177.

_Recovery history file:

– NOT automatically shipped to standby.

– You can place an initial copy of the history file by using a primary backup

image to restore the history file to the standby using the RESTORE

command with the REPLACE HISTORY FILE option for example:

db2 restore database sample replace history file

– You can also update the history file on the standby from a primary backup

image by the RESTORE command for example:

db2 restore database sample history file

– If a takeover operation occurs and the standby database has an

up-to-date history file, backup and restore operations on the new primary

generate new records in the history file and blend seamlessly with the

records generated on the original primary. If the history file is out-of-date

or has missing entries, an automatic incremental restore might not be

possible; instead, a manual incremental restore operation is required.

We think that the recovery history file for the primary database is not

relevant to the standby database, because it contains DB2 recovery

information specific to that primary server; DB2 keeps records of which

backup file to use for recovery to prior points in time, among other things.

102 High Availability and Disaster Recovery Options for DB2 on Linux, UNIX, and Windows

In most HADR implementations, the standby system is only used in the

primary mode as a temporary measure, while the old primary server is

being fixed, making the primary version of the recovery history file

irrelevant; also, because of potentially different log file numbers and

locations between the primary and the standby, incorrect for use on the

standby server.

原文链接： http://blog.csdn.net/jaminwm/article/details/7259739

转载于:https://my.oschina.net/chen106106/blog/45185

chenqunan3231

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
Redhat Linux 5.3环境实施DB2 V9.7 HADR

数据库主机：使用VMwareworkstation 虚拟两台机器操作系统：RedHatEL5.3 数据库软件：DB2V9.7 虚拟机，操作系统环境安装过程省去。 2.2逻辑拓扑结构 2.3网络配置2.3.1主机的IP配置 Linux...
复制链接

扫一扫