最近,我维护的cloud环境DB部分计划维护,需要在本地快速搭建一个相同的模拟环境。cloud环境的DB部分架构如下:
双机+mysql+HA+DRBD
为了实现测试环境的快速搭建,放弃了从头搭建环境的方法,而是选择将cloud环境的两个节点机器上的linux进行完整备份,然后在测试环境上恢复,实现环境复制的效果。具体步骤如下:
1)将现有环境的进行备份
tar cvpzf backup.tgz / --exclude=/DB --exclude=/boot --exclude=/proc --exclude=/dev --exclude=/mnt --exclude=/media --exclude=/lost+found --exclude=/backup.tgz --exclude=/sys
2)将备份迁移到新机器
tar xvpfz backup.tgz -C /
3)修改root账户密码
#passwd root
4)修改网络配置,重启网络生效
5)临时设置drbd,heartbeat为开机不启动,等设置好以后再恢复
#chkconfig drbd off
#chkconfig heartbeat off
6)在测试机器上创建drbd资源
#lvremove /dev/VG001/LV_DB
#lvcreate -l 1308 -n LV_DB VG001
#drbdadm create-md all
#/etc/init.d/drbd start
#drbdadm -- --overwrite-data-of-peer primary r0 (只在主节点操作)
Or # drbdsetup /dev/drbd0 primary -o
#mkfs.ext3 /dev/drbd0
#mount /dev/drbd0 /DB
7)配置mysql
mysqladmin -h localhost -u root password Trend#100
mysql -hlocalhost -uroot -pTrend#100
mysql> GRANT ALL PRIVILEGES ON *.* TO root@"%" IDENTIFIED BY 'Trend#100' WITH GRANT OPTION;
mysql> FLUSH PRIVILEGES;
将/etc/ha.d/resource.d目录下的mysqld文件替换为文件夹中的同名文件(修改/etc/init.d/mysqld文件中的start部分)
/etc/init.d/heartbeat start
恢复开机自启动
#chkconfig -- add drbd
#chkconfig drbd on
#chkconfig -- add heartbeat
#chkconfig heartbeat on
过程中出现的错误,及处理办法
1.
[root@JDC2-TMSPSQL02 ~]# mkfs.ext3 /dev/drbd0
mke2fs 1.39 (29-May-2006)
mkfs.ext3: Wrong medium type while trying to determine filesystem size
或者
[root@JDC2-TMSPSQL02 ~]# mount /dev/drbd0 /DB
mount: block device /dev/drbd0 is write-protected, mounting read-only
mount: Wrong medium type
解决办法:#drbdadm -- --overwrite-data-of-peer primary r0
2.
[root@JDC2-TMSPSQL02 ~]# /etc/init.d/drbd start
Starting DRBD resources: [ d(r0) s(r0) n(r0) ]..........
***************************************************************
DRBD's startup script waits for the peer node(s) to appear.
- In case this node was already a degraded cluster before the
reboot the timeout is 120 seconds. [degr-wfc-timeout]
- If the peer was available before the reboot the timeout will
expire after 0 seconds. [wfc-timeout]
(These values are for resource 'r0'; 0 sec -> wait forever)
To abort waiting enter 'yes' [ -- ]:[ 10]:[ 11]:[ 12]:[ 13]:ye
解决办法:用setup命令关闭防火墙或者开通相应端口
3.mysqld不能被heartbeat唤起
解决办法:原因是/etc/ha.d/haresources文件配置中service每被正确唤起
#ln -s /etc/init.d/mysqld /etc/ha.d/resource.d/mysqld
4.[root@JDC2-TMSPSQL01 data]# /etc/init.d/mysqld start
Timeout error occurred trying to start MySQL Daemon.
Starting MySQL: [FAILED]
解决办法:
#cd /DB/
#rm -rf data
#/usr/bin/mysql_install_db
instead of running service mysqld start or service mysqld restart try running:
# service mysqld stop; mysqld_safe &
5.[root@JDC2-TMSPSQL02 resource.d]# /etc/init.d/mysqld status
mysqld dead but subsys locked
解决办法:手工执行 #service mysqld stop; mysqld_safe &
6.登陆时报以下错误
Unable to get valid context for root
Last login: Wed Jul 24 02:06:01 2013 from 10.64.41.3
单机模式进入后#vi /var/log/secure
Jul 24 02:20:46 JDC2-TMSP-SQL1-NEW sshd[4372]: Accepted password for root from 10.64.41.3 port 60506 ssh2
Jul 24 02:20:46 JDC2-TMSP-SQL1-NEW sshd[4372]: pam_selinux(sshd:session): Security context unconfined_u:system_r:abrt_helper_t:s0-s0:c0.c1023 is not allowed for unconfined_u:system_r:abrt_helper_t:s0-s0:c0.c1023
Jul 24 02:20:46 JDC2-TMSP-SQL1-NEW sshd[4372]: pam_selinux(sshd:session): Unable to get valid context for root
Jul 24 02:20:46 JDC2-TMSP-SQL1-NEW sshd[4372]: pam_unix(sshd:session): session opened for user root by (uid=0)
Jul 24 02:20:47 JDC2-TMSP-SQL1-NEW sshd[4372]: error: ssh_selinux_setup_pty: security_compute_relabel: Invalid argument
解决办法:进入单机模式,关闭SElinux
关闭SElinux不重启系统的方法
Redhat系统,修改/etc/sysconfig/selinux文件:
#SELINUX=enforcing
SELINUX=disabled
重启生效,如果不想重启,用命令 setenforce 0