- 需求背景
现场有台EDR的物理机,centos7.4操作系统,客户做os层的备份恢复测试的时候,把操作系统弄坏了,无法进行恢复,目前已经恢复了操作系统,在操作系统上也根据集成标准化文档配置了操作系统内核类参数,按照局方要求也完成相关文件系统的划分和挂载,根据局方对os层的安全加固的要求,也配置了安全加固内容。目前需要恢复该节点的mysql pxc集群。目前现状是原集群加这个待恢复的节点有三个节点,现在这个节点刚装完系统,需要重新加入集群,而其他两个节点的pxc运行是正常的。三节点ip如下:
节点ip | 描述 |
10.xxx.xxx.12 | 待恢复节点 |
10.xxx.xxx.13 | 正常运行pxc |
10.xxx.xxx.14 | 正常运行pxc |
- 处理思路
- 配置相关安装安装yum源,包括local、pxc、openjdk。
- 检查防火墙,selinux。
- 配置ssh免密。
- 检查配置/etc/hosts。
- 检查配置时间同步。
- 安装pxc。
- 把新节点加入pxc集群。
- 执行步骤
3.1.配置相关yum源
3.1.1.安装httpd【待恢复节点】
#yum -y install httpd
#systemctl restart httpd
#systemctl enable httpd
3.1.2.创建目录【待恢复节点】
#mkdir /var/www/html/openjdk
#mkdir /var/www/html/PXC
3.1.3.上传相关rpm包【待恢复节点】
3.1.3.1.上传openjdk包【待恢复节点】
上传相关rpm包到/var/www/html/openjdk目录下:
root@ithedrrk1app1[/var/www/html/openjdk]#ll
total 46448
-rw-r--r-- 1 root root 19196 Jan 23 2018 copy-jdk-configs-2.2-5.el7_4.noarch.rpm
-rw-r--r-- 1 root root 40964 Jan 23 2018 giflib-4.1.6-9.el7.x86_64.rpm
-rw-r--r-- 1 root root 247124 Jan 4 2018 java-1.8.0-openjdk-1.8.0.151-5.b12.el7_4.x86_64.rpm
-rw-r--r-- 1 root root 10248476 Jan 4 2018 java-1.8.0-openjdk-devel-1.8.0.151-5.b12.el7_4.x86_64.rpm
-rw-r--r-- 1 root root 33138868 Jan 4 2018 java-1.8.0-openjdk-headless-1.8.0.151-5.b12.el7_4.x86_64.rpm
-rw-r--r-- 1 root root 74556 Jan 23 2018 javapackages-tools-3.4.1-11.el7.noarch.rpm
-rw-r--r-- 1 root root 31564 Jan 23 2018 libfontenc-1.1.3-3.el7.x86_64.rpm
-rw-r--r-- 1 root root 67720 Jan 23 2018 libICE-1.0.9-9.el7.x86_64.rpm
-rw-r--r-- 1 root root 40160 Jan 23 2018 libSM-1.2.2-2.el7.x86_64.rpm
-rw-r--r-- 1 root root 155992 Jan 23 2018 libXfont-1.5.2-1.el7.x86_64.rpm
-rw-r--r-- 1 root root 89660 Jan 23 2018 lksctp-tools-1.0.17-2.el7.x86_64.rpm
-rw-r--r-- 1 root root 129156 Jan 23 2018 nspr-4.13.1-1.0.el7_3.x86_64.rpm
-rw-r--r-- 1 root root 869480 Jan 23 2018 nss-3.28.4-15.el7_4.x86_64.rpm
-rw-r--r-- 1 root root 74644 Jan 23 2018 nss-pem-1.0.3-4.el7.x86_64.rpm
-rw-r--r-- 1 root root 316940 Jan 23 2018 nss-softokn-3.28.3-8.el7_4.x86_64.rpm
-rw-r--r-- 1 root root 203968 Jan 23 2018 nss-softokn-freebl-3.28.3-8.el7_4.i686.rpm
-rw-r--r-- 1 root root 218664 Jan 23 2018 nss-softokn-freebl-3.28.3-8.el7_4.x86_64.rpm
-rw-r--r-- 1 root root 61592 Jan 23 2018 nss-sysinit-3.28.4-15.el7_4.x86_64.rpm
-rw-r--r-- 1 root root 513080 Jan 23 2018 nss-tools-3.28.4-15.el7_4.x86_64.rpm
-rw-r--r-- 1 root root 75236 Jan 23 2018 nss-util-3.28.4-3.el7.x86_64.rpm
-rw-r--r-- 1 root root 31836 Jan 23 2018 python-javapackages-3.4.1-11.el7.noarch.rpm
-rw-r--r-- 1 root root 48676 Jan 23 2018 ttmkfdir-3.0.9-42.el7.x86_64.rpm
-rw-r--r-- 1 root root 187852 Jan 23 2018 tzdata-java-2017c-1.el7.noarch.rpm
-rw-r--r-- 1 root root 533720 Jan 23 2018 xorg-x11-fonts-Type1-7.5-9.el7.noarch.rpm
-rw-r--r-- 1 root root 89400 Jan 23 2018 xorg-x11-font-utils-7.5-20.el7.x86_64.rpm
3.1.3.2.上传pxc包【待恢复节点】
上传相关rpm到/var/www/html/PXC/目录下:
root@ithedrrk1app1[/var/www/html/PXC]#ll
total 257460
-rwxr-xr-x 1 root root 107040 Oct 22 2019 jemalloc-3.6.0-1.el7.x86_64.rpm
-rwxr-xr-x 1 root root 23804 Oct 22 2019 jemalloc-devel-3.6.0-1.el7.x86_64.rpm
-rwxr-xr-x 1 root root 45072 Oct 22 2019 libev-4.15-7.el7.x86_64.rpm
-rwxr-xr-x 1 root root 6555 Oct 29 2019 percona-release-0.1-4.noarch.rpm
-rwxr-xr-x 1 root root 7893808 Jul 5 2019 percona-xtrabackup-2.4.15-1.el7.x86_64.rpm
-rwxr-xr-x 1 root root 39551472 Jul 5 2019 percona-xtrabackup-debuginfo-2.4.15-1.el7.x86_64.rpm
-rwxr-xr-x 1 root root 13657136 Jul 5 2019 percona-xtrabackup-test-2.4.15-1.el7.x86_64.rpm
-rwxr-xr-x 1 root root 28436 Jun 20 2019 Percona-XtraDB-Cluster-57-5.7.26-31.37.1.el7.x86_64.rpm
-rwxr-xr-x 1 root root 107916124 Jun 20 2019 Percona-XtraDB-Cluster-57-debuginfo-5.7.26-31.37.1.el7.x86_64.rpm
-rwxr-xr-x 1 root root 7509116 Jun 20 2019 Percona-XtraDB-Cluster-client-57-5.7.26-31.37.1.el7.x86_64.rpm
-rwxr-xr-x 1 root root 1138232 Jun 20 2019 Percona-XtraDB-Cluster-devel-57-5.7.26-31.37.1.el7.x86_64.rpm
-rwxr-xr-x 1 root root 27940 Jun 20 2019 Percona-XtraDB-Cluster-full-57-5.7.26-31.37.1.el7.x86_64.rpm
-rwxr-xr-x 1 root root 702688 Jun 20 2019 Percona-XtraDB-Cluster-garbd-57-5.7.26-31.37.1.el7.x86_64.rpm
-rwxr-xr-x 1 root root 53109808 Jun 20 2019 Percona-XtraDB-Cluster-server-57-5.7.26-31.37.1.el7.x86_64.rpm
-rwxr-xr-x 1 root root 755196 Jun 20 2019 Percona-XtraDB-Cluster-shared-57-5.7.26-31.37.1.el7.x86_64.rpm
-rwxr-xr-x 1 root root 1185104 Jun 20 2019 Percona-XtraDB-Cluster-shared-compat-57-5.7.26-31.37.1.el7.x86_64.rpm
-rwxr-xr-x 1 root root 29910840 Jun 20 2019 Percona-XtraDB-Cluster-test-57-5.7.26-31.37.1.el7.x86_64.rpm
-rwxr-xr-x 1 root root 32624 Oct 29 2019 qpress-11-1.el7.x86_64.rpm
3.1.4.安装createrepo【待恢复节点】
#yum -y install createrepo
3.1.5.制作本地repo【待恢复节点】
#createrepo -v /var/www/html/openjdk/
#createrepo -v /var/www/html/PXC/
#chmod -R 755 /var/www/html/openjdk
#chmod -R 755 /var/www/html/PXC
3.1.6.配置repo文件【待恢复节点】
#cat openjdk.repo
[openjdk1.8]
name=openjdk1.8
baseurl=http://10.xxx.xxx.12/openjdk/
gpgcheck=0
enabled=1
priority=1
#cat PXC.repo
[PXC5.7]
name=PXC5.7
baseurl=http://10.xxx.xxx.12/PXC/
gpgcheck=0
enabled=1
priority=1
10.xxx.xxx.12就是待恢复的pxc节点,此处是edr1.
#cat file.repo
[centos7-Server]
name=centos7-Server
baseurl=http://10.xxx.xxx.12/centos7.4/
enabled=1
gpgcheck=0
#cd /etc/yum.repos.d
#ll
total 12
drwxr-xr-x. 2 root root 204 May 14 07:07 backup
-rw-r----- 1 root root 98 May 15 09:26 file.repo
-rw-r----- 1 root root 99 May 15 09:27 openjdk.repo
-rw-r----- 1 root root 87 May 15 09:28 PXC.repo
file.repo是配置的本地镜像源,自行挂载。
3.1.7.重建yum【待恢复节点】
#yum clean all
#yum makecache
3.2.检查内核参数【待恢复节点】
#cat /etc/sysctl.conf
3.3.检查防火墙、selinux【待恢复节点】
#getenforce
Disabled
#systemctl status firewalld
● firewalld.service - firewalld - dynamic firewall daemon
Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled)
Active: inactive (dead)
Docs: man:firewalld(1)
3.4.检查/etc/hosts【待恢复节点】
确保当前节点加入到hosts解析文件,其他内容跟现存的两个节点内容一致。
3.5.配置ssh互信
3.5.1.在当前edr1节点:【待恢复节点】
# ssh-keygen -t rsa
3.5.2.将其他两个节点的/root/.ssh/id_rsa.pub文件scp到edr1【正常节点】
节点10.xxx.xxx.13:
#scp /root/.ssh/id_rsa.pub edr1:/root/.ssh/id_rsa.pub_13
节点10.xxx.xxx.14:
#scp /root/.ssh/id_rsa.pub edr1:/root/.ssh/id_rsa.pub_14
3.5.3.加入公钥到edr信任文件【待恢复节点】
#cat /root/.ssh/id_rsa.pub >> /root/.ssh/authorized_keys
#cat /root/.ssh/id_rsa_13.pub >> /root/.ssh/authorized_keys
#cat /root/.ssh/id_rsa_14.pub >> /root/.ssh/authorized_keys
3.5.4.将信任文件scp给其他两个节点【待恢复节点】
#scp /root/.ssh/authorized_keys 10.xxx.xxx.13:/root/.ssh/authorized_keys
#scp /root/.ssh/authorized_keys 10.xxx.xxx.14:/root/.ssh/authorized_keys
3.5.5.验证ssh免密【所有节点】
#ssh 10.xxx.xxx.12 date
#ssh 10.xxx.xxx.13 date
#ssh 10.xxx.xxx.14 date
有些可能第一次需要输入一个yes,但是不需要输入密码,即为验证成功。
3.6.检查时间同步
本例使用ntp做时间同步:【待恢复节点】
#cat /etc/ntp.conf
#cat /etc/sysconfig/ntpd
3.7.安装pxc
3.7.1.安装系统包【待恢复节点】
# yum -y install perl-DBI rsync perl-DBD-mysql perl-Digest-MD5 lsof socat
3.7.2.安装pxc【待恢复节点】
#yum -y install Percona-XtraDB* percona-xtrabackup-24 jemalloc jemalloc-devel libev
3.8.检查mysql用户【待恢复节点】
#cat /etc/group|grep -i mysql
mysql:x:992:
# groupmod -g 512 mysql
#usermod -u 512 -d /mysql -s /bin/bash -c '' mysql
# chown -R mysql:mysql /var/lib/mysql
#passwd mysql
3.9.建mysql相关目录【待恢复节点】
#mkdir -p /mysql/bin_log
#mkdir -p /mysql/log
#mkdir -p /mysql/relay_log
#mkdir -p /mydata/data
#chown -R mysql:mysql /mysql
#chown -R mysql:mysql /mydata
3.10.从其他节点拷贝my.cnf文件【待恢复节点】
#scp 10.xxx.xxx.13:/etc/my.cnf /etc/my.cnf
此处任意找一个好的节点拷贝过来即可。
3.11.修改/etc/my.cnf【待恢复节点】
根据当前节点实际情况修改如下项:
log-bin | #根据集群主机名修改 |
relay_log | #根据集群主机名修改 |
wsrep_cluster_address | #因为这个节点ip没有变,所以无需修改 |
wsrep_node_address | #修改为本节点ip |
wsrep_node_name | #修改为本节点主机名 |
/etc/my.cnf文件样例如下:
#cat /etc/my.cnf
[client]
socket=/var/lib/mysql/mysql.sock
[mysqld_safe]
malloc-lib=/usr/lib64/libjemalloc.so
[mysqld]
skip-grant-tables
large_pages
innodb_numa_interleave=1
thread_handling=pool-of-threads
wsrep_provider_options = "gcs.fc_limit = 256; gcs.fc_factor = 0.99; gcs.fc_master_slave = yes;gcache.size =5G"
explicit_defaults_for_timestamp
log_bin_trust_function_creators = 1
transaction-isolation = READ-COMMITTED
server-id=1
datadir=/mydata/data
socket=/var/lib/mysql/mysql.sock
log-error=/mysql/log/mysqld.log
pid-file=/var/lib/mysql/mysqld.pid
log-bin=/mysql/bin_log/edr-app1-binlog
relay_log=/mysql/relay_log/edr-app1-relaylog
slow_query_log_file=/mysql/log/slow.log
general_log_file=/mysql/log/general.log
log_slow_admin_statements=1
slow_query_log=1
long_query_time=2
log_slave_updates
expire_logs_days=7
performance_schema = ON
character-set-server=utf8
collation-server=utf8_bin
skip-name-resolve
lower_case_table_names
skip-external-locking
max_allowed_packet = 16M
table_open_cache = 4096
table_open_cache_instances=16
sort_buffer_size = 524288
max_connections = 4096
query_cache_size = 0
query_cache_type = 0
tmp_table_size = 1024M
max_heap_table_size = 1024M
innodb_log_files_in_group = 4
innodb_log_file_size = 1024M
innodb_flush_method= O_DIRECT
innodb_buffer_pool_size = 10240M
innodb_log_buffer_size = 64M
innodb_file_per_table
innodb_data_file_path=ibdata1:10M:autoextend
innodb_flush_log_at_trx_commit = 2
sync_binlog = 1
innodb_lock_wait_timeout = 50
innodb_rollback_on_timeout = ON
innodb_io_capacity = 2000
innodb_read_io_threads = 8
innodb_write_io_threads = 8
innodb_thread_concurrency = 0
innodb_sync_spin_loops = 200
innodb_spin_wait_delay = 6
innodb_status_file = 1
innodb_max_dirty_pages_pct = 75
innodb_support_xa=1
innodb_doublewrite=1
innodb_checksum_algorithm=crc32
innodb_adaptive_hash_index = 1
innodb_purge_threads=8
innodb_use_native_aio = 1
innodb_autoinc_lock_mode = 2
innodb_change_buffering = all
gtid_mode=ON
enforce_gtid_consistency=ON
master_info_repository=TABLE
relay_log_info_repository=TABLE
slave_parallel_workers=4
# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0
wsrep_provider=/usr/lib64/galera3/libgalera_smm.so
# Cluster connection URL contains IPs of nodes
#If no IP is found, this implies that a new cluster needs to be created,
#in order to do that you need to bootstrap this node
wsrep_cluster_address=gcomm://10.xxx.xxx.12,10.xxx.xxx.13,10.xxx.xxx.14
# In order for Galera to work correctly binlog format should be ROW
binlog_format=ROW
# MyISAM storage engine has only experimental support
default_storage_engine=InnoDB
# Slave thread to use
wsrep_slave_threads= 8
wsrep_log_conflicts
# This changes how InnoDB autoincrement locks are managed and is a requirement for Galera
innodb_autoinc_lock_mode=2
# Node IP address
wsrep_node_address=10.xxx.xxx.12
# Cluster name
wsrep_cluster_name=PXC-1
#If wsrep_node_name is not specified, then system hostname will be used
wsrep_node_name=ithedrrk1app1
# SST method
wsrep_sst_method=xtrabackup-v2
#Authentication for SST method
wsrep_sst_auth="sstuser:sstpasswd"
3.12.配置hugepage【待恢复节点】
#vi /etc/sysctl.conf
##########HugePages###################
vm.nr_hugepages = 7000
vm.hugetlb_shm_group=512
#sysctl -p
#cat /proc/meminfo |grep -i hugepages
AnonHugePages: 0 kB
HugePages_Total: 7000
HugePages_Free: 7000
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
3.13.配置limits【待恢复节点】
#echo "mysql soft memlock unlimited" >> /etc/security/limits.conf
# echo "mysql hard memlock unlimited" >> /etc/security/limits.conf
3.14.启动mysql【待恢复节点】
#systemctl start mysql
过程比较慢,对于新节点,在保证/mysql/bin_log,/mysql/log,/mysql/relay_log,/mydata/data
等目录都是空的(因为我们这个是新节点,所以都是空的,条件满足)。在启动systemctl start mysql的时候在该节点上会启动一个SST进程,会自动从活节点上拉取全量数据,直至追上现有存活节点。该进程可以从mysql的数据目录下可以看到相关的文件,在拉取结束后,该文件消失,见下面飘红的文件。
[/mydata/data]#ll
total 5242920
-rw------- 1 mysql mysql 1680 May 15 10:54 ca-key.pem
-rw-r--r-- 1 mysql mysql 1120 May 15 10:54 ca.pem
-rw-r--r-- 1 mysql mysql 1120 May 15 10:54 client-cert.pem
-rw------- 1 mysql mysql 1680 May 15 10:54 client-key.pem
-rw-r----- 1 mysql mysql 5368710440 May 15 10:54 galera.cache
-rw-r----- 1 mysql mysql 0 May 15 10:54 grastate.dat
-rw-r----- 1 mysql mysql 265 May 15 10:54 gvwstate.dat
-rw------- 1 mysql mysql 1676 May 15 10:54 private_key.pem
-rw-r--r-- 1 mysql mysql 452 May 15 10:54 public_key.pem
-rw-r--r-- 1 mysql mysql 1120 May 15 10:54 server-cert.pem
-rw------- 1 mysql mysql 1676 May 15 10:54 server-key.pem
-rw-r----- 1 mysql mysql 0 May 15 10:54 sst_in_progress
3.15.验证数据
3.15.1.在存活的节点上,连接到本地mysql库【正常节点】
#mysql -uroot -p
mysql> use mpttest;
mysql> show tables ;
mysql> select * from tmp_list_001;
+----+-------+
| id | name |
+----+-------+
| 1 | wufan |
+----+-------+
3.15.2.测试插入数据【正常节点】
mysql> insert into tmp_list_001 values(2,'test');
mysql> select * from tmp_list_001;
+----+-------+
| id | name |
+----+-------+
| 1 | wufan |
| 2 | test |
+----+-------+
3.15.3.验证新节点数据【待恢复节点】
#mysql -uroot -p
mysql> use mpttest;
mysql> select * from tmp_list_001;
+----+-------+
| id | name |
+----+-------+
| 1 | wufan |
| 2 | test |
+----+-------+
可以看到数据已经同步过来了。
4、影响说明
1、整个恢复新节点的步骤不需要重启任何现有在线的节点,对业务没有影响。
2、因为被恢复的这个节点,是pxc的三个节点中的备份节点,没有安装keepalive,不需要安装和恢复keepalive。