根据下面pg_basebackup在线搭建从节点,在Centos7.9下碰到下列问题记录下供参考:
问题1:pg_basebackup复制完成后拉起从库事报块大小不一致
/usr/pgsql-12/bin/pg_ctl start -D /data/pgdata
waiting for server to start....2023-12-18 14:53:50.748 EST [635707] FATAL: database files are incompatible with server
2023-12-18 14:53:50.748 EST [635707] DETAIL: The database cluster was initialized with BLCKSZ 32768, but the server was compiled with BLCKSZ 8192.
2023-12-18 14:53:50.748 EST [635707] HINT: It looks like you need to recompile or initdb.
2023-12-18 14:53:50.748 EST [635707] LOG: database system is shut down
原因:主库是源码编译安装,编译时指定的'--with-wal-blocksize=64' '--with-blocksize=32' 。新搭建从库安装PG时用yum源安装的(默认是8kb)(主库已经运行了1年多才搭建从库并且不同人搭建,导致一些信息不同步。)
处理方法:
1.卸载从库yum安装的pg(yum remove postgresql*)
2.使用源码编译安装PG12,编译安装时使用同主节点相同的参数
#主节点执行,查看编译时参数信息
/usr/local/pgsql12/bin/pg_config --configure
[root@PG-Master ~]# /usr/local/pgsql12/bin/pg_config --configure
'--prefix=/usr/local/pgsql12' '--with-pgport=5432' '--with-perl' '--with-python' '--with-tcl' '--with-openssl' '--with-pam' '--without-ldap' '--with-libxml' '--with-libxslt' '--enable-thread-safety' '--with-wal-blocksize=64' '--with-blocksize=32' '--enable-dtrace' '--enable-debug' '--with-zlib'
#从节点编译安装PG时使用相同参数
./configure '--prefix=/usr/local/pgsql12' '--with-pgport=5432' '--with-perl' '--with-python' '--with-tcl' '--with-openssl' '--with-pam' '--without-ldap' '--with-libxml' '--with-libxslt' '--enable-thread-safety' '--with-wal-blocksize=64' '--with-blocksize=32' '--enable-dtrace' '--enable-debug' '--with-zlib'
3.从节点按上面参数编译时报错,因为缺失相关库,需要安装。
遇到的相关问题如下:
【问题1】:
checking for dtrace... no
configure: error: dtrace not found
解决方法:
yum search dtrace
Loaded plugins: fastestmirror, refresh-packagekit, security
Loading mirror speeds from cached hostfile
* base: mirrors.163.com
* extras: mirrors.163.com
* updates: mirrors.163.com
=============================================================================================== Matched: dtrace ===============================================================================================
systemtap-sdt-devel.i686 : Static probe support tools
systemtap-sdt-devel.x86_64 : Static probe support tools
找到了,就安装,我是64位的,安装第二个
yum install -y systemtap-sdt-devel.x86_64
【问题2】:
checking for flags to link embedded Perl... Can't locate ExtUtils/Embed.pm in @INC (@INC contains: /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .).
BEGIN failed--compilation aborted.
no
configure: error: could not determine flags for linking embedded Perl.
This probably means that ExtUtils::Embed or ExtUtils::MakeMaker is not
installed.
解决方法:
yum install perl-ExtUtils-Embed -y
【问题4】:
configure: error: readline library not found
If you have readline already installed, see config.log for details on the
failure. It is possible the compiler isn't looking in the proper directory.
Use --without-readline to disable readline support.
解决方法:
yum install readline readline-devel
【问题5】:
checking for inflate in -lz... no
configure: error: zlib library not found
If you have zlib already installed, see config.log for details on the
failure. It is possible the compiler isn't looking in the proper directory.
Use --without-zlib to disable zlib support.
解决方法:
yum install zlib zlib-devel
【问题6】:
checking for CRYPTO_new_ex_data in -lcrypto... no
configure: error: library 'crypto' is required for OpenSSL
解决方法:
yum install openssl openssl-devel
【问题7】:
checking for pam_start in -lpam... no
configure: error: library 'pam' is required for PAM
解决方法:
yum install pam pam-devel
【问题8】:
checking for xmlSaveToBuffer in -lxml2... no
configure: error: library 'xml2' (version >= 2.6.23) is required for XML support
解决方法:
yum install libxml2 libxml2-devel
【问题9】:
checking for xsltCleanupGlobals in -lxslt... no
configure: error: library 'xslt' is required for XSLT support
解决方法:
yum install libxslt libxslt-devel
【问题10】:
configure: error: Tcl shell not found
解决方法:
yum install tcl tcl-devel
【问题11】:
checking for ldap.h... no
configure: error: header file is required for LDAP
解决方法:
yum install openldap openldap-devel
【问题12】:
checking for Python.h... no
configure: error: header file <Python.h> is required for Python
解决方法:
yum install python python-devel (注意有的机器上需要指定python版本,例如python2-devel或python3.6-devel)
【问题13】:
Error when bootstrapping CMake:
Cannot find appropriate C++ compiler on this system.
Please specify one using environment variable CXX.
See cmake_bootstrap.log for compilers attempted.
解决方法:
yum install gcc-c++
【问题14】:
configure: error: Package requirements (liblz4) were not met:
No package 'liblz4' found
原因是缺少liblz4依赖包。
解决办法:
安装liblz4相关包
sudo yum install epel-release
yum install lz4
sudo yum install lz4-devel
若上面方法失败,也可手动安装,可以参考本人其他博客,有专门介绍手动安装lz4,本人亲测可行。
上面错误解决来源 blog地址,笔者参照该Blog完美解决,14个错误一模一样:
https://www.cnblogs.com/coreLeo/p/16170123.html
4.编译安装完从节点,检查从节点插件是否和主节点一致,缺少的需要安装。
笔者这里启用了pgautid和pg_stat_statements两个插件,都需要在从节点配置上否则从节点启动时会报错。查看postgresql.cnf配置文件shared_preload_libraries参数。
问题2:前面问题处理好后再次拉起从节点报需要的wal日志已被清理。
could not receive data from WAL stream: ERROR: requested WAL segment 0000000100004F810000005E has already been removed
笔者参数配置如下:
archive_command = '/bin/bash /data/archive_wals/pg_archivelog.sh %p %f'
restore_command = ''
#只保留近3天的归档日志,每天的wal日志放到当天的日期目录内。
#!/bin/bash
#source /home/postgres/.bash_profile
DATE=`date +%Y%m%d`
DIR="/data/archive_wals/$DATE"
BACK="/data/archive_wals/"`date -d '-2 day' +%Y%m%d`
if [ -d "$BACK" ]; then
rm -rf $BACK
echo `date "+%F %T"`" success rm $BACK" > /data/archive_wals/sh.log
fi
(test -d $DIR || mkdir -p $DIR) && cp $1 $DIR/$2
解决方法1:
(如果是测试库或者库很小建议直接重新使用pg_basebackup重新拷贝,笔者主库有4T拷贝太慢才使用下面方法)
1.修改从库参数:restore_command = 'cp /data/archive_wals/%f %p'
2.拷贝主节点的归档wal日志到从节点/data/archive_wals目录
rsync -avz root@192.168.2.249:/data/archive_wals/20231228/ /data/archive_wals/
3.重启从节点,查看pg日志cat postgresql-2023-12-19_083203.log在restore 日志。
拷贝主节点归档日志和重启从节点可能重复几次(最好在业务低谷,产生的wal日志少),笔者试了4次接上主节点的日志。
解决方法2:
后面生产环境运行过程中通过告警也经常碰到wal已被清理错误,导致从节点同步失败。
调整下面几个参数:
max_wal_size = 50GB
min_wal_size = 4GB
wal_keep_segments = 3000
checkpoint_timeout = 10min
cat postgresql-2023-12-19_083203.log
2023-12-19 08:32:03.826 EST [710735] LOG: database system was shut down in recovery at 2023-12-19 08:31:18 EST
cp: cannot stat '/data/archive_wals/00000002.history': No such file or directory
2023-12-19 08:32:03.829 EST [710735] LOG: entering standby mode
2023-12-19 08:32:03.843 EST [710735] LOG: restored log file "0000000100004F81000000AF" from archive
2023-12-19 08:32:03.866 EST [710735] LOG: restored log file "0000000100004F81000000AD" from archive
2023-12-19 08:32:03.872 EST [710735] LOG: redo starts at 4F81/AD880D60
2023-12-19 08:32:03.942 EST [710735] LOG: restored log file "0000000100004F81000000AE" from archive
2023-12-19 08:32:04.081 EST [710735] LOG: restored log file "0000000100004F81000000AF" from archive
2023-12-19 08:32:04.223 EST [710735] LOG: restored log file "0000000100004F81000000B0" from archive
2023-12-19 08:32:04.362 EST [710735] LOG: restored log file "0000000100004F81000000B1" from archive
2023-12-19 08:32:04.488 EST [710735] LOG: consistent recovery state reached at 4F81/B1FFFF78
2023-12-19 08:32:04.488 EST [710733] LOG: database system is ready to accept read only connections
2023-12-19 08:32:04.516 EST [710735] LOG: restored log file "0000000100004F81000000B2" from archive
2023-12-19 08:32:04.676 EST [710735] LOG: restored log file "0000000100004F81000000B3" from archive