今天项目上线,在新的vps上部署了一套lnmp,原本部署php项目早就已经轻车熟路,谁知竟然翻车了,后来祭出gdb才最终搞定了php-fpm异常退出导致nginx 502 的bug,经查,据说这个bug是php7.3以后才有,未经考证,记录下处理过程
错误
[07-Jul-2020 18:05:35] WARNING: [pool www] child 25159 exited on signal 11 (SIGSEGV) after 99.606752 seconds from start
[07-Jul-2020 18:05:35] NOTICE: [pool www] child 25408 started
[07-Jul-2020 18:05:36] WARNING: [pool www] child 25155 exited on signal 11 (SIGSEGV) after 100.420225 seconds from start
[07-Jul-2020 18:05:36] NOTICE: [pool www] child 25410 started
[07-Jul-2020 18:05:36] WARNING: [pool www] child 25156 exited on signal 11 (SIGSEGV) after 101.007471 seconds from start
[07-Jul-2020 18:05:36] NOTICE: [pool www] child 25411 started
[07-Jul-2020 18:05:37] WARNING: [pool www] child 25160 exited on signal 11 (SIGSEGV) after 101.356937 seconds from start
设置core dump的生成路径
echo "/tmp/core.%p" > /proc/sys/kernel/core_pattern
修改ulimit允许生成 core dump
ulimit -c unlimited
重启php-fpm
service php-fpm restart
请求之前502的接口
查看/tmp文件夹,发现很多core.xxx文件
关闭ulimit
ulimit -c 0
gdb /usr/local/php/sbin/php-fpm -c /tmp/core.28826
Program terminated with signal 11, Segmentation fault.
#0 0x00007fe0a2a8d4a1 in redis_spprintf (redis_sock=redis_sock@entry=0x7fe0a927d240, slot=slot@entry=0x0, ret=ret@entry=0x7ffe4b02da20, kw=kw@entry=0x7fe0a2ab8220 "AUTH",
fmt=fmt@entry=0x7fe0a2ab7790 "S") at /tmp/pear/temp/redis/library.c:855
855 /tmp/pear/temp/redis/library.c: No such file or directory.
一眼扫过去,瞬间锁定问题,大大的AUTH,猜想基本可能是redis的认证出问题了,然后去看了一下,果然,redis的密码没有设置正确,设置redis连接的正确密码后,php-fpm 进程退出的问题顺利解决
这里欠一个技术债,最近开发太忙了,下一篇文章,补上这个发生的真正原因