吃饭期间,忽然收到线上告警,排查原因发现是数据库连接超时。
于是联系DBA,一番排查下来数据库的Master节点的宿主机出现了故障。
主从切换后,告警还没回复,业务不可用,继续看链路跟踪日志。
找链路中耗时最长的那个节点,发现错误log入下:
org.springframework.dao.RecoverableDataAccessException: ### Error querying database. Cause: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure The last packet successfully received from the server was 3,123,734 milliseconds ago. The last packet sent successfully to the server was 3,101,763 milliseconds ago. ### The error may exist in URL [jar:file:/home/service/app/XXX/sgy4utdvesvp/XXX/lib/XXX-dao-1.0.0.jar!/META-INF/mybatis/XXXInfoMapper.xml] ### The error may involve com.AA.BB.CC.dd.dao.mapper.XXXInfoMapper.selectByExample-Inline ### The error occurred while setting parameters ### SQL: select 'true' as QUERYID, id, AA, XX, create_time, update_time from XX_info WHERE ( AA in ( ? , |
居然有连接占用3000多秒,合计50多分钟,观察数据库的配置一看,发现只配了数据库的URL。
其他超时都没配,
紧急重启服务解决问题先,后续,补充数据库连接池的超时信息
mysql.minIdle | 5 |
mysql.initialSize | 5 |
mysql.maxActive | 20 |
mysql.maxWait | 2000 |
mysql.testOnBorrow | FALSE |
mysql.testWhileIdle | TRUE |
mysql.testOnReturn | FALSE |
keepAlive | TRUE |
mysql.timeBetweenEvictionRunsMillis | 60000 |
mysql.minEvictableIdleTimeMillis | 300000 |
mysql.maxEvictableIdleTimeMillis | 420000 |
同时排查redis的超时时间,一样会引起连接超时故障
connectionTimeout(connection.timeout) | Y | 200ms | 连接超时时间 |
soTimeout(read.timeout) | Y | 200ms | 读写超时时间 |
poolConfig.maxTotal | Y | 50 |
poolConfig.maxIdle | Y | maxTotal=maxIdle |
poolConfig.minIdle | Y | 10 |
poolConfig.blockWhenExhausted | N | TRUE |
poolConfig.maxWaitMillis | Y | 100 |