近期系统频繁报 java.sql.SQLException: connection closed
message:com.noahgroup.framework.smart.admin.common.exception.AdminExceptionHandler.handleException:76 - 【exception】:
org.springframework.jdbc.UncategorizedSQLException:
Error querying database. Cause: java.sql.SQLException: connection closed
The error may exist in URL [jar:file:/usr/local/smart-admin/smart-admin.jar!/BOOT-INF/lib/smart-admin-service-0.0.1-SNAPSHOT.jar!/mapper/MmChanceMapper.xml]
The error may involve com.noahgroup.framework.smart.admin.service.mapper.MmChanceMapper.selectChanceReport_COUNT
The error occurred while executing a query
SQL: SELECT count(0) FROM mm_chance WHERE state BETWEEN 0 AND 1 AND is_deleted = 0
Cause: java.sql.SQLException: connection closed
; uncategorized SQLException; SQL state [null]; error code [0]; connection closed; nested exception is java.sql.SQLException: connection closed
at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:89)
at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:81)
at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:81)…
Caused by: java.sql.SQLException: connection closed
at com.alibaba.druid.pool.DruidPooledConnection.checkStateInternal(DruidPooledConnection.java:1163)
at com.alibaba.druid.pool.DruidPooledConnection.checkState(DruidPooledConnection.java:1154)
at com.alibaba.druid.pool.DruidPooledConnection.prepareStatement(DruidPooledConnection.java:337)
项目中用的Druid2.1.8,是比较新的版本,正常应该没有问题的,根据上面的报错先检查Druid配置
参考网上的一些配置,主要是看testOnBorrow和timeBetweenEvictionRunsMillis这两个参数
testOnBorrow:false,开启会影响性能,生产环境不要开启
timeBetweenEvictionRunsMillis:60000,需小于数据库最大连接超时时间(默认8小时)
<!-- 阿里 druid 数据库连接池 -->
<bean id="dataSource" class="com.alibaba.druid.pool.DruidDataSource" init-method="init" destroy-method="close">
<!-- 基本属性 url、user、password -->
<property name="url" value="${url}" />
<property name="username" value="${username}" />
<property name="password" value="${password}" />
<!-- 配置初始化大小、最小、最大 -->
<property name="initialSize" value="1" />
<property name="minIdle" value="1" />
<property name="maxActive" value="20" />
<!-- 配置获取连接等待超时的时间 -->
<property name="maxWait" value="60000" />
<!-- 配置间隔检测需要关闭的空闲连接,单位是毫秒 -->
<property name="timeBetweenEvictionRunsMillis" value="60000" />
<!-- 配置一个连接在池中最小生存的时间,单位是毫秒,注意要小于数据库的最大连接超时时间 -->
<property name="minEvictableIdleTimeMillis" value="300000" />
<!-- mysql是select 1,oracle是select 1 from dual -->
<property name="validationQuery" value="SELECT 1" />
<property name="testWhileIdle" value="true" />
<property name="testOnBorrow" value="false" /> <!-- 开启会影响性能,生产不要开启 -->
<property name="testOnReturn" value="false" />
<!-- 打开PSCache,mysql不要开启,oracle推荐开启 -->
<property name="poolPreparedStatements" value="false" />
<property name="maxPoolPreparedStatementPerConnectionSize" value="20" />
<!-- 配置监控统计拦截的filters,去掉后监控界面sql无法统计 -->
<property name="filters" value="stat" />
</bean>
配置成这样以后,问题还是没解决,继续排查日志,发现了Spring事务等待超时的报错
Lock wait timeout exceeded; try restarting transaction
### Error updating database. Cause: com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException:
Lock wait timeout exceeded; try restarting transaction
### The error may involve com.example.demo.mapper.AdminMapper.updateById-Inline
### The error occurred while setting parameters
项目老框架使用的手动事务,即编程式事务,mysql默认事务隔离级别Repeatable Read,可重复读,当请求并发时,读到了重复的事务版本,造成了事务锁竞争,事务等待,事务提交失败,又回滚失败,这样连接就一直没关闭。
-
后面的线程一直抢不到执行的机会,就一直等待。
-
最后因事务执行时间超过mysql默认的锁等待时间(50s),就会报出:Lock wait timeout exceeded
public int updateTableXxx(){
try{
/*开启事务*/
startTransaction();
mapper.updateXxx();
/*其他流程处理*/
xxx;
/*提交事务*/
return commitTransaction();
}catch (Exception e){
/*回滚事务*/
rollbackTransaction();
log.error(e.getMessage(), e);
return -1;
}
}
复现步骤:使用手动事务,并用JMeter并发测试,很容易就会报这个错误
解决办法:
1、保留手动事务,使用Redis分布式锁,控制并发请求
2、去掉手动事务,使用@Transaction注解,这个注解支持并发(推荐)
3、保留手动事务,使用synchronized同步锁,锁住方法块即可(不推荐)
经过测试以上三种方式都可行
事务等待超时后,Druid会频繁报连接被关闭或连接不可用的异常
-_-我目前只观察到两者有关联,具体原因还不详,欢迎小伙伴们评论区沟通-_-