MongoDB 经常报 “Closed connection“ 的问题排查修复

丶安格拉丶

已于 2024-01-25 18:23:23 修改

阅读量1k

点赞数 10

分类专栏： Java 文章标签： mongodb Java

于 2024-01-25 10:30:53 首次发布

本文链接：https://blog.csdn.net/qq_39579105/article/details/135837979

版权

Java 专栏收录该内容

5 篇文章 0 订阅

订阅专栏

现象

通过消息队列(RabbitMQ)将队列内的消息消费存入MongoDB，因为本来内容就不多所以基本应该是实时的，但是却不知道为什么会偶发消息消费推迟20~30分钟的情况，排查了RabbitMQ和MongoDB，网上没有找到相关问题的帖子，之后排查推测是 MongoDB 连接的问题

报错日志

2023-12-20 15:03:38.051 [org.springframework.amqp.rabbit.RabbitListenerEndpointContainer#3-1] WARN  org.mongodb.driver.connection - Got socket exception on connection [connectionId{localValue:109, serverValue:1993}] to test-mongodb03:17017. All connections to test-mongodb03:17017 will be closed.
2023-12-20 15:03:38.051 [org.springframework.amqp.rabbit.RabbitListenerEndpointContainer#3-1] INFO  org.mongodb.driver.connection - Closed connection [connectionId{localValue:109, serverValue:1993}] to test-mongodb03:17017 because there was a socket exception raised by this connection.
2023-12-20 15:03:38.052 [org.springframework.amqp.rabbit.RabbitListenerEndpointContainer#3-1] INFO  org.mongodb.driver.cluster - No server chosen by ReadPreferenceServerSelector{readPreference=primary} from cluster description ClusterDescription{type=REPLICA_SET, connectionMode=MULTIPLE, serverDescriptions=[ServerDescription{address=test-mongodb02:17017, type=REPLICA_SET_SECONDARY, state=CONNECTED, ok=true, version=ServerVersion{versionList=[6, 0, 10]}, minWireVersion=0, maxWireVersion=17, maxDocumentSize=16777216, logicalSessionTimeoutMinutes=30, roundTripTimeNanos=39535116, setName='test', canonicalAddress=test-mongodb02:17017, hosts=[test-mongodb03:17017, test-mongodb02:17017, test-mongodb01:17017], passives=[], arbiters=[], primary='test-mongodb03:17017', tagSet=TagSet{[]}, electionId=null, setVersion=1, lastWriteDate=Wed Dec 20 15:03:29 GMT+08:00 2023, lastUpdateTimeNanos=5350705627747041}, ServerDescription{address=test-mongodb03:17017, type=UNKNOWN, state=CONNECTING}, ServerDescription{address=test-mongodb01:17017, type=REPLICA_SET_SECONDARY, state=CONNECTED, ok=true, version=ServerVersion{versionList=[6, 0, 10]}, minWireVersion=0, maxWireVersion=17, maxDocumentSize=16777216, logicalSessionTimeoutMinutes=30, roundTripTimeNanos=39518862, setName='test', canonicalAddress=test-mongodb01:17017, hosts=[test-mongodb03:17017, test-mongodb02:17017, test-mongodb01:17017], passives=[], arbiters=[], primary='test-mongodb03:17017', tagSet=TagSet{[]}, electionId=null, setVersion=1, lastWriteDate=Wed Dec 20 15:03:29 GMT+08:00 2023, lastUpdateTimeNanos=5350705837909970}]}. Waiting for 30000 ms before timing out
2023-12-20 15:03:38.091 [cluster-ClusterId{value='65797632227a450fe86e23d5', description='null'}-test-mongodb03:17017] INFO  org.mongodb.driver.cluster - Discovered replica set primary test-mongodb03:17017
2023-12-20 15:03:38.091 [org.springframework.amqp.rabbit.RabbitListenerEndpointContainer#3-1] INFO  org.mongodb.driver.connection - Closed connection [connectionId{localValue:110, serverValue:1992}] to test-mongodb03:17017 because there was a socket exception raised on another connection from this pool.
2023-12-20 15:03:38.091 [org.springframework.amqp.rabbit.RabbitListenerEndpointContainer#3-1] INFO  org.mongodb.driver.connection - Closed connection [connectionId{localValue:111, serverValue:1994}] to test-mongodb03:17017 because there was a socket exception raised on another connection from this pool.
2023-12-20 15:03:38.362 [org.springframework.amqp.rabbit.RabbitListenerEndpointContainer#3-1] INFO  org.mongodb.driver.connection - Opened connection [connectionId{localValue:112, serverValue:2004}] to test-mongodb03:17017
2023-12-20 15:03:38.683 [org.springframework.amqp.rabbit.RabbitListenerEndpointContainer#3-1] INFO  org.mongodb.driver.connection - Opened connection [connectionId{localValue:113, serverValue:2005}] to test-mongodb03:17017
2023-12-20 15:03:40.218 [org.springframework.amqp.rabbit.RabbitListenerEndpointContainer#3-1] INFO  org.mongodb.driver.connection - Opened connection [connectionId{localValue:114, serverValue:2008}] to test-mongodb03:17017

解决方式

根据MongoDB官方文档

FAQ: MongoDB Diagnostics — MongoDB Manual

sysctl net.ipv4.tcp_keepalive_time

获得的值小于等于120(秒)(2分钟)会比较好，但是目前设置的是1200(秒)(20分钟)

也基本符合测试那边的问题，就是数据差不多延迟了二三十分钟(虽然官方文档中说大于5分钟的都无效，会被设置为5分钟)

修改语句

sudo sysctl -w net.ipv4.tcp_keepalive_time=120

要防止重启之后配置变回原来的还得改配置文件，这边没有权限后续就没有操作了，不过步骤写的很清楚，交给运维就行

另外根据官方文档

FAQ: MongoDB Diagnostics — MongoDB Manual

还需要修改 net.ipv4.tcp_retries2 配置，通过以下语句查询，小于等于5会比较好

sysctl net.ipv4.tcp_retries2

可以看到系统原来设置的是15，这边通过

sudo sysctl -w net.ipv4.tcp_retries2=5

进行修改，修改后可以看到值已经变成5了

要防止重启之后配置变回原来的还得改配置文件，跟上面一样是同一个文件，交给运维就行

经过修复之后，该异常已经一周并未出现，推测问题已经解决

丶安格拉丶

关注

10
点赞
踩
12

收藏

觉得还不错? 一键收藏
4
评论
MongoDB 经常报 “Closed connection“ 的问题排查修复

通过消息队列(RabbitMQ)将队列内的消息消费存入MongoDB，因为本来内容就不多所以基本应该是实时的，但是却不知道为什么会偶发消息消费推迟20~30分钟的情况，排查了RabbitMQ和MongoDB，网上没有找到相关问题的帖子，之后排查推测是 MongoDB 连接的问题。也基本符合测试那边的问题，就是数据差不多延迟了二三十分钟(虽然官方文档中说大于5分钟的都无效，会被设置为5分钟)要防止重启之后配置变回原来的还得改配置文件，这边没有权限后续就没有操作了，不过步骤写的很清楚，交给运维就行。
复制链接

扫一扫

专栏目录