问题来源
在windows系统运行一下代码,会出现问题。非windows系统可以exit了。
import time
from pymongo import MongoClient
mongodb_setting = dict(
host='127.0.0.1',
port=27017,
username='root',
password='root',
authSource='admin',
)
database_name = 'test'
db = MongoClient(**mongodb_setting)[database_name]
table = db['test_table']
def get_some_info():
now = time.time()
table.find_one({})
print(time.time() - now)
def do_something():
get_some_info() # 第一次查询
time.sleep(600) # do something other
get_some_info() # 第二次查询
do_something()
当第二次查询时,会抛出异常pymongo.errors.AutoReconnect
官方文档中的描述是:
exception pymongo.errors.AutoReconnect(message=’’, errors=None)
Raised when a connection to the database is lost and an attempt to auto-reconnect will be made.
In order to auto-reconnect you must handle this exception, recognizing that the operation which caused it has not necessarily succeeded. Future operations will attempt to open a new connection to the database (and will continue to raise this exception until the first successful connection is made).
大致是意思是,pymongo会自动重连mongodb,但是我们必须手动处理这个异常。
至今我还是没明白,既然你都自动重连了,为什么要我们去处理这个异常?,求大神指点!
DEBUG后查到抛出异常位置,pool.py 262行
def _raise_connection_failure(address, error, msg_prefix=None):
"""Convert a socket.error to ConnectionFailure and raise it."""
host, port = address
# If connecting to a Unix socket, port will be None.
if port is not None:
msg = '%s:%d: %s' % (host, port, error)
else:
msg = '%s: %s' % (host, error)
if msg_prefix:
msg = msg_prefix + msg
if isinstance(error, socket.timeout):
raise NetworkTimeout(msg)
elif isinstance(error, SSLError) and 'timed out' in str(error):
# CPython 2.6, 2.7, PyPy 2.x, and PyPy3 do not distinguish network
# timeouts from other SSLErrors (https://bugs.python.org/issue10272).
# Luckily, we can work around this limitation because the phrase
# 'timed out' appears in all the timeout related SSLErrors raised
# on the above platforms. CPython >= 3.2 and PyPy3.3 correctly raise
# socket.timeout.
raise NetworkTimeout(msg)
else:
raise AutoReconnect(msg)
解决思路
- 1、老老实实按照官方文档说的,去捕获AutoReconnect异常,然后再次发出相同的请求。这个工作量很大,基本要重写每一个的函数,例如insert_one(),find_one()之类的。(个人理解,有更好的方法麻烦告知,谢谢!)
- 2、插个话题,按照方法1去捕获AutoReconnect异常的时候。每次抛出该异常前,必须忍受20s的等待异常时间。例如当运行find_one()方法,20s后才会抛出AutoReconnect异常,然后我们处理这个异常,再次运行一次find_one()方法,耗时大概0.020s,所以一次查询用了20多秒的时间,这样很痛苦。查询mongo_client.py中的class MongoClient的初始化函数,看看超时选项
- `connectTimeoutMS`: (integer or None) Controls how long (in
milliseconds) the driver will wait during server monitoring when
connecting a new socket to a server before concluding the server
is unavailable. Defaults to ``20000`` (20 seconds).
- `serverSelectionTimeoutMS`: (integer) Controls how long (in
milliseconds) the driver will wait to find an available,
appropriate server to carry out a database operation; while it is
waiting, multiple server monitoring operations may be carried out,
each controlled by `connectTimeoutMS`. Defaults to ``30000`` (30
seconds).
默认connectTimeoutMS为20s,我之前的方法是,把connectTimeoutMS,socketTimeoutMS都设置为1000ms,然后处理NetworkTimeout异常,而不再是AutoReconnect异常。也是很痛苦的事(被windows害惨了)
- 3、最终还是回到socket的连接上找问题。出现AutoReconnect异常说明从连接池中拿到的连接已经失效,如果连接池里的连接一直保持着跟mongodb服务器的连接,就不会有自动重连的异常。说明socket的心跳检查有问题。而socket心跳跟几个参数有关:
TCP_KEEPIDLE : 多少秒socket连接没有数据通信,发送keepalive探测分组,单位是秒
TCP_KEEPINTVL : 如果没有响应,多少秒后重新发送keepalive探测分组
TCP_KEEPCNT : 多少次没有响应,则关闭连接
解决方案
从源代码中查找出响应代码,在pool.py中的126行,关键函数为 _set_keepalive_times(sock)
_MAX_TCP_KEEPIDLE = 300
_MAX_TCP_KEEPINTVL = 10
_MAX_TCP_KEEPCNT = 9
if sys.platform == 'win32':
try:
import _winreg as winreg
except ImportError:
import winreg
try:
with winreg.OpenKey(
winreg.HKEY_LOCAL_MACHINE,
r"SYSTEM\CurrentControlSet\Services\Tcpip\Parameters") as key:
_DEFAULT_TCP_IDLE_MS, _ = winreg.QueryValueEx(key, "KeepAliveTime")
_DEFAULT_TCP_INTERVAL_MS, _ = winreg.QueryValueEx(
key, "KeepAliveInterval")
# Make sure these are integers.
if not isinstance(_DEFAULT_TCP_IDLE_MS, integer_types):
raise ValueError
if not isinstance(_DEFAULT_TCP_INTERVAL_MS, integer_types):
raise ValueError
except (OSError, ValueError):
# We could not check the default values so do not attempt to override.
def _set_keepalive_times(dummy):
pass
else:
def _set_keepalive_times(sock):
idle_ms = min(_DEFAULT_TCP_IDLE_MS, _MAX_TCP_KEEPIDLE * 1000)
interval_ms = min(_DEFAULT_TCP_INTERVAL_MS,
_MAX_TCP_KEEPINTVL * 1000)
if (idle_ms < _DEFAULT_TCP_IDLE_MS or
interval_ms < _DEFAULT_TCP_INTERVAL_MS):
sock.ioctl(socket.SIO_KEEPALIVE_VALS,
(1, idle_ms, interval_ms))
else:
def _set_tcp_option(sock, tcp_option, max_value):
if hasattr(socket, tcp_option):
sockopt = getattr(socket, tcp_option)
try:
# PYTHON-1350 - NetBSD doesn't implement getsockopt for
# TCP_KEEPIDLE and friends. Don't attempt to set the
# values there.
default = sock.getsockopt(socket.IPPROTO_TCP, sockopt)
if default > max_value:
sock.setsockopt(socket.IPPROTO_TCP, sockopt, max_value)
except socket.error:
pass
def _set_keepalive_times(sock):
_set_tcp_option(sock, 'TCP_KEEPIDLE', _MAX_TCP_KEEPIDLE)
_set_tcp_option(sock, 'TCP_KEEPINTVL', _MAX_TCP_KEEPINTVL)
_set_tcp_option(sock, 'TCP_KEEPCNT', _MAX_TCP_KEEPCNT)
在windows系统和非win系统函数定义_set_keepalive_times()都不一样,我们先看windows系统。
1.先查找系统注册表中 SYSTEM\CurrentControlSet\Services\Tcpip\Parameters位置的两个键KeepAliveTime和KeepAliveInterval。我win10系统打开一看,根本就没有这两个键,所以_set_keepalive_times被定义为pass,没有心跳一段时间后就会造成AutoReconnect异常!
2.添加两个以上的键和值后,还需要与默认值对比,设置的是毫秒
_MAX_TCP_KEEPIDLE = 300
_MAX_TCP_KEEPINTVL = 10
……
……
def _set_keepalive_times(sock):
idle_ms = min(_DEFAULT_TCP_IDLE_MS, _MAX_TCP_KEEPIDLE * 1000)
interval_ms = min(_DEFAULT_TCP_INTERVAL_MS,
_MAX_TCP_KEEPINTVL * 1000)
if (idle_ms < _DEFAULT_TCP_IDLE_MS or
interval_ms < _DEFAULT_TCP_INTERVAL_MS):
sock.ioctl(socket.SIO_KEEPALIVE_VALS,
(1, idle_ms, interval_ms))
只有其中一个值比默认值大,才会执行sock.ioctl()。(ps:我被这个判断坑惨了!)
也就是说KeepAliveInterval要大于10 * 1000
或者KeepAliveTime大于300 * 1000
最终方案
win系统开发太多坑了
步骤:
1、win键+R,然后输入 regedit 回车
2、找到地址
计算机\HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters
3、空白处右键 > 新建 > QWORD
4、键名KeepAliveTime,值 60000(十进制)
5、键名KeepAliveInterval,值 20000(十进制)
完事!
来源:https://blog.csdn.net/dslkfajoaijfdoj/article/details/83717238