目录
多线程 update MySQL
数据库里面有一批数据,使用 python 串行更新代码如下:
from sqlalchemy import create_engine
import time
import pandas as pd
start = time.time()
# 连接
con = create_engine('mysql+pymysql://root:123456@localhost:3306/activity_analysic?charset=utf8')
sql1 = """SELECT distinct vin FROM `activity` limit 30"""
vin = list(pd.read_sql(sql1,con)['vin'])
for i,v in enumerate(vin):
sqlupdate = """update activity set speed = 400 where vin = '%s'""" % v
con.execute(sqlupdate)
print(i,v)
end = time.time()
print(end - start)
用了 250s,开两个线程去更新数据库,用时 70s,代码如下:
from sqlalchemy import create_engine
import pandas as pd
import threading
def update_sql(i,v):
try:
con = create_engine('mysql+pymysql://root:123456@localhost:3306/activity_analysic?charset=utf8',
pool_size=100)
sqlupdate = """update activity set speed = 1000 where vin = '%s'""" % v
con.execute(sqlupdate)
print(i,v)
except:
time.sleep(3)
con = create_engine('mysql+pymysql://root:123456@localhost:3306/activity_analysic?charset=utf8',
pool_size=100)
sqlupdate = """update activity set speed = 700 where vin = '%s'""" % v
con.execute(sqlupdate)
print(i,v)
finally:
con.dispose()
# 连接数据库
con = create_engine('mysql+pymysql://root:123456@localhost:3306/activity_analysic?charset=utf8',pool_size=100)
##线程数
thread_nums = 2
pool_sema = threading.BoundedSemaphore(value=thread_nums)
threads = []
sql1 = """SELECT distinct vin FROM `activity` limit 30"""
vin = list(pd.read_sql(sql1,con)['vin'])
for i,v in enumerate(vin):
th = threading.Thread(target=update_sql, args=(i,v,))
th.start()
threads.append(th)
多线程连接使用数据库需要注意几点:
1.sqlalchemy.exc.TimeoutError: QueuePool limit of size 5 overflow 10 reached, connection timed out,出现这个错误,是因为连接池超过限制了,在连接中加 pool_size=100,最后也加一个 con.dispose() 回收连接,就解决了。
2.Lock wait timeout exceeded,这个错误是因为所有线程都在更新,前一个还没有用完,下一个就等待,超时了就报错,也就是死锁了,这里加一个等待 3 秒,多一点时间等待上一个线程执行完毕,可具体设置,这是最简单粗暴的方法。