python多线程读取数据库数据,如何在python中进行多线程SQL查询，以便获得所有查询的结果...

最新推荐文章于 2022-05-08 21:12:59 发布

weixin_39829073

最新推荐文章于 2022-05-08 21:12:59 发布

阅读量276

点赞数

文章标签： python多线程读取数据库数据

Is there a way to use threads to simultaneously perform the SQL queries so I can cut down on processing time of my code below? Is there a better method to perform the same result as below without using the pandas module? Given the size of the data sets I am working with I cannot store the entire dataset in memory and I have found looping over the rows of a SELECT * FROM statement and comparing them against the list I am querying with adds to the processing time.

# DATABASE layout

# _____________________________________________________________

# | id | name | description |

# |_____________|____________________|__________________________|

# | 1 | John | Credit Analyst |

# | 2 | Jane | Doctor |

# | ... | ... | ... |

# | 5000000 | Mohammed | Dentist |

# |_____________|____________________|__________________________|

import sqlite3

SEARCH_IDS = [x for x in range(15000)]

DATABASE_NAME = 'db.db'

def chunks(wholeList, chunkSize=999):

"""Yield successive n-sized chunks from wholeList."""

for i in range(0, len(wholeList), chunkSize):

yield wholeList[i:i + chunkSize]

def search_database_for_matches(listOfIdsToMatch):

'''Takes a list of ids and returns the rows'''

conn = sqlite3.connect(DATABASE_NAME)

cursor = conn.cursor()

sql = "SELECT id, name, description FROM datatable WHERE id IN ({})".format(', '.join(["?" for x in listOfIdsToMatch]))

cursor.execute(sql,tuple(listOfIdsToMatch))

rows = cursor.fetchall()

return rows

def arrange(orderOnList,listToBeOrdered,defaultReturnValue='N/A'):

'''Takes a list of ids in the desired order and list of tuples which have ids as the first items.

the list of tuples is aranged into a new list corresponding to the order of the source list'''

from collections import OrderedDict

resultList=[defaultReturnValue for x in orderOnList]

indexLookUp = OrderedDict( [ ( value , key ) for key , value in enumerate( orderOnList ) ] )

for item in listToBeOrdered:

resultList[indexLookUp[item[0]]]=item

return resultList

def main():

results=[]

for chunk in chunks(SEARCH_IDS,999):

results += search_database_for_matches(chunk)

results = arrange(SEARCH_IDS,results)

print(results)

if __name__ == '__main__': main()

解决方案

Some advices:

Instead of reading the records by chucks using a iterator, you ought to use pagination.

See this questions:

If you're using multithreading / multiprocessing make sure your database can support it.

See: SQLite And Multiple Threads

To implement what you want you can use a pool of workers which work on each chunk. See Using a pool of workers in the Python documentation.

Example:

Import multiprocessing

with multiprocessing.pool.Pool(process = 4) as pool:

result = pool.map(search_database_for_match, [for chunk in chunks(SEARCH_IDS,999)])

weixin_39829073

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python多线程读取数据库数据,如何在python中进行多线程SQL查询，以便获得所有查询的结果...

Is there a way to use threads to simultaneously perform the SQL queries so I can cut down on processing time of my code below? Is there a better method to perform the same result as below without usin...
复制链接

扫一扫