python输入数据至mysql效率提升,使用python加快将大型数据集从txt文件插入到mySQL的速度...

本文探讨了如何优化Python脚本以提高将大量文本文件数据插入MySQL数据库的效率。主要提出了两种策略:一是使用批量插入多行数据,减少数据库交互次数;二是利用`LOAD DATA INFILE`命令,快速导入数据,显著提升导入速度。
摘要由CSDN通过智能技术生成

background: I have 500 formatted *.txt files that I need to insert into a mysql database. Currently I have a python script to read the files line by line and insert into mySQL database.

Problem: the files are quite big (~100M per txt file), I tested the script and it takes too long to insert just one file to database.

How can I speed up the process by modifying the scripts?

code:

for file in os.listdir(INPUTFILEPATH):

## index += 1

## print "processing %s out of %s files " % (index, totalfiles)

inputfilename = INPUTFILEPATH + "/" + file

open_file = open(inputfilename, 'r')

contents = open_file.readlines()

totalLines = len(contents)

## index2 = 0

for i in range(totalLines):

## index2 +=1

## print "processing %s out of %s lines " % (index2, totalLines)

lineString = contents[i]

lineString = lineString.rstrip('\n')

values = lineString.split('\t')

if ( len(re.findall(r'[0123456789_\'\.]',values[0])) > 0 ):

continue

message = """INSERT INTO %s(word,year,count,volume)VALUES('%s','%s','%s','%s')"""% ('1gram', values[0],values[1],values[2],values[3])

cursor.execute(message)

db.commit()

cursor.close()

db.close()

解决方案

Two options to consider:

1) the easiest is to include multiple rows of values on one insert. This is way, way faster than doing multiple indserts.

Insetad of doing INSERT INTO tbl ( cols ) VALUES ( vals ), do something like INSERT INTO tbl ( cols ) VALUES ( vals ), ( vals ), ( vals )

The amount of rows you can insert at once depends on the maximum packet size of the mysql server, but you can probably do 100, 1000, maybe 10000 rows safely and it should give you a performance increase of an order of magnitude or more.

2) LOAD DATA INFILE is a bit different, requires more work and has its own requirements, but is very very fast.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值