mysql 8.0 load data_MySQL 8.0 多线程 load data infile

在4核CPU、8GB RAM和SSD磁盘的环境中,比较了单线程与多线程(threads: 8)MySQL 8.0的`load data infile`导入8G CSV文件的性能。单线程导入耗时39分钟35.5036秒,而多线程导入仅需6分钟30.0411秒,效率提升约6倍。多线程导入涉及参数包括:threads、bytesPerChunk和maxRate,以及数据文件分析和分割过程。
摘要由CSDN通过智能技术生成

测试环境:

4 core CPU

8 GB RAM

SSD Disk

Centos 7

导入的文件load_data.csv,大约8G左右。

单线程测试:

mysql > load data infile '/usr/local/mysql/mysql-files/load_data.csv'

into table single.single_load fields terminated by ',' optionally

enclosed by '"' lines terminated by '\n'

(user_id,visible,name,size,latitude,longitude,timestamp,public,description,inserted);

Query OK, 14244516 rows affected, 0 warnings (39 min 35.5036 sec)

Records: 14244516 Deleted: 0 Skipped: 0 Warnings: 0

一共耗时 39 min 35.5036 sec。

多线程测试:

mysql shell > util.importTable("/usr/local/mysql/mysql-files/load_data.csv",

{schema: "parallel", table: "parallel_load", columns:

["user_id","visible","name","size","latitude","longitude","timestamp","public","description"

,"inserted"], dialect: "csv-unix", skipRows: 0, showProgress: true,

fieldsOptionallyEnclosed: true, fieldsTerminatedBy: ",", linesTerminatedBy:

"\n",fieldsEnclosedBy: '"',threads: 8, bytesPerChunk: "1G", maxRate: "2G"})

Importing from file '/usr/local/mysql/mysql-files/load_data.csv' to table

`parallel`.`parallel_load` in MySQL Server at /tmp%2Fmysql.sock using 8 threads

[Worker01] parallel.parallel_load: Records: 1780564 Deleted: 0 Skipped: 0 Warnings: 0

[Worker02] parallel.parallel_load: Records: 1780564 Deleted: 0 Skipped: 0 Warnings: 0

[Worker03] parallel.parallel_load: Records: 1780564 Deleted: 0 Skipped: 0 Warnings: 0

[Worker04] parallel.parallel_load: Records: 1780564 Deleted: 0 Skipped: 0 Warnings: 0

[Worker05] parallel.parallel_load: Records: 1780564 Deleted: 0 Skipped: 0 Warnings: 0

[Worker06] parallel.parallel_load: Records: 1780564 Deleted: 0 Skipped: 0 Warnings: 0

[Worker07] parallel.parallel_load: Records: 1780564 Deleted: 0 Skipped: 0 Warnings: 0

[Worker08] parallel.parallel_load: Records: 1780564 Deleted: 0 Skipped: 0 Warnings: 0

100% (8.06 GB / 8.06 GB), 535.17 KB/s

File '/usr/local/mysql/mysql-files/load_data.csv' (8.06 GB) was imported in 6 min

30.0411 sec at 18.81 MB/s

Total rows affected in parallel.parallel_load: Records: 14244516 Deleted: 0 Skipped: 0

Warnings: 0

一共耗时:6 min 30.0411 sec

结果对比:

从测试结果上看,单线程耗时 39 min 35.5036 sec,多线程耗时6 min 30.0411 sec,多线程效率大约是单线程的6倍。

多线程load data infile的几个参数:

threads,指定线程数

bytesPerChunk,每个chunk的大小

maxRate,线程每秒能够处理的数据最大限制

多线程load data infile内部处理过程:

分析要导入的数据文件

将大的数据文件分割成数据块(chunk)

多线程将数据块导入到MySQL中

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值