csv导入pgsql不成功_将大量csv文件导入PostgreSQL数据库的有效方法

本文讨论如何高效地将500,000个格式相同的CSV文件导入到PostgreSQL数据库中,总数据量约为272GB。建议使用单个事务、COPY语句以及在加载前后移除并恢复索引来提高效率。PostgreSQL的COPY命令可以直接处理CSV格式的数据。" 102494946,8070614,WebView中HTML代码适配与JS交互,"['前端开发', 'Android开发', 'WebView']
摘要由CSDN通过智能技术生成

I see plenty of examples of importing a CSV into a PostgreSQL db, but what I need is an efficient way to import 500,000 CSV's into a single PostgreSQL db. Each CSV is a bit over 500KB (so grand total of approx 272GB of data).

The CSV's are identically formatted and there are no duplicate records (the data was generated programatically from a raw data source). I have been searching and will continue to search online for options, but I would appreciate any direction on getting this done in the most efficient manner possible. I do have some experience with Python, but will dig into any other solution that seems appropriate.

Thanks!

解决方案

If you start by reading the PostgreSQL guide "Populating a Database" you'll see several pieces of advice:

Load the data in a single transaction.

Use COPY if at all possible.

Remove indexes, foreign key constraints etc before loading the data and restore them afterwards.

PostgreSQL's COPY statement already supports the CSV format:

COPY table (column1, column2, ...) FROM '/path/to/data.csv' WITH (FORMAT CSV)

so it looks as if you are best off not using Python at all, or using Python only to generate the required sequence of COPY statements.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值