csv导入pgsql不成功_将大量csv文件导入PostgreSQL数据库的有效方法

最新推荐文章于 2023-11-21 19:18:37 发布

weixin_39980903

最新推荐文章于 2023-11-21 19:18:37 发布

阅读量695

点赞数

文章标签： csv导入pgsql不成功

本文链接：https://blog.csdn.net/weixin_39980903/article/details/111811765

版权

本文讨论如何高效地将500,000个格式相同的CSV文件导入到PostgreSQL数据库中，总数据量约为272GB。建议使用单个事务、COPY语句以及在加载前后移除并恢复索引来提高效率。PostgreSQL的COPY命令可以直接处理CSV格式的数据。" 102494946,8070614,WebView中HTML代码适配与JS交互,"['前端开发', 'Android开发', 'WebView']

摘要由CSDN通过智能技术生成

I see plenty of examples of importing a CSV into a PostgreSQL db, but what I need is an efficient way to import 500,000 CSV's into a single PostgreSQL db. Each CSV is a bit over 500KB (so grand total of approx 272GB of data).

The CSV's are identically formatted and there are no duplicate records (the data was generated programatically from a raw data source). I have been searching and will continue to search online for options, but I would appreciate any direction on getting this done in the most efficient manner possible. I do have some experience with Python, but will dig into any other solution that seems appropriate.

Thanks!

解决方案

If you start by reading the PostgreSQL guide "Populating a Database" you'll see several pieces of advice:

Load the data in a single transaction.

Use COPY if at all possible.

Remove indexes, foreign key constraints etc before loading the data and restore them afterwards.

PostgreSQL's COPY statement already supports the CSV format:

COPY table (column1, column2, ...) FROM '/path/to/data.csv' WITH (FORMAT CSV)

so it looks as if you are best off not using Python at all, or using Python only to generate the required sequence of COPY statements.

weixin_39980903

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫