python函数太多_numpy Loadtxt函数似乎消耗太多内存

最新推荐文章于 2024-04-28 00:24:55 发布

weixin_39539563

最新推荐文章于 2024-04-28 00:24:55 发布

阅读量301

点赞数

文章标签： python函数太多

当我使用numpy.loadtxt加载数组时，似乎占用了太多内存。例如

a=numpy.zeros(int(1e6))

导致内存增加约8MB（使用htop或仅增加8bytes * 100万\约8MB）。另一方面，如果我保存然后加载此数组

numpy.savetxt('a.csv',a)b=numpy.loadtxt('a.csv')

我的内存使用量增加了约100MB！我再次用htop观察了这一点。在iPython Shell中，以及在使用Pdb ++逐步执行代码时，都可以观察到这一点。

知道这里发生了什么吗？

在阅读了jozzas的答案之后，我意识到，如果我提前知道数组的大小，那么如果说'a'是一个mxn数组，则有一种内存效率更高的方式来处理事情：

b=numpy.zeros((m,n))withopen('a.csv','r')asf:reader=csv.reader(f)fori,rowinenumerate(reader):b[i,:]=numpy.array(row)

解决方案

将此浮点数组保存到文本文件中，将创建一个24M文本文件。重新加载时，numpy逐行浏览文件，解析文本并重新创建对象。

我希望在这段时间内内存使用量会激增，因为numpy在到达文件末尾之前不知道结果数组需要多大，所以我希望至少有24M + 8M +其他使用的临时内存。

这是numpy代码的相关部分，来自/lib/npyio.py：

# Parse each line, including the firstfori,lineinenumerate(itertools.chain([first_line],fh)):vals=split_line(line)iflen(vals)==0:continueifusecols:vals=[vals[i]foriinusecols]# Convert each value according to its column and storeitems=[conv(val)for(conv,val)inzip(converters,vals)]# Then pack it according to the dtype's nestingitems=pack_items(items,packing)X.append(items)#...A bit further onX=np.array(X,dtype)

This additional memory usage shouldn't be a concern, as this is just the way python works - while your python process appears to be using 100M of memory, internally it maintains knowledge of which items are no longer used, and will re-use that memory. For example, if you were to re-run this save-load procedure in the one program (save, load, save, load), your memory usage will not increase to 200M.

weixin_39539563

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python函数太多_numpy Loadtxt函数似乎消耗太多内存

当我使用numpy.loadtxt加载数组时，似乎占用了太多内存。例如a=numpy.zeros(int(1e6))导致内存增加约8MB（使用htop或仅增加8bytes * 100万\约8MB）。另一方面，如果我保存然后加载此数组numpy.savetxt('a.csv',a)b=numpy.loadtxt('a.csv')我的内存使用量增加了约100MB！我再次用htop观察了这一点。在iPy...
复制链接

扫一扫