7
With a small 10 line test file I tried 2 approaches - parse the whole thing and select the last N lines, versus load all lines, but only parse the last N:
使用一個小的10行測試文件,我嘗試了2種方法 - 解析整個事物並選擇最后N行,而不是加載所有行,但只解析最后N:
In [1025]: timeit np.genfromtxt('stack38704949.txt',delimiter=',')[-5:]
1000 loops, best of 3: 741 µs per loop
In [1026]: %%timeit
...: with open('stack38704949.txt','rb') as f:
...: lines = f.readlines()
...: np.genfromtxt(lines[-5:],delimiter=',')
1000 loops, best of 3: 378 µs per loop
This was tagged as a duplicate of Efficiently Read last 'n' rows of CSV into DataFrame. The accepted answer there used
這被標記為有效地將最后'n'行CSV讀入DataFrame的副本。那里接受的答案
from collections import deque
and collected the last N lines in that structure. It also u

这篇博客对比了在Python中读取CSV文件最后N行的两种方法:完整解析后选择和仅解析最后N行。结果显示,使用`deque`配合`genfromtxt`的方法与`readlines`和切片操作耗时相当,对于大型文件,deque可能因减少内存消耗而更具优势。同时,通过`skip_header`读取文件末尾行的方法因为需要两次读取而较慢。
最低0.47元/天 解锁文章

1440

被折叠的 条评论
为什么被折叠?



