seek在python中的意思_f.seek()在Python中的复杂性

Does f.seek(500000,0) go through all the first 499999 characters of the file before getting to the 500000th?

In other words, is f.seek(n,0) of order O(n) or O(1)?

解决方案

You need to be a bit more specific on what type of object f is.

If f is a normal io module object for a file stored on disk, you have to determine if you are dealing with:

The raw binary file object

A buffer object, wrapping the raw binary file

A TextIO object, wrapping the buffer

An in-memory BytesIO or TextIO object

The first option just uses the lseek system call to reposition the file descriptor position. If this call is O(1) depends on the OS and what kind of file system you have. For a Linux system with ext4 filesystem, lseek is O(1).

Buffers just clear the buffer if your seek target is outside of the current buffered region and read in new buffer data. That's O(1) too, but the fixed cost is higher.

For text files, things are more complicated as variable-byte-length codecs and line-ending translation mean you can't always map the binary stream position to a text position without scanning from the start. The implementation doesn't allow for non-zero current-position- or end-relative seeks, and does it's best to minimise how much data is read for absolute seeks. Internal state shared with the text decoder tracks a recent 'safe point' to seek back to and read forward to the desired position. Worst-case this is O(n).

The in-memory file objects are just long, addressable arrays really. Seeking is O(1) because you can just alter the current position pointer value.

There are legion other file-like objects that may or may not support seeking. How they handle seeking is implementation dependent.

The zipfile module supports seeking on zip files opened in read-only mode, and seeking to a point that lies before the data section covered by the current buffer requires a full re-read and decompression of the data up to the desired point, seeking after requires reading from the current position until you reach the new. The gzip, lzma and bz2 modules all use the same shared implementation, that also starts reading from the start if you seek to a point before the current read position (and there's no larger buffer to avoid this).

The chunk module allows seeking within the chunk boundaries and delegates to the underlying object. This is an O(1) operation if the underlying file seek operation is O(1).

Etc. So, it depends.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值