【Python练习cookbook】文件操作，I/O

最新推荐文章于 2023-12-11 16:21:29 发布

Code_LT

最新推荐文章于 2023-12-11 16:21:29 发布

阅读量290

点赞数

分类专栏： Python

本文链接：https://blog.csdn.net/Code_LT/article/details/108294078

版权

Python 专栏收录该内容

37 篇文章 3 订阅

订阅专栏

file对象使用：https://docs.python.org/zh-cn/3.7/library/io.html#module-io

关键函数：

open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None)

打开 file 并返回对应的 file object。如果该文件不能打开，则触发 OSError。

从流中读取输入时，如果 newline 为 None，则启用通用换行模式。输入中的行可以以 '\n'，'\r' 或 '\r\n' 结尾，这些行被翻译成 '\n' 在返回呼叫者之前。如果它是 ''，则启用通用换行模式，但行结尾将返回给调用者未翻译。如果它具有任何其他合法值，则输入行仅由给定字符串终止，并且行结尾将返回给未调用的调用者。
将输出写入流时，如果 newline 为 None，则写入的任何 '\n' 字符都将转换为系统默认行分隔符 os.linesep。如果 newline 是 '' 或 '\n'，则不进行翻译。如果 newline 是任何其他合法值，则写入的任何 '\n' 字符将被转换为给定的字符串。

errors 是一个可选的字符串参数，用于指定如何处理编码和解码错误 - 这不能在二进制模式下使用。可以使用各种标准错误处理程序（列在错误处理方案），但是使用 codecs.register_error() 注册的任何错误处理名称也是有效的。标准名称包括:

如果存在编码错误，'strict' 会引发 ValueError 异常。默认值 None 具有相同的效果。
'ignore' 忽略错误。请注意，忽略编码错误可能会导致数据丢失。
'replace' 会将替换标记（例如 '?' ）插入有错误数据的地方。
'surrogateescape' 将表示任何不正确的字节作为Unicode专用区中的代码点，范围从U+DC80到U+DCFF。当在写入数据时使用 surrogateescape 错误处理程序时，这些私有代码点将被转回到相同的字节中。这对于处理未知编码的文件很有用。
只有在写入文件时才支持 'xmlcharrefreplace'。编码不支持的字符将替换为相应的XML字符引用 &#nnn;。
'backslashreplace' 用Python的反向转义序列替换格式错误的数据。
'namereplace' （也只在编写时支持）用 \N{...} 转义序列替换不支持的字符。

# Read the entire file as a single string
with open('somefile.txt', 'rt') as f:
    data = f.read()

# Iterate over the lines of the file
with open('somefile.txt', 'rt') as f:
    for line in f:
    # process line
    # Write chunks of text data
with open('somefile.txt', 'wt') as f:
    f.write(text1)

# Redirected print statement
with open('somefile.txt', 'wt') as f:
    print(line1, file=f)

read(),write()遵循上述输入输出规则处理换行符。print()结果重定向时，注意file必须是文本模式，二进制模式会报错。

1.print()函数修改分割符和结束符

>>> print('ACME', 50, 91.5)
ACME 50 91.5
>>> print('ACME', 50, 91.5, sep=',')
ACME,50,91.5
>>> print('ACME', 50, 91.5, sep=',', end='!!\n')
ACME,50,91.5!!

>>> for i in range(5):
... print(i, end=' ')
...
0 1 2 3 4 >>>

设置sep参数比起用str.join转换输出的好处在于，str.join()只能对字符串生效，儿sep参数没有这种限制

2.file对象的read()函数

file.read([size])

从文件读取指定的字符数，如果未给定或为负则读取所有。

fileoper.txt文件如下：

1:www.runoob.com
2:www.runoob.com
3:www.runoob.com
4:www.runoob.com
5:www.runoob.com

with open('D:\Program Data\PyCharmPython\deep_learning-master\DeepFM\script\\fileoper.txt','rb') as f:
    print(f.read(16))
        
output:b'1:www.runoob.com'

with open('D:\Program Data\PyCharmPython\deep_learning-master\DeepFM\script\\fileoper.txt','r') as f:
    print(f.read(16))
        
output:1:www.runoob.com

3.当文件不存在时，才写入

普通做法：

>>> import os
>>> if not os.path.exists('somefile'):
... with open('somefile', 'wt') as f:
... f.write('Hello\n')
... else:
... print('File already exists!')
...
File already exists!
>>>

python3.0以后的可选做法：

>>> with open('somefile', 'wt') as f:
... f.write('Hello\n')
...
>>> with open('somefile', 'xt') as f:
... f.write('Hello\n')
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
FileExistsError: [Errno 17] File exists: 'somefile'
>>>

4.模拟文件输入

使用io.StringIO() and io.BytesIO()

>>> s = io.StringIO()
>>> s.write('Hello World\n')
12
>>> print('This is a test', file=s)
15
>>> # Get all of the data written so far
>>> s.getvalue()
'Hello World\nThis is a test\n'
>>>
>>> # Wrap a file interface around an existing string
>>> s = io.StringIO('Hello\nWorld\n')
>>> s.read(4)
'Hell'
>>> s.read()
'o\nWorld\n'
>>>

>>> s = io.BytesIO()
>>> s.write(b'binary data')
>>> s.getvalue()
b'binary data'
>>>

5.打开.gz .bz2压缩文件

#读入
# gzip compression
import gzip
    with gzip.open('somefile.gz', 'rt') as f:
        text = f.read()
# bz2 compression
import bz2
    with bz2.open('somefile.bz2', 'rt') as f:
        text = f.read()
#写入
# gzip compression
import gzip
    with gzip.open('somefile.gz', 'wt') as f:
        f.write(text)
# bz2 compression
import bz2
    with bz2.open('somefile.bz2', 'wt') as f:
        f.write(text)

gzip和bz2模块生成的f具有和内置open（）一样的参数。同时，gzip和bz2模块可以用在二进制打开的文件最上层，实现对多种文件型的操作

import gzip
    f = open('somefile.gz', 'rb')
    with gzip.open(f, 'rt') as g:
        text = g.read()

6.循环读取固定大小文件

from functools import partial
    RECORD_SIZE = 32
    with open('somefile.data', 'rb') as f:
        records = iter(partial(f.read, RECORD_SIZE), b'')
    for r in records:
        ...

用到了 partial()生成callabe对象和iter()替代while循环

7.readinto()方法使用固定大小的buffer

import os.path
def read_into_buffer(filename):
    buf = bytearray(os.path.getsize(filename))
    with open(filename, 'rb') as f:
        f.readinto(buf)
    return buf

>>> # Write a sample file
>>> with open('sample.bin', 'wb') as f:
... f.write(b'Hello World')
...
>>> buf = read_into_buffer('sample.bin')
>>> buf
bytearray(b'Hello World')
>>> buf[0:5] = b'Hallo'
>>> buf
bytearray(b'Hallo World')
>>> with open('newsample.bin', 'wb') as f:
... f.write(buf)
...
11
>>>

readintor()方法会依次读取buf大小的内容到buf中，替换掉原有buf内容，这样不像read(size)那样每次开辟新的size大小内存。

注意：每次判断readinto()真实读取到的大小，可判断是否停止。

record_size = 32 # Size of each record (adjust value)
buf = bytearray(record_size)
with open('somefile', 'rb') as f:
    while True:
        n = f.readinto(buf)
        if n < record_size:
            break
         # Use the contents of buf
         ...

类似的into还有recv_into(), pack_into()等

Code_LT

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
【Python练习cookbook】文件操作，I/O

file对象使用：https://docs.python.org/zh-cn/3.7/library/io.html#module-io关键函数：open(file,mode='r',buffering=-1,encoding=None,errors=None,newline=None,closefd=True,opener=None)打开file并返回对应的file object。如果该文件不能打开，则触发OSError。从流中读取输入时，如果newlin...
复制链接

扫一扫