python中oepen及fileobject初步整理之划水篇

open选项

参考官方文档,很多东西也没有看懂,将自己理解的部分先整理到这里,以后还是要参阅官方文档的。

open (file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None)

1.file

file这里可以是文件的路径(相对路径或绝对路径),也可以是一个文件描述符。若是一个文件描述符,当文件对象关闭时文件描述符也会关闭,除非closefd选项设置为False.

*file* is a path-like object giving the pathname (absolute or relative to the current working directory) of the file to be opened or an integer file descriptor of the file to be wrapped. (If a file descriptor is given, it is closed when the returned I/O object is closed, unless *closefd* is set to `False`.)

2.mode

指定文件打开的方式。分两部分,一部分是权限(rwxa),另外一部分是打开方式(tb);

权限方式有 只读r,只写w(若存在则覆盖),新建x(若存在则抛异常),追加a;

打开方式有:字符串模式t,二进制模式b

 默认权限为'r',默认打开方式为‘t'

mode选项中可以给个'+',表示增加读或写的功能;比如'r+',表示以读和text的方式打开文件,但是还可以写。

U?通用的新行模式,(python官方不建议),,,不懂诶
CharacterMeaning
'r'open for reading (default)
'w'open for writing, truncating the file first
'x'open for exclusive creation, failing if the file already exists
'a'open for writing, appending to the end of the file if it exists
'b'binary mode
't'text mode (default)
'+'open a disk file for updating (reading and writing)
'U'universal newlines mode (deprecated)

3.buffering

设置buffer的策略,参数为整数。
bufferingMeaning
0关闭buffer(binary mode 专用)
1line buffering(text mode 专用),binary中就是1个字节
>1设置buffer大小
-1默认,使用系统缓冲大小(io.DEFAULT_BUFFER_SIZE)

4.encoding

设置编码方式,text mode 专用。默认使用系统编码方式。linux/unix中为utf-8 ,win中为gbk。

5.errors

text mode 专用。指定遇到decode或者encode错误时的处理方式。默认为None,即跑异常

errorsMeaning
'strict'to raise aValueError exception if there is an encoding error. The default value of None has the same effect.
'ignore'ignores errors. Note that ignoring encoding errors can lead to data loss.
'replace'causes a replacement marker (such as '?') to be inserted where there is malformed data.
'surrogateescape'will represent any incorrect bytes as code points in the Unicode Private Use Area ranging from U+DC80 to U+DCFF. These private code points will then be turned back into the same bytes when the surrogateescape error handler is used when writing data. This is useful for processing files in an unknown encoding.

t为一个编码有问题的文本文档

模式errors

In [48]: f = open(t)

In [49]: f.read()
---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
<ipython-input-49-571e9fb02258> in <module>()
----> 1 f.read()

~/.pyenv/versions/3.6.5/lib/python3.6/codecs.py in decode(self, input, final)
    319         # decode input (taking the buffer into account)
    320         data = self.buffer + input
--> 321         (result, consumed) = self._buffer_decode(data, self.errors, final)
    322         # keep undecoded input until the next call
    323         self.buffer = data[consumed:]

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x88 in position 1: invalid start byte

    
    

errors = 'surrogateescape'

In [61]: f=open(t, errors='surrogateescape')

In [62]: f.read()
Out[62]: 'a\udc88\udc91.sss.com\nwww.tudou.com\nwww.dtw.com\n'

errors = 'ignore'

In [57]: f = open(t, errors='ignore')

In [58]: f.read()
Out[58]: 'a.sss.com\nwww.tudou.com\nwww.dtw.com\n'

errors = 'replace'

In [54]: f = open(t, errors='replace')

In [55]: f.read()
Out[55]: 'a��.sss.com\nwww.tudou.com\nwww.dtw.com\n'

6.newline

newline controls howuniversal newlines mode works (it only applies to text mode). It can be None, '', '\n', '\r', and '\r\n'. It works as follows:

  • When reading input from the stream, if newline is None, universal newlines mode is enabled. Lines in the input can end in '\n', '\r', or '\r\n', and these are translated into '\n'before being returned to the caller. If it is '', universal newlines mode is enabled, but line endings are returned to the caller untranslated. If it has any of the other legal values, input lines are only terminated by the given string, and the line ending is returned to the caller untranslated.
  • When writing output to the stream, if newline is None, any '\n' characters written are translated to the system default line separator, os.linesep. If newline is '' or '\n', no translation takes place. If newline is any of the other legal values, any '\n' characters written are translated to the given string.
  1. closefd

    关闭文件描述符,True表示关闭它。False会在文件关闭后保持这个描述符。fileobj.fileno()查看.

    If closefd is False and a file descriptor rather than a filename was given, the underlying file descriptor will be kept open when the file is closed. If a filename is given closefd must be True(the default) otherwise an error will be raised.

8.opener

A custom opener can be used by passing a callable as *opener*. The underlying file descriptor for the file object is then obtained by calling *opener* with (*file*, *flags*). *opener* must return an open file descriptor (passing [`os.open`](https://docs.python.org/3.6/library/os.html#os.open) as *opener* results in functionality similar to passing `None`).

fileobject的读写方法:

Methodmeaning
read()一次读取所有文件
readline()一次读取一行
readlines()一次读取所有文件,每行放到列表中
write()参数为str,将str写入文件
writelines()接收一个可迭代对象,读取每个元素(str)并写入到文件 不自动写换行符

fileobject的其他方法

Methodmeaning
seek()指定指针位置,0,1,2表示开头,当前,和结尾。二进制模式下可以前移后移,文本摸下不行
close()关闭文件
seekable()是否可seek
readable()是否可读取
writable()是否可写
closed是否关闭

转载于:https://www.cnblogs.com/dingtianwei/p/9569809.html

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值