Python Basic - open()方法,函数使用(详细)


能调用方法的只有对象,文件本身也是一个对象

文件的操作

文件的操作过程:

  1. 打开文件,拿到文件的操作句柄或叫文件描述符(file descriptor)
  2. 通过句柄对文件进行操作
  3. 关闭文件
    • 养成良好的习惯,文件用完之后要使用“close()”关闭文件,也可以使用“with” 来打开文件,用完之后会自动关闭相应的文件

open() 建立一个文件对象

  • 用于打开一个文件,并赋以相应的权限,打开的内容可以使用一个变量进行接收

语法

open ( file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None )

open() 方法支持参数

file :文件的路径 可以是绝对路径,也可以是相对路径

+ 绝对路径:从文件系统的根开始
+ 相对路径:相对于程序文件目录

mode :操作文件的模式,可选参数,默认为“r”

+ r:	read--读文件操作,如果只有“r”参数,在文本模式中,文件是只读不可写的
+ w:	write--写文件操作,如果是只有“w”参数,文件是只写不可读的,单独使用此参数请注意,一旦创建这个文件对象,且有写操作,则直接清空文件内容,如果没有这个文件,则创建这个文件。原文为“open for writing, truncating the file first”,一旦开始,首先就截短文件为空白。
+ x:	exclusive 如果不存在这个文件就创建,如果文件已经存在则打开打败。可用于创建文件前先判断此文件是否已经存在
+ a:	append--往后面追加,也不能读,与“w”一样,都是写,不同之处在于,无论当前的seek位置(指针位置,或者叫光标的位置)在哪里,a是往文件最后追加,默认情况下往最后直接写,建议在写之前先加一个“\n”换行符。
+ b:	二进制模式
+ t:	文本模式(默认模式)
+ +:	打开文件用于更新(读和写)
 	- r+:	读写模式,从指针所在的位置往后写,往哪儿写要看指针所在位置,指针所在位置可以使用tell()方法查询
	- w+: 	写读模式,写完一个数据后,指针就移到写的末尾,如果此时再读,要看后面有没有内容。或者使用seek() 调整光标的位置
	- a+: 	追加读模式,与写一样,需要看打针所在的位置,因为a默认是从文件尾部追加,所以读就从此后读,所以不调整光标默认读到的是空白

buffering =

+ 是一个可选的数字参数,用于设置缓存策略
	- 0 : 关闭缓存(仅允许在二进制模式中)
	- 1 :行缓存(仅在文本模式中可用)
	- >1 :用于指明缓存的字节大小
+ 当没有二进程文件缓存至指定大小的框中,默认策略如下:
	- 二进制文件 固定缓存大小
	- 缓存的大小使用一种探索式的尝试从度层设备的“块大小”反馈来设置默认的缓存大小(DEFAULT_BUFFER_SIZE)
	- 在许多系统中,缓存的典型大小是4字节或者8字节。

encoding

+ 用于标明文件内字符编码
+ 在文本模式下,如果“encoding”没有手动指定,将使用操作系统平台默认的locale,python3 会通过“locale.getpreferredencoding(False)”函数来获取当前平台的默认编码。
+ 二进制模式下,读写都使用裸字节,就是原始的0101直接发过去,所以保留了原始的编码。

errors

+ 是一个可选项
+ 用于指定怎么对“encoding”,"decoding"所遇到的错误
+ 不能用于二进制模式
+ 各种各样的标准错误处理方法都可以使用“error()”函数进行处理,标准和名称包含如下:
	- ignore	:	忽略,**注意**:忽略编码错误会导致数据丢失
	- strict		:严格,编码错误时,提高异常错误的值,默认的值为None与strict有同样的功能
	- replace	:替换,编码错误时,生成一个替换的标记(如“?”),然后插入到原文中,会看起来很难看。
	- surrogateescape	:
		+ 不知道咋翻译:从单词上看“代理逃逸”:如遇到编码错误,将使用“Unicode”的私有编码,范围从“U+DC80” 到“ U+DCFF”的来替换错误字节到原位置
		+ 对于处理一些不知道编码的文件会很有用。
	- xmlcharrefreplace:
		+ 仅用于写文件时,编码不支持的字符,将会使用一个合适的XML特性,如&#nnn;
	- backslashreplace	:
		+ 使用Python的 逃逸符“\”来代替原先畸形的数据
	- namereplace:
		+ 仅用于写文件,
		+ 使用“\N”来替换不支持的字符

newline :指定换行符

  • 仅用于“文本模式”
  • 用于控制如何换行,就是说明一下文件的换行符:
  • 换行符可以是:
    • None
    • "
    • ‘\n’
    • ‘\r’
    • ‘\r\n’
  • 如果从一个数据流中输入数据:
    • 换行符是None,使用的是通用的换行模式,一行的结束可以是‘\n’,’\r’或者是’\r\n’,在将数据返回给请求者之前,这些换行符将都被统一转换成’\n’
    • 如果是“(一个双引号做为换行符),通用的模式被开启,但是返回给请求都之前,不会被转换
    • 如果是其它的合法的值,输出的行将仅用给定的换行符进行换行,并且换行符在返回给请求者之前不会被翻译
  • 如果是写入输出一个数据流:
    • None: 任何’\n’ 换行符被转换成系统默认的换行符,可使用“os.linesep” 替换
    • “ 或者是‘\n’ :不会转换
    • 其它:所有的’\n’字符将被转换成给定的换行符。

closefd 指定关闭文件描述符

  • 如果closefd 是False 并且 ”文件描述符(file descriptor)不是文件名给定的,文件关闭后底层系统的描述符将被保持打开,文件名称指定的closefd 为Ture(默认就是Ture) 否则会报出一个错误。

opener

官方文档内容

open ( file, mode=‘r’, buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None )
Open file and return a corresponding file object. If the file cannot be opened, an OSError is raised.
file is a path-like object giving the pathname (absolute or relative to the current working directory) of the file to be opened or an integer file descriptor of the file to be wrapped. (If a file descriptor is given, it is closed when the returned I/O object is closed, unless closefd is set to False.)
mode is an optional string that specifies the mode in which the file is opened. It defaults to ‘r’ which means open for reading in text mode. Other common values are ‘w’ for writing (truncating the file if it already exists), ‘x’ for exclusive creation and ‘a’ for appending (which on some Unix systems, means that all writes append to the end of the file regardless of the current seek position). In text mode, if encoding is not specified the encoding used is platform dependent: locale.getpreferredencoding(False) is called to get the current locale encoding. (For reading and writing raw bytes use binary mode and leave encoding unspecified.) The available modes are:

Character
Meaning
‘r’
open for reading (default)
‘w’
open for writing, truncating the file first
‘x’
open for exclusive creation, failing if the file already exists
‘a’
open for writing, appending to the end of the file if it exists
‘b’
binary mode
‘t’
text mode (default)
‘+’
open for updating (reading and writing)
The default mode is ‘r’ (open for reading text, synonym of ‘rt’). Modes ‘w+’ and ‘w+b’ open and truncate the file. Modes ‘r+’ and ‘r+b’ open the file with no truncation.
As mentioned in the Overview, Python distinguishes between binary and text I/O. Files opened in binary mode (including ‘b’ in the mode argument) return contents as bytes objects without any decoding. In text mode (the default, or when ‘t’ is included in the mode argument), the contents of the file are returned as str, the bytes having been first decoded using a platform-dependent encoding or using the specified encoding if given.
There is an additional mode character permitted, ‘U’, which no longer has any effect, and is considered deprecated. It previously enabled universal newlines in text mode, which became the default behaviour in Python 3.0. Refer to the documentation of the newline parameter for further details.
Note
Python doesn’t depend on the underlying operating system’s notion of text files; all the processing is done by Python itself, and is therefore platform-independent.
buffering is an optional integer used to set the buffering policy. Pass 0 to switch buffering off (only allowed in binary mode), 1 to select line buffering (only usable in text mode), and an integer > 1 to indicate the size in bytes of a fixed-size chunk buffer. When no buffering argument is given, the default buffering policy works as follows:
Binary files are buffered in fixed-size chunks; the size of the buffer is chosen using a heuristic trying to determine the underlying device’s “block size” and falling back on io.DEFAULT_BUFFER_SIZE. On many systems, the buffer will typically be 4096 or 8192 bytes long.
“Interactive” text files (files for which isatty() returns True) use line buffering. Other text files use the policy described above for binary files.
encoding is the name of the encoding used to decode or encode the file. This should only be used in text mode. The default encoding is platform dependent (whatever locale.getpreferredencoding() returns), but any text encoding supported by Python can be used. See the codecs module for the list of supported encodings.
errors is an optional string that specifies how encoding and decoding errors are to be handled—this cannot be used in binary mode. A variety of standard error handlers are available (listed under Error Handlers), though any error handling name that has been registered with codecs.register_error() is also valid. The standard names include:
‘strict’ to raise a ValueError exception if there is an encoding error. The default value of None has the same effect.
‘ignore’ ignores errors. Note that ignoring encoding errors can lead to data loss.
‘replace’ causes a replacement marker (such as ‘?’) to be inserted where there is malformed data.
‘surrogateescape’ will represent any incorrect bytes as code points in the Unicode Private Use Area ranging from U+DC80 to U+DCFF. These private code points will then be turned back into the same bytes when the surrogateescape error handler is used when writing data. This is useful for processing files in an unknown encoding.
‘xmlcharrefreplace’ is only supported when writing to a file. Characters not supported by the encoding are replaced with the appropriate XML character reference &#nnn;.
‘backslashreplace’ replaces malformed data by Python’s backslashed escape sequences.
‘namereplace’ (also only supported when writing) replaces unsupported characters with \N{…} escape sequences.
newline controls how universal newlines mode works (it only applies to text mode). It can be None, ‘’, ‘\n’, ‘\r’, and ‘\r\n’. It works as follows:
When reading input from the stream, if newline is None, universal newlines mode is enabled. Lines in the input can end in ‘\n’, ‘\r’, or ‘\r\n’, and these are translated into ‘\n’ before being returned to the caller. If it is ‘’, universal newlines mode is enabled, but line endings are returned to the caller untranslated. If it has any of the other legal values, input lines are only terminated by the given string, and the line ending is returned to the caller untranslated.
When writing output to the stream, if newline is None, any ‘\n’ characters written are translated to the system default line separator, os.linesep. If newline is ‘’ or ‘\n’, no translation takes place. If newline is any of the other legal values, any ‘\n’ characters written are translated to the given string.
If closefd is False and a file descriptor rather than a filename was given, the underlying file descriptor will be kept open when the file is closed. If a filename is given closefd must be True (the default) otherwise an error will be raised.
A custom opener can be used by passing a callable as opener. The underlying file descriptor for the file object is then obtained by calling opener with (file, flags). opener must return an open file descriptor (passing os.open as opener results in functionality similar to passing None).
The newly created file is non-inheritable.
The following example uses the dir_fd parameter of the os.open() function to open a file relative to a given directory:

>>> import os
>>> dir_fd = os.open('somedir', os.O_RDONLY)
>>> def opener(path, flags):
...     return os.open(path, flags, dir_fd=dir_fd)
...
>>> with open('spamspam.txt', 'w', opener=opener) as f:
...     print('This will be written to somedir/spamspam.txt', file=f)
...
>>> os.close(dir_fd)  # don't leak a file descriptor

The type of file object returned by the open() function depends on the mode. When open() is used to open a file in a text mode (‘w’, ‘r’, ‘wt’, ‘rt’, etc.), it returns a subclass of io.TextIOBase (specifically io.TextIOWrapper). When used to open a file in a binary mode with buffering, the returned class is a subclass of io.BufferedIOBase. The exact class varies: in read binary mode, it returns an io.BufferedReader; in write binary and append binary modes, it returns an io.BufferedWriter, and in read/write mode, it returns an io.BufferedRandom. When buffering is disabled, the raw stream, a subclass of io.RawIOBase, io.FileIO, is returned.
See also the file handling modules, such as, fileinput, io (where open() is declared), os, os.path, tempfile, and shutil.
Raises an auditing event open with arguments file, mode, flags.
The mode and flags arguments may have been modified or inferred from the original call.
Changed in version 3.3:
The opener parameter was added.
The ‘x’ mode was added.
IOError used to be raised, it is now an alias of OSError.
FileExistsError is now raised if the file opened in exclusive creation mode (‘x’) already exists.
Changed in version 3.4:
The file is now non-inheritable.
Deprecated since version 3.4, will be removed in version 3.9: The ‘U’ mode.
Changed in version 3.5:
If the system call is interrupted and the signal handler does not raise an exception, the function now retries the system call instead of raising an InterruptedError exception (see PEP 475 for the rationale).
The ‘namereplace’ error handler was added.
Changed in version 3.6:
Support added to accept objects implementing os.PathLike.
On Windows, opening a console buffer may return a subclass of io.RawIOBase other than io.FileIO.

  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值