【Python不是大蟒蛇】文件操作三部曲-CSDN博客

本文链接：https://blog.csdn.net/m0_73552311/article/details/134406089

我也老旧没更博客了，前段时间要期中考试嘛（理解一下）

拖更也快一个月了，考完试就想着快点更一下。

这篇本来想多讲一些的，后面写的也有点累了，就没往后讲了。

这篇质量可能没那么高，谅解一下，这里就不多说了。

接下来是正文.........

Ⅰ.文件操作三部曲

0x00 什么是文件操作

呃，什么是文件读写？

文件读写在电脑使用中随处可见

我们在做PPT时，ppt文件就会保存起来，这样哪怕你关机之后再打开ppt也还在。

在这里面，PPT软件把这个ppt保存起来的这个过程就是文件写入了。

而ppt软件在把这个ppt文件加载的时候就是文件读取了。

实际上在这里大家应该就到文件读写三部曲有个大概的猜测了。

额我们目前所有的程序都是关掉之后原来的都没有了，

这时候文件操作就急外重要了，毕竟谁也不想打了半天的游戏关机就全没了

活不多说，让我们扬帆起航

0x01 文件操作三部曲——打开文件

python的内置函数简直是麻雀虽小五脏俱全

python在制造python的时候已经考虑到了，

所以内置函数也就毫无疑问的包含打开文件的函数了。

那就是文件操作的开始，没有他，我们这节课就完结撒花了

开个玩笑。

那就是大名鼎鼎的 open()

open()一看名字就知道是打开的意思，起作用也是人如其名——打开文件

怎么样？是不是迫不及待，那怎么用呢？

嘿嘿，我就不说

自己用help试一试吧。

---------------------------此次省略三小时的尝试----------------------------------

为节省时间，这里我们直接展示结果（~~笔者只是懒~~)

有亿点点长哈

Help on built-in function open in module io:

open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None)
    Open file and return a stream.  Raise OSError upon failure.

    file is either a text or byte string giving the name (and the path
    if the file isn't in the current working directory) of the file to
    be opened or an integer file descriptor of the file to be
    wrapped. (If a file descriptor is given, it is closed when the
    returned I/O object is closed, unless closefd is set to False.)

    mode is an optional string that specifies the mode in which the file
    is opened. It defaults to 'r' which means open for reading in text
    mode.  Other common values are 'w' for writing (truncating the file if
    it already exists), 'x' for creating and writing to a new file, and
    'a' for appending (which on some Unix systems, means that all writes
    append to the end of the file regardless of the current seek position).
    In text mode, if encoding is not specified the encoding used is platform
    dependent: locale.getpreferredencoding(False) is called to get the
    current locale encoding. (For reading and writing raw bytes use binary
    mode and leave encoding unspecified.) The available modes are:

    ========= ===============================================================
    Character Meaning
    --------- ---------------------------------------------------------------
    'r'       open for reading (default)
    'w'       open for writing, truncating the file first
    'x'       create a new file and open it for writing
    'a'       open for writing, appending to the end of the file if it exists
    'b'       binary mode
    't'       text mode (default)
    '+'       open a disk file for updating (reading and writing)
    'U'       universal newline mode (deprecated)
    ========= ===============================================================

    The default mode is 'rt' (open for reading text). For binary random
    access, the mode 'w+b' opens and truncates the file to 0 bytes, while
    'r+b' opens the file without truncation. The 'x' mode implies 'w' and
    raises an `FileExistsError` if the file already exists.

    Python distinguishes between files opened in binary and text modes,
    even when the underlying operating system doesn't. Files opened in
    binary mode (appending 'b' to the mode argument) return contents as
    bytes objects without any decoding. In text mode (the default, or when
    't' is appended to the mode argument), the contents of the file are
    returned as strings, the bytes having been first decoded using a
    platform-dependent encoding or using the specified encoding if given.

    'U' mode is deprecated and will raise an exception in future versions
    of Python.  It has no effect in Python 3.  Use newline to control
    universal newlines mode.

    buffering is an optional integer used to set the buffering policy.
    Pass 0 to switch buffering off (only allowed in binary mode), 1 to select
    line buffering (only usable in text mode), and an integer > 1 to indicate
    the size of a fixed-size chunk buffer.  When no buffering argument is
    given, the default buffering policy works as follows:

    * Binary files are buffered in fixed-size chunks; the size of the buffer
      is chosen using a heuristic trying to determine the underlying device's
      "block size" and falling back on `io.DEFAULT_BUFFER_SIZE`.
      On many systems, the buffer will typically be 4096 or 8192 bytes long.

    * "Interactive" text files (files for which isatty() returns True)
      use line buffering.  Other text files use the policy described above
      for binary files.

    encoding is the name of the encoding used to decode or encode the
    file. This should only be used in text mode. The default encoding is
    platform dependent, but any encoding supported by Python can be
    passed.  See the codecs module for the list of supported encodings.

    errors is an optional string that specifies how encoding errors are to
    be handled---this argument should not be used in binary mode. Pass
    'strict' to raise a ValueError exception if there is an encoding error
    (the default of None has the same effect), or pass 'ignore' to ignore
    errors. (Note that ignoring encoding errors can lead to data loss.)
    See the documentation for codecs.register or run 'help(codecs.Codec)'
    for a list of the permitted encoding error strings.

    newline controls how universal newlines works (it only applies to text
    mode). It can be None, '', '\n', '\r', and '\r\n'.  It works as
    follows:

    * On input, if newline is None, universal newlines mode is
      enabled. Lines in the input can end in '\n', '\r', or '\r\n', and
      these are translated into '\n' before being returned to the
      caller. If it is '', universal newline mode is enabled, but line
      endings are returned to the caller untranslated. If it has any of
      the other legal values, input lines are only terminated by the given
      string, and the line ending is returned to the caller untranslated.

    * On output, if newline is None, any '\n' characters written are
      translated to the system default line separator, os.linesep. If
      newline is '' or '\n', no translation takes place. If newline is any
      of the other legal values, any '\n' characters written are translated
      to the given string.

    If closefd is False, the underlying file descriptor will be kept open
    when the file is closed. This does not work when a file name is given
    and must be True in that case.

    A custom opener can be used by passing a callable as *opener*. The
    underlying file descriptor for the file object is then obtained by
    calling *opener* with (*file*, *flags*). *opener* must return an open
    file descriptor (passing os.open as *opener* results in functionality
    similar to passing None).

    open() returns a file object whose type depends on the mode, and
    through which the standard file operations such as reading and writing
    are performed. When open() is used to open a file in a text mode ('w',
    'r', 'wt', 'rt', etc.), it returns a TextIOWrapper. When used to open
    a file in a binary mode, the returned class varies: in read binary
    mode, it returns a BufferedReader; in write binary and append binary
    modes, it returns a BufferedWriter, and in read/write mode, it returns
    a BufferedRandom.

    It is also possible to use a string or bytearray as a file for both
    reading and writing. For strings StringIO can be used like a file
    opened in a text mode, and for bytes a BytesIO can be used like a file
    opened in a binary mode.

部分翻译

关于在模块io中打开的内置功能的帮助：

open（file，mode='r'，buffering=-1，encoding=None，errors=None、newline=None和closefd=True，opener=None）

打开文件并返回流。失败时引发O错误。

文件是一个文本或字节字符串，给出名称（和路径如果文件不在当前工作目录中）被打开或要打开的文件的整数文件描述符包裹。（如果给定了文件描述符，则当返回的I/O对象是关闭的，除非closefd设置为False。）mode是一个可选字符串，用于指定文件所在的模式已打开。

它默认为“r”，意思是打开以在文本中阅读

模式其他常见值为“w”，用于写入（如果它已经存在），

'x'用于创建和写入新文件，以及

“a”表示追加（在某些Unix系统上，这意味着所有写入附加到文件的末尾而不管当前查找位置）。

因此语法如下：

f = open(文件路径, 打开方式(默认为r——读取), 解码方式)
# 如果没有报错变量f则是一个文件流

我们快点试一试！

#此次打开文件a.txt
f = open('a.txt')
print(f)

我们可以看到默认的打开方式是r，解码是cp936

我们再回来看一下打开方式

0x02 文件打开方式

mode常用的模式：

r：表示文件只能读取
w：表示文件只能写入
a：表示打开文件，在原有内容的基础上追加内容，在末尾写入
w+:表示可以对文件进行读写双重操作

mode参数可以省略不填，默认为r模式mode参数还可以指定以什么样的编码方式读写文本，默认情况下open是以文本形式打开文件的，比如上面的四种mode模式。

当你需要以字节（二进制）形式读写文件时，只需要在mode参数中追加'b'即可：

rb：以二进制格式打开一个文件，用于只读
wb：以二进制格式打开一个文件，用于只写
ab：以二进制格式打开一个文件，用于追加
wb+:以二进制格式打开一个文件，用于读写

当你在默认模式下读取文本文件时（二进制文件不可以），文件中的换行符会转换为'\n'形式。

相反，在默认模式下写入文件时，文本中的'\n'会转换为换行符。

也就是说，你读取的txt文本，其中换行符会以'\n'形式出现，写入txt文本时，文本中的'\n'会变成换行指令。

0x03 文件读写

read()方法

当使用open函数打开文件后，就可以使用该文件对象的各种方法了，read就是其中一种。

read()会读取一些数据并将其作为字符串（在文本模式下）或字节对象（在二进制模式下）返回。

read方法有一个参数：

f.read(size) # f为文件对象

参数size（可选）为数字，表示从已打开文件中读取的字节计数，默认情况下为读取全部。

假设有一个文件sample1.txt，内容如下：

This is python big data analysis!

现在读取该文件：

with  open('sample1.txt') as f:
content = f.read()
    print(content)
    f.close()

readline()方法

readline方法从文件中读取整行，包括换行符'\n'。

换行符（\n）留在字符串的末尾，如果文件不以换行符结尾，则在文件的最后一行省略，这使得返回值明确无误。

如果 f.readline() 返回一个空的字符串，则表示已经到达了文件末尾，而空行使用 '\n' 表示，该字符串只包含一个换行符。

f.readline()有一个参数：

f.readline(size)

参数size表示从文件读取的字节数。

假设有一个文件sample2.txt，共三行，内容如下：

hello,my friends!
This is python big data analysis,
let's study.

我要用readline函数读取该文件：

with  open('a.txt') as f:
    print(f.readline())
    print(f.readline(5))
    f.close()

readline方法会记住上一个readline函数读取的位置，接着读取下一行。

所以当你需要遍历文件每一行的时候，不妨使用readline方法吧！

readlines方法

readlines方法和readline方法长得像，但功能不一样，前面说过readline方法只读取一行，readlines方法则是读取所有行，返回的是所有行组成的列表。

readlines方法没有参数，使用更加简单。依旧以sample2.txt为例：

with  open('a.txt') as f:
    print(f.readlines())
    f.close()

输出：

write方法

write方法顾名思义，就是将字符串写入到文件里。

它只有一个参数：

f.write([str]) # f为文件对象

参数[str]代表要写入的字符串

使用起来也很简单，比如将下面字符串（注意里面的转行符'\n'）

'hello,my friends!\nthis is python big data analysis'

写入到文件sample3.txt里。

with  open('sample3.txt','w') as f:
    f.write('hello,my friends!\nthis is python big data analysis')
    f.close()

0x04 关闭文件

但我们操作文件之后就要关闭文件了，这也便是最后一步。

关闭用的时close()方法，语法如下：

f.close() #关闭文件

这么简单就不用我教了吧

0x05 with as高阶操作

我们讲了这么多，有些比较懂的同学就会说这么不用with...as..

这个我留到后面讲是觉得这样比较容易理解一些

格式如下

with open(filepath) as f:
    # ...省略文件操作....

这段代码就相当于

f = open(filepath)
#...省略文件操作....
f.close()

这样写更加的简洁易用，我就不多说了。