python文件的基本操作

最新推荐文章于 2022-06-29 23:35:21 发布

frantichow

最新推荐文章于 2022-06-29 23:35:21 发布

阅读量329

点赞数

分类专栏： python 文章标签： python

本文链接：https://blog.csdn.net/Bensonofljb/article/details/96589136

版权

python 专栏收录该内容

36 篇文章 1 订阅

订阅专栏

python文件的基本操作

1.读取文件涉及到三个步骤

打开
读取到内存
关闭

文件的基本操作的核心逻辑为：将文件从磁盘中获取到内存中，然后打印出来。
涉及到的主要入参为：
文件路径
编码：utf-8，gbk，gb2312
打开模式：只读（r），只写（w），追加（a），写读（w+），读写（r+）

核心：使用文件的open()方法来打开一个文件，获取到文件的句柄，通过文件句柄对文件进行操作。

2.只读模式（r）

f = open('文件.txt',mode='r',encoding='utf-8')
content = f.read()
f.close()
print(content)

#输出
文件

f可以写成任意的变量，被称为文件句柄，文件操作符，或者文件操作对象。

**open()**是Python调用的操作系统（windows，linux等）的功能。

windows的默认编码方式为gbk，linux默认编码方式为utf-8。

由于展示文件需要编码解码，因此用什么格式编码就要用什么格式解码打开，根据此来配置encoding参数。

mode参数为打开方式；常见的有r,w,a,r+,w+,a+.rb,wb,ab,等，默认为r。

3.读取字节模式（rb）

f = open('文件',mode='rb')
content = f.read()
f.close()
print(content)

#输出
b'\xe6\x9d\x8e\xe4\xbf\x8a\xe6\xb3\xa2'

rb模式读出来的数据是bytes类型,在rb模式下,不能encoding字符集

应用场景：

在读取非文本文件的时候,比如要读取mp3,图像,视频等信息的时候就需要用到rb,因为这种数据是没办法直接显示出来的。字节模式主要用户传输和存储。

4.路径

相对路劲

同一个文件夹下面的文件，直接写文件名就可以。

绝对路径

从根目录下开始一直到文件名。

在使用绝对路径的时候因为有\这样程序是不能识别的。解决方法：

open('C:\Users\Benson')  #这样程序是不识别的
解决方法一:
open('C:\\Users\\Benson') #这样就成功的将\进行转义   两个\\代表一个\
解决方法二:
open(r'C:\Users\Benson') #这样相比上边的还要省事,在字符串的前面加个小r也是转义的意思

5.读操作（r，rb）

1. read()

read()是将文件中所有的内容都读取到内存，文件过大会出现内存溢出的问题。

file = open('words/fundstype.txt',mode='r',encoding='utf-8')
content=file.read()
file.close()
print(content)

输出:
普通
短债
货币
保本基金
QDII基金

read()可以指定我们想要读取的内容数量

file = open('words/fundstype.txt',mode='r',encoding='utf-8')
content=file.read(1) #读取一个字符
content2=file.read()#后边在读就会继续向后读取
file.close()
print(content)
print(content2)

输出：
普
通
短债
货币
保本基金
QDII基金

使用r模式读取的是文字，使用rb模式，读取出来的就是字节

file = open('words/fundstype.txt',mode='rb')
content=file.read()
file.close()
print(content)

输出：
b'\xe6\x99\xae\xe9\x80\x9a\r\n\xe7\x9f\xad\xe5\x80\xba\r\n\xe8\xb4\xa7\xe5\xb8\x81\r\n\xe4\xbf\x9d\xe6\x9c\xac\xe5\x9f\xba\xe9\x87\x91\r\nQDII\xe5\x9f\xba\xe9\x87\x91'

2. readline()

readline()读取每次只读取一行。注意点：readline()读取出来的数据在后面都有一个\n

file = open('words/fundstype.txt',mode='r',encoding='utf-8')
content1=file.readline()
content2=file.readline()
content3=file.readline()
content4=file.readline()

file.close()
print(content1)
print(content2)
print(content3)
print(content4)

结果：
普通

短债

货币

保本基金

通过调用stripe()方法可以去除\n

3.readline()

file = open('words/fundstype.txt',mode='r',encoding='utf-8')
content1=file.readline().strip()
content2=file.readline().strip()
content3=file.readline()
content4=file.readline().strip()

file.close()
print(content1)
print(content2)
print(content3.strip())
print(content4)

结果：
普通
短债
货币
保本基金

readlines() 将每一行形成一个元素，放到一个列表中，将所有的内容全部读出来，如果文件很大会导致内存溢出。

file = open('words/fundstype.txt',mode='r',encoding='utf-8')
content=file.readlines()

file.close()
print(content)

结果：
['普通\n', '短债\n', '货币\n', '保本基金\n', 'QDII基金']

对于大文件，通常采用的方式为用for循环，通过readline()方法，逐行获取，防止出现内存溢出的情况。

file = open('words/fundstype.txt',mode='r',encoding='utf-8')
for line in file:
    print(line.strip())
file.close()

结果：
普通
短债
货币
保本基金
QDII基金

6.写模式（w，wb）

1.write（）和flush（）方法

2.写文件四个步骤

打开文件
写入文件
刷新到硬盘中
关闭文件

在写文件的时候我们要养成一个写完文件就刷新的习惯。从内存中刷新到磁盘中flush()

当我选择使用w模式的时候,在打开文件的时候就就会把文件中的所有内容都清空,然后再操作

如果文件不存在使用w模式会创建文件,文件存在w模式是覆盖写,在打开文件时会把文件中所有的内容清空.

file = open('words/fundstype.txt',mode='w',encoding='utf-8')
file.write('专户产品')
file.flush()
file.close()

file = open('words/fundstype.txt',mode='r',encoding='utf-8')
for line in file:
    print(line.strip())
file.close()


结果：
专户产品

wb模式下，不可以指定打开文件的编码，但是写文件的时候必须将字符串转换成utf-8的bytes数据

file = open('words/fundstype.txt',mode='wb')
file.write("专户产品".encode('utf-8'))
file.flush()
file.close()

file = open('words/fundstype.txt',mode='r',encoding='utf-8')
content=file.read()
print(content)
file.close()

输出：
专户产品

7.追加模式（a，ab,a+）

只要是a或者ab,a+都是在文件的末尾写入,不论光标在任何位置.

在追加模式下,我们写入的内容后追加在文件的末尾

a模式如果文件不存在就会创建一个新文件

file = open('words/fundstype.txt',mode='a',encoding='utf-8')
content = file.write('私募')
file.close()

file = open('words/fundstype.txt',mode='r',encoding='utf-8')
content=file.read()
file.close()
print(content)


结果：
专户产品私募

8.读写模式（r+）

对于读写模式,必须是先读后写,因为光标默认在开头位置,当读完了以后再进行写入。

使用频率最高的模式就是r+

r+模式一定要记住是先读后写

深坑请注意: 在r+模式下. 如果读取了内容. 不论读取内容多少. 光标显示的是多少. 再写入
或者操作文件的时候都是在结尾进行的操作.

file = open('words/fundstype.txt',mode='r+',encoding='utf-8')
content=file.read()
print(content)
file.write("公募")
file.flush()
print(content)
file.close()

file = open('words/fundstype.txt',mode='r+',encoding='utf-8')
content=file.read()
print(content)
file.close()

结果：
专户产品
专户产品
专户产品公募

9.写读模式（w+）

先将所有的内容清空,然后写入.最后读取.但是读取的内容是空的,不常用

file = open('words/fundstype.txt',mode='w+',encoding='utf-8')
file.write("公募")
file.flush()
content=file.read()
file.close()
print(content)

10.追加读(a+,a+b)

a+模式下,不论是先读还是后读,都是读不到数据的

还有几个带b的模式,其实就是对字节的一些操作

file = open('words/fundstype.txt',mode='a+',encoding='utf-8')
file.write("公募")
file.flush()
content=file.read()
file.close()
print(content)

11.其他相关操作

1.seek()

seek(n)光标移动到n位置，注意：移动单位是byte，因此如果是utf-8的中文部分要是3的倍数

通常我们使用seek都是移动到开头或者结尾

移动到开头：seek(0,0) 可以看做成seek(0)

seek(6)这种如果是单数并且不是0的就是按照字节来移动光标

移动到结尾：seek(0,2)

seek的第二个参数表示的是从哪个位置进行偏移。默认是0，表示开头；1表示当前位置；2表示结尾。

2.tell()

使用tell()可以帮我们获取当前光标在什么字节位置

file = open('words/fundstype.txt',mode='r+',encoding='utf-8')
file.seek(0)#光标移动到开头
content = file.read()#读取内容，此时光标移动到结尾
print(content)

file.seek(0)#再次将光标移动到开头
file.seek(0,2)#将光标移动到结尾
content2=file.read()#读取内容，什么都没有
print(content2)

file.seek(0)#移动到开头
file.write("陈奕迅")#写入信息，此时光标在9，utf-8编码，中文3*3=9个byte
file.flush()
locations=file.tell()#获取光标当前的位置
print(locations)

file.seek(0)#再次将光标移动到开头
content3=file.read()
print(content3)

file.close()

结果：
股票型基金

9
陈奕迅基金

3.truncate()

删掉光标后面的所有内容

如果想做截断操作。要先挪动光标，挪动到你想要截断的位置.，然后再进行截断。
关于truncate(n)，如果给出了n，则从n进行截断；如果不给n, 则从当前位置截断，后面
的内容将会被删除。

file = open('words/fundstype.txt',mode='w',encoding='utf-8')
file.write("股票型基金")#写入两个字符
file.seek(12)#光标移动到第12个字节位置，基字后面
file.truncate()#删掉光标后面的所有内容
file.close()

file = open('words/fundstype.txt',mode='r+',encoding='utf-8')
content=file.read(3)#读取3个字符
print(content)
file.seek(3)
print(file.tell())
file.truncate()#后面的内容全部删掉
file.flush()

content2=file.read()
print(content2)

file.seek(0)
content3=file.read()
print(content3)
file.close()


结果：
股票型
3

股

12.修改文件

文件修改: 只能将文件中的内容读取到内存中, 将信息修改完毕, 然后将源文件删除, 将新文件的名字改成老文件的名字.

import os
with open('words/fundstype.txt',mode='r',encoding='utf-8') as f1,\
    open('words/fundstype_new.txt',mode='w',encoding='utf-8') as f2:
    content = f1.read()
    new_content=content.replace("股票","货币")
    f2.write(new_content)
os.remove("words/fundstype.txt")# 删除源文件
os.rename("words/fundstype_new.txt","fundstype.txt")# 重命名新文件

弊端：一次将所有的内容进行读取，会出现内存溢出。

解决方案：一行一行的读取和操作。

import os
with open('fundstype.txt',mode='r',encoding='utf-8') as f1,\
    open('fundstype_new.txt',mode='w',encoding='utf-8') as f2:
    for line in f1:
        new_line=line.replace("股票","货币")
        f2.write(new_line)
os.remove("fundstype.txt")# 删除源文件
os.rename("fundstype_new.txt","fundstype.txt")# 重命名新文件