python 全栈开发，Day8(文件操作)-CSDN博客

本文链接：https://blog.csdn.net/shykevin/article/details/90102210

python 全栈开发，Day8(文件操作)

一、文件操作流程

文件以什么编码存储的，就以什么编码打开

参数：

1.文件路径

2.编码方式,encode

3.执行动作(打开方式)：只读，只写，追加，读写，写读...

打开一个已经存在的文件

f = open('D:\qycache\飞碟说.txt',encoding='utf-8',mode='r')
content = f.read()
print(content)
f.close()

执行输出：

知识从未如此性感

代码解释：

f 变量，可以命令为f_obj，file，f_handler... 这个变量，称之为文件句柄(句柄是操作系统在生成对象时分配给对象的唯一标识)

open windows的系统功能，也就是说open这个命令，其实就是调用系统打开文件的动作

windows 默认的编码方式：gbk，linux默认的编码方式为utf-8

f.close() 关闭文件

操作文件流程：

打开一个文件，产生一个文件句柄，对文件句柄进行操作，关闭文件

读：

r 只读，以str方式读取

rb 只读，以bytes类型读取（在非文字类的文件是，用rb。比如音频文件等）

下面一个例子

f = open('D:\qycache\飞碟说.txt',encoding='utf-8')
content = f.read()
print(content)
f.close()

默认mode不写，表示只读模式。

编码不一致是，报错

UnicodeDecodeError: 'gbk' codec can't decode byte 0xaa in position 14: illegal multibyte sequence

所以，文件以什么编码存储的，就以什么编码打开

二、文件路径

绝对路径:从根目录开始，一级一级查找知道找到文件。比如 D:\qycache\飞碟说.txt
相对路径:在同一个文件夹下，直接写文件名即可。

相对路径举例：

f = open('天气.txt',encoding='utf-8')
content = f.read()
print(content)
f.close()

务必保证python代码和txt文件在同一文件夹下。

某些windows系统，读取文件的时候报错

[Error 22] Invalid argument: '\u202adD:\\xx.txt'

解决办法:

1.加2个斜杠，比如 D:\\xx.txt

2.前面加r ,比如 r'D:\xx.txt'

三、文件读取的5种模式

r模式有5种模式读取

第一种：全部读取出来 f.read()

f = open('天气.txt',encoding='utf-8')
content = f.read()
print(content)
f.close()

执行输出：

03月27日(今天)
晴转多云
11～27℃
西南风 1级
重度污染

第二种：一行一行的读 f.readline()

f = open('天气.txt',encoding='utf-8')
print(f.readline())
print(f.readline())
print(f.readline())
f.close()

第三种：将原文件的每一行作为一个列表的元素 f.readlines()

f = open('天气.txt',encoding='utf-8')
print(f.readlines())
f.close()

执行输出：

['03月27日(今天)\n', '晴转多云\n', '11～27℃\n', '西南风 1级\n', '重度污染']

第四种：读取一部分 f.read(n)

在r模式下，read(n) 按照字符去读取

f = open('天气.txt',encoding='utf-8')
print(f.read(3))
f.close()

执行输出：

03月

0 3 月表示3个字符

第五种：for循环读取

f = open('天气.txt',encoding='utf-8')
for i in f:
    print(i.strip())
f.close()

执行输出：

03月27日(今天)
晴转多云
11～27℃
西南风 1级
重度污染

在for循环中，每次读取一行，结束之后，内存就释放了。所以在整个for循环个过程中，始终只占用了一行内容的内存。
推荐使用第5种方式。

四、写操作

w 文件不存在时，创建一个文件写入内容

有文件时，将原文件内容清空，再写入内容

f = open('log.txt',encoding='utf-8',mode='w')
f.write('人生苦短，我想学Python')
f.close()

wb 以bytes写入，写入的内容，必须要转换为bytes类型才可以

f = open('log.txt',mode='wb')
f.write('人生苦短，我想学Python'.encode(encoding="utf-8"))
f.close()

a 追加

没有文件时，创建一个文件追加内容

有文件时，直接追加内容

f = open('log2.txt',encoding='utf-8',mode='a')
f.write('666')
f.close()

r+ 读写，先读，后追加。

错误的写法

f = open('log.txt',encoding='utf-8',mode='r+')
f.write('BBB')
content = f.read()
print(content)
f.close()

执行输出，内容是空的

为什么呢？

因为光标，默认是从0开始。只要进行一个动作，光标就会移动，包括读取。
上面的代码写入时，光标移动到最后了。所以执行f.read()时，读取不到后面的内容了。

r+ 一定要先读后写，否则会错乱或者读取不到内容

w+ 先写后读

f = open('log.txt',encoding='utf-8',mode='w+')
f.write('AAA')
content = f.read()
print(content)
f.close()

执行输出，内容是空的

写完之后，光标移动到最后了。所以读取的时候，读取不到内容了。

f = open('log.txt',encoding='utf-8',mode='w+')
f.write('AAA')
print(f.tell()) #按直接去读光标位置
f.seek(0) #调整光标位置
content = f.read()
print(content)
f.close()

执行输出：

3
AAA

下面一个例子

f = open('log.txt',encoding='utf-8',mode='w+')
f.write('中国')
print(f.tell()) #按直接去读光标位置
f.seek(2) #调整光标位置
content = f.read()
print(content)
f.close()

执行输出：

.....

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xad in position 0: invalid start byte

因为一个中文占用3字节

ftp的断点续传，需要用到光标，一定会用到tell和seek

a+ 追加读

这里不举例了

上面列举了那么多模式

用的最多的还是r和r+模式

r的5种模式中，重点掌握第5种。

其他操作方法：

truncate #截取文件
writable() #是否可写
readable() #是否可读

truncate是截取文件，所以文件的打开方式必须可写，但是不能用w或w+等方式打开，因为那样直接清空文件了，所以truncate要在r+或a或a+等模式下测试效果

f = open('log.txt',encoding='utf-8',mode='r+')
# 截取10个字节
f.truncate(3)
content = f.read()
print(content)
f.close()

执行输出：

中

判断是否可写

f = open('log.txt',encoding='utf-8',mode='r+')
print(f.writable())
f.close()

执行输出：

True

回收方法为：
1、f.close() #回收操作系统级打开的文件，关闭文件句柄
2、del f #回收应用程序级的变量，在python代码级别中，删除变量

为了避免忘记回收文件句柄，需要使用with open方法，代码执行完毕之后，自动关闭文件句柄

功能1：自动关闭文件句柄

with open('log.txt',encoding='utf-8') as f:
    print(f.read())

功能2：一次性操作多个文件

with open('log.txt',encoding='utf-8') as f1,\
    open('log1.txt',encoding='utf-8',mode='r+') as f2:
    print(f1.read())
    print(f2.read())

有些情况下，必须先关闭，才能执行某某动作的情况下，不能用with，这种情况比较少见。

推荐使用with open

所有的软件，不是直接在原文件修改的。
它是进行了5步操作

1.将原文件读取到内存。
2.在内存中进行修改，形成新的内容。
3.将新的字符串写入新文件。
4.将原文件删除。
5.将新文件重命名成原文件。

将log文件内容中含有张三的，替换为李四

import os
#第1步
with open('log',encoding='utf-8') as f1,\
    open('log.bak',encoding='utf-8',mode='w') as f2:
    content = f1.read()
    #第2步
    new_content = content.replace('张三','李四')
    #第3步
    f2.write(new_content)
#第4步
os.remove('log')
#第5步
os.rename('log.bak','log')

这种方法，不好。如果文件比较大，内存直接爆掉。因为f1.read()是将文件所有内容写入内存中。

推荐做法：

import os
#第1步
with open('log',encoding='utf-8') as f1,\
    open('log.bak',encoding='utf-8',mode='w') as f2:
    for i in f1:
        #第3步
        new_i = i.replace('张三', '李四')
        #第4步
        f2.write(new_i)
#第4步
os.remove('log')
#第5步
os.rename('log.bak','log')

这种方式，每次只占用一行。
所有软件，都是执行这5步的。

课后作业：

1. 文件a.txt内容：每一行内容分别为商品名字，价钱，个数。
apple 10 3
tesla 100000 1
mac 3000 2
lenovo 30000 3
chicken 10 3
通过代码，将其构建成这种数据类型：[{'name':'apple','price':10,'amount':3},{'name':'tesla','price':1000000,'amount':1}......] 并计算出总价钱。
2，有如下文件：
-------
alex是老男孩python发起人，创建人。
alex其实是人妖。
谁说alex是sb？
你们真逗，alex再牛逼，也掩饰不住资深屌丝的气质。
----------
将文件中所有的alex都替换成大写的SB。




3. 文件a1.txt内容：
文件内容：
name:apple price:10 amount:3 year:2012
name:tesla price:100000 amount:1 year:2013

通过代码，将其构建成这种数据类型：
[{'name':'apple','price':10,'amount':3},
{'name':'tesla','price':1000000,'amount':1}......]
并计算出总价钱。
4,文件a2.txt内容：
文件内容：
序号     部门      人数      平均年龄      备注
1       python    30         26         单身狗
2       Linux     26         30         没对象
3       运营部     20         24         女生多
通过代码，将其构建成这种数据类型：
[{'序号':'1','部门':Python,'人数':30,'平均年龄':26,'备注':'单身狗'},
......]
并计算出总价钱。
5，明日默写：就是第二题的代码（课上讲过）。

答案：

第一题

1.1先将文件内容读取出来

with open('a.txt',encoding='utf-8') as f:
    for i in f:
        i = i.strip()
        print(i)

执行输出：

apple 10 3
tesla 100000 1
mac 3000 2
lenovo 30000 3
chicken 10 3

1.2使用空格切割成列表

with open('a.txt',encoding='utf-8') as f:
    for i in f:
        i = i.strip().split()
        print(i)

执行输出：

['apple', '10', '3']
['tesla', '100000', '1']
['mac', '3000', '2']
['lenovo', '30000', '3']
['chicken', '10', '3']

1.3将列表的内容放入字典中，测试打印

with open('a.txt',encoding='utf-8') as f:
    for i in f:
        i = i.strip().split()
        print({'name': i[0], 'price': i[1], 'amount': i[2]})

执行输出：

{'amount': '3', 'name': 'apple', 'price': '10'}
{'amount': '1', 'name': 'tesla', 'price': '100000'}
{'amount': '2', 'name': 'mac', 'price': '3000'}
{'amount': '3', 'name': 'lenovo', 'price': '30000'}
{'amount': '3', 'name': 'chicken', 'price': '10'}

1.4 将字典追加到一个空列表中

#总列表
li = []
with open('a.txt',encoding='utf-8') as f:
    for i in f:
        #默认使用空格分割
        i = i.strip().split()
        #将字典追加到列表中
        li.append({'name': i[0], 'price': i[1], 'amount': i[2]})

print(li)

执行输出：

[{'price': '10', 'name': 'apple', 'amount': '3'}, {'price': '100000', 'name': 'tesla', 'amount': '1'}, {'price': '3000', 'name': 'mac', 'amount': '2'}, {'price': '30000', 'name': 'lenovo', 'amount': '3'}, {'price': '10', 'name': 'chicken', 'amount': '3'}]

1.5 计算总价格，最终完整代码如下：

#总列表
li = []
#总价格
the_sum = 0
with open('a.txt',encoding='utf-8') as f:
    for i in f:
        #默认使用空格分割
        i = i.strip().split()
        #将字典追加到列表中
        li.append({'name': i[0], 'price': i[1], 'amount': i[2]})

#遍历列表
for i in li:
    #计算总价格(单价*个数)
    the_sum += int(i['price']) * int(i['amount'])

print(li)
print('总价钱为: {}'.format(the_sum))

执行输出：

[{'amount': '3', 'name': 'apple', 'price': '10'}, {'amount': '1', 'name': 'tesla', 'price': '100000'}, {'amount': '2', 'name': 'mac', 'price': '3000'}, {'amount': '3', 'name': 'lenovo', 'price': '30000'}, {'amount': '3', 'name': 'chicken', 'price': '10'}]
总价钱为: 196060

老师的代码：

li = []
the_sum = 0
name_list = ['name','price','amount','year']
with open('a.txt',encoding='utf-8') as f1:
    for i in f1:
        l2 = i.strip().split()
        dic = {}
        for j in range(len(l2)):
            dic[name_list[j]] = l2[j]
        li.append(dic)
#遍历列表
for i in li:
    #计算总价格(单价*个数)
    the_sum += int(i['price']) * int(i['amount'])

print(li)
print('总价钱为: {}'.format(the_sum))

执行效果同上

第2题，老师讲过，直接贴代码：

import os
with open('log.txt',encoding='utf-8') as f1,\
    open('log.bak',encoding='utf-8',mode='w') as f2:
    for i in f1:
        new_i = i.replace('alex', 'SB')
        f2.write(new_i)

os.remove('log.txt')
os.rename('log.bak','log.txt')

执行代码，查看文件内容如下：

SB是老男孩python发起人，创建人。
SB其实是人妖。
谁说SB是sb？
你们真逗，SB再牛逼，也掩饰不住资深屌丝的气质。

第3题，和第1题类似

3.1直接读取文件，并取出对应的值

with open('log.txt', encoding='utf-8') as f:
    for i in f:
        # 默认使用空格分割
        i = i.strip().split()
        #使用冒号切割,获取value
        name = i[0].split(':')[1]
        price = i[1].split(':')[1]
        amount = i[2].split(':')[1]
        year = i[3].split(':')[1]
        print(name,price,amount,year)

执行输出：

apple 10 3 2012
tesla 100000 1 2013

3.2将取出的内容写入字典中，并追加到总列表里面

# 总列表
li = []
with open('log.txt', encoding='utf-8') as f:
    for i in f:
        # 默认使用空格分割
        i = i.strip().split()
        #使用冒号切割,获取value
        name = i[0].split(':')[1]
        price = i[1].split(':')[1]
        amount = i[2].split(':')[1]
        year = i[3].split(':')[1]
        #将字典追加到列表中
        li.append({'name':name, 'price': price, 'amount': amount,'year':year})

print(li)

执行输出：

[{'name': 'apple', 'price': '10', 'year': '2012', 'amount': '3'}, {'name': 'tesla', 'price': '100000', 'year': '2013', 'amount': '1'}]

3.3遍历总列表，计算总价钱，最终代码如下

# 总列表
li = []
# 总价钱
the_sum = 0
with open('log.txt', encoding='utf-8') as f:
    for i in f:
        # 默认使用空格分割
        i = i.strip().split()
        # 使用冒号切割,获取value
        name = i[0].split(':')[1]
        price = i[1].split(':')[1]
        amount = i[2].split(':')[1]
        year = i[3].split(':')[1]
        # 将字典追加到列表中
        li.append({'name': name, 'price': price, 'amount': amount, 'year': year})

# 遍历总列表
for i in li:
    # 计算总价钱(单价*个数)
    the_sum += int(i['price']) * int(i['amount'])

print(li)
print('总价钱为: {}'.format(the_sum))

执行输出：

[{'year': '2012', 'amount': '3', 'price': '10', 'name': 'apple'}, {'year': '2013', 'amount': '1', 'price': '100000', 'name': 'tesla'}]
总价钱为: 100030

老师的代码：可扩展性强，不论是横排还是竖排增加数据，都是适用

l1 = []
the_sum = 0
with open('a1.txt',encoding='utf-8') as f1:
    for i in f1:
        li = i.strip().split()
        dic = {}
        for j in li:
            #使用冒号切割字符串,转成列表
            l2 = j.strip().split(':')
            #添加字典,l2[0]表示key,l2[1] 表示value
            dic[l2[0]] = l2[1]
        #将字典追加到列表中
        l1.append(dic)

# 遍历总列表
for i in l1:
    # 计算总价钱(单价*个数)
    the_sum += int(i['price']) * int(i['amount'])

print(l1)
print('总价钱为: {}'.format(the_sum))

　　执行输出，效果同上。

第4题

4.1 先读取文件内容

with open('log.txt', encoding='utf-8') as f:
    for i in f:
        i = i.strip()
        print(i)

执行输出：

序号部门人数平均年龄备注
1 python 30 26 单身狗
2 Linux 26 30 没对象
3 运营部 20 24 女生多

4.2 我需要第一行内容(序号部门人数平均年龄备注)，作为标题栏，放入列表中

#行数
line_num = 0
with open('log.txt', encoding='utf-8') as f:
    for i in f:
        i = i.strip()
        #行数加1
        line_num += 1
        #判断行数是否等于1
        if (line_num == 1):
            #去除空格，使用空格切割
            title = i.strip().split()
            #加入标题的列表中
            title_bar = [title[0],title[1],title[2],title[3],title[4]]

print(title_bar)

执行输出：

['序号', '部门', '人数', '平均年龄', '备注']

4.3 将(序号部门人数平均年龄备注)里面的内容提取出来

#行数
line_num = 0
with open('log.txt', encoding='utf-8') as f:
    for i in f:
        i = i.strip()
        #行数加1
        line_num += 1
        #判断行数是否等于1
        if (line_num == 1):
            #去除空格，使用空格切割
            title = i.strip().split()
            #加入标题的列表中
            title_bar = [title[0],title[1],title[2],title[3],title[4]]
        else:
            #切割数据
            split_data = i.strip().split()
            #序号
            id = split_data[0]
            #部门
            department = split_data[1]
            #人数
            number = split_data[2]
            #平均年龄
            average_age = split_data[3]
            #备注
            remarks = split_data[4]
            #打印结果
            print(id,department,number,average_age,remarks)

执行输出：

1 python 30 26 单身狗
2 Linux 26 30 没对象
3 运营部 20 24 女生多

4.4将最后一行代码print改成字典输出

#行数
line_num = 0
li = []
with open('log.txt', encoding='utf-8') as f:
    for i in f:
        i = i.strip()
        #行数加1
        line_num += 1
        #判断行数是否等于1
        if (line_num == 1):
            #去除空格，使用空格切割
            title = i.strip().split()
            #加入标题的列表中
            title_bar = [title[0],title[1],title[2],title[3],title[4]]
        else:
            #切割数据
            split_data = i.strip().split()
            #序号
            id = split_data[0]
            #部门
            department = split_data[1]
            #人数
            number = split_data[2]
            #平均年龄
            average_age = split_data[3]
            #备注
            remarks = split_data[4]
            #打印结果
            print({title_bar[0]:id,title_bar[1]:department,title_bar[2]:number,title_bar[3]:average_age,title_bar[4]:remarks})

执行输出：

{'人数': '30', '备注': '单身狗', '序号': '1', '平均年龄': '26', '部门': 'python'}
{'人数': '26', '备注': '没对象', '序号': '2', '平均年龄': '30', '部门': 'Linux'}
{'人数': '20', '备注': '女生多', '序号': '3', '平均年龄': '24', '部门': '运营部'}

4.5将打印的字典，添加到总列表中，最终代码如下

#行数
line_num = 0
#总列表
li = []
with open('log.txt', encoding='utf-8') as f:
    for i in f:
        i = i.strip()
        #行数加1
        line_num += 1
        #判断行数是否等于1
        if (line_num == 1):
            #去除空格，使用空格切割
            title = i.strip().split()
            #加入标题的列表中
            title_bar = [title[0],title[1],title[2],title[3],title[4]]
        else:
            #切割数据
            split_data = i.strip().split()
            #序号
            id = split_data[0]
            #部门
            department = split_data[1]
            #人数
            number = split_data[2]
            #平均年龄
            average_age = split_data[3]
            #备注
            remarks = split_data[4]
            #将字典添加到列表中
            li.append({title_bar[0]:id,title_bar[1]:department,title_bar[2]:number,title_bar[3]:average_age,title_bar[4]:remarks})

print(li)

执行输出：

[{'备注': '单身狗', '平均年龄': '26', '人数': '30', '序号': '1', '部门': 'python'}, {'备注': '没对象', '平均年龄': '30', '人数': '26', '序号': '2', '部门': 'Linux'}, {'备注': '女生多', '平均年龄': '24', '人数': '20', '序号': '3', '部门': '运营部'}]

老师的代码：可扩展性强，不论是横排还是竖排增加数据，都是适用

l1 = []
with open('a2.txt',encoding='utf-8') as f1:
    #读取第一行,去除空格,使用空格切割成列表
    list_name = f1.readline().strip().split()
    #从第二行开始读取，因为读取一行之后，光标自动移动到下面一行。
    #所以for循环不会读取到第一行
    for i in f1:
        dic = {}
        #去除空格,以空格切割成列表
        i = i.strip().split()
        #遍历i
        for j in range(len(i)):
            #添加字典, list_name[j]表示key, i[j]表示value,比如'序号': '1'
            dic[list_name[j]] = i[j]
        #添加到列表中
        l1.append(dic)

print(l1)

执行效果同上