第五章：文件处理_通用文件copy工具实现 #2、基于seek控制指针移动,测试r+、w+、a+模式下的读写内容-CSDN博客

本文链接：https://blog.csdn.net/qq_41405475/article/details/113529571

文章目录

第五章文件处理

第五章文件处理

一、什么是文件？

文件是操作系统提供给用户 / 应用程序操作硬盘的一种虚拟的概念 / 接口
　　用户 / 应用程序
　　操作系统（文件）
　　计算机硬件（硬盘）

二、为何要用文件？

用户 / 应用程序可以通过文件将数据永久保存到硬盘中(即操作文件就是操作硬盘)
用户 / 应用程序直接操作的是文件，对文件进行的所有的操作
都是在向操作系统发送系统调用，然后再由操作系统将其转换成具体的硬盘操作

三、如何使用文件

open() 打开文件
控制文件读写内容的模式：t和b
强调：t和b不能单独使用，必须跟r / w / a连用
t：文本（默认的模式）
　　1.
读写都是以str（Unicode）为单位的
　　2.
文本文件
　　3.
必须为open()
指定encoding = 'utf-8'

b：二进制 / bytes
　　控制文件读写操作的模式：
　　　　r：只读模式
　　　　w：只写模式
　　　　a：只追加写模式
　　　　+：r +、w +、a +

四、文件基本操作

1.打开文件
① Windows路径分隔符问题：
open('C:\a\nb\c\d.txt')
解决方案1：（推荐）
open(r'C:\a\nb\c\d.txt')  # 路径前面加r，取消转义字符的作用
解决方案2：
open('C:/a/nb/c/d.txt')  # 系统自动识别为路径
② open() 操作既占用应用程序，还占用操作系统
f = open(r'F:\Python学习相关\正课\第2周\day05\world.txt', mode='rt')
# f的值是一种变量，占用的是应用程序的内存空间
print(f)

x = 10  # 属于 应用程序：Python解释器 的内存空间
③ 绝对路径 & 相对路径（了解）
　　绝对路径：（就是文件的完整路径）

　　　　优点：路径完整，易于寻找

　　　　缺点：路径名可能过长，前路径修改后，文件就找不到

C:\Users\Darker\Desktop\a.txt
　　相对路径：（当前文件夹所在的路径）

　　　　优点：路径相对剪短

　　　　缺点：文件换个文件夹就不以寻找

a.txt
2.操作文件（读 / 写文件）
应用程序读文件的读写请求都是在向操作系统发送系统调用，然后由操作系统控制硬盘把输入读入内存、或者写入硬盘
f = open(r'C:\Users\Darker\Desktop\aaa.txt', mode='rt', encoding='UTF-8')
res = f.read()
print(res)
文件对象又称为：文件句柄
with open(r'C:\Users\Darker\Desktop\aaa.txt', mode='rt', encoding='UTF-8') as f:
    res = f.read()
    print(res)
3.关闭文件
del f
f.close()
文件名过长的时候，可以用“\”转义字符，来表示是同一行
with open(r'C:\Users\Darker\Desktop\aaa.txt', mode='rt', encoding='UTF-8') as f1, \
        open(r'C:\Users\Darker\Desktop\aaa.txt', mode='rt', encoding='UTF-8') as f2:
    res1 = f1.read()
    res2 = f2.read()

    print(res1)
    print(res2)

五、指定字符编码

没有指定encoding参数，操作系统会使用自己默认的编码
with open(r'C:\Users\Darker\Desktop\aaa.txt', mode='rt') as f1:
    res1 = f1.read()  # t模式会将f.read()读出的结果解码成Unicode
    print(res1, type(res1))

# 此时，就会报错：
UnicodeDecodeError: 'gbk'
codec
can
't decode byte 0x80 in position 16: illegal multibyte sequence
linux系统、MacOS系统默认UTF - 8
Windows系统默认GBK
内存：utf - 8
格式的二进制 - ---解码 - ---Unicode
硬盘（aaa.txt内容：utf - 8
的二进制）

六：文件操作r\w\a模式详解

下面内容都是以t模式为基础，进行内存操作的
1.r（默认的操作模式）只读模式
当文件不存在时：会报错
当文件存在时：文件指针调到开始位置
with open(r'C:\Users\Darker\Desktop\aaa.txt', mode='rt', encoding='UTF-8') as f:
    print('第一次读'.center(50, '*'))
    res = f.read()  # 把所有内容从硬盘读到内存
    print(res)

    print('第二次读'.center(50, '*'))
    res1 = f.read()  # t模式会将f.read()读出的结果解码成Unicode
    print(res1)

** ** ** ** ** ** ** ** ** ** ** *第一次读 ** ** ** ** ** ** ** ** ** ** ** *
滚滚长江东逝水
滚滚长江东逝
滚滚长江东
滚滚长江
滚滚长
滚滚
滚
** ** ** ** ** ** ** ** ** ** ** *第二次读 ** ** ** ** ** ** ** ** ** ** ** *
== 案例
user.txt中的内容：
qwe: 123
asd: 456
zxc: 789
    

inp_username = input('Your name:').strip()
inp_password = input('Your password:').strip()

with open('user.txt', mode='rt', encoding='UTF-8') as f:
    for line in f:
        l = line.strip().split(':')
        username, password = line.strip().split(':')
        if inp_username == username and inp_password == password:
            print('Successful')
            break
    else:
        print('False')
应用程序 == == ==》文件
应用程序 == == ==》数据库管理软件 == == ==》文件
2.w 只写模式
当文件不存在时：会创建空文件
当文件存在时：会清空文件，文件指针调到开始位置
with open('d.txt', mode='wt', encoding='utf-8') as f:
    # f.read()  #只能写，不能读
    f.write('111\n123')  # 先清空内容，再写入
强调1：
　　在以w模式打开文件没有关闭的情况下，连续写入，新写的内容总是跟在旧写的内容之后

with open('d.txt', mode='wt', encoding='utf-8') as f:
    # f.read()  #只能写，不能读
    f.write('111\n')  # 先清空内容，再写入
    f.write('222\n')  # 先清空内容，再写入
    f.write('333\n')  # 先清空内容，再写入
强调2：
　　如果重新以w模式打开文件，则会清空文件

with open('d.txt', mode='wt', encoding='utf-8') as f:
    f.write('111\n')  # 先清空内容，再写入
with open('d.txt', mode='wt', encoding='utf-8') as f:
    f.write('222\n')  # 先清空内容，再写入
with open('d.txt', mode='wt', encoding='utf-8') as f:
    f.write('333\n')  # 先清空内容，再写入
案例：w模式用来创建全新的文件 ** *
　　
针对文本文件的拷贝Copy工具
src_file = input('原文件路径：').strip()
dst_file = input('原文件路径：').strip()
with open(r'{}'.format(src_file), mode='rt', encoding='utf-8') as f1, \
        open(r'{}'.format(dst_file), mode='wt', encoding='utf-8') as f2:
    res = f1.read()
    f2.write(res)
3. a 只追加写
当文件不存在时：会创建空文档，文件指针调到开始位置
当文件存在时：文件指针会直接跳到末尾
with open('e.txt', mode='at', encoding='utf-8') as f:
    # f.read()    # 报错，不能读
    f.write('我去1\n')
    f.write('我去2\n')
    f.write('我去3\n')
强调
w模式
与
a模式的异同：
１.相同点
　　　　在打开文件
不关闭的情况下，连续的写入，新写的内容总会跟在之前写的内容之后

２.不同点
　　　　以a模式重新打开文件，不会清空原文件内容，会将文件指针直接移动到文件末尾

　　　　以为w模式重新打开文件，会清空原文件内容，会将文件指针移动到文件开头

案例：a模式用来在原有的文件内存的基础之上，写入新的内容，比如：记录日志、注册功能
注册功能：
name = input('请输入用户名：')
pwd = input('请输入密码：')
with open('db.txt', mode='at', encoding='utf-8') as f:
    f.write('{}:{}\n'.format(name, pwd))

七、了解

+模式：+不能单独使用，必须配合r、w、a

r + 模式：文件不存在，则会报错
w + 模式：文件不存在，则会创建新文件
r + 模式：文件不存在，则会报错

八、控制文件内指针的移动

# f.seek(字节个数,模式)
# 模式有三种
# 0：参照文件的开头
# 1：参照当前所在的位置
# 2：参照文件末尾的位置

# 注意：
# 1、无论何种模式，都是以字节单位移动,只有t模式下的read(n)的n代表的是字符个数
with open('a.txt',mode='rt',encoding='utf-8') as f:
    data=f.read(6)
    print(data)
with open('a.txt',mode='rt',encoding='utf-8') as f:
    data = f.read(6)
    print(data)
with open('a.txt',mode='rt',encoding='utf-8') as f:
    data = f.read(6)
    print(data)  #打印的是字符hello美

with open('a.txt',mode='rb') as f:
    data = f.read(6)
    print(data)  #显示的是字节b'hello\xe7'
with open('a.txt',mode='rb')as f:
    data = f.read(6)
    print(data)

with open('a.txt',mode='rb') as f:
    data=f.read(5)
    print(data.decode('utf-8'))
with open('a.txt',mode='rb',)as f:
    data = f.read(5)
    print(data.decode('utf-8'))
with open('a.txt',mode='rb') as f:
    data = f.read(5)
    print(data.decode('utf-8'))
with open('a.txt',mode='rb') as f:
    data = f.read(5)
    print(data.decode('utf-8'))

# 2、只有0模式可以在t模式下使用，而0、1、2都可以在b模式下用

# 示例
with open('a.txt',mode='rt',encoding='utf-8') as f: #t模式下1,2模式会报错
    f.seek(3,2)
with open('a.txt',mode='rb',encoding='utf-8') as f:
    f.seek(0,0)

with open('a.txt',mode='rb') as f:
    f.seek(6,0)
    print(f.read().decode('utf-8'))
    f.seek(16,1)
    print(f.tell())
with open('a.txt',mode='rb')as f:
    f.seek(5,0)
    print(f.read().decode('utf-8'))
with open('a.txt',mode='rb') as f:
    f.seek(8,0)
    print(f.read().decode('utf-8'))
    f.seek(19,1)
    print(f.tell())

    f.seek(-3,2)
    print(f.read().decode('utf-8'))

    f.seek(0,2)
    print(f.tell())



with open('b.txt',mode='wt',encoding='utf-8') as f:
    f.seek(10,0)
    print(f.tell())
    f.write("你好")


# 应用1：tail -f access.log
import time
with open('access.log',mode='rb') as f:
    f.seek(0,2)
    while True:
        line=f.readline()
        if len(line) == 0:
            time.sleep(0.3)
        else:
            print(line.decode('utf-8'),end='')

九、文件操作的其他方法

with open('b.txt',mode='rt',encoding='utf-8') as f:
    data=f.readlines()
    print(data,end=' ')
print()

with open('b.txt',mode='wt',encoding='utf-8') as f:
    # f.write("1111\n222\n3333\n")

    lines=["aaaa\n",'bbb\n','cccc\n']

    for line in lines:
        f.write(line)

    f.writelines(lines)
    f.writelines("hello")
    f.write("hello")

with open('b.txt',mode='wt',encoding='utf-8') as f:
    f.write('hello\n')
    f.write('world\n')
    f.flush()


with open('b.txt',mode='r+t',encoding='utf-8') as f:
    f.truncate(4)

十、文件修改

10.1文件修改low

a.txt

张一蛋     山东    179    49    12344234523
李二蛋     河北    163    57    13913453521
王全蛋     山西    153    62    18651433422

test.py

with open('a.txt',mode='r+t',encoding='utf-8') as f:
    res = f.read(9)	# 读取前9个字符：张一蛋     山
    print(res)
    f.seek(9,0)		# 把指针移到第9个bytes
    f.write('<男妇女主任>')	# 一个汉字对应3个bytes，此处共有3*5+2=17个bytes

修改后a.txt的内容

张一蛋<男妇女主任>9    49    12344234523
李二蛋     河北    163    57    13913453521
王全蛋     山西    153    62    18651433422

10.2 文件修改的2种方式

**方式一：**文本编辑采用的就是这种方式

# 实现思路：将文件内容发一次性全部读入内存,然后在内存中修改完毕后再覆盖写回原文件
# 　　优点: 在文件修改过程中同一份数据只有一份
# 　　缺点: 会过多地占用内存

# 文件的读取

with open('c.txt',mode='rt',encoding='utf-8') as f:
    res=f.read()
    data=res.replace('alex','dsb')
    print(data)
# 文件的写入

with open('c.txt',mode='wt',encoding='utf-8') as f1:
    f1.write('111')

**方式二：**import os

# 实现思路：以读的方式打开原文件,以写的方式打开一个临时文件,一行行读取原文件内容,
# 修改完后写入临时文件...,删掉原文件,将临时文件重命名原文件名
# 　　优点: 不会占用过多的内存
# 　　缺点: 在文件修改过程中同一份数据存了两份

# 在c.txt中的内容：
life is beautiful
dream is beautiful
learn is beautiful

# 执行文件
import os     # 导入OS模块

with open('c.txt', mode='rt', encoding='utf-8') as f, \
        open('.c.txt.swap', mode='wt', encoding='utf-8') as f1:
    for line in f:
        f1.write(line.replace('alex', 'dsb'))  # 把alex替换为dsb

os.remove('c.txt') # 删除原文件
os.rename('.c.txt.swap', 'c.txt')  # 把临时文件重命名为原文件

# 执行后，c.txt中的内容
# dsb is sb
# sb is dsb
# egon is hahahahah

作业

# 1.编写文件copy工具
# 2.编写登录程序，账号密码来自于文件
# 3.编写注册程序，账号密码来存入文件
# 4.编写用户登录接口
# 5.通用文件copy工具实现
# 6.基于seek控制指针移动，测试r+、w+、a+模式下的读写内容
# 7.tail -f access.log程序实现

答案

# 1.编写文件copy工具
src_file = input('原文件路径：').strip()
dst_file = input('新文件路径：').strip()
with open(r'{}'.format(src_file),mode='rt',encoding='utf-8') as f1,\
    open(r'{}'.format(dst_file),mode='wt',encoding='utf-8') as f2:
    res = f1.read()
    f2.write(res)
# 2.编写登录程序，账号密码来自于文件
print("========欢迎来到登录界面========")
inp_name = input("请输入用户名：").strip()
inp_pwd = input("请输入密码：").strip()
with open(r'userinfo.txt', 'rt', encoding='UTF-8') as f:
    for line in f:
        user, pwd = line.strip().split(':')
        if user == inp_name and pwd == inp_pwd:
            print('恭喜您，登录成功！')
            break
    else:
        print('用户名或密码错误！')
print("==============================")
userinfo.txt的内容：
	qwe:123
	asd:456
	zxc:789
        
# 3.编写注册程序，账号密码来存入文件
print("========欢迎来到注册界面========")
inp_user = input("请输入用户名：").strip()
inp_pwd = input("请输入密码：").strip()
with open(r'userinfo1.txt', 'at', encoding='utf-8') as f:
    f.write('{}:{}\n'.format(inp_user,inp_pwd))
print("============注册成功============")

# 4.编写用户登录接口
'''
1.输入账号密码完成验证，验证通过后输出"登录成功"
2.可以登录不同的用户
3.同一账号输错三次锁定，（提示：锁定的用户存入文件中，这样才能保证程序关闭后，该用户仍然被锁定）'''
import os
list2=[]
tag = True
while True:
    print('''
    1.注册
    2.登录
    3.退出
    ''')
    tag = True
    cmd = input("请输入指令>")
    list1 = ["1","2","3"]
    if cmd == list1[0]:
        username = input("请输入要注册的用户名：")
        password = input("请输入要注册的密码：")
        with open("a.txt","a",encoding="utf-8") as f :
            # f.write(f"{username}:{password}\n")
            f.write("{}:{}\n".format(username,password))
    elif cmd == list1[1]:
        while tag:
            user_inp = input("请输入你的用户名：")
            if os.path.exists(f"locked/{user_inp}"):
                print("账号被锁定")
                tag =False
                continue
            else:
                pwd_inp = input("请输入你的密码：")
                with open("a.txt","r",encoding="utf-8") as f:
                    for line in f:
                        username,password = line.strip().split(":")
                        if username == user_inp and password == pwd_inp:
                            print("登录成功")
                            break
                    if list2.count(user_inp) == 2:
                        with open(f"locked/{user_inp}","w",encoding="utf-8"):
                            print("错误太多被锁定")
                    else:
                        print("输入错误")
                        list2.append(user_inp)
                        print(list2.count(user_inp))
    elif cmd == list1[2]:
        print("谢谢")
        break
    else:
        print("错误输入")

        
# 5.通用文件copy工具实现
src_file = input('原文件路径：').strip()
dst_file = input('新文件路径：').strip()
with open(r'{}'.format(src_file),mode='rb') as f1,\
    open(r'{}'.format(dst_file),mode='wb') as f2:
        res = f1.read() 
        f2.write(res)
        
# 6.基于seek控制指针移动，测试r+、w+、a+模式下的读写内容
# r+
with open(r'b.txt',mode='r+t',encoding='utf-8') as f:
    print(f.read())     # 打印b.txt中的内容，内容为：123456789，光标会移至末尾
    f.seek(5,0)     # 将指针移到第5个字符
    print(f.tell())     # 打印当前指针位置，为5
    f.write('qwe')     # 写入qwe，替换6-8个字符
    f.seek(0, 0)        # 再将指针移到开头
    print(f.read())     # 再次打印b.txt中的内容，内容为：12345qwe9
# w+
with open(r'b.txt',mode='w+t',encoding='utf-8') as f:
    print(f.read())     # w模式会清空文件的内容，此时打印内容为空，指针在开头
    f.seek(10,0)     # 将指针移到第10个字符
    print(f.tell())     # 打印当前指针位置，为10
    f.write('qwe')     # 写入qwe，替换11-13个字符
    f.seek(0, 0)        # 再将指针移到开头
    print(f.read())     # 再次打印b.txt中的内容，内容为：          qwe
# a+
with open(r'b.txt',mode='a+t',encoding='utf-8') as f:
    print(f.read())     # a模式不会清空文件的内容，指针在末尾
    f.seek(5,0)     # 将指针移到第5个字符
    print(f.tell())     # 打印当前指针位置，为5
    f.write('qwe')     # 写入qwe，会在末尾写入qwe
    f.seek(0, 0)        # 再将指针移到开头
    print(f.read())     # 再次打印b.txt中的内容，内容为：123456789qwe
    
# 7.tail -f access.log程序实现
while 1:
    print('输入任意文字可添加至日志，输入readlog可读取日志信息')
    msg = input('输入指令：').strip()
    if msg == 'readlog':
        with open(r'access.log', 'a+b') as f:
            f.write(bytes('{}\n'.format(msg), encoding='utf-8'))
            f.seek(0, 0)
            log = f.read().decode('utf-8')
            print(log)
        continue
    else:
        with open(r'access.log', 'ab') as f:
            f.write(bytes('{}\n'.format(msg), encoding='utf-8'))