10.1 从文件中读取数据
程序操作数据需要先将数据读取到内存中,可以一次性读取,也可以每次一行的读取
10.1.1 读取整个文件
创建一个 pi_digits.txt 文件,与 test.py 放在同一个文件夹下,pi_digits.txt 内容如下:
3.14159
895
125
# test.py
'''读取txt文件内容,并打印出来'''
with open('digit.txt') as file_object:
contents = file_object.read()
print(contents)
'''
输出:
3.14159
895
125
'''
- 关键字 with 在不再需要访问文件后将其关闭(python程序自己判断什么是合适的时候),所以这里只有 open(),但是没有 close()
- open() 接受一个参数,即要打开文件的名称(在同一目录下,如果不再同一目录,需要做路径处理),该函数返回一个表示该文件的对象
- 由 read() 方法读取这个文件对象的全部内容,并将其作为长字符串赋值给 contents
课本中说输出最后有一行空行是因为文件读取到末尾的字符串,但自己亲测更像是print()函数的参数 end=’\n’所致,如果将print()函数中的参数改为end = ‘’,就达到和原文件一样的效果了
10.1.2 文件路径
文件有相对路径和绝对路径
- 相对路径:文件相对于当前程序所在目录
- 绝对路径:文件在计算机中的准确位置
file_path1 = r'D:\pythonProject1\Notes\digit.txt' #绝对路径
file_path2 = 'digit.txt' # 相对路径中的当前目录中的文件
file_path3 = r'text_file\digit.txt' # 相对路径中的读取当前目录文件夹中的文件
with open(file_path1) as file_object: #路径file_path1,file_path2,file_path3等价
contents = file_object.read()
print(contents)
'''
输出:
3.14159
895
125
'''
10.1.3 逐行读取
file_path = r'text_file\digit.txt'
with open(file_path) as file_object:
lines = file_object.readlines()
for line in lines:
print(line)
'''课本中的原代码,虽然程序运行没错,但我认为按上面的写法更符合规范
for line in file_object:
print(line)
'''
'''
输出:
3.14159
895
125
'''
引文文本中每行结束有一个换行符,print()也有一个换行符
改进程序1,使用rstrip()方法
file_path = r'text_file\digit.txt'
with open(file_path) as file_object:
for line in file_object:
print(line.rstrip())
'''
输出:
3.14159
895
125
'''
改进程序2,使用print(line, end=’’),此时与原文本完全一致
file_path = r'text_file\digit.txt'
with open(file_path) as file_object:
for line in file_object:
print(line, end='')
'''
输出:
3.14159
895
125
'''
10.1.4 创建一个包含文件各行内容的列表
使用关键字 with ,open()返回的文件对象只能在with代码块内使用,如果要在代码块外使用,可以将其各行存储在一个列表中
file_path = r'text_file\digit.txt'
with open(file_path) as file_object:
lines = file_object.readlines()
print(lines)
for line in lines:
print(line.rstrip())
'''
输出:
['3.14159\n', ' 895\n', ' 125']
3.14159
895
125
'''
方法readlines()从文件对象中读取每一行,并将其存放在一个列表中
10.1.5 使用文件的内容
将读取到的数据拼接成一个字符串,并输出字符串的长度
file_path = r'text_file\digit.txt'
with open(file_path) as file_object: #按行读取文件
lines = file_object.readlines()
print(lines) #打印出读取数据(以列表形式)
pi_string =''
for line in lines:
pi_string += line.strip() #将列表元素去掉左右空白符,然后拼接
print(pi_string)
print(len(pi_string))
'''
输出:
['3.14159\n', ' 895\n', ' 125']
3.14159895125
13
'''
python将所有文本都解读为字符串
10.1.6 包含一百万位的大型文件
pi_mil.txt
3.1415926535897932384626433832795028841971693993751058209749445923078164062862089986280348253421170679
8214808651328230664709384460955058223172535940812848111745028410270193852110555964462294895493038196
4428810975665933446128475648233786783165271201909145648566923460348610454326648213393607260249141273
7245870066063155881748815209209628292540917153643678925903600113305305488204665213841469519415116094
3305727036575959195309218611738193261179310511854807446237996274956735188575272489122793818301194912
9833673362440656643086021394946395224737190702179860943702770539217176293176752384674818467669405132
0005681271452635608277857713427577896091736371787214684409012249534301465495853710507922796892589235
4201995611212902196086403441815981362977477130996051870721134999999837297804995105973173281609631859
5024459455346908302642522308253344685035261931188171010003137838752886587533208381420617177669147303
5982534904287554687311595628638823537875937519577818577805321712268066130019278766111959092164201989
如果有一个超大的文本文件,如pi,我们想打印到小数点后50位
file_path = r'text_file\pi_mil.txt'
with open(file_path) as file_object:
lines = file_object.readlines()
print(lines)
pi_string =''
for line in lines:
pi_string += line.strip()
print(f"{pi_string[:52]}...")
print(len(pi_string))
'''
输出:
['3.1415926535897932384626433832795028841971693993751058209749445923078164062862089986280348253421170679\n', '8214808651328230664709384460955058223172535940812848111745028410270193852110555964462294895493038196\n', '4428810975665933446128475648233786783165271201909145648566923460348610454326648213393607260249141273\n', '7245870066063155881748815209209628292540917153643678925903600113305305488204665213841469519415116094\n', '3305727036575959195309218611738193261179310511854807446237996274956735188575272489122793818301194912\n', '9833673362440656643086021394946395224737190702179860943702770539217176293176752384674818467669405132\n', '0005681271452635608277857713427577896091736371787214684409012249534301465495853710507922796892589235\n', '4201995611212902196086403441815981362977477130996051870721134999999837297804995105973173281609631859\n', '5024459455346908302642522308253344685035261931188171010003137838752886587533208381420617177669147303\n', '5982534904287554687311595628638823537875937519577818577805321712268066130019278766111959092164201989']
3.14159265358979323846264338327950288419716939937510...
1002
'''
10.1.7圆周率值中包含你的生日吗
这个txt文件只有1002位,你可以尝试10000位的情况
file_path = r'text_file\pi_mil.txt'
with open(file_path) as file_object:
lines = file_object.readlines()
pi_string =''
for line in lines:
pi_string += line.strip()
birthday = input("Enter your birthday(yymmdd): ")
if birthday in pi_string:
print("in")
else:
print("not in")
'''
输出:
Enter your birthday(yymmdd): 991122
not in
'''
10.2 写入文件
永久保存数据最简单的方式就是将其写入文件
10.2.1 写入空文件
file_path = r'text_file\programming.txt'
with open(file_path, 'w') as file_object:
file_object.write("I love programming.")
- 调用open()函数提供了两个参数,第一个是要打开的文件,第二个是‘w’,它的意思是只写,第二个参数还可以是‘r’(只读,可以省略),‘a’(追加), ‘r+’(读写)
- 如果写入的文件不存在,程序会自动创建它
- 使用’w’模式,如果文件内有数据,程序在写入前会先清空
- 方法write()是将一个字符串写入文件
- 程序的运行结果是在路径’text_file\programming.txt’,创建一个txt文件,文件内的内容是"I love programming."
10.2.2 写入多行
函数write()不会自动在末尾加入换行符,在写入多行时,要按照需要添加换行符
file_path = r'text_file\programming.txt'
with open(file_path, 'w') as file_object:
file_object.write("I love programming.\n")
file_object.write("I love China too!\n")
10.2.3 附加到文件
如果是要给文件追加内容,而不是覆盖原有内容,就需要用’a’模式
再重新创建一个.py文件,对programming.txt的内容进行追加,追加一条"I love python."
file_path = r'text_file\programming.txt'
with open(file_path, 'a') as file_object:
file_object.write("I love python.\n")
10.3 异常
程序执行过程中出现错误,会出现Traceback,这会让人觉得程序不友好,可以使用try-except代码块对异常进行处理
10.3.1 处理ZeroDivisionError
print(5/0)
'''
输出:
Traceback (most recent call last):
File "D:\pythonProject1\Notes\test2.py", line 1, in <module>
print(5/0)
ZeroDivisionError: division by zero
'''
10.3.2 使用try-except代码块
对伤处异常代码进行处理,这样,用户看到的就是一条友好提示,而不是Traceback
try:
print(5/0)
except ZeroDivisionError:
print("You can't divide by zero.")
'''
You can't divide by zero.
'''
10.3.3 使用异常避免奔溃
如果程序工作尚未完成,妥善处理异常就尤为重要,以一个由用户输入两个数进行相除程序为例
print("Give me two numbers, and I will divide them.")
print("Enter 'q' to quit.")
while True:
first_number = input("\nFirst number: ")
if first_number == 'q':
break
second_number = input("Second number: ")
if second_number == 'q':
break
answer = int(first_number) / int(second_number)
print(answer)
'''
输出:
Give me two numbers, and I will divide them.
Enter 'q' to quit.
First number: 5
Second number: 2
2.5
First number: 5
Second number: 0
Traceback (most recent call last):
File "D:\pythonProject1\Notes\test2.py", line 11, in <module>
answer = int(first_number) / int(second_number)
ZeroDivisionError: division by zero
'''
10.3.4 else 代码块
使用 else 代码块解决上节程序的问题
print("Give me two numbers, and I will divide them.")
print("Enter 'q' to quit.")
while True:
first_number = input("\nFirst number: ")
if first_number == 'q':
break
second_number = input("Second number: ")
if second_number == 'q':
break
try:
answer = int(first_number) / int(second_number)
except:
print("You can't divide by 0!")
else:
print(answer)
'''
输出:
Give me two numbers, and I will divide them.
Enter 'q' to quit.
First number: 5
Second number: 0
You can't divide by 0!
First number: 5
Second number: 2
2.5
First number: q
'''
10.3.5 处理FileNotFoundError
使用文件时,一种常见的错误是找不到文件,如下面的例子中不存在’alice.txt’文件就会报错
filename = 'alice.txt'
with open(filename, encoding='utf-8') as f:
contents = f.read()
使用try-except-else代码块处理
filename = 'alice.txt'
try:
with open(filename, encoding='utf-8') as f:
contents = f.read()
except FileNotFoundError:
print(f"Sorry, the file {filename} does not exist.")
else:
#计算该文件包含多少个单词
words = contents.split()
num_words = len(words)
print(f"The file {filename} has about {num_words} words.")
10.3.6 分析文本
你可以分析整本书的文本文件,见上节else代码块
10.3.7 使用多个文件
把10.3.5节的代码的主要操作移到一个count_words()的函数中,对多本书进行分析
def count_words(filename):
try:
with open(filename, encoding='utf-8') as f:
contents = f.read()
except FileNotFoundError:
print(f"Sorry, the file {filename} does not exist.")
else:
words = contents.split()
num_words = len(words)
print(f"The file {filename} has about {num_words} words.")
filenames = ['alice.txt', 'siddhartha.txt', 'moby_dick.txt', 'little_women.txt']
for filename in filenames:
count_words(filename)
10.3.8 静默失败
并不是每次发生错误,都要提醒用户,有时候你希望程序发生错误时保持沉默,就像什么也没发生一样,这时要用pass语句
10.3.7的代码就可以改为
def count_words(filename):
try:
with open(filename, encoding='utf-8') as f:
contents = f.read()
except FileNotFoundError:
pass
else:
words = contents.split()
num_words = len(words)
print(f"The file {filename} has about {num_words} words.")
filenames = ['alice.txt', 'siddhartha.txt', 'moby_dick.txt', 'little_women.txt']
for filename in filenames:
count_words(filename)
10.3.9 决定报告哪些错误
凭经验判断
10.4 存储数据
- python 有一个模块 json
- 有一种存储格式.json
10.4.1 使用 json.dump() 和 json.load()
- 函数 json.dump()用来存储数据,有两个参数,第一个是要存储的数据,第二个是要存储的目的地
- 函数 json.load() 用来读取数据,有一个参数,即要读取的对象
#使用 json.dump()
import json
numbers = [2, 3, 5, 7, 11, 13]
filename = 'numbers.json'
with open(filename, 'w') as f:
json.dump(numbers, f)
# 使用 json.load()
import json
filename = 'numbers.json'
with open(filename) as f:
numbers = json.load(f)
print(numbers)
'''
输出:
[2, 3, 5, 7, 11, 13]
'''
10.4.2 保存和读取用户生成的数据
使用 json 存储可以在程序运行停止时,保持数据不丢失
程序第一次执行,没有 number.json 文件,执行 except 代码块,第二次执行时,number.json 文件存在,读取其中的数据(执行 try 代码块),执行 else 代码块
import json
filename = 'numbers.json'
try:
with open(filename) as f:
username = json.load(f)
except FileNotFoundError:
username = input("What's your name?")
with open(filename, 'w') as f:
json.dump(username, f)
print(f"We will remember when you come back, {username}.")
else:
print(f"Welcome back, {username}.")
'''
第一次运行输出:
What's your name? zhang xy
We will remember when you come back, zhang xy.
'''
'''
第二次运行输出:
Welcome back, zhang xy.
'''
10.4.3 重构
会有这样的情况,代码可以正常运行,但我们要将它划分为一些列完成具体工作的函数,这个过程就叫重构
重构让代码更清晰、更易于理解、更容易扩展
10.4.2的代码进行重构,代码的主要功能是:问候用户,如果用户不存在,就录入用户,说:你下次回来我能记得你;如果用户存在,就说:欢迎回来
把上面要实现的功能中的单一功能尽量放在一个函数,函数1:获得已存储用户(有数据返回该用户,无数据,返回None);函数2:录入新用户,并返回该用户;函数3:问好(如果是已存储的,说欢迎回来,如果未存储,说下次回来我能记得你)
import json
def get_stored_username():
'''读取已存储的名字'''
filename = 'numbers.json'
try:
with open(filename) as f:
username = json.load(f)
except FileNotFoundError:
return None
else:
return username
def get_new_username():
'''用户输入名字,并存储'''
username = input("What is your name? ")
filename = 'numbers.json'
with open(filename, 'w') as f:
json.dump(username, f)
return username
def greet_user():
'''输出问候语'''
username = get_stored_username()
if username:
print(f"Welcome back, {username}.")
else:
username = get_new_username()
print(f"We will remenber you when you comme back, {username}")
greet_user()
'''
第一次运行输出:
What's your name? zhang xy
We will remember when you come back, zhang xy.
'''
'''
第二次运行输出:
Welcome back, zhang xy.
'''
10.5 小结
略