【py code everyday】2023-02-08

文章展示了在Python中处理文件时遇到的UnicodeDecodeError异常,通过尝试不同的编码方式(如GB2312、GBK、ISO-8859-1)解决问题。还讨论了如何在用户输入非数字时处理ValueError异常,以及在文件不存在时静默失败的编程策略。最后提到了对古登堡计划文本的单词计数分析。
摘要由CSDN通过智能技术生成

使用多个文件

def count_words(filename):
    """计算一个文件大致包含多少个单词。"""
    try:
        with open(filename, encoding = 'utf-8') as f:
            contents = f.read()
    except FileNotFoundError:
        print(f"Sorry, the file {filename} does not exist.")
    else:
        words = contents.split()
        num_words = len(words)
        print(f"The file {filename} has about {num_words} words.")
filenames = ['text_files/alice.txt','text_files/siddhartha.txt',
            'text_files/moby_dick.txt','text_files/little_women.txt']
for filename in filenames:
    count_words(filename)

结果出现异常:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa1 in position 2415: invalid start byte
[Finished in 47ms]

在网上找问题,发现原因是
Unicode解码错误:“UTF-8”编解码器无法解码位置2中的字节0xBC:无效的起始字节

将 encoding=’utf-8’ 改为GB2312、gbk、ISO-8859-1,随便尝试一个均可以!

解决了问题

    try:
        with open(filename, encoding = 'GB2312') as f:
            contents = f.read()
  try:
        with open(filename, encoding = 'gbk') as f:
            contents = f.read()
    try:
        with open(filename, encoding = 'ISO-8859-1') as f:
            contents = f.read()

结果:

The file text_files/alice.txt has about 17846 words.
Sorry, the file text_files/siddhartha.txt does not exist.
The file text_files/moby_dick.txt has about 6998 words.
The file text_files/little_women.txt has about 67354 words.
[Finished in 62ms]

删掉发现也能实现。。

    try:
        with open(filename) as f:
            contents = f.read()

运行后也是上述结果

有趣的“pass”语句,在except中使用,告诉Python什么都不要做

def count_words(filename):
    """计算一个文件大致包含多少个单词。"""
    try:
        with open(filename,encoding = 'GB2312') as f:
            contents = f.read()
    except FileNotFoundError:
        # print(f"Sorry, the file {filename} does not exist.")
        pass
    else:
        words = contents.split()
        num_words = len(words)
        print(f"The file {filename} has about {num_words} words.")
filenames = ['text_files/alice.txt','text_files/siddhartha.txt',
            'text_files/moby_dick.txt','text_files/little_women.txt']
for filename in filenames:
    count_words(filename)

结果:

The file text_files/alice.txt has about 17846 words.
The file text_files/moby_dick.txt has about 6998 words.
The file text_files/little_women.txt has about 67354 words.
[Finished in 62ms]

课后题

提示用户提供数值输入时,常出现的一个问题是,用户提供的是文本而不是数。在这
种情况下,当你尝试将输入转换为整数时,将引发 ValueError 异常。编写一个程序,提
示用户输入两个数,再将它们相加并打印结果。在用户输入的任何一个值不是数字时都捕
获 ValueError 异常,并打印一条友好的错误消息。对你编写的程序进行测试:先输入两
个数,再输入一些文本而不是数。

print("Enter 'q' to exit!")
try:
    while True:
        first = input("Please enter the first number: ")
        if first == 'q':
            break
        second = input("Please enter the second number: ")
        if second == 'q':
            break
        answer = int(first)+int(second)
        print(f"Your answer is {answer}.")
except ValueError:
    print("Please enter a valid number")
Enter 'q' to exit!
Please enter the first number: 12
Please enter the second number: ab
Please enter a valid number

Process finished with exit code 0

将修改try的位置

print("Enter 'q' to exit!")
while True:
    first = input("\nPlease enter the first number: ")
    if first == 'q':
        break
    second = input("Please enter the second number: ")
    if second == 'q':
        break
    try:
        answer = int(first)+int(second)
        print(f"Your answer is {answer}.")
    except ValueError:
        print("Please enter a valid number")
Please enter the first number: a 
Please enter the second number: 12
Please enter a valid number

Please enter the first number: 34
Please enter the second number: ab
Please enter a valid number

书上提供了如下的实现方式,没有放在while循环中,每次需要手动再进行执行操作

try:
    x = input("Give me a number: ")
    x = int(x)

    y = input("Give me another number: ")
    y = int(y)
except ValueError:
    print("Sorry, I really needed a number.")    
else:
    sum = x + y
    print(f"The sum of {x} and {y} is {sum}.")
Give me a number: 12
Give me another number: a
Sorry, I really needed a number.

Process finished with exit code 0

练习10-8:猫和狗 创建文件cats.txt和dogs.txt,在第一个文
件中至少存储三只猫的名字,在第二个文件中至少存储三条狗
的名字。编写一个程序,尝试读取这些文件,并将其内容打印
到屏幕上。将这些代码放在一个try-except 代码块中,以便
在文件不存在时捕获FileNotFound 错误,并显示一条友好的
消息。将任意一个文件移到另一个地方,并确认except 代码
块中的代码将正确执行。

第一种方案

def read_txt(filename):
    try:
        with open(filename) as f:
            for name in f:
                print(f"The pets is {name.rstrip()}." )
    except FileNotFoundError:
        print(f"\nThis isn't {filename.title()} in the file.")
# filename = 'text_files/cats.txt'
# cat = read_txt(filename)
cats = read_txt('text_files/cats.txt')
print("\n")
dogs = read_txt('text_files/dogs.txt')
rabbits = read_txt('text_files/rabbits.txt')
The pets is happy.
The pets is cinima.
The pets is sucy.


The pets is huanhuan.
The pets is xixi.
The pets is lulu.

This isn't Text_Files/Rabbits.Txt in the file.
[Finished in 47ms]

第二种实现方法:通过for循环遍历.txt文件,再通过with open打开每个遍历的文件

filenames = ['text_files/cats.txt','dogs.txt']

for filename in filenames:
    print(f"\nReading file: {filename}")
    try:
        with open(filename) as f:
            contents = f.read()
            print(contents)
    except FileNotFoundError:
        print(f"Sorry, I can't find {filename}.")

Reading file: text_files/cats.txt
happy
cinima
sucy

Reading file: dogs.txt
Sorry, I can't find dogs.txt.
[Finished in 62ms]

练习10-9:静默的猫和狗 修改你在练习10-8中编写的
except 代码块,让程序在任意文件不存在时静默失败。

这里修改了print的位置,放在else中,这样在遇到不存在的.txt文件时,可以

filenames = ['text_files/cats.txt','dogs.txt']

for filename in filenames:
    try:
        with open(filename) as f:
            contents = f.read()
    except FileNotFoundError:
        # print(f"Sorry, I can't find {filename}.")
        pass
    else:
        print(f"\nReading file: {filename}")
        print(contents)

或者再使用第一种方法来实现

def read_txt(filename):
    try:
        with open(filename) as f:
            for name in f:
                print(f"The pets is {name.rstrip()}." )
    except FileNotFoundError:
        # print(f"\nThis isn't {filename.title()} in the file.")
        pass
# filename = 'text_files/cats.txt'
# cat = read_txt(filename)
cats = read_txt('text_files/cats.txt')
print("\n")
dogs = read_txt('text_files/dogs.txt')
rabbits = read_txt('text_files/rabbits.txt')
The pets is happy.
The pets is cinima.
The pets is sucy.


The pets is huanhuan.
The pets is xixi.
The pets is lulu.
[Finished in 47ms]

练习10-10:常见单词 访问古登堡计划,找一些你想分析的
图书。下载这些作品的文本文件或将浏览器中的原始文本复制
到文本文件中。
可以使用方法count() 来确定特定的单词或短语在字符串中出
现了多少次。例如,下面的代码计算’row’ 在一个字符串中出
现了多少次:
请注意,通过使用lower() 将字符串转换为小写,可捕捉要查
找单词的所有格式,而不管其大小写如何。
编写一个程序,它读取你在古登堡计划中获取的文件,并计算
单词’the’ 在每个文件中分别出现了多少次。这里计算得到的
结果并不准确,因为将诸如’then’ 和’there’ 等单词也计算
在内了。请尝试计算’the ’ (包含空格)出现的次数,看看
结果相差多少。

filename = 'text_files/alice.txt'
with open(filename) as f:
    contents = f.read()
    print(contents.title().count('the'))
84
[Finished in 62ms]
filename = 'text_files/alice.txt'
with open(filename) as f:
    contents = f.read()
    print(contents.lower().count('then'))
39
[Finished in 47ms]

以下来自专业的答案:

def count_words(filename,word):
#这里的计算并不准确
    try:
        with open(filename,encoding='utf-8') as f:
            contents = f.read()
    except FileNotFoundError:
        pass
    else:
        word_count = contents.lower().count(word)

    msg = f"'{word}' appears in {filename} about {word_count} times."
    print(msg)

filename = 'text_files/alice.txt'
count_words(filename,'the')
'the' appears in text_files/alice.txt about 1221 times.
[Finished in 47ms]
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值