【py code everyday】2023-02-08

最新推荐文章于 2024-10-08 11:32:26 发布

学无止境2023

最新推荐文章于 2024-10-08 11:32:26 发布

阅读量193

点赞数

文章标签： python 开发语言

本文链接：https://blog.csdn.net/JMHSMX/article/details/128929749

版权

文章展示了在Python中处理文件时遇到的UnicodeDecodeError异常，通过尝试不同的编码方式（如GB2312、GBK、ISO-8859-1）解决问题。还讨论了如何在用户输入非数字时处理ValueError异常，以及在文件不存在时静默失败的编程策略。最后提到了对古登堡计划文本的单词计数分析。

摘要由CSDN通过智能技术生成

使用多个文件

def count_words(filename):
    """计算一个文件大致包含多少个单词。"""
    try:
        with open(filename, encoding = 'utf-8') as f:
            contents = f.read()
    except FileNotFoundError:
        print(f"Sorry, the file {filename} does not exist.")
    else:
        words = contents.split()
        num_words = len(words)
        print(f"The file {filename} has about {num_words} words.")
filenames = ['text_files/alice.txt','text_files/siddhartha.txt',
            'text_files/moby_dick.txt','text_files/little_women.txt']
for filename in filenames:
    count_words(filename)

结果出现异常：

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa1 in position 2415: invalid start byte
[Finished in 47ms]

在网上找问题，发现原因是
Unicode解码错误：“UTF-8”编解码器无法解码位置2中的字节0xBC:无效的起始字节

将 encoding=’utf-8’ 改为GB2312、gbk、ISO-8859-1，随便尝试一个均可以！

解决了问题

    try:
        with open(filename, encoding = 'GB2312') as f:
            contents = f.read()

  try:
        with open(filename, encoding = 'gbk') as f:
            contents = f.read()

    try:
        with open(filename, encoding = 'ISO-8859-1') as f:
            contents = f.read()

结果：

The file text_files/alice.txt has about 17846 words.
Sorry, the file text_files/siddhartha.txt does not exist.
The file text_files/moby_dick.txt has about 6998 words.
The file text_files/little_women.txt has about 67354 words.
[Finished in 62ms]

删掉发现也能实现。。

    try:
        with open(filename) as f:
            contents = f.read()

运行后也是上述结果

有趣的“pass”语句，在except中使用，告诉Python什么都不要做

def count_words(filename):
    """计算一个文件大致包含多少个单词。"""
    try:
        with open(filename,encoding = 'GB2312') as f:
            contents = f.read()
    except FileNotFoundError:
        # print(f"Sorry, the file {filename} does not exist.")
        pass
    else:
        words = contents.split()
        num_words = len(words)
        print(f"The file {filename} has about {num_words} words.")
filenames = ['text_files/alice.txt','text_files/siddhartha.txt',
            'text_files/moby_dick.txt','text_files/little_women.txt']
for filename in filenames:
    count_words(filename)

结果：

The file text_files/alice.txt has about 17846 words.
The file text_files/moby_dick.txt has about 6998 words.
The file text_files/little_women.txt has about 67354 words.
[Finished in 62ms]

课后题

提示用户提供数值输入时，常出现的一个问题是，用户提供的是文本而不是数。在这
种情况下，当你尝试将输入转换为整数时，将引发 ValueError 异常。编写一个程序，提
示用户输入两个数，再将它们相加并打印结果。在用户输入的任何一个值不是数字时都捕
获 ValueError 异常，并打印一条友好的错误消息。对你编写的程序进行测试：先输入两
个数，再输入一些文本而不是数。

print("Enter 'q' to exit!")
try:
    while True:
        first = input("Please enter the first number: ")
        if first == 'q':
            break
        second = input("Please enter the second number: ")
        if second == 'q':
            break
        answer = int(first)+int(second)
        print(f"Your answer is {answer}.")
except ValueError:
    print("Please enter a valid number")

Enter 'q' to exit!
Please enter the first number: 12
Please enter the second number: ab
Please enter a valid number

Process finished with exit code 0

将修改try的位置

print("Enter 'q' to exit!")
while True:
    first = input("\nPlease enter the first number: ")
    if first == 'q':
        break
    second = input("Please enter the second number: ")
    if second == 'q':
        break
    try:
        answer = int(first)+int(second)
        print(f"Your answer is {answer}.")
    except ValueError:
        print("Please enter a valid number")

Please enter the first number: a 
Please enter the second number: 12
Please enter a valid number

Please enter the first number: 34
Please enter the second number: ab
Please enter a valid number

书上提供了如下的实现方式，没有放在while循环中，每次需要手动再进行执行操作

try:
    x = input("Give me a number: ")
    x = int(x)

    y = input("Give me another number: ")
    y = int(y)
except ValueError:
    print("Sorry, I really needed a number.")    
else:
    sum = x + y
    print(f"The sum of {x} and {y} is {sum}.")

Give me a number: 12
Give me another number: a
Sorry, I really needed a number.

Process finished with exit code 0

练习10-8：猫和狗创建文件cats.txt和dogs.txt，在第一个文
件中至少存储三只猫的名字，在第二个文件中至少存储三条狗
的名字。编写一个程序，尝试读取这些文件，并将其内容打印
到屏幕上。将这些代码放在一个try-except 代码块中，以便
在文件不存在时捕获FileNotFound 错误，并显示一条友好的
消息。将任意一个文件移到另一个地方，并确认except 代码
块中的代码将正确执行。

第一种方案

def read_txt(filename):
    try:
        with open(filename) as f:
            for name in f:
                print(f"The pets is {name.rstrip()}." )
    except FileNotFoundError:
        print(f"\nThis isn't {filename.title()} in the file.")
# filename = 'text_files/cats.txt'
# cat = read_txt(filename)
cats = read_txt('text_files/cats.txt')
print("\n")
dogs = read_txt('text_files/dogs.txt')
rabbits = read_txt('text_files/rabbits.txt')

The pets is happy.
The pets is cinima.
The pets is sucy.


The pets is huanhuan.
The pets is xixi.
The pets is lulu.

This isn't Text_Files/Rabbits.Txt in the file.
[Finished in 47ms]

第二种实现方法：通过for循环遍历.txt文件，再通过with open打开每个遍历的文件

filenames = ['text_files/cats.txt','dogs.txt']

for filename in filenames:
    print(f"\nReading file: {filename}")
    try:
        with open(filename) as f:
            contents = f.read()
            print(contents)
    except FileNotFoundError:
        print(f"Sorry, I can't find {filename}.")


Reading file: text_files/cats.txt
happy
cinima
sucy

Reading file: dogs.txt
Sorry, I can't find dogs.txt.
[Finished in 62ms]

练习10-9：静默的猫和狗修改你在练习10-8中编写的
except 代码块，让程序在任意文件不存在时静默失败。

这里修改了print的位置，放在else中，这样在遇到不存在的.txt文件时，可以

filenames = ['text_files/cats.txt','dogs.txt']

for filename in filenames:
    try:
        with open(filename) as f:
            contents = f.read()
    except FileNotFoundError:
        # print(f"Sorry, I can't find {filename}.")
        pass
    else:
        print(f"\nReading file: {filename}")
        print(contents)

或者再使用第一种方法来实现

def read_txt(filename):
    try:
        with open(filename) as f:
            for name in f:
                print(f"The pets is {name.rstrip()}." )
    except FileNotFoundError:
        # print(f"\nThis isn't {filename.title()} in the file.")
        pass
# filename = 'text_files/cats.txt'
# cat = read_txt(filename)
cats = read_txt('text_files/cats.txt')
print("\n")
dogs = read_txt('text_files/dogs.txt')
rabbits = read_txt('text_files/rabbits.txt')

The pets is happy.
The pets is cinima.
The pets is sucy.


The pets is huanhuan.
The pets is xixi.
The pets is lulu.
[Finished in 47ms]

练习10-10：常见单词访问古登堡计划，找一些你想分析的
图书。下载这些作品的文本文件或将浏览器中的原始文本复制
到文本文件中。
可以使用方法count() 来确定特定的单词或短语在字符串中出
现了多少次。例如，下面的代码计算’row’ 在一个字符串中出
现了多少次：
请注意，通过使用lower() 将字符串转换为小写，可捕捉要查
找单词的所有格式，而不管其大小写如何。
编写一个程序，它读取你在古登堡计划中获取的文件，并计算
单词’the’ 在每个文件中分别出现了多少次。这里计算得到的
结果并不准确，因为将诸如’then’ 和’there’ 等单词也计算
在内了。请尝试计算’the ’ （包含空格）出现的次数，看看
结果相差多少。

filename = 'text_files/alice.txt'
with open(filename) as f:
    contents = f.read()
    print(contents.title().count('the'))

84
[Finished in 62ms]

filename = 'text_files/alice.txt'
with open(filename) as f:
    contents = f.read()
    print(contents.lower().count('then'))

39
[Finished in 47ms]

以下来自专业的答案：

def count_words(filename,word):
#这里的计算并不准确
    try:
        with open(filename,encoding='utf-8') as f:
            contents = f.read()
    except FileNotFoundError:
        pass
    else:
        word_count = contents.lower().count(word)

    msg = f"'{word}' appears in {filename} about {word_count} times."
    print(msg)

filename = 'text_files/alice.txt'
count_words(filename,'the')

'the' appears in text_files/alice.txt about 1221 times.
[Finished in 47ms]

学无止境2023

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫