python txt文件复制_如何使用python将从web复制到txt文件的文本

I'm learning how to read text files. I used this way:

f=open("sample.txt")

print(f.read())

It worked fine if I typed the txt file myself. But when I copied text from a news article on the web, it produced the following error:

UnicodeEncodeError: 'charmap' codec can't encode charater '\u2014' in position 738: character maps to undefined

I tried changing the Encoding setting in Notepad++ to UTF-8 as I read somewhere it is due to that

I also tried using:

f=open("sample.txt",encoding='utf-8')

from here

But it still didn't work.

解决方案

You're on Windows and trying to print to the console. The print() is throwing the exception.

The Windows console only natively supports 8bit code pages, so anything outside of your region will break (despite what people say about chcp 65001).

You need to install and use https://github.com/Drekin/win-unicode-console. This module talks at a low-level to the console API, giving support for multi-byte characters, for input and output.

Alternatively, don't print to the console and write your output to a file, opened with an encoding. For example:

with open("myoutput.log", "w", encoding="utf-8") as my_log:

my_log.write(body)

Ensure you open the file with the correct encoding.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值