python将txt输入到html中,在python中使用变音符号读取/写入文件（html到txt）

最新推荐文章于 2022-11-12 14:23:50 发布

曾颖老师-造价

最新推荐文章于 2022-11-12 14:23:50 发布

阅读量234

点赞数

文章标签： python将txt输入到html中

您尚未指定问题，因此这是一个完整的猜测。

strip_tags()功能返回了什么内容？它是返回一个unicode对象，还是一个字节串？如果是后者，当您尝试将其写入文件时，可能会导致解码问题。例如，如果strip_tags()返回utf-8编码的字节字符串：

>>> s = u'This is \xe4 test\nHere is \xe4nother line.'

>>> print s

This is ä test

Here is änother line.

>>> s_utf8 = s.encode('utf-8')

>>> f=codecs.open('test', 'w', encoding='utf8')

>>> f.write(s) # no problem with this... s is unicode, but

>>> f.write(s_utf8)

Traceback (most recent call last):

File "", line 1, in

File "/usr/lib64/python2.7/codecs.py", line 691, in write

return self.writer.write(data)

File "/usr/lib64/python2.7/codecs.py", line 351, in write

data, consumed = self.encode(object, self.errors)

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 8: ordinal not in range(128)

如果这是你所看到的，你需要确保在fid.write(result)中传递unicode，这可能意味着确保strip_tags()返回unicode。

此外，我还注意到了其他一些事情：

如果无法打开文件，

codecs.open()将引发IOError异常。它不会返回None，因此if not fid:测试无效。您需要使用try/except，理想情况下使用with。

try:

with codecs.open(htmlFile, "r", encoding = "utf-8") as fid:

htmlText = fid.read()

except IOError, e:

# handle error

print e

并且，您从通过codecs.open()打开的文件中读取的数据将自动转换为unicode，因此调用unicode(htmlText)无法实现任何目标。

曾颖老师-造价

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python将txt输入到html中,在python中使用变音符号读取/写入文件（html到txt）

您尚未指定问题，因此这是一个完整的猜测。strip_tags()功能返回了什么内容？它是返回一个unicode对象，还是一个字节串？如果是后者，当您尝试将其写入文件时，可能会导致解码问题。例如，如果strip_tags()返回utf-8编码的字节字符串：>>> s = u'This is \xe4 test\nHere is \xe4nother line.'>>&g...
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。