针对我上一篇《简易版计算文本相似度》出现的问题:
Traceback (most recent call last):
File "D:/pythonlianxi/wenbensimi1.py", line 52, in <module>
d3 = open(doc3).read()
UnicodeDecodeError: 'gbk' codec can't decode byte 0xbf in position 2: illegal multibyte sequence
编码错误,好吧,在后面加上encoding='utf-8',问题依然存在。
Traceback (most recent call last):
File "D:/pythonlianxi/wenbensimi1.py", line 9, in <module>
d1 = open(doc1,'r',encoding='utf-8').read()
File "C:\Users\asus\AppData\Local\Programs\Python\Python35\lib\codecs.py", line 321, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'u