运行boilerpipe 时报以下错误:
Traceback (most recent call last):
File "/Users/Adrian/anaconda3/lib/python3.6/site-packages/boilerpipe/extract/__init__.py", line 45, in __init__
self.data = unicode(self.data, encoding)NameError: name 'unicode' is not defined
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "webcrawl.py", line 26, in <module>
fileW.write(str(Extractor(extractor='ArticleExtractor', url=urls).getText() + "\n\n\n" + str(articleDate.publish_date)+"\n\n\n"))
File "/Users/Adrian/anaconda3/lib/python3.6/site-packages/boilerpipe/extract/__init__.py", line 47, in __init__
self.data = self.data.decode(encoding)UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte<