python error:‘gbk‘ codec can‘t encode character ‘\xa0‘ in position 389: illegal multibyte sequence
unicodedata.normalize("NFKD", unicode_str)
import unicodedata
text_string = BeautifulSoup(raw_html, "lxml").text
clean_text = unicodedata.normalize("NFKD",text_string)
print clean_text
REF:https://stackoverflow.com/questions/10993612/how-to-remove-xa0.
转载
2021-05-17 16:58:00 ·
109 阅读 ·
0 评论