Python字符编码判断
方法一:
isinstance(s, str) # 判断是否为字符串
isinstance(s, unicode) # 判断是否为 unicode
方法二:
if type(s).__name__!="unicode":
s = s.decode("utf-8")
else:
pass
方法三:使用 chardet 对网页编码格式做判断
import urllib
import chardet
rawdata = urllib.urlopen('http://www.google.cn/').read()
chardet.detect(rawdata)
return: {‘confidence’: 0.98999999999999999, ‘encoding’: ‘GB2312’}