1. 读取无BOM的UTF-8编码文件,open方法传入参数:encoding = 'utf-8'
2. 读取有BOM的UTF-8编码文件,open方法传入参数:encoding = 'utf-8-sig'
3. 读取无BOM的gbk编码文件,open方法传入参数:encoding = 'gbk'
万金油方法:
bytes = min(32, os.path.getsize(filename))
raw = open(filename, 'rb').read(bytes)
result = chardet.detect(raw)
encoding = result['encoding']
infile = open(filename, mode, encoding=encoding)
data = infile.read()
infile.close()
print(data)
参考资料:
Reading Unicode file data with BOM chars in Python
http://stackoverflow.com/questions/13590749/reading-unicode-file-data-with-bom-chars-in-python#comment18629764_13591421
在Python的API文档里有详细介绍: