我试图使用ElementTree使用utf-8编码的数据编写一个xml文件,如下所示:
#!/usr/bin/python
# -*- coding: utf-8 -*-
import xml.etree.ElementTree as ET
import codecs
testtag = ET.Element('unicodetag')
testtag.text = u'Töreboda' #The o is really ö (o with two dots over). No idea why SO dont display this
expfile = codecs.open('testunicode.xml',"w","utf-8-sig")
ET.ElementTree(testtag).write(expfile,encoding="UTF-8",xml_declaration=True)
expfile.close()
这样会产生错误
Traceback (most recent call last):
File "unicodetest.py",line 10,in
ET.ElementTree(testtag).write(expfile,xml_declaration=True)
File "/usr/lib/python2.7/xml/etree/ElementTree.py",line 815,in write
serialize(write,self._root,encoding,qnames,namespaces)
File "/usr/lib/python2.7/xml/etree/ElementTree.py",line 932,in _serialize_xml
write(_escape_cdata(text,encoding))
File "/usr/lib/python2.7/codecs.py",line 691,in write
return self.writer.write(data)
File "/usr/lib/python2.7/codecs.py",line 351,in write
data,consumed = self.encode(object,self.errors)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1: ordinal not in range(128)
使用“us-ascii”编码代替工作正常,但不保留数据中的unicode字符.发生什么事?