I am using Python BeautifulSoup version 3.
my xml looks something like this (its from docx format):-
Mandatory / Optional
I wanted to extract out the content from tag 'w:t', and so this is what i did:-
print soup.findAll('w:t')
This is the error message that i got:-
print soup.findAll('w:t')
UnicodeEncodeError: 'ascii' codec can't encode character u'\u201c' in position 43: ordinal not in range(128)
解决方案
the beautiful object must be defined as follow :
BeautifulSoup(markup, "lxml-xml")
or
BeautifulSoup(markup, "xml")