What is the default encoding used for encoding strings in python 2.x? I've read that there are two possible ways to declare a string.
string = 'this is a string'
unicode_string = u'this is a unicode string'
The second string is in Unicode.
What is the encoding of the first string?
解决方案
As per Python default/implicit string encodings and conversions (reciting its Py2 part concisely, to minimize duplication):
There are actually multiple independent "default" string encodings in Python 2, used by different parts of its functionality.
Parsing the code and string literals:
str from a literal -- will contain raw bytes from the file, no transcoding is done
unicode from a literal -- the bytes from the file are decode'd with the file's "source encoding" which defaults to ascii
with unicode_literals future, all literals in the file are treated as Unicode literals
Transcoding/type conversion:
strunicode type conversion and encode/decode w/o arguments are done with sys.getdefaultencoding()
which is ascii almost always, so any national characters will cause a UnicodeError
str can only be decode'd and unicode -- encode'd. Trying otherwise will involve an implicit type conversion (with the aforementioned result)
I/O, including printing:
unicode -- encode'd with .encoding if set, otherwise implicitly converted to str (with the aforementioned result)
str -- raw bytes are written to the stream, no transcoding is done. For national characters, a terminal will show different glyphs depending on its locale settings.