时不时的被编码折腾一下
>>> print urllib.quote('中国')
%E4%B8%AD%E5%9B%BD
>>> s = '%E4%B8%AD%E5%9B%BD'
>>> print urllib.unquote(s).decode('utf8')
中国
>>> s=urllib.quote('中国').encode('utf8')
>>> print s
%E4%B8%AD%E5%9B%BD
>>> s=urllib.quote('中国').decode('utf8')
>>> print s
%E4%B8%AD%E5%9B%BD
>>> print urllib.unquote(s).decode('utf8')
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "/usr/lib64/python2.4/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-5: ordinal not in range(128)
>>> print urllib.unquote(str(s)).decode('utf8')
中国
有几个函数关注一下
urllib.quote()
urllib.unquote()
encode()
decode()
str()
sys.getdefaultencoding()
reload(sys)
sys.setdefaultencoding('utf8')