python 在eclipse中的中文问题

最新推荐文章于 2019-01-17 20:50:43 发布

weixin_33908217

最新推荐文章于 2019-01-17 20:50:43 发布

阅读量117

点赞数

原文链接：http://www.cnblogs.com/xaf-dfg/p/3244018.html

版权

http://robin.sh/html/733_python-eclipse-encoding.html

http://wenku.baidu.com/view/9786332eed630b1c59eeb575.html

http://blog.sina.com.cn/s/blog_7fd6977b0100tpfd.html

编码环境改成utf-8后，文件路径“F:\kuaipan\zhid_\datper\daa file”要改成“F:kuaipan/zhid_/datper/daa file”

%D6%AA%B5%C0%B9%B1%CF%D7%D5%DF0799 这种%加十六进制是"url编码"，需要通过urllib2模块的unquote来解码成原来的编码形式（可能是gbk,utf-8,GB2312）一下是引用：http://piyu.blog.163.com/blog/static/19420310320101113102455340/

然而同样的中文转出来的URL编码也会有不同，比如百度和电玩巴士就不一样~~

百度用的是GBK格式，一个中文字符转为%xx%xx，共两组

而电玩巴士用的utf-8格式，一个中文字符转为%xx%xx%xx，共三组

需要将中文转成url编码，只需要使用python提供的urllib库就可以。

>>>import sys,urllib

>>>s = "空之境界"

>>>print s

空之境界

>>> urllib.quote(s)

'%BF%D5%D6%AE%BE%B3%BD%E7' ---->请注意，原作者这里输出的是utf-8格式，而我却是gbk

所以，为了保证按照指定的格式输出，调用sys库下面提供的函数

>>>s_utf=s.decode(sys.stdin.encoding).encode("utf-8") # 若想要gbk格式则把这行里的"utf-8"换成"gbk"

>>> print s_utf

绌轰箣澧冪晫 # 嗯……竟然是乱码，不过这个不用管它~

>>> urllib.quote(s_utf)

'%E7%A9%BA%E4%B9%8B%E5%A2%83%E7%95%8C' # 数一数，正好十二组，也就是四个汉字

至于与quote方法相对应的，则是unquote方法，作用是还原被quote函数转码过的字串。

>>>urllib.unquote(urllib.quote(s)) ---->出来的结果是s字符串本身

唯一需要在意的是这句：

s.decode(sys.stdin.encoding).encode("utf-8")

将一个字符串s先使用系统标准编码方式进行解码，再将其结果重新编码为指定的格式。在我自己的系统上，sys.stdin.encoding就等于"gbk"

python的chardet模块可以判断字符编码。

程序举例：

import urllib2,sys,chardet
print chardet.detect('%D6%AA%B5%C0%B9%B1%CF%D7%D5%DF0799')
u=urllib2.unquote('%D6%AA%B5%C0%B9%B1%CF%D7%D5%DF0799').decode('gb2312')
print chardet.detect(u)
print u

输出：{'confidence': 1.0, 'encoding': 'ascii'}
{'confidence': 1.0, 'encoding': 'ascii'}
知道贡献者0799

转载于:https://www.cnblogs.com/xaf-dfg/p/3244018.html

weixin_33908217

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python 在eclipse中的中文问题

http://robin.sh/html/733_python-eclipse-encoding.htmlhttp://wenku.baidu.com/view/9786332eed630b1c59eeb575.htmlhttp://blog.sina.com.cn/s/blog_7fd6977b0100tpfd.html编码环境改成utf-8后，文件路径“F:\kuaipan\zhi...
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。