python的中文问题一直是困扰新手的头疼问题,这篇文章将给你详细地讲解一下这方面的知识。当然,几乎可以确定的是,在将来的版本中,python会彻底解决此问题,不用我们这么麻烦了。 先来看看python的版本:
>>>
import sys
>>> sys.version ' 2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.1310 32 bit (Intel)] ' (一)用记事本创建一个文件ChineseTest.py,默认ANSI:
s =
"
中文
"
print s 测试一下瞧瞧:
E:\Project\Python\Test>python ChineseTest.py
File "ChineseTest.py", line 1 SyntaxError: Non-ASCII character '\xd6' in file ChineseTest.py on line 1, but noencodingdeclared; see http://www.pytho n.org/peps/pep-0263.html for details 偷偷地把文件编码改成UTF-8:
E:\Project\Python\Test>python ChineseTest.py
File "ChineseTest.py", line 1 SyntaxError: Non-ASCII character '\xe4' in file ChineseTest.py on line 1, but noencodingdeclared; see http://www.pytho n.org/peps/pep-0263.html for details 无济于事。。。 既然它提供了网址,那就看看吧。简单地浏览一下,终于知道如果文件里有非ASCII字符,需要在第一行或第二行指定编码声明。把ChineseTest.py文件的编码重新改为ANSI,并加上编码声明:
#
coding=gbk
s = " 中文 " print s 再试一下:
E:\Project\Python\Test>python ChineseTest.py
中文 正常咯:)
#
coding=gbk
s = " 中文 " print len(s)
#
coding=gbk
s = " 中文 " s1 = u " 中文 " s2 = unicode(s, " gbk ") # 省略参数将用python默认的ASCII来解码 s3 = s.decode( " gbk ") # 把str转换成unicode是decode,unicode函数作用与之相同 print len(s1) print len(s2) print len(s3)
#
coding=gbk
print open( " Test.txt ").read()
#
coding=gbk
import codecs print open( " Test.txt ").read().decode( " utf-8 ")
# coding=gbk
import codecs print open("Test.txt").read().decode("utf-8")
#
coding=gbk
import codecs print open( " Test.txt ").read().decode( " utf-8 ")
#
coding=utf-8
s = " 中文 " print unicode(s, " utf-8 ")
Traceback (most recent call last):
File "ChineseTest.py", line 3, in < module > s = unicode(s, "utf-8") UnicodeDecodeError: 'utf8' codec can't decode bytes in position 0-1: invalid data
#
coding=utf-8
s = " 中文 " print unicode(s, " gbk ")
#
coding=utf-8
s = " 中文 " print unicode(s, " cp936 ")
|
Python疑难杂症:SyntaxError: Non-ASCII character Python中文处理问题
最新推荐文章于 2023-01-03 21:51:42 发布
摘要: python的中文问题一直是困扰新手的头疼问题,这篇文章将给你详细地讲解一下这方面的知识。当然,几乎可以确定的是,在将来的版本中,python会彻底解决此问题,不用我们这么麻烦了。先来看看python的版本:importsyss ...