关于python源码字符编码的定义

运行如下Python打印语句:
print u'I "said" do not touch “this.""'
其中包含一个中文的双引号,python解释器报错。报错信息如下:

[wangy@bogon 文档]$ python ex1.py
  File "ex1.py", line 7
SyntaxError: Non-ASCII character '\xe2' in file ex1.py on line 7, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details

查看链接  http://www.python.org/peps/pep-0263.html


主要内容如下:

在Python2.1版本中,源码文件仅仅支持Latin-1,西欧国家的字符编码,从而给亚洲的编程爱
好者造成很大的困扰,必须使用“unicode-escape”编码来表示Unicode literals。


解决的方法就是为了让解释器了解源代码的编码,必须对源码文件的编码进行声明。


定义编码的方式:
Python will default to ASCII as standard encoding if no other encoding hints are given.

To define a source code encoding, a magic comment must be placed into the source 

files either as first or second line in the file, such as:


# coding=
or (using formats recognized by popular editors):


#!/usr/bin/python
# -*- coding: -*-
or:


#!/usr/bin/python
# vim: set fileencoding= :




最好使用第一种或者第二种。


文中特别提到在windows平台下,增加Unicode BOM标记在Unicode文件头,因此不需要特别声明文件编码,同理也会在UTF-8文件头增加UTF-8标记,故亦不需要声明。

如果源文件使用 both the UTF-8 BOM mark signature and a magic encoding comment, the only allowed encoding for the comment is 'utf-8'. Any other encoding will cause an 
error.


Examples
These are some examples to clarify the different styles for defining the source code encoding at the top of a Python source file:

With interpreter binary and using Emacs style file encoding comment:

#!/usr/bin/python
# -*- coding: latin-1 -*-
import os, sys
...


#!/usr/bin/python
# -*- coding: iso-8859-15 -*-
import os, sys
...

#!/usr/bin/python
# -*- coding: ascii -*-
import os, sys
...
Without interpreter line, using plain text:
# This Python file uses the following encoding: utf-8
import os, sys
...
Text editors might have different ways of defining the file's encoding, e.g.:
#!/usr/local/bin/python
# coding: latin-1
import os, sys
...
Without encoding comment, Python's parser will assume ASCII text:
#!/usr/local/bin/python
import os, sys
...
Encoding comments which don't work:
Missing "coding:" prefix:
#!/usr/local/bin/python
# latin-1
import os, sys
...
Encoding comment not on line 1 or 2:
#!/usr/local/bin/python
#
# -*- coding: latin-1 -*-
import os, sys
...
Unsupported encoding:
#!/usr/local/bin/python
# -*- coding: utf-42 -*-
import os, sys
...

修改源代码,以UTF-8保存,编辑器使用了Linux下的gedit
# -*- coding: utf-8 -*-
print "hello world!"
print "hello Again"
print "I like trying this"
print "This is fun"
print 'Yay! Printing'
print "I'd much rather you 'not'."
print u'I "said" 这里有中文双引号 “this.""'

正常打印




来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/31142205/viewspace-2133760/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/31142205/viewspace-2133760/

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值