python设置utf8编码,如何在Python中打印UTF-8编码的文本到控制台3?

I'm running a recent Linux system where all my locales are UTF-8:

LANG=de_DE.UTF-8

LANGUAGE=

LC_CTYPE="de_DE.UTF-8"

LC_NUMERIC="de_DE.UTF-8"

LC_TIME="de_DE.UTF-8"

...

LC_IDENTIFICATION="de_DE.UTF-8"

LC_ALL=

Now I want to write UTF-8 encoded content to the console.

Right now Python uses UTF-8 for the FS encoding but sticks to ASCII for the default encoding :-(

>>> import sys

>>> sys.getdefaultencoding()

'ascii'

>>> sys.getfilesystemencoding()

'UTF-8'

I thought the best (clean) way to do this was setting the PYTHONIOENCODING environment variable. But it seems that Python ignores it. At least on my system I keep getting ascii as default encoding, even after setting the envvar.

# tried this in ~/.bashrc and ~/.profile (also sourced them)

# and on the commandline before running python

export PYTHONIOENCODING=UTF-8

If I do the following at the start of a script, it works though:

>>> import sys

>>> reload(sys) # to enable `setdefaultencoding` again

>>> sys.setdefaultencoding("UTF-8")

>>> sys.getdefaultencoding()

'UTF-8'

But that approach seems unclean. So, what's a good way to accomplish this?

Workaround

Instead of changing the default encoding - which is not a good idea (see mesilliac's answer) - I just wrap sys.stdout with a StreamWriter like this:

sys.stdout = codecs.getwriter(locale.getpreferredencoding())(sys.stdout)

See this gist for a small utility function, that handles it.

解决方案How to print UTF-8 encoded text to the console in Python < 3?

print u"some unicode text \N{EURO SIGN}"

print b"some utf-8 encoded bytestring \xe2\x82\xac".decode('utf-8')

i.e., if you have a Unicode string then print it directly. If you have

a bytestring then convert it to Unicode first.

Your locale settings (LANG, LC_CTYPE) indicate a utf-8 locale and

therefore (in theory) you could print a utf-8 bytestring directly and it

should be displayed correctly in your terminal (if terminal settings

are consistent with the locale settings and they should be) but you

should avoid it: do not hardcode the character encoding of your

environment inside your script; print Unicode directly instead.

There are many wrong assumptions in your question.

You do not need to set PYTHONIOENCODING with your locale settings,

to print Unicode to the terminal. utf-8 locale supports all Unicode characters i.e., it works as is.

You do not need the workaround sys.stdout =

codecs.getwriter(locale.getpreferredencoding())(sys.stdout). It may

break if some code (that you do not control) does need to print bytes

and/or it may break while

printing Unicode to Windows console (wrong codepage, can't print undecodable characters). Correct locale settings and/or PYTHONIOENCODING envvar are enough. Also, if you need to replace sys.stdout then use io.TextIOWrapper() instead of codecs module like win-unicode-console package does.

sys.getdefaultencoding() is unrelated to your locale settings and to

PYTHONIOENCODING. Your assumption that setting PYTHONIOENCODING

should change sys.getdefaultencoding() is incorrect. You should

check sys.stdout.encoding instead.

sys.getdefaultencoding() is not used when you print to the

console. It may be used as a fallback on Python 2 if stdout is

redirected to a file/pipe unless PYTHOHIOENCODING is set:

$ python2 -c'import sys; print(sys.stdout.encoding)'

UTF-8

$ python2 -c'import sys; print(sys.stdout.encoding)' | cat

None

$ PYTHONIOENCODING=utf8 python2 -c'import sys; print(sys.stdout.encoding)' | cat

utf8

Do not call sys.setdefaultencoding("UTF-8"); it may corrupt your

data silently and/or break 3rd-party modules that do not expect

it. Remember sys.getdefaultencoding() is used to convert bytestrings

(str) to/from unicode in Python 2 implicitly e.g., "a" + u"b". See also,

the quote in @mesilliac's answer.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值