python打印unicode字符问题_在python中打印unicode字符串,无论环境如何

I'm trying to find a generic solution to print unicode strings from a python script.

The requirements are that it must run in both python 2.7 and 3.x, on any platform, and with any terminal settings and environment variables (e.g. LANG=C or LANG=en_US.UTF-8).

The python print function automatically tries to encode to the terminal encoding when printing, but if the terminal encoding is ascii it fails.

For example, the following works when the environment "LANG=enUS.UTF-8":

x = u'\xea'

print(x)

But it fails in python 2.7 when "LANG=C":

UnicodeEncodeError: 'ascii' codec can't encode character u'\xea' in position 0: ordinal not in range(128)

The following works regardless of the LANG setting, but would not properly show unicode characters if the terminal was using a different unicode encoding:

print(x.encode('utf-8'))

The desired behavior would be to always show unicode in the terminal if it is possible and show some encoding if the terminal does not support unicode. For example, the output would be UTF-8 encoded if the terminal only supported ascii. Basically, the goal is to do the same thing as the python print function when it works, but in the cases where the print function fails, use some default encoding.

解决方案

You can handle the LANG=C case by telling sys.stdout to default to UTF-8 in cases when it would otherwise default to ASCII.

import sys, codecs

if sys.stdout.encoding is None or sys.stdout.encoding == 'ANSI_X3.4-1968':

utf8_writer = codecs.getwriter('UTF-8')

if sys.version_info.major < 3:

sys.stdout = utf8_writer(sys.stdout, errors='replace')

else:

sys.stdout = utf8_writer(sys.stdout.buffer, errors='replace')

print(u'\N{snowman}')

The above snippet fulfills your requirements: it works in Python 2.7 and 3.4, and it doesn't break when LANG is in a non-UTF-8 setting such as C.

It is not a new technique, but it's surprisingly hard to find in the documentation. As presented above, it actually respects non-UTF-8 settings such as ISO 8859-*. It only defaults to UTF-8 if Python would have bogusly defaulted to ASCII, breaking the application.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值