如何使用python3.4中如何使用print_如何使用Python 3.4(Windows 8)将utf-8打印到控制台?...

I've never fully wrapped my head around encoding and decoding unicode to other formats (utf-8, utf-16, ascii, etc.) but I've reached a wall thatis both confusing and frustrating. What I'm trying to do is print utf-8 card symbols (♠,♥,♦,♣) from a python module to a windows console. The console that I'm using is git bash and I'm using console2 as a front-end. I've tried/read a number of approaches below and nothing has worked so far. Let me know if what I'm doing is possible and the right way to do it.

Made sure the console can handle utf-8 characters.

These two tests make me believe that the console isn't the problem.

Attempt the same thing from the python module.

When I execute the .py, this is the result.

print(u'♠')

UnicodeEncodeError: 'charmap' codec can't encode character '\u2660' in position 0: character maps to

Attempt to encode ♠.

This gives me back the unicode set encoded in utf-8, but still no spade symbol.

text = '♠'

print(text.encode('utf-8'))

b'\xe2\x99\xa0'

I feel like I'm missing a step or not understanding the whole encode/decode process. I've read this, this, and this. The last of the pages suggests wrapping the sys.stdout into the code but this article says using stdout is unnecessary and points to another page using the codecs module.

I'm so confused! I feel as thought quality documentation on this subject is hard to find and hopefully someone can clear this up. Any help is always appreciated!

Austin

解决方案What I'm trying to do is print utf-8 card symbols (♠,♥,♦,♣) from a python module to a windows console

UTF-8 is a byte encoding of Unicode characters. ♠♥♦♣ are Unicode characters which can be reproduced in a variety of encodings and UTF-8 is one of those encodings—as a UTF, UTF-8 can reproduce any Unicode character. But there is nothing specifically “UTF-8” about those characters.

Other encodings that can reproduce the characters ♠♥♦♣ are Windows code page 850 and 437, which your console is likely to be using under a Western European install of Windows. You can print ♠ in these encodings but you are not using UTF-8 to do so, and you won't be able to use other Unicode characters that are available in UTF-8 but outside the scope of these code pages.

print(u'♠')

UnicodeEncodeError: 'charmap' codec can't encode character '\u2660'

In Python 3 this is the same as the print('♠') test you did above, so there is something different about how you are invoking the script containing this print, compared to your py -3.4. What does sys.stdout.encoding give you from the script?

To get print working correctly you would have to make sure Python picks up the right encoding. If it is not doing that adequately from the terminal settings you would indeed have to set PYTHONIOENCODING to cp437.

>>> text = '♠'

>>> print(text.encode('utf-8'))

b'\xe2\x99\xa0'

print can only print Unicode strings. For other types including the bytes string that results from the encode() method, it gets the literal representation (repr) of the object. b'\xe2\x99\xa0' is how you would write a Python 3 bytes literal containing a UTF-8 encoded ♠.

If what you want to do is bypass print's implicit encoding to PYTHONIOENCODING and substitute your own, you can do that explicitly:

>>> import sys

>>> sys.stdout.buffer.write('♠'.encode('cp437'))

This will of course generate wrong output for any consoles not running code page 437 (eg non-Western-European installs). Generally, for apps using the C stdio, like Python does, getting non-ASCII characters to the Windows console is just too unreliable to bother with.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值