python3文档字符串中文报错_python3还会有中文输出乱码问题?!!

最新推荐文章于 2023-03-19 17:32:54 发布

不当咸鱼

最新推荐文章于 2023-03-19 17:32:54 发布

阅读量666

点赞数

文章标签： python3文档字符串中文报错

本文链接：https://blog.csdn.net/weixin_33037713/article/details/113498440

版权

这篇文章是以前重构一个python2项目为python3时遇到的一个问题，当时记录下来没有发表，现在发表在这里存档。

众所周知，python3相比python2的一个最大的改变就是str统一是unicode编码的，带有中文的字符串再也不用写成 u'中文' 这么麻烦，但是最近却遇到了 print('中文') 报错

UnicodeEncodeError: 'ascii' codec can't encode characters in position 157-158: ordinal not in range(128)

这个问题只发生在生产环境上，本地环境没有问题。那么首先考虑的就是生产环境和本地环境差别在哪里了-----启动方式。

生产环境上使用了 uWSGI 来做服务器处理web请求，而本地环境不是。

尝试在 uWSGI 配置文件中加入编码设置：

[uwsgi]

.....

env=LC_ALL=zh_CN.UTF-8

....

重启，报错again！

经过百度和谷歌狂轰乱炸之后，也找不出什么有用的资料，再仔细想想，以前的项目使用了同样的 runit + uWSGI 为啥没问题呢？结合这一点去看 uWSGI 的官方文档，果然有Python 3的特别说明

http://uwsgi-docs.readthedocs.io/en/latest/Python.html#python-3

image2016-6-29%2019%3A27%3A33.png?version=1&modificationDate=1467203759000&api=v2

再看 PEP3333

image2016-6-29%2019%3A43%3A56.png?version=1&modificationDate=1467203759000&api=v2

真相好像已经若隐若现了？慢着，因为经过尝试，直接使用 uwsgi 启动项目 print 是能正常工作的，而使用 runit 就报错了。

注意最后一段

“For values referred to in this specification as "bytestrings" (i.e., values read fromwsgi.input , passed towrite() or yielded by the application), the valuemustbe of typebytes under Python 3, andstr in earlier versions of Python.”

结合python 的 print 函数代码说明，pritn 默认是输出到 sys.stdout

def print(self, *args, sep=' ', end='\n', file=None): # known special case of print

"""

print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)

Prints the values to a stream, or to sys.stdout by default.

Optional keyword arguments:

file: a file-like object (stream); defaults to the current sys.stdout.

sep: string inserted between values, default a space.

end: string appended after the last value, default a newline.

flush: whether to forcibly flush the stream.

"""

pass

print 默认输出到 sys.stdout 。由此猜测，使用 runit 启动项目和 runit 的 svlogd 收集日志改变了 sys.stdout 的输入位置，需要调用到上述的 values read fromwsgi.input , passed towrite() or yielded by the application所以 python 中的unicode str 带有中文的时候就会报错。

经过不懈努力，终于在 PEP3333 文档中发现了这么一段

import os, sys

enc, esc = sys.getfilesystemencoding(), 'surrogateescape'

def unicode_to_wsgi(u):

# Convert an environment variable to a WSGI "bytes-as-unicode" string

return u.encode(enc, esc).decode('iso-8859-1')

def wsgi_to_bytes(s):

return s.encode('iso-8859-1')

def run_with_cgi(application):

environ = {k: unicode_to_wsgi(v) for k,v in os.environ.items()}

environ['wsgi.input'] = sys.stdin.buffer

感觉和 sys.getfilesystemencoding() 有莫大的关系！然后去google寻找蛛丝马迹~~最后还是在万能的 stackoverflow 上找到了相关的

http://stackoverflow.com/questions/9932406/unicodeencodeerror-only-when-running-as-a-cron-job =>

http://stackoverflow.com/questions/492483/setting-the-correct-encoding-when-piping-stdout-in-python => 第二个答案和答案下的评论

https://drj11.wordpress.com/2007/05/14/python-how-is-sysstdoutencoding-chosen/ 最后使用的解决方法

在再生产环境系统中尝试打印出 stdout 的编码

import sys

print(sys.stdout.encoding)

# ANSI_X3.4-1968

而在其他没有使用 runit 的环境打印出的是 utf-8 。

最后，在 uwsgi 的配置文件中加入一句，用来设置python使用的编码格式。

env = LC_CTYPE=zh_CN.utf-8

启动项目！！中文正常输出！！！

就像文章最后说的

BUT WHERE THE HELL IS THIS DOCUMENTED！

-------------------

不当咸鱼

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python3文档字符串中文报错_python3还会有中文输出乱码问题?!!

这篇文章是以前重构一个python2项目为python3时遇到的一个问题，当时记录下来没有发表，现在发表在这里存档。众所周知，python3相比python2的一个最大的改变就是str统一是unicode编码的，带有中文的字符串再也不用写成 u'中文' 这么麻烦，但是最近却遇到了 print('中文') 报错UnicodeEncodeError: 'ascii' codec can't enc...
复制链接

扫一扫

python3文档字符串中文报错_python3还会有中文输出乱码问题?!!

“相关推荐”对你有帮助么？