python tornado下logging.handlers.HTTPHandler多打印一个None的解决方案

最新推荐文章于 2023-07-10 14:16:46 发布

ouyangbro

最新推荐文章于 2023-07-10 14:16:46 发布

阅读量2.1k

点赞数 1

分类专栏： Python tornado

本文链接：https://blog.csdn.net/emaste_r/article/details/78141820

版权

Python 同时被 2 个专栏收录

72 篇文章 2 订阅

订阅专栏

tornado

3 篇文章 0 订阅

订阅专栏

0、

版本，忽略版本写博客都是耍流氓！

Python==2.7.10

Tornado==4.2

1、

背景：因为用tornado，各种异步，导致业务逻辑日志不能准确定位，因为它输出日志输到一半就要去搞别的业务逻辑，然后再回来继续输出日志，导致日志看起来像是：

2017-09-29 23:59:57,459 BusinessFactory.py-create()-270 [INFO] [1000108695] 【获取用户Profile接口】
2017-09-29 23:59:57,460 GetUserProfile.py-run()-21 [INFO] 获取用户profile，user_id=1000108695
2017-09-29 23:59:57,470 UserProfile.py-create_user_profile()-45 [INFO] 用户设备个数：0
2017-09-29 23:59:57,494 APIMain.py-post()-19 [INFO] 【版本：234, 协议：pb】
2017-09-29 23:59:57,517 BusinessFactory.py-create()-270 [INFO] [1000109733] 【获取系统设置接口】
2017-09-29 23:59:57,549 BusinessFactory.py-create()-270 [INFO] [1000109733] 【获取用户Config接口】
2017-09-29 23:59:57,559 web.py-log_request()-1908 [INFO] 200 POST /api (127.0.0.1) 66.55ms
2017-09-29 23:59:57,584 UserProfile.py-create_user_profile()-67 [INFO] 用户功课个数：1
2017-09-29 23:59:57,586 UserProfile.py-create_user_profile()-80 [INFO] 1000108695有第三方用户
2017-09-29 23:59:57,588 web.py-log_request()-1908 [INFO] 200 POST /api (127.0.0.1) 154.04ms

可以看到，“获取用户Profile接口”打印到“ 用户设备个数：0”这一句后就开始去处理“获取系统配置接口”，处理完系统接口后再继续打印“ 用户功课个数：1 ”。。。所以不能精准定位啊！！各种谷歌无果，因为人家提出的解决方案，基本上都是基于日志服务器+logging.handlers.HTTPHandler，但这个并不能解决日志不成顺序的问题呀。

无奈自己造轮子吧。

造轮子期间遇到这么一件有趣的事，打印到日志服务器总会带一个None：

2017-09-29 17:54:51,780 - GetUserProfileFromThird.run.78 - ERROR - x1
None
2017-09-29 17:54:51,780 - GetUserProfileFromThird.run.78 - ERROR - yyy
None

2、

首先贴出我们的代码，先感谢残阳似血的博客： http://qinxuye.me/article/build-log-server-with-tornado/ ，我们就在这个基础上修改。

因为受到tornado代码的精神污染，也开始喜欢在代码中加大量的注释。。

tornado的业务handler：

# coding=utf-8
import re
import json
import logging

import tornado.web
from mylog.mylogger import my_logger


class LogAddHandler(tornado.web.RequestHandler):
    tuple_reg = re.compile("^\([^\(\)]*\)$")
    float_reg = re.compile("^\d*\.\d+$")
    int_reg = re.compile("^\d+$")

    def _extract(self, string):
        '''
        由于通过request.arguments的值join起来的仍然是个字符串，这里我们需要将其转化为Python对象
        通过分析，我们可以知道这个对象只能是tuple、float和int
        简单的来说，这个地方可以使用eval方法，但是通常情况下，"eval is evil"
        所以这里通过正则的方法进行解析
        '''
        if self.tuple_reg.match(string):
            # 这里用json.loads来加载一个JS的数组方式来解析Python元组，将前后的括号专为方括号
            # JS里的None为null，这样得到一个Python list，再转化为元组
            return tuple(json.loads('[%s]' % string[1: -1].replace('None', 'null')))
        elif self.float_reg.match(string):
            return float(string)
        elif self.int_reg.match(string):
            return int(string)
        return string

    def post(self):
        '''
        原始的self.request.arguments如下：
        import pprint
        original_args = dict(
            [(k, v) for (k, v) in self.request.arguments.iteritems()]
        )
        pprint.pprint(original_args)

        {'args': ['()'],
         'created': ['1506738449.32'],
         'exc_info': ['None'],
         'exc_text': ['None'],
         'filename': ['GetUserProfileFromThird.py'],
         'funcName': ['run'],
         'levelname': ['ERROR'],
         'levelno': ['40'],
         'lineno': ['78'],
         'module': ['GetUserProfileFromThird'],
         'msecs': ['315.39106369'],
         'msg': ["['x1', 'yyy']"],
         'name': ['monitor'],
         'pathname': ['/Users/ouyang/PycharmProjects/myApp/biz_handlers/third_party/GetUserProfileFromThird.py'],
         'process': ['98843'],
         'processName': ['MainProcess'],
         'relativeCreated': ['57897774.2171'],
         'thread': ['140736844747712'],
         'threadName': ['MainThread']
         }

        '''

        args = dict(
            [(k, self._extract(''.join(v))) for (k, v) in self.request.arguments.iteritems()]
        )
        '''
        import pprint
        pprint.pprint(args)
        结果：
        {
            'threadName': 'MainThread',
            'name': 'monitor',
            'thread': 140736060957632,
            'created': 1506739312.87,
            'process': 1520,
            'args': (),
            'msecs': 872.350931168,
            'filename': 'GetUserProfileFromThird.py',
            'levelno': 40,
            'processName': 'MainProcess',
            'lineno': 78,
            'pathname': '/Users/ouyang/PycharmProjects/myApp/biz_handlers/third_party/GetUserProfileFromThird.py',
            'module': 'GetUserProfileFromThird',
            'exc_text': 'None',
            'exc_info': 'None',
            'funcName': 'run',
            'relativeCreated': 259876.040936,
            'levelname': 'ERROR',
            'msg': "['x1', 'yyy']"
        }
        '''

        '''
        因为和client端约定好，他们那边用如下格式传递过来
            from logclient import client_logger
            logs = ["x1","yyy"]
            client_logger.error(logs)
        所以这边要先还原msg_lst = ['x1', 'yyy']
        '''
        msg_lst = args['msg'].replace('[', '').replace(']', '').replace('\'', '').split(',')
        msg_lst = [v.strip() for v in msg_lst]

        '''
        替换'None'为None，否则会引发如下日志：
        2017-09-30 11:09:10,625 - GetUserProfileFromThird.run.78 - ERROR - x1
        None
        2017-09-30 11:09:10,625 - GetUserProfileFromThird.run.78 - ERROR - yyy
        None
        '''
        for key, value in args.iteritems():
            if value == 'None':
                args[key] = None

        for msg in msg_lst:
            # 每一次只写msg_lst中的一条记录
            args['msg'] = msg

            #import pdb
            #pdb.set_trace()

            # makeLogRecord接受一个字典作为参数
            record = logging.makeLogRecord(args)
            my_logger.handle(record)

日志服务器的log配置， mylogger.py：

# coding=utf-8
import os
import sys
import logging


# 创建一个全局的logger
def get_logger():
    print '#########Create a global logger#########'
    logger = logging.getLogger('server_logger')
    filename = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'my.log')
    handler = logging.FileHandler(filename)
    formatter = logging.Formatter('%(asctime)s-%(name)s-%(module)s.%(funcName)s.%(lineno)d - %(levelname)s - %(message)s')
    handler.setFormatter(formatter)
    handler.setLevel(logging.ERROR)

    '''
    # logger.propagate = False 不要传递到父亲的参数
    # 默认为True，如果为True，那么root这个logger也会收到。到时候在控制台就会打印：
    2017-09-30 11:26:22,493-monitor-GetUserProfileFromThird.run.78 - ERROR - x1
    ERROR:monitor:x1
    2017-09-30 11:26:22,493-monitor-GetUserProfileFromThird.run.78 - ERROR - yyy
    ERROR:monitor:yyy

    控制代码在：logging的Logger类中1318行的callHandlers()：
        def callHandlers(self, record):
            """
            如果propagate=True，则会进去else分支，c = c.parent一直回溯到root，
            root也会打印到streamHandler控制台，导致重复输出。
            """
            c = self
            found = 0
            while c:
                for hdlr in c.handlers:
                    found = found + 1
                    if record.levelno >= hdlr.level:
                        hdlr.handle(record)
                if not c.propagate:
                    c = None    #break out
                else:
                    c = c.parent
            if (found == 0) and raiseExceptions and not self.manager.emittedNoHandlerWarning:
                sys.stderr.write("No handlers could be found for logger"
                                 " \"%s\"\n" % self.name)
                self.manager.emittedNoHandlerWarning = 1
    '''
    logger.propagate = False


    logger.addHandler(handler)

    # 同时输到屏幕，便于实施观察
    handle_for_screen = logging.StreamHandler(sys.stdout)
    handle_for_screen.setFormatter(formatter)
    logger.addHandler(handle_for_screen)
    return logger

my_logger = get_logger()

在其他项目中的log_client.py

# coding=utf-8
import logging
import logging.handlers

logging_host = '127.0.0.1'
logging_port = 8888
logging_add_url = '/log/'


def get_logger():
    logger = logging.getLogger('monitor')
    http_handler = logging.handlers.HTTPHandler(
        '%s:%s' % (logging_host, logging_port),
        logging_add_url,
        method='POST'
    )
    http_handler.setLevel(logging.ERROR)
    logger.addHandler(http_handler)

    return logger

client_logger = get_logger()

3、

开始单步调试，都在logging/__init__,py中！

往下class Logger中(1286行):

    def handle(self, record):
        """
        Call the handlers for the specified record.

        This method is used for unpickled records received from a socket, as
        well as those created locally. Logger-level filtering is applied.
        """
        if (not self.disabled) and self.filter(record):
            self.callHandlers(record)    ##############<<<<< JUMP

往下class Logger中 (1318行):

    def callHandlers(self, record):
        """
        Pass a record to all relevant handlers.

        Loop through all handlers for this logger and its parents in the
        logger hierarchy. If no handler was found, output a one-off error
        message to sys.stderr. Stop searching up the hierarchy whenever a
        logger with the "propagate" attribute set to zero is found - that
        will be the last logger whose handlers are called.
        """
        c = self
        found = 0
        while c:
            for hdlr in c.handlers:
                found = found + 1
                if record.levelno >= hdlr.level:
                    hdlr.handle(record)    ##############<<<<< JUMP
            if not c.propagate:
                c = None    #break out
            else:
                c = c.parent
        if (found == 0) and raiseExceptions and not self.manager.emittedNoHandlerWarning:
            sys.stderr.write("No handlers could be found for logger"
                             " \"%s\"\n" % self.name)
            self.manager.emittedNoHandlerWarning = 1

往下class Handler中(744行):

    def handle(self, record):
        """
        Conditionally emit the specified logging record.

        Emission depends on filters which may have been added to the handler.
        Wrap the actual emission of the record with acquisition/release of
        the I/O thread lock. Returns whether the filter passed the record for
        emission.
        """
        rv = self.filter(record)
        if rv:
            self.acquire()
            try:
                self.emit(record)   ################<<<<< JUMP
            finally:
                self.release()
        return rv

往下class StreamHandler中(847行):

    def emit(self, record):
        """
        Emit a record.

        If a formatter is specified, it is used to format the record.
        The record is then written to the stream with a trailing newline.  If
        exception information is present, it is formatted using
        traceback.print_exception and appended to the stream.  If the stream
        has an 'encoding' attribute, it is used to determine how to do the
        output to the stream.
        """
        try:
            msg = self.format(record)    ###############<<<<< JUMP
            stream = self.stream
            fs = "%s\n"
            if not _unicode: #if no unicode support...
                stream.write(fs % msg)
            else:
                try:
                    if (isinstance(msg, unicode) and
                        getattr(stream, 'encoding', None)):
                        ufs = u'%s\n'
                        try:
                            stream.write(ufs % msg)
                        except UnicodeEncodeError:
                            #Printing to terminals sometimes fails. For example,
                            #with an encoding of 'cp1251', the above write will
                            #work if written to a stream opened or wrapped by
                            #the codecs module, but fail when writing to a
                            #terminal even when the codepage is set to cp1251.
                            #An extra encoding step seems to be needed.
                            stream.write((ufs % msg).encode(stream.encoding))
                    else:
                        stream.write(fs % msg)
                except UnicodeError:
                    stream.write(fs % msg.encode("UTF-8"))
            self.flush()
        except (KeyboardInterrupt, SystemExit):
            raise
        except:
            self.handleError(record)

往下class Handler中(721行):

    def format(self, record):
        """
        Format the specified record.

        If a formatter is set, use it. Otherwise, use the default formatter
        for the module.
        """
        if self.formatter:
            fmt = self.formatter
        else:
            fmt = _defaultFormatter
        return fmt.format(record)   ###############<<<<< JUMP

往下class Formatter中(458行):

    def format(self, record):
        """
        Format the specified record as text.

        The record's attribute dictionary is used as the operand to a
        string formatting operation which yields the returned string.
        Before formatting the dictionary, a couple of preparatory steps
        are carried out. The message attribute of the record is computed
        using LogRecord.getMessage(). If the formatting string uses the
        time (as determined by a call to usesTime(), formatTime() is
        called to format the event time. If there is exception information,
        it is formatted using formatException() and appended to the message.
        """
        record.message = record.getMessage()
        if self.usesTime():
            record.asctime = self.formatTime(record, self.datefmt)
        s = self._fmt % record.__dict__
        if record.exc_info:
            # Cache the traceback text to avoid converting it multiple times
            # (it's constant anyway)
            if not record.exc_text:
                record.exc_text = self.formatException(record.exc_info)
        if record.exc_text:
            if s[-1:] != "\n":    #############<<
                s = s + "\n"
            try:
                s = s + record.exc_text
            except UnicodeError:
                # Sometimes filenames have non-ASCII chars, which can lead
                # to errors when s is Unicode and record.exc_text is str
                # See issue 8924.
                # We also use replace for when there are multiple
                # encodings, e.g. UTF-8 for the filesystem and latin-1
                # for a script. See issue 13232.
                s = s + record.exc_text.decode(sys.getfilesystemencoding(),
                                               'replace')
        return

ok，到达解决问题的终点，看到：

        if record.exc_text:
            if s[-1:] != "\n":
                s = s + "\n"
            try:
                s = s + record.exc_text

我发现我们在转tornado参数的时候，exc_text是’None’，而不是None才导致这个迷之None打印。

修复，在处理tornado的传进来参数的时候：

        if record.exc_text:
            if s[-1:] != "\n":
                s = s + "\n"
            try:
                s = s + record.exc_tex

调完感觉logging这个内置模块还挺有意思的，一开始先是用个while循环遍历出logger的所有handler，然后每一个handler分别去handler这个日志(record)，这个handle()的过程其实就是一个加锁的emit()过程，这个emit()是具体的处理函数，它先用formatter弄出一个msg，然后write到具体的stream（可能是File，也可能是Console）中。

完整的日志服务器项目请查看： https://github.com/emaste-r/tornado_sync_log_demo

如果觉得有用的话，不妨去github点个Star，O(∩_∩)O~

以上