使用递归下降文法解析Cognos TM1数据库日志

wanqisheng

已于 2023-01-04 20:19:09 修改

阅读量321

点赞数

分类专栏： unix 文章标签：数据库 oracle

于 2023-01-04 20:16:30 首次发布

本文链接：https://blog.csdn.net/wanqisheng/article/details/128553387

版权

unix 专栏收录该内容

4 篇文章 0 订阅

订阅专栏

同事希望了解TM1中不同对象的执行信息，如开始时间和结束时间，时长，成功与否，错误信息等，以便了解执行快慢，分析错误原因。注意到tm1日志中包含了类似这些信息，所以希望通过分析执行日志来实现这个需求。查看了一下TM1日志的构成，借鉴常见程序日志的记录过程，参考python cookbook中递归下降文法，最后使用python来分析。

TM1的日志类似如下

大致来说每个日志记录行是这样的，

<语句>::= 成分1 + 成分2 + LOG_LEVEL + yyyy-mm-dd hh24:mi:ss.f3 + 日志来源对象 + 描述简短信息（可能存在换行）

程序中为了匹配一个对象开始执行和结束执行，尝试分析后发现，实际是通过PART1和PART2构成了对象执行中的标识。

同时考虑到日志分析时，可能某些对象正在执行之中，需要在下一次分析中完整记录这些正在执行对象的分析结果，所以引入了savepoint，表示这个时间点之前启动的对象都已经执行完成，类似数据库之中的checkpoint.

具体程序如下

# -*- coding:utf-8 -*-

# 保存分析结果到oracle数据库中,用于琪雯和志勇使用
import cx_Oracle

# 修改为使用logging模块来记录日志
import logging

# 用于打印异常函数栈
import traceback

import datetime
# 计算进程消耗cpu
import time

# 使用哪个正则匹配
import re

# 使用命名元组来操作token
import collections

# 为自定义解码错误处理方法而导入
import codecs

# 使用2分搜索查找ti记录位置
import bisect

# 用于保存启动和执行完成的描述信息,描述信息包含行号和具体信息,使用json.dumps(data,separators=(',',':') )
# json.loads(strJson)来复原
import json

# ===========================================================================
# 设计说明
#
# 目的是分析tm1server.log日志,获取执行ti的时间长短,执行结果,失败原因等
# 考虑日志在不断追加之中,且日志可能被其他人修改, 每天早上执行1次,记录分析程序开始时间,结束时间,消耗cpu,
# 分析开始位置所对应时间,结束位置对应时间,本次分析的保存点对应时间(此时间之前的ti都已经执行完成)
# 对于每个ti执行,记录开始时间,结束时间,时长,执行结果,失败原因

# 程序大致过程如下:
# 1. 按照T_tm1slog中analysis_id排序,查找最后一次执行时的save_point和analysis_id
# 2. 如果为空, 则文件从开始分析,新的分析编号为1,
#    否则analysis_id加1作为新的分析编号,同时文件找到此时间,然后读入下一行,作为开始分析
# 3. 假设不同ti存在并发执行的情况,开始进行ti扫描.假设输入为具备行缓冲的文件,使用基于stream的操作
#    1. 扫描符合ti的行,获取ti名称,判断是开始执行还是执行完成或退出,操作时间,按照不同情况,记录对应的各种数据
#    2. 查找字典结构中此ti名称的分析,
#    3. 如果当前行为开始执行,则在此ti名称的分析list中增加一行,假设同一ti操作日志输出会按照时间排序,且忽略不同线程操作时,由于缓冲导致的
#       输出时间顺序不同情况
#    4. 如果是执行返回,则检查此ti名称的分析list是否为空,若空,则提示错误, 否则list排序中查找前次完成操作的位置,然后在之后的未完成操作
#       中记录完成时间和返回信息,若没有未完成的操作,提示错误
# 4. 从本次操作的数据构造保存到数据库中的数据,并保存
#

# 解析中发现tm1server.log文件存在下面现象:
# 1. 文件中存在乱码,文件编码为utf-8,但存在非utf-8的数据
# 2. 存在单独的0x0d,如47533行,在}之后存在0x0d, 随后才是) 导致增加了1行
#    在以下位置或其附近出现语法错误： '})', 字符位置 215
#

# 对于扫描性能,暂时不考虑,每次分析完整文件
# 2022.12.19

# 由于tm1s数据库服务存在重新启动和异常退出问题,所以仅匹配同次启动的情况
# 需要从服务器启动后开始匹配,如果服务器存在shutdown情况,则此前没有匹配成功的全部不再分析
# 服务器启动标志是:
# TM1.Server   thread TM1UpdateLoggerThread starts with id:
# 服务关闭标志是
# TM1.Server   thread TM1UpdateLoggerThread quits with id:
# 2022.12.20

# 1. TI_UNKNOWN1是这个ti的提示信息,查找对应的ti分析记录,补充到描述结构中
# 2. TI_UNKNOWN2实际是异常退出,可以作为返回代码2来处理
# 3. TI_UNKNOWN3应该是ti的提示信息,但发现仅出现1次,Cube数据导出_B0703损益表, 且应该是代码错误,所以暂时按照原有方式
# 2022.12.21

# 1. 增加ti_process_type,引入执行对象类型TM1.Chore,TM1.Process等,parent_begin_time,记录父对象的启动时间,以便后续使用层级来搜索启动关系
#    增加以便查找父对象实例化之后的实际记录,在目前情况下,实际无法区分同时存在多个实例化的情况
# 2. 增加TM1.Chore结构的处理,以便获取完整启动父子关系
# 3. 分析时,当ti启动时,记录父对象的类型,父对象的名称和实例化时间,同时在父对象的child_nodes中记录子的名称,实例化时间,通过双向方式
#    以便TM1.Chore结束时,将其所属的所有子对象关闭,如果子对象没有对应的结束,则此时,使用单独的状态标识这种现象
# 2022.12.21

# 1. ti层级关系判断时，需要增加用户
# 2022.12.25

# 1. 对日志中显示2次的原因
#    例如 317142行的Process A00系统_更新年度预算实际数 executed by chore A00系统_更新年度预算实际数
#    Process "A02年度预算_B0601、F0502部门费用预算" run from process "A00系统_更新年度预算实际数" by user "R*A00系统_更新年度预算实际数"
# ===========================================================================

# ===========================================================================
# 常量及变量说明
# 日志记录对象
logger=None

# 批量执行控制
BATCH_SIZE=100

# 全面预算 数据库的连接字符串 C1_MIDDLEDATA
BGDB_CONN_STR="C1_MIDDLEDATA/mypasswd@bgdb"

# ===========================================================================
# 正则解析

# 查看tm1sserver.log,发现存在简短描述信息换行的情况,猜测实际文法构成是
# <语句列表>::= <语句>|<语句列表><语句>
# <语句>::= 成分1 + 成分2 + LOG_LEVEL + yyyy-mm-dd hh24:mi:ss.f3 + 日志来源对象 + 描述简短信息 （可能存在换行）
# 若语句不符合上述文法,则需要提示
# 其中
# 成分1  从行开始的数值部分,直到分隔符   用途不知  从数据来看，可能是对应执行线程的某种id，
# 成分2  [] 或[]中间包含数值, 直到分隔符  用途不知 从数据看， 应该是实例id， 如果存在，
# LOG_LEVEL : INFO | ERROR | DEBUG | WARN, 其他信息则需要提示, 正常来说,应该是这样的
# CURR_SOURCE 存在 TM1.Mdx.Interface,TM1.Server等不同情况,所以使用 (?:\.\w+)* 非捕获版本
# 日志来源对象,为下面的一种
# TM1.API
# TM1.Blob
# TM1.Chore
# TM1.Client
# TM1.ClientMessage
# TM1.Cube
# TM1.Cube.Dependency
# TM1.Dimension
# TM1.Lock
# TM1.Mdx.Interface
# TM1.Process
# TM1.Server
# TM1.Server.Config
# TM1.Server.Memory
# TM1.Server.Network
# TM1.SQLAPI
# TM1.Subset
# TM1.TILogOutput
# TM1.Transaction
# TM1.User.Thread
LOG_STATEMENT=r'(?P<PART1>^\d+)' + r'\s+'+ \
              r'\[(?P<PART2>\d*)\]' + r'\s+'+ \
              r'(?P<LOG_LEVEL>INFO|ERROR|DEBUG|WARN)' + r'\s+'+ \
              r'(?P<CURR_TIMESTAMP>\d{4}-\d{2}-\d{2}\s+\d{2}:\d{2}:\d{2}\.\d{3})' + r'\s+'+ \
              r'(?P<CURR_SOURCE>TM1\.\w+(?:\.\w+)*)' + r'\s+' + \
              r'(?P<CURR_DESC>.+?)'+r'$'
# 获取日志语句对应的 编译后正则表达式对象
log_statement = re.compile(LOG_STATEMENT)

# 空行
WS=r'^\s*$'
ws=re.compile(WS)

# 需要从服务器启动后开始匹配,如果服务器存在shutdown情况,则此前没有匹配成功的全部不再分析
# 服务器启动标志是:
# TM1.Server   thread TM1UpdateLoggerThread starts with id:
# 服务关闭标志是
# TM1.Server   thread TM1UpdateLoggerThread quits with id:

SERVER_START=r'thread TM1UpdateLoggerThread starts with id:'
server_start=re.compile(SERVER_START)

SERVER_SHUTDOWN=r'thread TM1UpdateLoggerThread quits with id:'
server_shutdown=re.compile(SERVER_SHUTDOWN)

# TM1.Chore 部分的匹配
# chore的启动,仅1种方式
# 140134112556800   []   INFO   2022-12-05 05:30:01.991   TM1.Chore   Chore "A00系统_更新年度预算实际数" executed by scheduler
CHORE_START=r'^Chore'+r'\s+'+ \
            r'(?P<QUOTATION_MARKS>["\'])?(?P<CHORE_NAME>[^"\']+?)(?(QUOTATION_MARKS)(?P=QUOTATION_MARKS))'+r'\s+'+ \
            r'executed by scheduler'
chore_start=re.compile(CHORE_START)

# chore的执行完成,实际包含2个步骤
# 140133978339072   []   INFO   2022-12-05 05:34:28.160   TM1.Chore   Chore "A00系统_更新年度预算实际数0330" time = 7466.71 seconds
# 140133978339072   []   INFO   2022-12-05 05:34:41.848   TM1.Chore   Chore "A00系统_更新年度预算实际数0330" finished executing
# 猜测这2者之间的间隔为执行完成后,释放其所使用的各种对象,内存等,所以消耗了13秒
# 使用chore_finish1,chore_finish2来结束它所启动的所有ti,
# 其中chore_finish1设置状态为1,chore_finish2设置状态为0,区分看看是否存在1的情况

CHORE_FINISH1=r'^Chore'+r'\s+'+ \
              r'(?P<QUOTATION_MARKS>["\'])?(?P<CHORE_NAME>[^"\']+?)(?(QUOTATION_MARKS)(?P=QUOTATION_MARKS))'+r'\s+'+ \
              r'time'+r'\s*=\s*'+ r'(?P<DURATION>\d+\.?\d+)'+r'\s*'+r'seconds'
chore_finish1=re.compile(CHORE_FINISH1)

CHORE_FINISH2=r'^Chore'+r'\s+'+ \
              r'(?P<QUOTATION_MARKS>["\'])?(?P<CHORE_NAME>[^"\']+?)(?(QUOTATION_MARKS)(?P=QUOTATION_MARKS))'+r'\s+'+ \
              r'finished executing'
chore_finish2=re.compile(CHORE_FINISH2)

# 以下这几种方式可以忽略处理
# TM1.Chore   start wait for next runtime
# TM1.Chore   executing next runtime
# TM1.Chore   Registering chore: A06维度元素导出_组织末级  Start time: 2021/07/28 04:10:00  Frequency: 01:00:00:00
# TM1.Chore   Deactivating chore: A00系统_更新年度预算实际数0330
CHORE_IGNORE1=r'start wait for next runtime'
chore_ignore1=re.compile(CHORE_IGNORE1)

CHORE_IGNORE2=r'executing next runtime'
chore_ignore2=re.compile(CHORE_IGNORE2)

CHORE_IGNORE3=r'Registering chore:'+r'\s+'+r'.*'
chore_ignore3=re.compile(CHORE_IGNORE3)

CHORE_IGNORE4=r'Deactivating chore:'+r'\s+'+r'.*'
chore_ignore4=re.compile(CHORE_IGNORE4)

# ti的启动 存在类似下面3种说明的方式
# Process "A01维度_物料" run from process "A00系统_维度更新_每天0530" by user "R*A00系统_维度更新_每天0530"
# Process A00系统_更新年度预算实际数 executed by chore A00系统_更新年度预算实际数
# Process "生产模块_ti送数_B0205计算分交易类型生产量" executed by user "8037932"
TI_START1=r'^(进程|Process)'+r'\s+'+ \
          r'(?P<QUOTATION_MARKS>["\'])?(?P<TI_NAME>[^"\']+?)(?(QUOTATION_MARKS)(?P=QUOTATION_MARKS))'+r'\s+'+ \
          r'run from process'+r'\s+' + \
          r'(?P<QUOTATION_MARKS2>["\'])?(?P<RUN_FROM_PROCESS_NAME>[^"\']+?)(?(QUOTATION_MARKS2)(?P=QUOTATION_MARKS2))'+r'\s+'+ \
          r'by user'+r'\s+' + \
          r'(?P<QUOTATION_MARKS3>["\'])?(?P<EXECUTED_BY_USER>[^"\']+?)(?(QUOTATION_MARKS3)(?P=QUOTATION_MARKS3))'
ti_start1=re.compile(TI_START1)

TI_START2=r'^(进程|Process)'+r'\s+'+ \
          r'(?P<QUOTATION_MARKS>["\'])?(?P<TI_NAME>[^"\']+?)(?(QUOTATION_MARKS)(?P=QUOTATION_MARKS))'+r'\s+'+ \
          r'executed by chore'+r'\s+' + \
          r'(?P<QUOTATION_MARKS2>["\'])?(?P<RUN_FROM_PROCESS_NAME>[^"\']+?)(?(QUOTATION_MARKS2)(?P=QUOTATION_MARKS2))'+r'$'
ti_start2=re.compile(TI_START2)

TI_START3=r'^(进程|Process)'+r'\s+'+ \
          r'(?P<QUOTATION_MARKS>["\'])?(?P<TI_NAME>[^"\']+?)(?(QUOTATION_MARKS)(?P=QUOTATION_MARKS))'+r'\s+'+ \
          r'executed by user'+r'\s+' + \
          r'(?P<QUOTATION_MARKS2>["\'])?(?P<EXECUTED_BY_USER>[^"\']+?)(?(QUOTATION_MARKS2)(?P=QUOTATION_MARKS2))'+r'$'
ti_start3=re.compile(TI_START3)

# 正常执行完成
# 进程 "A01维度_半成品":  已正常完成执行,耗时 67.95 秒
TI_FINISH1=r'^(进程|Process)'+r'\s+'+ \
           r'(?P<QUOTATION_MARKS>["\'])?(?P<TI_NAME>[^"\']+?)(?(QUOTATION_MARKS)(?P=QUOTATION_MARKS))(:|：)'+r'\s+'+ \
           r'.*'+ r'已正常完成执行'+r'.*'+ \
           r'耗时\s*(?P<DURATION>\d+\.?\d+)\s*秒'
ti_finish1=re.compile(TI_FINISH1)

# 执行完成,但存在错误,其中 已正常完成执行.请在以下文件中检查新消息 这种方式实际仍然是有问题,所以归为一类
# 进程 "A00系统_更新经营分析实际数":  已完成执行,但存在错误.错误文件：<tm1processerror_20221214235040_34120320_a00系统_更新经营分析实际数.log>
# 进程 "P050_Change_ElementName":  已正常完成执行.请在以下文件中检查新消息：<tm1processerror_20220323140015_94976000_p050_change_elementname.log>
TI_FINISH2=r'^(进程|Process)'+r'\s+'+ \
           r'(?P<QUOTATION_MARKS>["\'])?(?P<TI_NAME>[^"\']+?)(?(QUOTATION_MARKS)(?P=QUOTATION_MARKS))(:|：)'+r'\s+'+ \
           r'.*'+ r'(已完成执行|已正常完成执行)'+r'.*'+ \
           r'<'+r'(?P<MESSAGE_FILE>[^<>]+?)'+r'>'
ti_finish2=re.compile(TI_FINISH2)

# 执行终止
# 进程 "A08经营分析_BA0304材料采购成本分析_采购入库": ：执行已中止.错误文件：<tm1processerror_20221214234251_31720320_a08经营分析_ba0304材料采购成本分析_采购入库.log>
# 进程 "A06维度元素导出_客户（层级）": ：执行已中止.错误文件：<tm1processerror_20220325163559_26905088_a06维度元素导出_客户（层级）.log> : 执行 SQL 查询出错：“insert into T_C1_DIMENSION_CUSTOMER (c1,c2,c3) values ( 'B822','HARRY'S.INC-B822','N-B822')”
TI_TERMINATE=r'^(进程|Process)'+r'\s+'+ \
             r'(?P<QUOTATION_MARKS>["\'])?(?P<TI_NAME>[^"\']+?)(?(QUOTATION_MARKS)(?P=QUOTATION_MARKS))(:|：)'+r'\s+'+ \
             r'.*'+r'执行已中止.*错误文件.*'+ \
             r'<'+r'(?P<MESSAGE_FILE>[^<>]+?)'+r'>'
ti_terminate=re.compile(TI_TERMINATE)

# 执行取消
# 进程 "A02年度预算_F0603损益预实_内外部客户营收_按钮": ：ProcessQuit 函数已将执行中止.
# 这里 允许 '或" 来分隔 TI_NAME 或没有分隔符
TI_QUIT=r'^(进程|Process)'+r'\s+'+ \
        r'(?P<QUOTATION_MARKS>["\'])?(?P<TI_NAME>[^"\']+?)(?(QUOTATION_MARKS)(?P=QUOTATION_MARKS))(:|：)'+r'\s+'+ \
        r'.*'+r'ProcessQuit'+r'.*'+r'$'
ti_quit=re.compile(TI_QUIT)

# 其他方式 此时主要是获取 程序的名称,如果这个名称没有启动,则将其认为是启动,含义是启动时,就遇到了错误,所以这个启动和终止是一个提示信息
# A00系统_更新年度预算实际数0330：不存在这样的进程名称：“A02年度预算_B0407人员成本占比_清数”
# 进程 "A00系统_更新年度预算实际数0330": ：无法初始化进程 : A00系统_更新年度预算实际数0330：不存在这样的进程名称：“A02年度预算_B0407人员成本占比_清数”
# 在进程“Cube数据导出_B0703损益表”中找不到参数“”
TI_UNKNOWN1=r'(?P<TI_NAME>[^"\'\s]+?)(:|：)'+ r'\s*' +\
            r'不存在这样的进程名称'
ti_unknown1=re.compile(TI_UNKNOWN1)

TI_UNKNOWN2=r'^(进程|Process)'+r'\s+'+ \
            r'(?P<QUOTATION_MARKS>["\'])?(?P<TI_NAME>[^"\']+?)(?(QUOTATION_MARKS)(?P=QUOTATION_MARKS))(:|：)'+r'\s+'
ti_unknown2=re.compile(TI_UNKNOWN2)

TI_UNKNOWN3=r'.*'+r'(进程|Process)'+r'\s*'+ \
            r'(?P<QUOTATION_MARKS>“)?(?P<TI_NAME>[^"\']+?)(?(QUOTATION_MARKS)”)'
ti_unknown3=re.compile(TI_UNKNOWN3)

# 忽略,此方式没有程序名称,而且是固定的,且是DEBUG级别或者是前面没有解析出 名称的 其他未知方式
# ExecuteProcedures: Cancel request received.
# 139372829603584   [9846]   DEBUG   2022-11-24 13:49:26.455   TM1.Process   ExecuteProcedures: Cancel request received.
TI_IGNORE=r'ExecuteProcedures: Cancel request received.'
ti_ignore=re.compile(TI_IGNORE)

# token类型
Token = collections.namedtuple('Token', ['type', 'value'])

# 用于分析过程的描述,不包含执行开始,结束,消耗cpu这3个统计信息
ANALYSIS_INFO=collections.namedtuple('ANALYSIS_INFO',
                                     ['analysis_id',
                                      'db_started_line','db_started_time','db_shutdowned_line','db_shutdowned_time',
                                      'begin_position_line','begin_position_time','begin_position_desc',
                                      'end_position_line','end_position_time','end_position_desc',
                                      'save_point_line','save_point_time','save_point_desc',
                                      'description'
                                     ]
                                    )

# ===========================================================================
# 函数
def generate_tokens():
    '''
    产生LOG_STATEMENT的token.由于语句可能包含多行的情况,此时考虑token阶段不处理文法
    所以将后续的多行均作为unknown,在文法阶段,将其合并到前一个语句的信息描述部分
    '''
    # tm1server.log文件存在编码无法解析为utf-8的情况,此时使用replace方式,将其作为?处理
    # codecs.register_error()
    # Error Handlers
    def unicode_decode_err_handle(exc):
        '''
        由于文件中包含无法解析为utf-8的编码,而使用replace,backslashreplace方式,取代结果中都包含了\n,导致行无法和文件中行对应
        不便于核对文件数据,所以使用替代方式,使用codecs.register_error注册,然后调用这个替代方式处理
        exc是UnicodeDecodeError异常,为UnicodeError的子类,具有下面属性
        encoding 引发错误的编码名称.
        reason   描述特定编解码器错误的字符串.
        object   编解码器试图要编码或解码的对象.
        start    object 中无效数据的开始位置索引.
        end      object 中无效数据的末尾位置索引（不含）.
        '''
        if not isinstance(exc, UnicodeDecodeError):
            raise TypeError("don't know how to handle %r" % exc)

        sk=str(exc.object[exc.start:exc.end])
        newpos = exc.end + 1
        return (sk, newpos)

    # 设置unicode处理方式为前面替代方法
    codecs.register_error("unicode_decode_err_handle", unicode_decode_err_handle)
    with open("tm1server.log","rt",encoding='utf-8',errors='unicode_decode_err_handle') as f1:
        i=0
        for line in f1:
            i+=1
            if  ws.match(line) :
                # 忽略空行
                continue
            if  log_statement.match(line) :
                match=log_statement.match(line)
                # 参考https://cx-oracle.readthedocs.io/en/latest/user_guide/sql_execution.html?highlight=TIMESTAMP%20WITH%20TIME%20ZONE#id7
                # The timestamps returned are naive timestamps without any time zone information present.
                # TIMESTAMP WITH TIME ZONE中实际没有包含时区信息,所以数据库中列类型仅仅是TIMESTAMP
                tok=Token('LOG_STATEMENT',
                          {'LINE':i,                                                                           # 便于核对
                           'PART1':int(match.group('PART1')),                                                  # 这个应该是和线程某种编号有关对象id
                           'PART2':int(match.group('PART2')) if match.group('PART2')!='' else None,            # 这个应该是和实例化有关的编号 
                           'CURR_TIMESTAMP':datetime.datetime.fromisoformat(match.group('CURR_TIMESTAMP')),    # 使用本地时区时间戳
                           'CURR_SOURCE':match.group('CURR_SOURCE'),                                           # 日志来源对象
                           'CURR_DESC':match.group('CURR_DESC')                                                # 描述信息
                          }
                         )
            else:
                tok=Token('UNKNOWN',
                          {'LINE':i,                                       # 便于核对
                           'CURR_DESC':line                                # 将整行作为描述信息
                          }
                         )
                logger.debug("UNKNOWN TOKEN %d , %s"%(i,line))
            yield tok

class TM1_ServerLog(object):
    '''
    解析tm1server日志文法,提取句子,使用产生器方式来迭代提取
    参考 python cookbook 中P78示例
    '''
    def parse(self,last_info):
        '''
        对tm1server.log文件进行解析,获取主要信息
        '''
        self.tokens = generate_tokens()
        self.tok = None # Last symbol consumed
        self.nexttok = None # Next symbol tokenized
        self._advance() # Load first lookahead token
        self.context(last_info)

    def _advance(self):
        '''
        读取token
        '''
        self.tok, self.nexttok = self.nexttok, next(self.tokens, None)

    def _accept(self, toktype):
        '''
        判断是否为对应类型的token,如果符合要求,则读入下一个token
        '''
        if self.nexttok and self.nexttok.type == toktype:
            self._advance()
            return True
        else:
            return False

    def _expect(self, toktype):
        '''
        判断是否为预期的token,否则产生语法异常
        '''
        if not self._accept(toktype):
            raise SyntaxError('Expected ' + toktype+', current token is '+str(self.tok))

    # Grammar rules follow
    def context(self,last_info):
        '''
        按照下面文法解析trace文件
        <语句列表>::= <语句>|<语句列表><语句>
        考虑为了尽可能的得到执行结构的输出,防止日志输出方式和自己理解的不一致,对于异常方式,均输出提示.方便理解日志,修改代码
        因此这里是全部日志文件解析,后续利用last_savepoint过滤需要的分析结果
        '''
        # 初始化分析TM1.Process的变量
        self.ti_list={}
        self.chore_list={}
        #数据库服务启动说明
        self.started_db={}
        self.started_times=0

        # 当开始阶段没有匹配到语句,则继续读取token
        while not self._accept('LOG_STATEMENT'):
            # 当读取到文件尾时,实际表示文件中不包含需要处理的信息,或文件实际为空
            # 直接返回
            if  self.tok is None:
                return
            # 继续读取下一个token来判断是否进入了语句
            self._advance()

        while True:
            # 解析出语句信息
            curr_value=self.statement()

            if curr_value['CURR_SOURCE']=='TM1.Server':
                # 此时判断是否匹配启动
                self.process_server_data(curr_value,last_info)

            if curr_value['CURR_SOURCE']=='TM1.Process':
                # 此时是需要处理的ti
                self.process_ti_data(curr_value,last_info)

            if curr_value['CURR_SOURCE']=='TM1.Chore':
                # 此时处理chore
                self.process_chore_data(curr_value,last_info)

            # 其他来源的暂时不处理,如果后续处理,这里增加即可

            # 读入后续的语句
            if  self._accept('LOG_STATEMENT'):
                continue

            if  self.nexttok is None:
                # 此时trace文件匹配到尾部
                # 将self.ti_list增加到前面的启动分析中
                # 为防止 TM1.Server关闭后，已经将分析数据追加到self.started_db[self.started_times],所以这里需要判断是否为空
                if len(self.ti_list.keys())>0 or len(self.chore_list.keys())>0:
                    if self.started_times in self.started_db:
                        db_started_line,db_started_time,*ignore_items=self.started_db[self.started_times]
                    else:
                        # 此时表示人为删除了文件,已经没有遇到数据库启动标识
                        db_started_line,db_started_time=None,None
                    
                    self.started_db[self.started_times]=(db_started_line,db_started_time,None,None,self.ti_list,self.chore_list)
                    # 清空ti分析数据,chore分析数据
                    self.ti_list={}
                    self.chore_list={}
                    self.started_times+=1
                break

            # 此时说明遇到了非 LOG_STATEMENT,也不是结尾的情况
            raise SyntaxError('遇到 ' + self.nexttok.type+' 不符合文法要求')

        #这里无返回值,需要外部过程调用,按照上次的保存点来过滤需要的结果
        return

    def statement(self):
        '''
        按照下面文法解析
        语句 ::= LOG_STATEMENT [ UNKNOWN ]
        这里解析后,实际是返回语句的值
        '''
        curr_value=self.tok.value

        # 判断是否包含多行描述信息,这里保持原始文件中多行描述信息的换行方式
        while True:
            if self._accept('UNKNOWN'):
                curr_value['CURR_DESC']=curr_value['CURR_DESC']+'\n'+self.tok.value['CURR_DESC']
                continue
            else:
                break
        return curr_value

    def process_server_data(self,curr_value,last_info):
        '''
        解析是否匹配启动和suhtdown,目的是仅分析同次启动的数据,
        增加TM1.Chore数据,仍然遵守TM1.Process的处理原则,以便后续使用层级来搜索启动关系,从而区分同次
        '''
        # 当前所使用的变量初始化
        line_number=curr_value['LINE']
        curr_timestamp=curr_value['CURR_TIMESTAMP']
        curr_desc=curr_value['CURR_DESC']

        match=server_start.match(curr_desc)
        if match:
            # 此时开始新的数据
            # 需要检查看self.ti_list是否有数据,如果有,则需要其对应到前一次的数据中
            # 为防止 TM1.Server关闭后，已经将分析数据追加到self.started_db[self.started_times],所以这里需要判断是否为空 
            if  len(self.ti_list.keys())>0 or len(self.chore_list.keys())>0 :
                # 此时需要将待处理的ti数据,保存到前次对应的启动数据
                # 如果self.started_times为0,则表示原来没有找到过,此时启动时间未知,结束时间未知
                # 例如是异常终止
                if self.started_times in self.started_db:
                    db_started_line,db_started_time,*ignore_items=self.started_db[self.started_times]
                else:
                    db_started_line,db_started_time=None,None
                self.started_db[self.started_times]=(db_started_line,db_started_time,None,None,self.ti_list,self.chore_list)
            # 恢复新的ti数据分析
            self.ti_list={}
            self.chore_list={}
            self.started_times+=1
            self.started_db[self.started_times]=(line_number,curr_timestamp,None,None,self.ti_list,self.chore_list)

            return

        match=server_shutdown.match(curr_desc)
        if match:
            # 此时填写结束时间
            db_started_line,db_started_time,*ignore_items=self.started_db[self.started_times]
            self.started_db[self.started_times]=(db_started_line,db_started_time,line_number,curr_timestamp,self.ti_list,self.chore_list)
            # 恢复新的ti数据分析
            self.ti_list={}
            self.chore_list={}
        return

    def process_chore_data(self,curr_value,last_info):
        '''
        解析是否匹配启动和suhtdown,目的是仅分析同次启动的数据,
        增加TM1.Chore处理,仍然遵守TM1.Process的处理原则,以便后续使用层级来搜索启动关系,从而区分同次
        '''
        def parse_start(curr_desc):
            # 检查是否符合启动匹配
            match=chore_start.match(curr_desc)
            if  match:
                chore_name=match.group('CHORE_NAME')
                return True,chore_name

            # 此时不匹配启动
            return False,None

        def parse_finish(curr_desc):
            # 检查是否为 执行完成 匹配
            match=chore_finish1.match(curr_desc)
            if  match:
                chore_name=match.group('CHORE_NAME')
                duration=float(match.group('DURATION'))
                return_code=1
                return True,chore_name,duration,return_code

            match=chore_finish2.match(curr_desc)
            if  match:
                chore_name=match.group('CHORE_NAME')
                duration=None
                return_code=0
                return True,chore_name,duration,return_code

            # 此时不匹配完成
            return False,None,None,None

        def parse_ignore(curr_desc):
            # 检查是否为 执行完成 匹配
            if  (   chore_ignore1.match(curr_desc)
                 or chore_ignore2.match(curr_desc)
                 or chore_ignore3.match(curr_desc)
                 or chore_ignore4.match(curr_desc)
                ):
                # 此时是 正常执行完成
                return True

            # 此时不匹配忽略
            return False

        def add_start(line_number,curr_timestamp,description,chore_name):
            # 添加chore对象的实例,由于chore实际是一种计划任务,为对象启动父子关系中的最高层级,实际是根节点
            # 所以是简单添加
            if chore_name not in self.chore_list:
                # 表示这个记录是 这个chore的初次记录
                # 最后一个为list对象,以便在chore完成时,用于子对象搜索,关闭子对象
                # 倒数第2个为return_code状态
                self.chore_list[chore_name]=[(line_number,chore_name,
                                              curr_timestamp,None,None,{line_number:description},
                                              None,[],
                                              part1,part2
                                             )]
            else:
                # 在chore_list中按照启动时间排序的位置,插入当前记录
                # 遵循的原则是如果没有开始时间,则结束时间早的一定排在待插入记录之前
                # 如果没有开始时间,但结束时间在此启动时间之后,则对于待插入的数据,可以插入在前,也可以插入在后
                # 这里简单追加list,然后执行原地排序
                self.chore_list[chore_name].append((line_number,chore_name,
                                                    curr_timestamp,None,None,{line_number:description},
                                                    None,[],
                                                    part1,part2
                                                   ))
                # 如果begin_time 开始时间存在,则使用开始时间, 否则使用end_time结束时间
                self.chore_list[chore_name].sort(key=lambda r: r[2] if r[2] else r[3])

            return

        def add_finish(line_number,curr_timestamp,description,chore_name,duration,return_code):
            # 关闭chore对象,查找对应的实例,设置状态,同时依据子对象结构,关闭ti_list中的子对象
            # 此时添加chore结束 的记录
            if chore_name not in self.chore_list:
                # 此时日志文件不完整,或曾经被删除过,导致启动的记录存在缺失
                # 没有子对象,不关闭子对象
                self.chore_list[chore_name]=[(line_number,chore_name,
                                              None,curr_timestamp,duration,{line_number:description},
                                              return_code,[],
                                              part1,part2
                                             )]
            else:
               # 查找可能匹配的同次chore执行的记录,找到后,则填充那个执行日志的结束时间,状态等,
               # duration可以用于辅助判断是否为同次chore执行
               # 查找机制是查开始时间是last_savepoint之后,在curr_timestamp时间戳之前的日志且没有结束的,
               # 此外如果是正常结束,时长已知的,同时校核一下时长,若时长不正确,则继续搜索下一个
               # 若没有找到,则表示解析又问题,提示的同时,新增加一个日志记录

               # 由于python 3.9中bisect.bisect_left没有包含key参数,所以前面使用其他方式来操作
               time_list=[r[2] if r[2] else r[3] for r in self.chore_list[chore_name]]

               # 暂时从头开始查找,后续视情况再改善效率
               position_begin=0
               position_end=bisect.bisect_right(time_list,curr_timestamp)

               for i in range(position_begin,position_end):
                   # 逐个查找过滤,要求数据是
                   (temp_line_number,temp_chore_name,
                    temp_begin_time,temp_end_time,temp_ignore_duration,temp_description,
                    temp_return_code,temp_child_nodes,
                    temp_part1,temp_part2
                   )=self.chore_list[chore_name][i]
                   # 考虑return_code状态为0和1的情况
                   # 1匹配到有时长的完成,其后继续释放资源,进入状态0,0表示匹配到最终完成,
                   # 如果temp_return_code为0, 则要么完成匹配了,或者是异常的情况,无开始时间,但状态为0,那个对象一定关闭了
                   if  (   temp_return_code==0 
                        or temp_begin_time is None 
                        or temp_part1!=part1 and temp_part2!=part2 
                       ) :
                       continue

                   temp_duration=(curr_timestamp-temp_begin_time).days*86400+ \
                                 (curr_timestamp-temp_begin_time).seconds+ \
                                 (curr_timestamp-temp_begin_time).microseconds*1.0/1000/1000
                   
                   # 现在使用part1,part2来判断,取消原来判断方法
                   # 2022.12.25
                   
                   # 当执行时间很短时,不能简单使用%来判断所以这里使用1秒以下的,和1秒以上的
                   # duration is None,表示本次操作是最终释放资源
                   # if  (   duration is None
                   #      or duration>=1 and temp_duration>=0.9*duration and temp_duration<1.1*duration
                   #      or duration<1 and temp_duration<1
                   #     ):
                   
                   # 采用找到的第一个
                   # 为了便于核对,保留启动时和行号和描述,追加当前行号和描述
                   # 后续使用json来保存description数据
                   temp_description[line_number]=description
                   self.chore_list[chore_name][i]=(temp_line_number,chore_name,
                                                   temp_begin_time,curr_timestamp,temp_duration,temp_description,
                                                   return_code,temp_child_nodes,
                                                   temp_part1,temp_part2
                                                  )
                   # 对所有子节点对象关闭实例,并设置结束状态为4
                   self.close_child_nodes(temp_child_nodes,4)
                   return

               # 此时没有找到匹配
               # 没有子对象,不关闭子对象
               logger.warning("没有找到对应的启动记录,当前操作完成日志为 {} {} {}".format(line_number,curr_timestamp,description))
               self.chore_list[chore_name].append((line_number,chore_name,
                                                   None,curr_timestamp,duration,{line_number:description},
                                                   return_code,[]
                                                  ))
              
            return

        # 当前所使用的变量初始化
        line_number=curr_value['LINE']
        curr_timestamp=curr_value['CURR_TIMESTAMP']
        description=curr_value['CURR_DESC']
        part1=curr_value['PART1']
        part2=curr_value['PART2']

        # 检查是否为启动 匹配
        status,chore_name=parse_start(description)
        if  status:
            # 此时添加到chore_list中
            # 此时已解析 chore_name
            add_start(line_number,curr_timestamp,description,chore_name)
            return

        # 检查是否为完成 匹配
        status,chore_name,duration,finish_code=parse_finish(description)
        if  status:
            # 此时添加到chore_list中
            # 此时已解析 chore_name,duration
            add_finish(line_number,curr_timestamp,description,chore_name,duration,finish_code)
            return

        # 检查是否为可忽略的匹配
        status=parse_ignore(description)
        if  status:
            # 这里均为忽略
            return

        # 这种情况表示是未知的情况,警告输出
        logger.warning("UNKNOWN TM1.Chore %d , %s"%(line_number,description))
        return

    def process_ti_data(self,curr_value,last_info):
        '''
        处理ti的数据.主要是对描述信息解析,将处理的结果保存到ti_list数据结构中
        ti_list结构是 {"ti名称":[按照启动时间排序,同ti名称的不同次执行过程列表],[]...}
        ti_name ：用于对应志勇他们程序中名称
        状态: 启动状态,终止状态,还是其他
        启动状态时：解析出 启动程序,启动用户
        终止状态时：判断出终止状态.是异常终止,还是带警告的终止,或正常结束.同时获取涉及的详细信息记录文件,文件的名称通过解析简短描述
                    中<> 的内容来获取.通过日志文件发现,每次描述信息中<>仅出现1次

        填充ti数据,大致处理过程
        1. 判断是开始执行还是执行完成或退出,操作时间,按照不同情况,记录对应的各种数据
        2. 查找字典结构中此ti名称的分析,
        3. 如果当前行是开始执行,则在此ti名称的分析list中按照启动时间顺序增加一行.
           同一次ti操作日志输出会按照时间排序,但不同次,或不同ti执行程序会由于不同线程线程操作,或者写入机制导致输出时间顺序不同情况
        4. 如果是执行返回,则按照下面原理查找这个对应的同次执行.
           1. 同次执行 日志文件中开始位置一定在结束位置之前,
           2. 不同次的 同执行程序,执行顺序与记录顺序可能不同.
              但结束时间在某次执行开始时间前的,则在list位置中在前
        5. 如果处理分析有违背此分析原则的,则提示,以便后续查找原因

        '''
        def parse_start(curr_desc):
            # 检查是否符合3种启动匹配
            match=ti_start1.match(curr_desc)
            if  match:
                ti_name=match.group('TI_NAME')
                run_from_process_type='TM1.Process'
                run_from_process_name=match.group('RUN_FROM_PROCESS_NAME')
                executed_by_user=match.group('EXECUTED_BY_USER')
                return True,ti_name,run_from_process_type,run_from_process_name,executed_by_user

            match=ti_start2.match(curr_desc)
            if  match:
                ti_name=match.group('TI_NAME')
                run_from_process_type='TM1.Chore'
                run_from_process_name=match.group('RUN_FROM_PROCESS_NAME')
                executed_by_user=None
                return True,ti_name,run_from_process_type,run_from_process_name,executed_by_user

            match=ti_start3.match(curr_desc)
            if  match:
                ti_name=match.group('TI_NAME')
                run_from_process_type=None
                run_from_process_name=None
                executed_by_user=match.group('EXECUTED_BY_USER')
                return True,ti_name,run_from_process_type,run_from_process_name,executed_by_user

            # 此时不匹配启动
            return False,None,None,None,None

        def parse_finish(curr_desc):
            # 包含正常执行完成,
            # 检查是否为 执行完成 匹配
            match=ti_finish1.match(curr_desc)
            if  match:
                # 此时是 正常执行完成
                ti_name=match.group('TI_NAME')
                duration=float(match.group('DURATION'))
                return_code=0
                message_file=None
                return True,ti_name,duration,return_code,message_file

            match=ti_finish2.match(curr_desc)
            if  match:
                # 此时是 带警告的执行完成
                ti_name=match.group('TI_NAME')
                duration=None
                return_code=1
                message_file=match.group('MESSAGE_FILE')
                return True,ti_name,duration,return_code,message_file

            # 检查是否为 执行终止 匹配
            match=ti_terminate.match(curr_desc)
            if  match:
                # 此时是 执行终止
                ti_name=match.group('TI_NAME')
                duration=None
                return_code=2
                message_file=match.group('MESSAGE_FILE')
                return True,ti_name,duration,return_code,message_file

            # 检查是否为 取消执行 匹配
            match=ti_quit.match(curr_desc)
            if  match:
                # 此时是 取消执行
                ti_name=match.group('TI_NAME')
                duration=None
                return_code=3
                message_file=None
                return True,ti_name,duration,return_code,message_file

            # 此时不匹配完成
            return False,None,None,None,None

        def parse_unknown(curr_desc):
            # 处理方法与完成的有所区别,主要是某些情况下,启动和这个退出合并为1次出现
            # 由于ti_unknown1需要将信息增加到最近启动结构中,ti_unknown2需要视为结束,ti_unknown3维持原有方法不变
            # 所以增加返回参数,用于区别对待
            match=ti_unknown1.match(curr_desc)
            if  match:
                ti_name=match.group('TI_NAME')
                return True,ti_name,1

            match=ti_unknown2.match(curr_desc)
            if  match:
                ti_name=match.group('TI_NAME')
                return True,ti_name,2

            match=ti_unknown3.match(curr_desc)
            if  match:
                ti_name=match.group('TI_NAME')
                return True,ti_name,3

            return False,None,None

        def add_start(line_number,curr_timestamp,description,ti_name,run_from_process_type,run_from_process_name,executed_by_user):
            # 判断文件在上次分析结束前的记录是否被修改过,如果修改过,则需要判断是否简单的删除上次分析后可以删除的部分,
            # 如果是,则可能仅分析后面新产生的日志,从而提高分析性能.否则基本无法提高分析性能

            # 判断是否存在上次保存点之前启动,但日志记录很晚的现象.
            # 正常不应该存在此现象,但如果这个程序线程被阻塞或类似原因,造成日志无法及时写入日志文件,则可能存在此现象

            # 考虑启动,执行完成的描述信息都需要保存,所以使用json来保存,以便后续复原
            if (    last_info
                and last_info.save_point_line
                and line_number>last_info.save_point_line
                and last_info.save_point_time
                and curr_timestamp<last_info.save_point_time
               ):
                logger.warning("启动时间早于上次保存点,但启动记录位置晚于上次的保存点 {} {} {}".format(line_number,curr_timestamp,description))

            # 依据run_from_process_type,run_from_process_name 查找父对象的实例启动时间,同时在父对象实例中注册自身实例名称和启动时间
            parent_begin_time=self.register_nodes(run_from_process_type,run_from_process_name,ti_name,curr_timestamp,part1,part2)
            
            # 此时添加ti启动的记录
            # 最后一列增加子节点对象列表,实现父子关系搜索
            if ti_name not in self.ti_list:
                # 表示这个记录是 这个ti的初次记录
                self.ti_list[ti_name]=[(line_number,ti_name,
                                        run_from_process_type,run_from_process_name,parent_begin_time,executed_by_user,
                                        curr_timestamp,None,None,{line_number:description},
                                        None,None,[],
                                        part1,part2
                                       )]
            else:
                # 在ti_list中按照启动时间排序的位置,插入当前记录
                # 遵循的原则是如果没有开始时间,则结束时间早的一定排在待插入记录之前
                # 如果没有开始时间,但结束时间在此启动时间之后,则对于待插入的数据,可以插入在前,也可以插入在后
                # 这里简单追加list,然后执行原地排序
                self.ti_list[ti_name].append((line_number,ti_name,
                                              run_from_process_type,run_from_process_name,parent_begin_time,executed_by_user,
                                              curr_timestamp,None,None,{line_number:description},
                                              None,None,[],
                                              part1,part2
                                             ))
                # 如果begin_time 开始时间存在,则使用开始时间, 否则使用end_time结束时间
                self.ti_list[ti_name].sort(key=lambda r: r[6] if r[6] else r[7])
            
            return

        def add_finish(line_number,curr_timestamp,description,ti_name,duration,return_code,message_file):
            # 此时添加ti结束 的记录
            if ti_name not in self.ti_list:
                # 此时日志文件不完整,或曾经被删除过,导致启动的记录存在缺失
                self.ti_list[ti_name]=[(line_number,ti_name,
                                        None,None,None,None,
                                        None,curr_timestamp,duration,{line_number:description},
                                        return_code,message_file,[],
                                        part1,part2
                                       )]
            else:
               # 查找可能匹配的同次ti执行的记录,找到后,则填充那个执行日志的结束时间,状态等,
               # duration可以用于辅助判断是否为同次ti执行
               # 查找机制是查开始时间是last_savepoint之后,在curr_timestamp时间戳之前的日志且没有结束的,
               # 此外如果是正常结束,时长已知的,同时校核一下时长,若时长不正确,则继续搜索下一个
               # 若没有找到,则表示解析有问题,提示的同时,新增加一个日志记录

               # 由于python 3.9中bisect.bisect_left没有包含key参数,所以前面使用其他方式来操作
               time_list=[r[6] if r[6] else r[7] for r in self.ti_list[ti_name]]

               # 暂时从头开始查找,后续视情况再改善效率
               position_begin=0
               position_end=bisect.bisect_right(time_list,curr_timestamp)

               for i in range(position_begin,position_end):
                   # 逐个查找过滤,要求数据是
                   (temp_line_number,temp_ti_name,
                    temp_run_from_process_type,temp_run_from_process_name,temp_executed_by_user,temp_parent_begin_time,
                    temp_begin_time,temp_end_time,temp_ignore_duration,temp_description,
                    temp_return_code,temp_ignore_message,temp_child_nodes,
                    temp_part1,temp_part2
                   )=self.ti_list[ti_name][i]
                   # 如果这个数据结束时间已经有了,或 开始时间为空 说明这个是一个完成记录
                   # 如果temp_return_code为-1,或其他非None的值,都表示这个数据已经记录完整了
                   # 相同的对象,part1,part2是相同的
                   if  temp_return_code is not None or temp_part1!=part1 and temp_part2!=part2:
                       continue
                   # 由于Process "A02年度预算_B0601、F0502部门费用预算" run from process "A00系统_更新年度预算实际数" by user "R*A00系统_更新年度预算实际数"
                   # 存在显示2次,导致跨天匹配的问题,所以进行天的比对
                   # 2022.12.21
                   temp_duration=(curr_timestamp-temp_begin_time).days*86400+ \
                                 (curr_timestamp-temp_begin_time).seconds+ \
                                 (curr_timestamp-temp_begin_time).microseconds*1.0/1000/1000
                   # 当执行时间很短时,不能简单使用%来判断所以这里使用1秒以下的,和1秒以上的
                   # 使用part1,part2来判断同一对象，所以这里取消时长的判断
                   # duration is None表示非正常执行完成情况
                   # if  (   duration is None
                   #      or duration>=1 and temp_duration>=0.9*duration and temp_duration<1.1*duration
                   #      or duration<1 and temp_duration<1
                   #     ):
                   
                   # 此时不是正常完成的,没有时长,认为是找到的第一个
                   # 或者时长匹配
                   # 为了便于核对,保留启动时和行号和描述,追加当前行号和描述

                   # str(temp_line_number)+':'+temp_description+'\n'+ str(line_number)+':'+description
                   # 后续使用json来保存description数据
                   temp_description[line_number]=description
                   self.ti_list[ti_name][i]=(temp_line_number,ti_name,
                                             temp_run_from_process_type,temp_run_from_process_name,temp_executed_by_user,temp_parent_begin_time,
                                             temp_begin_time,curr_timestamp,temp_duration,temp_description,
                                             return_code,message_file,temp_child_nodes,
                                             temp_part1,temp_part2
                                            )
                   # 由于不确定ti执行时,是否等待子对象完成,所以不处理关闭子节点对象,仅在chore对象关闭时,关闭子节点
                   # self.close_child_nodes(temp_child_nodes,4)
                   return

               # 此时没有找到匹配
               logger.warning("没有找到对应的启动记录,当前操作完成日志为 {} {} {}".format(line_number,curr_timestamp,description))
               self.ti_list[ti_name].append((line_number,ti_name,
                                             None,None,None,None,
                                             None,curr_timestamp,duration,{line_number:description},
                                             return_code,message_file,[],
                                             part1,part2
                                            ))
            return

        def add_extendinfo(line_number,curr_timestamp,description,ti_name):
            # 此时查到最可能的ti,将额外信息追加到描述中
            # 由于是额外信息,可以暂时不实现
            # 2022.12.21
            pass
            return

        def add_unknown(line_number,curr_timestamp,description,ti_name):
            # 此时添加 不是 启动和结束,但可以解析出ti_name的 记录
            # 此时return_code为-1,表示未知的方式,这个暂时单独一行,后续分析后,视情况修改代码,决定如何处理
            # 将当前时间戳,作为此事件发生的时间,填写到begin_time中
            return_code=-1
            if ti_name not in self.ti_list:
                self.ti_list[ti_name]=[(line_number,ti_name,
                                        None,None,None,None,
                                        curr_timestamp,curr_timestamp,None,{line_number:description},
                                        return_code,None,[],
                                        part1,part2
                                       )]
            else:
                self.ti_list[ti_name].append((line_number,ti_name,
                                              None,None,None,None,
                                              curr_timestamp,curr_timestamp,None,{line_number:description},
                                              return_code,None,[],
                                              part1,part2
                                             ))
                # 如果begin_time 开始时间存在,则使用开始时间, 否则使用end_time结束时间
                self.ti_list[ti_name].sort(key=lambda r: r[6] if r[6] else r[7])
            return

        # 当前所使用的变量初始化
        line_number=curr_value['LINE']
        curr_timestamp=curr_value['CURR_TIMESTAMP']
        description=curr_value['CURR_DESC']
        part1=curr_value['PART1']
        part2=curr_value['PART2']

        # 检查是否为启动 匹配
        status,ti_name,run_from_process_type,run_from_process_name,executed_by_user=parse_start(description)
        if  status:
            # 此时添加到ti_list中
            # 此时已解析 ti_name,run_from_process_type,run_from_process_name,executed_by_user
            add_start(line_number,curr_timestamp,description,ti_name,run_from_process_type,run_from_process_name,executed_by_user)
            return

        # 检查是否为完成 匹配 ,包含 正常执行完成,带警告的执行完成, 异常终止, 取消执行
        status,ti_name,duration,return_code,message_file=parse_finish(description)
        if  status:
            # 此时添加到ti_list中
            # 此时已解析 ti_name,duration,return_code,message_file
            add_finish(line_number,curr_timestamp,description,ti_name,duration,return_code,message_file)
            return

        # 检查是否为 可以匹配出ti_name的情况
        status,ti_name,unknown_code=parse_unknown(description)
        if  status:
            # 状态1 ,需要将信息补充到ti的描述信息中
            if  unknown_code==1:
                add_extendinfo(line_number,curr_timestamp,description,ti_name)
                return
            # 状态2, 按照退出方式来处理,等同于 2--执行已中止
            if  unknown_code==2:
                add_finish(line_number,curr_timestamp,description,ti_name,None,2,None)
                return
            # 状态3,维持原来处理方式不变
            # 此时添加到ti_list中,处理方法与完成的有所区别,主要是某些情况下,启动和这个退出合并为1次出现
            if  unknown_code==3:
                add_unknown(line_number,curr_timestamp,description,ti_name)
                return

        # 检查是否为可忽略的情况
        match=ti_ignore.match(description)
        if  match :
            # 此时表示直接忽略
            return

        # 这种情况表示是未知的情况,警告输出
        logger.warning("UNKNOWN TM1.Process %d , %s"%(line_number,description))
        return

    def get_ti_list(self,last_info):
        '''
        假设savepoint之前的内容可能会删除,所以这里仅取此后启动的,但为了对比分析,如何判断此前启动的
        这里意味着需要判断上次分析后,文件是否仅仅追加,而没有删除,通过比对上次的位置和描述信息是否一致来判断
        如果有不一致的情况,则在本次分析的描述中记录

        如果以前没有分析过,则按照行号排序后,行号最小的,对应数据作为本次开始,
        逐个检查状态,直到状态为None的数据,此时这个数据是启动状态,这个时间是savepoint,保留其对应的数据为本次savepoint
        继续检查,直到数据尾部,此时数据为本次分析结束点
        如果前次分析存在,且文件仅追加,或符合要求,不产生歧义,
        则排序后,向后查找,直至找到上次的保存点, 随后以此时间为本次开始,
        随后重复前面步骤,找到savepoint,和结束点
        若记录存在可能的歧义,则仍然按照这个方式来处理,但在分析概述中说明即可
        '''
        def search_positions(position_time,position_desc):
            # 查找符合时间戳和描述内容的日志位置
            # 数据不多,暂时顺序检查
            for temp_i in range(len(all_ti_lists)):
                temp_line,temp_time,temp_desc=get_line_info(temp_i)
                if temp_time==position_time and temp_desc==position_desc:
                    return temp_i

            return None

        def get_line_info(i):
            # 获取这个分析位置的信息
            (temp_times,
             temp_line_number,temp_ti_process_type,temp_ti_name,
             temp_run_from_process_type,temp_run_from_process_name,temp_parent_begin_time,temp_executed_by_user,
             temp_begin_time,temp_end_time,temp_duration,temp_description,
             temp_return_code,temp_message_file
            )=all_ti_lists[i]
            # 获取这个位置的行号
            position_line=temp_line_number
            # 获取这个位置的时间戳
            # 由于可能存在仅有结束时间,没有启动时间的日志,所以这种现象时,获取这个位置的结束时间
            if temp_begin_time:
                  position_time=temp_begin_time
            else:
                position_time=temp_end_time

            # 保存描述信息时,需要分离出原始行号对应数据,以便于核对
            position_desc=temp_description[temp_line_number]

            return position_line,position_time,position_desc

        # 将分析的数据转为list
        all_ti_lists=[]

        # 增加数据库启动过程记录,便于后续核对
        analysis_description=""
        for i in self.started_db:
            (db_started_line,db_started_time,db_shutdowned_line,db_shutdowned_time,
             ti_list,chore_list
            )=self.started_db[i]
            analysis_description=analysis_description+ \
                                 "启动行号: {} 启动时间: {}, 关闭行号: {} 关闭时间: {}\n".format(db_started_line,db_started_time,
                                      db_shutdowned_line,db_shutdowned_time)
            for curr_ti_list in ti_list.values():
                for (curr_line_number,curr_ti_name,
                     curr_run_from_process_type,curr_run_from_process_name,curr_parent_begin_time,curr_executed_by_user,
                     curr_begin_time,curr_end_time,curr_duration,curr_description,
                     curr_return_code,curr_message_file,ignore_child_nodes,
                     curr_part1,curr_part2
                    ) in curr_ti_list:
                    # 增加ti_process_type列,引入执行对象类型TM1.Chore,TM1.Process等,以便后续使用层级来搜索启动关系
                    all_ti_lists.append((i,
                                         curr_line_number,"TM1.Process",curr_ti_name,
                                         curr_run_from_process_type,curr_run_from_process_name,curr_parent_begin_time,curr_executed_by_user,
                                         curr_begin_time,curr_end_time,curr_duration,curr_description,
                                         curr_return_code,curr_message_file
                                        ))

            # 这里处理TM1.Chore对象
            for curr_chore_list in chore_list.values():
                for (curr_line_number,curr_chore_name,
                     curr_begin_time,curr_end_time,curr_duration,curr_description,
                     curr_return_code,ignore_child_nodes,
                     curr_part1,curr_part2
                    ) in curr_chore_list:
                    all_ti_lists.append((i,
                                         curr_line_number,"TM1.Chore",curr_chore_name,
                                         None,None,None,None,
                                         curr_begin_time,curr_end_time,curr_duration,curr_description,
                                         curr_return_code,None
                                        ))
            
        # 按照行位置排序
        all_ti_lists.sort(key=lambda r: r[1] )

        # 假设文件仅仅追加,或者文件从头删除到上次保存点之前的某位置
        # 查找本次开始的位置
        # 如果上次保存点未找到,或文件删除了部分内容,予以描述
        if last_info is None:
            # 此时是首次分析
            i,curr_analysis_id=0,1
        else:
            curr_analysis_id=last_info.analysis_id+1
            # 查找从上次分析的savepoint的位置
            # 并判断自上次分析后,是否删除,且可能删除的行数
            i=search_positions(last_info.save_point_time,last_info.save_point_desc)
            if i is None:
                i=0
                analysis_description=analysis_description+'上次的保存点未找到,重新开始分析文件'
            else:
                # 检查此位置对应数据
                if all_ti_lists[i][1]!=last_info.save_point_line:
                    analysis_description=analysis_description + \
                        '上次的保存点找到,但文件删除了 {} 行'.format(last_info.save_point_line-all_ti_lists[i][1])

        # 获取本次开始分析位置的信息
        curr_begin_position_line,curr_begin_position_time,curr_begin_position_desc=get_line_info(i)

        # 查找savepoint,这里查询最后一次数据库启动的分析
        imax_times=max(self.started_db.keys())
        (db_started_line,db_started_time,db_shutdowned_line,db_shutdowned_time,
         *ignore_items
        )=self.started_db[imax_times]

        for m in range(i,len(all_ti_lists)):
            # 要求是最后一次数据库启动的,且状态为空的数据为savepoint
            # 将TM1.Chore作为保存点的要素,无返回值或今天凌晨后启动的chore
            if  (    all_ti_lists[m][0]==imax_times 
                 and all_ti_lists[m][-2] is None 
                 and all_ti_lists[m][2]=="TM1.Chore"
                  or all_ti_lists[m][8].date()>=datetime.datetime.now().date()
                 and all_ti_lists[m][2]=="TM1.Chore"
                ):
                break

        curr_save_point_line,curr_save_point_time,curr_save_point_desc=get_line_info(m)

        curr_end_position_line,curr_end_position_time,curr_end_position_desc=get_line_info(-1)

        # 获取本次分析的详细数据
        result=all_ti_lists[i:]

        return ANALYSIS_INFO(curr_analysis_id,
                             db_started_line,db_started_time,db_shutdowned_line,db_shutdowned_time,
                             curr_begin_position_line,curr_begin_position_time,curr_begin_position_desc,
                             curr_end_position_line,curr_end_position_time,curr_end_position_desc,
                             curr_save_point_line,curr_save_point_time,curr_save_point_desc,
                             analysis_description
                            ),result

    def close_child_nodes(self,child_nodes,close_code):
        '''
        关闭子节点对象ti实例，并且设置关闭的状态.通过递归调用实现所有子节点对象关闭
        由于假设ti关闭时，其子ti不一定关闭,但chore(将其理解为调度任务,类似cron)关闭时，所有的子ti一定都关闭
        '''
        if child_nodes is None or len(child_nodes)==0:
            return
        for (ti_name,begin_time) in child_nodes:
            # 查询这个ti在self.ti_list中的实例位置
            # 若其存在子节点,则递归调用
            for i in range(len(self.ti_list[ti_name])):
                (curr_line_number,curr_ti_name,
                 curr_run_from_process_type,curr_run_from_process_name,curr_parent_begin_time,curr_executed_by_user,
                 curr_begin_time,curr_end_time,curr_duration,curr_description,
                 ignore_return_code,curr_message_file,curr_child_nodes,
                 curr_part1,curr_part2
                )=self.ti_list[ti_name][i]
                # 简单使用顺序扫描,检查是否是这个时间启动的对象实例
                if  begin_time==curr_begin_time :
                    if  ignore_return_code is None:
                        # 若这个子对象没有关闭,则设置为特殊的状态来关闭
                        # 已关闭则不处理
                        self.ti_list[ti_name][i]=(curr_line_number,curr_ti_name,
                            curr_run_from_process_type,curr_run_from_process_name,curr_parent_begin_time,curr_executed_by_user,
                            curr_begin_time,curr_end_time,curr_duration,curr_description,
                            close_code,curr_message_file,curr_child_nodes,
                            curr_part1,curr_part2
                            )
                    self.close_child_nodes(curr_child_nodes,close_code)
                    break
        return

    def register_nodes(self,parent_process_type,parent_process_name,ti_name,begin_time,part1,part2):
        '''
        子ti向父对象注册
        '''
        if  parent_process_type is None or parent_process_name is None:
            return None
            
        if parent_process_type=='TM1.Chore':
            # 父对象是TM1.Chore
            for i in range(len(self.chore_list[parent_process_name])):
                (curr_line_number,curr_chore_name,
                 curr_begin_time,curr_end_time,curr_duration,curr_description,
                 curr_return_code,curr_child_nodes,
                 curr_part1,curr_part2
                )=self.chore_list[parent_process_name][i]
                if  (    curr_return_code is None and curr_begin_time<=begin_time 
                     and curr_part1==part1 and curr_part2==part2
                    ) :
                    # 这个状态表示未结束，且启动时间早于这个子对象的启动时间
                    curr_child_nodes.append((ti_name,begin_time))
                    self.chore_list[parent_process_name][i]=(curr_line_number,curr_chore_name,
                            curr_begin_time,curr_end_time,curr_duration,curr_description,
                            curr_return_code,curr_child_nodes,
                            curr_part1,curr_part2
                            )
                    return curr_begin_time
            logger.warning("父节点未找到 {} {}".format(ti_name,begin_time))
            return None
        else:
            # 父对象是TM1.Process
            for i in range(len(self.ti_list[parent_process_name])):
                (curr_line_number,curr_ti_name,
                 curr_run_from_process_type,curr_run_from_process_name,curr_parent_begin_time,curr_executed_by_user,
                 curr_begin_time,curr_end_time,curr_duration,curr_description,
                 curr_return_code,curr_message_file,curr_child_nodes,
                 curr_part1,curr_part2
                )=self.ti_list[parent_process_name][i]
                if  (    curr_return_code is None and curr_begin_time<=begin_time 
                     and curr_part1==part1 and curr_part2==part2
                    ):
                    # 这个状态表示未结束，且启动时间早于这个子对象的启动时间
                    curr_child_nodes.append((ti_name,begin_time))
                    self.ti_list[parent_process_name][i]=(curr_line_number,curr_ti_name,
                         curr_run_from_process_type,curr_run_from_process_name,curr_parent_begin_time,curr_executed_by_user,
                         curr_begin_time,curr_end_time,curr_duration,curr_description,
                         curr_return_code,curr_message_file,curr_child_nodes,
                         curr_part1,curr_part2
                        )
                    return curr_begin_time
            logger.warning("父节点未找到 {} {}".format(ti_name,begin_time))
            return None
            
def get_last_analysis():
    '''
    获取前一次的分析的analysis_id, savepoint等信息,使用命名元组返回
    1. 按照analysis_id排序,查找最后一次分析的编号和savepoint时间
    2. 如果是首次分析,则返回 None
    '''

    try:
        conn=cx_Oracle.connect(BGDB_CONN_STR)
        cur=conn.cursor()
        # 使用命名元组的_fields属性来获取需要查询的列名
        # 这里不包含分析开始,分析结束,分析消耗cpu 这3个统计信息
        sql='select '+','.join(ANALYSIS_INFO._fields)+' from T_tm1slog' + chr(10) + \
            ' order by analysis_id desc'

        cur.execute(sql)

        # 这里仅提取1条数据,实际是返回最大的
        res=cur.fetchone()
        cur.close()
    except :
        stack = traceback.format_exc()
        logger.error( 'exception stack: %s' % stack)
    finally:
        conn.close()

    if res:
        return ANALYSIS_INFO(*res)
    else:
        return None

def fill_ti_data(curr_info,analysis_begin,analysis_end,cpu_time,ti_list):
    '''
    将分析后的数据保存到T_tm1slog,t_tm1slog_ti中,这里需要针对ti错误信息文件,返回每个错误的详细信息,而在ti文法解析阶段,
    不读取文件内容,仅到保存到数据库的阶段,才读取文件内容
    Args:
        curr_info       :本次分析信息,包含编号,时间戳,行等记录信息,数据类型为ANALYSIS_INFO
        analysis_begin  :本次分析,进程开始分析时,对应时刻
        analysis_end    :本次分析,进程结束分析时,对应时刻
        cpu_time        :进程消耗的cpu时间,用于查看性能情况,这3个为分析进程统计信息
        ti_list         :需要保存的ti数据
    Returns:
    Raises:
       无抛出异常,仅在遇到异常时,终端中提示
    '''
    # 如果没有需要保存的数据,直接返回
    if curr_info is None and ti_list is None:
        return

    try:
        conn=cx_Oracle.connect(BGDB_CONN_STR)
        conn.begin()

        cur=conn.cursor()
        cur.arraysize=BATCH_SIZE
        cur.bindarraysize=BATCH_SIZE

        # 插入本次分析的概述信息
        # 获取表中列的数据类型
        sql='select * from T_tm1slog where 1=0'
        cur.execute(sql)

        #获取每列的变量类型
        db_types = (d[1] for d in cur.description)
        # cur.description 描述了各返回列的描述,包含名称,类型等
        # oracle不支持?作为绑定变量,所以这里使用位置绑定变量
        col_parameters=",".join(":"+str(i) for i in range(len(cur.description)))

        # 构成插入语句
        sql="insert into T_tm1slog " +chr(10)+ \
            " values("+col_parameters+")"

        cur.prepare(sql)
        # 设置每列的数据类型
        cur.setinputsizes(*db_types)
        # 这里仅插入1条分析概述数据
        cur.execute(sql,(*curr_info,analysis_begin,analysis_end,cpu_time))

        # 插入ti数据
        sql='select * from t_tm1slog_ti where 1=0'
        cur.execute(sql)
        db_types = (d[1] for d in cur.description)
        col_parameters=",".join(":"+str(i) for i in range(len(cur.description)))

        sql="insert into t_tm1slog_ti "+chr(10)+ \
            " values("+col_parameters+")"
        cur.prepare(sql)
        # 设置每列的数据类型
        cur.setinputsizes(*db_types)

        # 由于cx_oracle没有cur.executemany类似hana中设置batch size,所以需要使用循环来判断
        # cx_oracle中cur.executemany行为是1次执行客户端中不判断插入的数据行数和cur.arraysize的关系
        data_arr=[]
        # 检查是否为空
        if  ti_list :
            for (i_start_times,
                 line_number,ti_process_type,ti_name,
                 run_from_process_type,run_from_process_name,parent_begin_time,executed_by_user,
                 begin_time,end_time,duration,description,
                 return_code,message_file
                ) in ti_list:
                # 这里需要针对ti错误信息文件,返回每个错误的详细信息
                if  message_file:
                    try:
                        with open(message_file,'rt',encoding='utf-8') as f:
                            return_message=f.read()
                    except :
                        logger.warning("%s 文件不存在"%message_file)
                else:
                    return_message=None

                # 填充数组
                # 描述信息description实际包含了行号和实际描述组成的字典,使用json.dumps来操作
                # 由于包含中午,所以使用ensure_ascii=False
                data_arr.append((curr_info.analysis_id,
                                 line_number,ti_process_type,ti_name,
                                 run_from_process_type,run_from_process_name,parent_begin_time,executed_by_user,
                                 begin_time,end_time,duration,
                                 json.dumps(description,separators=(',',':'),ensure_ascii=False),
                                 return_code,return_message
                                ))
                if  len(data_arr) ==BATCH_SIZE:
                    cur.executemany(sql,data_arr)
                    data_arr=[]
            if  data_arr:
                cur.executemany(sql,data_arr)
        conn.commit()
    except :
        stack = traceback.format_exc()
        logger.error('exception stack: %s' % stack)
        conn.rollback()
    finally:
        conn.close()

# 处理函数
def process_tm1server_log():
    '''
    处理tm1 server log大致过程
    1. 从数据库中查找上次的savepoint和analysis_id
    2. 建立TM1_ServerLog, 并从此savepoint之后开始过滤ti TM1.Process 的日志
    3. 返回本次处理的begin_position,end_position,savepoint,以及ti_list
    4. 保存数据到数据库中
    由于tm1server.log存在并发写入,由不同线程操作,同时分析程序和tm1s数据库日志操作也存在同时并发的情况
    处理时,不考虑日志被截断（或类似情况）,仅假设为追加操作.不考虑使用文件加锁方式,但需要考虑文件内容正在写入时,读出的情况
    '''
    # 获取前一次的分析的analysis_id, savepoint,若为初次分析,则返回0,None
    last_info=get_last_analysis()

    # 获取进程执行的统计信息
    analysis_begin=datetime.datetime.now()
    cpu_begin=time.process_time()

    # 考虑为了尽可能的得到执行结构的输出,防止日志输出方式和自己理解的不一致,对于异常方式,均输出提示.方便理解日志,修改代码
    # 因此这里是全部日志文件解析,后续利用last_savepoint过滤需要的分析结果
    obj_serverlog = TM1_ServerLog()
    obj_serverlog.parse(last_info)
    # 获取本次处理的begin_position,end_position,savepoint,以及ti_list
    curr_info,ti_list=obj_serverlog.get_ti_list(last_info)

    # 获取进程执行的统计信息
    analysis_end=datetime.datetime.now()
    cpu_time=time.process_time()-cpu_begin

    # 将数据保存到数据库中
    fill_ti_data(curr_info,analysis_begin,analysis_end,cpu_time,ti_list)

# ===========================================================================
# main函数
if __name__ == '__main__':
    logging.basicConfig(level = logging.INFO,format = "%(asctime)s.%(msecs)d pid %(process)d %(filename)s line %(lineno)d %(funcName)s %(message)s",datefmt="%Y-%m-%d %H:%M:%S")
    logger = logging.getLogger(__name__)

    logger.info("begin...")
    # 处理tm1 server 日志
    process_tm1server_log()
    logger.info("finished...")

为了将数据保存到数据库中，便于分析，建立如下的表结构

--用于琪雯她们查询预算中ti执行结果
--详细使用说明参考tm1slog.py
--万起胜
--2022.12.12

--=======================================================================================================================================
drop table t_tm1slog_ti purge;
drop table T_tm1slog purge;

create table T_tm1slog(
  analysis_id                 integer                    ,  --分析编号
  db_started_line             integer                    ,  --C1 数据库启动行号
  db_started_time             TIMESTAMP(3)               ,  --数据库启动的时间戳
  db_shutdowned_line          integer                    ,  --C1 数据库关闭行号
  db_shutdowned_time          TIMESTAMP(3)               ,  --C1 数据库关闭时间戳
  begin_position_line         integer                    ,  --分析开始位置对应行号
  begin_position_time         TIMESTAMP(3)               ,  --分析开始位置对应时间,主要用于核对分析结果,同时显示和保存点之间的差异
  begin_position_desc         varchar2(2000)             ,  --分析开始行所对应描述内容,用于对比文件是否变动
  end_position_line           integer                    ,  --分析结束位置对应行号
  end_position_time           TIMESTAMP(3)               ,  --分析结束位置对应时间,
  end_position_desc           varchar2(2000)             ,  --分析结束行所对应描述内容,用于对比文件是否变动
  save_point_line             integer                    ,  --save_point对应的行号
  save_point_time             TIMESTAMP(3)               ,  --从本次开始截止这个时间,这之间发起的ti均执行完成,完成时间可以在此save_point之后
  save_point_desc             varchar2(2000)             ,  --save_point对应行所对应描述内容,用于对比文件是否变动
  description                 clob                       ,  --上次分析后，如果文件曾经有删除，则说明一下
  analysis_begin              date                       ,  --本次分析开始时间
  analysis_end                date                       ,  --本次分析结束时间
  cpu_time                    number                        --cpu时长,以秒为单位
);

alter table T_tm1slog add constraint pk_T_tm1slog primary key(analysis_id);

comment on table T_tm1slog is '分析概述';
comment on column T_tm1slog.analysis_id         is '分析编号';
comment on column T_tm1slog.db_started_line     is 'C1 数据库启动行号';
comment on column T_tm1slog.db_started_time     is '数据库启动的时间戳';
comment on column T_tm1slog.db_shutdowned_line  is 'C1 数据库关闭行号';
comment on column T_tm1slog.db_shutdowned_time  is 'C1 数据库关闭时间戳';
comment on column T_tm1slog.begin_position_line is '分析开始位置对应行号';
comment on column T_tm1slog.begin_position_time is '分析开始位置对应时间,主要用于核对分析结果,同时显示和保存点之间的差异';
comment on column T_tm1slog.begin_position_desc is '分析开始行所对应描述内容,用于对比文件是否变动';
comment on column T_tm1slog.end_position_line   is '分析结束位置对应行号';
comment on column T_tm1slog.end_position_time   is '分析结束位置对应时间,';
comment on column T_tm1slog.end_position_desc   is '分析结束行所对应描述内容,用于对比文件是否变动';
comment on column T_tm1slog.save_point_line     is 'save_point对应的行号';
comment on column T_tm1slog.save_point_time     is '从本次开始截止这个时间,这之间发起的ti均执行完成,完成时间可以在此save_point之后';
comment on column T_tm1slog.save_point_desc     is 'save_point对应行所对应描述内容,用于对比文件是否变动';
comment on column T_tm1slog.description         is '上次分析后，如果文件曾经有删除，则说明一下';
comment on column T_tm1slog.analysis_begin      is '本次分析开始时间';
comment on column T_tm1slog.analysis_end        is '本次分析结束时间';
comment on column T_tm1slog.cpu_time            is 'cpu时长,以秒为单位';

create table t_tm1slog_ti(
  analysis_id                 integer                    ,  --分析编号
  line_number                 integer                    ,  --日志中行号,用于简化校核,对应启动时间的行号,如果没有启动,则为结束时间的行号
  ti_process_type             varchar2(50)               ,  --引入执行对象类型TM1.Chore,TM1.Process等,以便后续使用层级来搜索启动关系
  ti_name                     varchar2(500)              ,  --ti名称
  run_from_process_type       varchar2(50)               ,  --启动程序类型,例如是TM1.Chore,TM1.Process
  run_from_process_name       varchar2(500)              ,  --启动程序名称,可能使用executed by chore或run from process启动当前程序
  parent_begin_time           TIMESTAMP(3)               ,  --父对象的实例启动时间,用于层级搜索时,获取父，不考虑同时刻存在多个实例情况,此时实际无法确定父
  executed_by_user            varchar2(500)              ,  --启动用户
  begin_time                  TIMESTAMP(3)               ,  --ti开始执行时间
  end_time                    TIMESTAMP(3)               ,  --ti执行完成时间,null表示未处理完成
  duration                    number                     ,  --cpu时长,以秒为单位,主要是便于查询执行时间长的ti
  description                 varchar2(2000)             ,  --简短描述,便于分析当前可能的描述情况
  return_code                 integer                    ,  --执行结果,0--已正常完成执行,1--已完成执行,但存在错误,2--执行已中止,3--执行取消,4--祖先chore关闭,-1--表示未知的方式,用于排除执行时错误 null则表示未处理完成
  return_message              clob                          --返回错误信息,正常完成执行时,无错误信息
);

alter table t_tm1slog_ti add constraint pk_t_tm1slog_ti primary key(analysis_id,line_number);
alter table t_tm1slog_ti add constraint fk_t_tm1slog_ti
  foreign key (analysis_id) references T_tm1slog(analysis_id);

create index idx1_t_tm1slog_ti on t_tm1slog_ti(ti_name);
create index idx2_t_tm1slog_ti on t_tm1slog_ti(begin_time);

comment on table t_tm1slog_ti is 'ti分析日志';
comment on column T_tm1slog_ti.analysis_id      is '分析编号';
comment on column T_tm1slog_ti.line_number      is '日志中行号,用于简化校核';
comment on column T_tm1slog_ti.ti_process_type  is '引入执行对象类型TM1.Chore,TM1.Process等,以便后续使用层级来搜索启动关系';
comment on column T_tm1slog_ti.ti_name          is 'ti名称';
comment on column T_tm1slog_ti.run_from_process_type is '启动程序类型,例如是TM1.Chore,TM1.Process';
comment on column T_tm1slog_ti.run_from_process_name is '启动程序名称,可能使用executed by chore或run from process启动当前程序';
comment on column T_tm1slog_ti.parent_begin_time is '父对象的实例启动时间,用于层级搜索时,获取父，不考虑同时刻存在多个实例情况,此时实际无法确定父';
comment on column T_tm1slog_ti.executed_by_user is '启动用户';
comment on column T_tm1slog_ti.begin_time       is 'ti开始执行时间';
comment on column T_tm1slog_ti.end_time         is 'ti执行完成时间';
comment on column T_tm1slog_ti.duration         is 'cpu时长,以秒为单位,主要是便于查询执行时间长的ti';
comment on column T_tm1slog_ti.description      is '简短描述,便于分析当前可能的描述情况';
comment on column T_tm1slog_ti.return_code      is '执行结果,0--已正常完成执行,1--已完成执行,但存在错误,2--执行已中止';
comment on column T_tm1slog_ti.return_message   is '返回错误信息,正常完成执行时,无错误信息';