用Python解析WinMerge生成的Patch文件

最新推荐文章于 2024-08-05 20:13:08 发布

往事如烟0819

最新推荐文章于 2024-08-05 20:13:08 发布

阅读量724

点赞数

分类专栏： Python入门实例

本文链接：https://blog.csdn.net/sgx6660888/article/details/116430086

版权

Python入门实例专栏收录该内容

10 篇文章 0 订阅

订阅专栏

这个代码是本人第一次用Python写的包含Class的代码。

该解析之前用VBA写过，刚刚学习了五天的Python，检验一下自己学习的成果，也算给五一长假画上一个分号。

写的比较烂，自己看着都难受。（编码规范，异常处理，Class的使用等等。。。。。。。都没学明白）

另外解析后的数据如何管理，也没有想好，也只能打log了。

功能介绍

1. 解析WinMerge生成的Patch文件，抽出相关信息。

2. 抽出的信息：文件名，变更前后的代码行数范围。

这个功能有啥用呀，肯定有人会问。

我目前筹划的用法如下

1. 本次需求变更代码的文件以及变更行数抽出来。

2. 静态解析的警告信息进行分析，将变更文件和变更行数警告进行标记，生成文档，用于进行重点的检查。

重点：希望大牛们进行指点。

输入的Patch文件

不太熟悉patch格式的，看看代码注释。

diff i circle.yml circle.yml
11c11
<    fedora33_gmake:
---
>    fedora32_gmake:
14,45c14
<        - image: docker.io/fedora:33
---
>        - image: docker.io/fedora:32
66c35
<              make roundtrip CIRCLECI=1 ROUNDTRIP_MAX_ENTRIES=25
---
>              make check roundtrip CIRCLECI=1
104,130c73
<              MAKE=bmake bmake validate-input check codecheck CIRCLECI=1
< 
<    fedora30_bmake_roundtrip:
---
>              MAKE=bmake bmake validate-input check roundtrip codecheck CIRCLECI=1
132c75
<    fedora33_distcheck:
---
>    fedora_distcheck:
135c78
<        - image: docker.io/fedora:33
---
>        - image: docker.io/fedora:latest
diff i Units/review-needed.r/test.vhd.t/input.vhd Units/review-needed.r/test.vhd.t/input.vhd
4649c4649
< end parameterize;
---
> end paramterize;
diff i win32/ctags_vs2013.vcxproj win32/ctags_vs2013.vcxproj
8,11d7
<     <ProjectConfiguration Include="Debug|x64">
<     </ProjectConfiguration>
16,19d11
<     </ProjectConfiguration>
34,39d25
<   <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|x64'" Label="Configuration">

运行的结果

Line 000001: [diff i circle.yml circle.yml]
Find File!	[True]
Line 000002: [11c11]
{'BEF_START': '11', 'CHG_SYM': 'c', 'AFT_START': '11'}
Find LineNo!	[True]
Line 000003: [<    fedora33_gmake:]
Find Content!	[True]
Line 000004: [---]
Find Content!	[True]
Line 000005: [>    fedora32_gmake:]
Find Content!	[True]
Line 000006: [14,45c14]
{'BEF_START': '14', 'BEF_STOP': '45', 'CHG_SYM': 'c', 'AFT_START': '14'}
Find LineNo!	[True]
Line 000007: [<        - image: docker.io/fedora:33]
Find Content!	[True]
Line 000008: [---]
Find Content!	[True]
Line 000009: [>        - image: docker.io/fedora:32]
Find Content!	[True]
Line 000010: [66c35]
{'BEF_START': '66', 'CHG_SYM': 'c', 'AFT_START': '35'}
Find LineNo!	[True]
Line 000011: [<              make roundtrip CIRCLECI=1 ROUNDTRIP_MAX_ENTRIES=25]
Find Content!	[True]
Line 000012: [---]
Find Content!	[True]
Line 000013: [>              make check roundtrip CIRCLECI=1]
Find Content!	[True]
Line 000014: [104,130c73]
{'BEF_START': '104', 'BEF_STOP': '130', 'CHG_SYM': 'c', 'AFT_START': '73'}
Find LineNo!	[True]
Line 000015: [<              MAKE=bmake bmake validate-input check codecheck CIRCLECI=1]
Find Content!	[True]
Line 000016: [<]
This line can not be parsed(failure!)	[False]
Line 000017: [<    fedora30_bmake_roundtrip:]
Find Content!	[True]
Line 000018: [---]
Find Content!	[True]
Line 000019: [>              MAKE=bmake bmake validate-input check roundtrip codecheck CIRCLECI=1]
Find Content!	[True]
Line 000020: [132c75]
{'BEF_START': '132', 'CHG_SYM': 'c', 'AFT_START': '75'}
Find LineNo!	[True]
Line 000021: [<    fedora33_distcheck:]
Find Content!	[True]
Line 000022: [---]
Find Content!	[True]
Line 000023: [>    fedora_distcheck:]
Find Content!	[True]
Line 000024: [135c78]
{'BEF_START': '135', 'CHG_SYM': 'c', 'AFT_START': '78'}
Find LineNo!	[True]
Line 000025: [<        - image: docker.io/fedora:33]
Find Content!	[True]
Line 000026: [---]
Find Content!	[True]
Line 000027: [>        - image: docker.io/fedora:latest]
Find Content!	[True]
Line 000028: [diff i Units/review-needed.r/test.vhd.t/input.vhd Units/review-needed.r/test.vhd.t/input.vhd]
Find File!	[True]
Line 000029: [4649c4649]
{'BEF_START': '4649', 'CHG_SYM': 'c', 'AFT_START': '4649'}
Find LineNo!	[True]
Line 000030: [< end parameterize;]
Find Content!	[True]
Line 000031: [---]
Find Content!	[True]
Line 000032: [> end paramterize;]
Find Content!	[True]
Line 000033: [diff i win32/ctags_vs2013.vcxproj win32/ctags_vs2013.vcxproj]
Find File!	[True]
Line 000034: [8,11d7]
{'BEF_START': '8', 'BEF_STOP': '11', 'CHG_SYM': 'd', 'AFT_START': '7'}
Find LineNo!	[True]
Line 000035: [<     <ProjectConfiguration Include="Debug|x64">]
Find Content!	[True]
Line 000036: [<     </ProjectConfiguration>]
Find Content!	[True]
Line 000037: [16,19d11]
{'BEF_START': '16', 'BEF_STOP': '19', 'CHG_SYM': 'd', 'AFT_START': '11'}
Find LineNo!	[True]
Line 000038: [<     </ProjectConfiguration>]
Find Content!	[True]
Line 000039: [34,39d25]
{'BEF_START': '34', 'BEF_STOP': '39', 'CHG_SYM': 'd', 'AFT_START': '25'}
Find LineNo!	[True]
Line 000040: [<   <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|x64'" Label="Configuration">]
Find Content!	[True]

# **********************************************************************************************************************
# coding: UTF-8 -*-
#
# 解析WinMerge patch 文件
# 解析变更文件的文件名以及变更文件的行号(变更前后)
#
# patch 文件生成方法 menu -> Generate Patch  [必须选的option: include command line]
#
# Pycharm 2021.1.1 + Python 3.9.4
# WinMerge 2.16.2.10
#
# Author:   sgx6660888
# Blog:     https://blog.csdn.net/sgx6660888
# **********************************************************************************************************************
import re


class ParsePatchFile():
    # ******************************************************************************************************************
    #  class variable shared by all instances
    # ******************************************************************************************************************

    # ******************************************************************************************************************
    # diff i circle.yml circle.yml          ===> 命令行，包含变更前后的相对路径以及文件名
    # 11c11                                 ===> 变更前后的行数以及变更的种类(a: add  c: change   d: delete )
    # <    fedora33_gmake:                  ===> 变更前内容
    # ---                                   ===> 分隔符
    # >    fedora32_gmake:                  ===> 变更后内容
    # ******************************************************************************************************************
    # Define the rule string for checking
    __RULE_PATTEN_FILE = "diff "
    __RULE_SPLIT_SYM_FILE = " "
    __RULE_PATTEN_CONTENT_1 = "< "
    __RULE_PATTEN_CONTENT_2 = "---"
    __RULE_PATTEN_CONTENT_3 = "> "

    # Define the rule of line no using regular expression for checking

    # The keyword in Dictionary of analysis result
    __QUOTE_BEF_START = "BEF_START"
    __QUOTE_BEF_STOP = "BEF_STOP"
    __QUOTE_AFT_START = "AFT_START"
    __QUOTE_AFT_STOP = "AFT_STOP"

    __QUOTE_DIGIT = "[1-9][0-9]{0,})"
    __QUOTE_DIGIT1 = "(?P<" + __QUOTE_BEF_START + ">" + __QUOTE_DIGIT
    __QUOTE_DIGIT2 = "(?P<" + __QUOTE_BEF_STOP + ">" + __QUOTE_DIGIT
    __QUOTE_DIGIT3 = "(?P<" + __QUOTE_AFT_START + ">" + __QUOTE_DIGIT
    __QUOTE_DIGIT4 = "(?P<" + __QUOTE_AFT_STOP + ">" + __QUOTE_DIGIT
    __QUOTE_CHG = "(?P<CHG_SYM>[acd])"

    # Patten: e.g 11,12c12,24
    __RULE_PATTEN_LINENO_1 = "^" + __QUOTE_DIGIT1 + "," + __QUOTE_DIGIT2 \
                             + __QUOTE_CHG + __QUOTE_DIGIT3 + "," + __QUOTE_DIGIT4
    # Patten: e.g 11c12,24
    __RULE_PATTEN_LINENO_2 = "^" + __QUOTE_DIGIT1 + __QUOTE_CHG + __QUOTE_DIGIT3 + "," + __QUOTE_DIGIT4
    # Patten: e.g 11,12c12
    __RULE_PATTEN_LINENO_3 = "^" + __QUOTE_DIGIT1 + "," + __QUOTE_DIGIT2 + __QUOTE_CHG + __QUOTE_DIGIT3
    # Patten: e.g 11c12
    __RULE_PATTEN_LINENO_4 = "^" + __QUOTE_DIGIT1 + __QUOTE_CHG + __QUOTE_DIGIT3

    # ******************************************************************************************************************
    #   Public
    # ******************************************************************************************************************
    def __init__( self ):
        self.PatchFileName = ""
        self.NowLineNo = 0

    def ParseStart( self, FileName: str ) -> bool:
        self.PatchFileName = FileName
        self.__ParseSub(FileName)
        print("This file %s has been parsed." % (FileName))

    # ******************************************************************************************************************
    #   Private
    # ******************************************************************************************************************
    def __ParseSub( self, file_name: str ) -> bool:
        line_buff = ""
        end_flag = False
        self.PatchFileName = file_name
        try:
            # Open file
            with open(self.PatchFileName, errors='ignore') as file_object:
                for line_buff in file_object:
                    line_buff = line_buff.strip()  # Delete the space in head and tail
                    self.NowLineNo += 1  # record the line no

                    if line_buff == "":  # skip the space line
                        continue

                    # parse the line
                    print("Line %06d: [%s]" % (self.NowLineNo, line_buff))
                    ret = self.__ParseLine(line_buff)
                    print("[%s]" % ret)

                else:
                    end_flag = True

            if file_object:
                file_object.close()

        except FileNotFoundError:
            print("[__ParseSub] FileNotFoundError!")
        else:
            if file_object:
                file_object.close()
            print("[__ParseSub] SYSTEM ERROR!")

    def __ParseLine( self, lineBuff: str ) -> bool:
        rslt = []
        # check all rules
        # TODO: Add Rule schedule function
        #       1.Run the rules by the the priority( not the rule no)
        #       2.Count the times of rule for optimize the priority of the rules
        return self.__ParseRules(lineBuff, rslt)

    def __ParseRules( self, lineBuff: str, Rslt ) -> bool:
        ret = True

        # print("__ParseCheckRules", type(lineBuff))
        # TODO: Run the rules and add the analysis data into buff.

        # Check function
        if self.__ParseCheckRule_CONTENT(lineBuff, Rslt):
            print("Find Content!", end="\t")
            # TODO:Action function (Add the Result to data struct)
            ret = True
        elif self.__ParseCheckRule_LINENO(lineBuff, Rslt):
            print("Find LineNo!", end="\t")
            # TODO:Action function (Add the Result to data struct)
            ret = True
        elif self.__ParseCheckRule_FILE(lineBuff, Rslt):
            print("Find File!", end="\t")
            # TODO:Action function (Add the Result to data struct)
            ret = True
        else:
            print("This line can not be parsed(failure!)", end="\t")
            ret = False

        return ret

    def __ParseCheckRule_FILE( self, lineBuff: str, Rslt ) -> bool:
        # print("__ParseCheckRule_FILE", type(lineBuff))
        if self.__RULE_PATTEN_FILE == lineBuff[0: len(self.__RULE_PATTEN_FILE)]:
            # Split the file name
            # TODO: Parse the file name
            # Need attention how to do it if include the space in path name
            # e.g: diff i circle.yml circle.yml

            return True

        else:
            return False

    def __ParseCheckRule_LINENO( self, lineBuff: str, Rslt ) -> bool:
        # 11c11

        if self.__ParseCheckRule_LINENO_Patten(self.__RULE_PATTEN_LINENO_1, lineBuff, Rslt):
            pass
        elif self.__ParseCheckRule_LINENO_Patten(self.__RULE_PATTEN_LINENO_2, lineBuff, Rslt):
            pass
        elif self.__ParseCheckRule_LINENO_Patten(self.__RULE_PATTEN_LINENO_3, lineBuff, Rslt):
            pass
        elif self.__ParseCheckRule_LINENO_Patten(self.__RULE_PATTEN_LINENO_4, lineBuff, Rslt):
            pass
        else:
            return False

        return True

    def __ParseCheckRule_LINENO_Patten( self, Patten: str, lineBuff: str, Rslt ) -> bool:
        # 11c11
        reg = re.compile(Patten, re.I)
        reg_match = reg.match(lineBuff)
        if reg_match:
            line_grp = reg_match.groupdict()
            print(line_grp)
            return True

        return False

    def __ParseCheckRule_CONTENT( self, lineBuff: str, Rslt ) -> bool:
        # No Need to parse the string

        # <    fedora33_gmake:
        # ---
        # >    fedora32_gmake:
        if self.__RULE_PATTEN_CONTENT_1 == lineBuff[0: len(self.__RULE_PATTEN_CONTENT_1)]:
            return True
        elif self.__RULE_PATTEN_CONTENT_2 == lineBuff[0: len(self.__RULE_PATTEN_CONTENT_2)]:
            return True
        elif self.__RULE_PATTEN_CONTENT_3 == lineBuff[0: len(self.__RULE_PATTEN_CONTENT_3)]:
            return True
        else:
            return False


# For Test
if __name__ == '__main__':
    pf = ParsePatchFile()
    pf.ParseStart("E:/Study/010.ProgramLanguage/python/sample/ParseWinMergePatchFile/testData/all_file.txt")

    pass