自动替换 Latex 中引用代码的行号

zBinny

已于 2023-12-11 19:35:04 修改

阅读量275

点赞数

文章标签： python 开发语言

于 2023-12-11 19:33:28 首次发布

本文链接：https://blog.csdn.net/zbinny/article/details/134934535

版权

Latex 中引用代码的时候，可以不用包括整个文件，只用给起始和终止的行号，便可以嵌入需要的代码。但是，由于代码经常修改，导致起止行号变化，还需要手动更新该行号，否则编译生成的 pdf 文件中涉及到的代码会发生错位。

为此，编写了个简单的函数处理该行号，本程序适合 bash shell，其他程序如法炮制：

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

'''
# 这个代码是问文心一言，然后给出的
# 问：我想用 python 写一段获取 Linux 中 shell 脚本文件的函数列表以及这些函数在文件中的起始和结束位置
# 回答：
#     在Python中，你可以使用os和re模块来读取并分析shell脚本。下面是一个简单的函数，
# 它可以获取shell脚本中的函数列表及其在文件中的起始位置。
#     这个函数假设你的shell脚本是Bash脚本，并且函数以function_name()的形式定义。
# 它使用正则表达式来匹配函数的开始和结束位置。

def get_shell_functions(file_path):  
    functions = []  
    with open(file_path, 'r') as file:  
        content = file.read()  
        pattern = r'\bfunction\s+(\w+)\s*\([^;{]*\)\s*{\s*(/\*.*?\*/|[^}]+)*\s*}'  
        matches = re.findall(pattern, content, re.MULTILINE | re.DOTALL)  
        for match in matches:  
            start = content.index(match)  
            end = start + len(match)  
            functions.append((match[0], start, end))  
    return functions
    
# 文心一言给的正则表达式不符合规则所以匹配出来乱七八糟，结果令人失望，只好自己编写！
'''

import os  
import re  

import BnPlatform.BinnyBase as BinnyBase
  
def get_shell_functions(content, print_code=False):
    '''
    得到某个 bash shell 的全部函数名称以及函数在代码中的起始和结束的位置
    '''
    functions = []  
    
    # 这个正则表达式主要作用是找到函数的开始位置，结尾位置由于函数体很复杂
    # 无法进行匹配，所以思路是先找到位置，再分析大括号封闭的情况
    # 当然也可能有转义大符号等语句，这个留给专业的词法分析器解析，这里只是简单的匹配就行
    # 匹配合法的函数表达式，function fn(){...}、fn(){...}
    pattern = r"(\s*[function]*?\s*)(\w+)\s*\(\s*\)\s*{.*?}"
    pos = 0
    while True:
        match = re.search(pattern, content[pos:], re.DOTALL)
        if not match:
            break
        start = match.start()
        end = match.end()
        # Move forward in text for the next search
        # 提取函数和位置 func_name
        func_name = match.groups()[1]
        # 匹配到的空格
        blanks = match.groups()[0]
        start_pos = pos + start + len(blanks)
        end_pos = pos + end
        # 找到真正的结尾，只有左右符号相等才是封闭的
        _, s, e = BinnyBase.bnGetBetweenSymbol(content[start_pos:], '{|}')
        # 如果分析错误，中间也许包含其他函数，或者只包含函数的部分
        # 这一点不勉强，至少这对我复杂的函数，也可以返回正确的结果
        if start_pos + e + 1 > end_pos:
            end_pos = start_pos + e + 1
        if print_code:
            print('{}[{:>2d} : {:>2d}]'.format(
                content[start_pos:end_pos], start_pos, end_pos - 1))
        functions.append((func_name, start_pos, end_pos))
        pos = end_pos
        
    return functions  


def convert_code_line_number(content, start, end):
    '''得到开始和结束位置在代码中的行号'''
    return [BinnyBase.bnCountsOf(content[:start], '\n') + 1,
           BinnyBase.bnCountsOf(content[:end], '\n') + 1]

def replace_latex_lstinputlisting(content,
                                  base_shell_name,
                                  row_begin=0, 
                                  row_end=0):
    '''替换 Latex 中 lstinputlisting 函数的起始和终止行号'''
    # 修改和脚本中的位置信息，示例脚本：
    # \lstinputlisting[language=bash,
    #                  caption=LoadBaseFunc.sh/openwrt\_compile\_source,
    #                  label=LoadBaseFunc.sh/openwrt_compile_source,
    #                  firstline=306, lastline=535, columns=flexible,
    #                  breaklines=true]{Script/sanple.sh}
    pattern = r'\\lstinputlisting\[language=bash.*?' + base_shell_name + '.*?}'
    # 由于只在一行内替换，所以不用正则表达式多行的标志    
    row_search = re.search(pattern, content)
    if row_search:
        row_code = content[row_search.start():row_search.end()]
        # 将上例中的 firstline=306, lastline=535 自动修改为最新的
        row_replace = re.sub(r"firstline=\s*?\d+", f"firstline={row_begin}", row_code)
        row_replace = re.sub(r"lastline=\s*?\d+", f"lastline={row_end}", row_replace)
        if row_replace != row_code:
            return content.replace(row_code, row_replace)
    return None
    


def main(shell_path, tex_files, print_code=False):
    # 需要修改的 Latex 代码
    content_shell = BinnyBase.bnGetFileData(shell_path)
    functions = get_shell_functions(content_shell, print_code=print_code)  
    if functions:
        for tex_file, func_name in tex_files:
            if not os.path.exists(tex_file):
                continue
            for function, start, end in functions:
                if function != func_name:
                    continue
                if print_code:
                    print(f"Function: {function}, Start: {start}, End: {end}")
                # 将位置转成行数
                row_begin, row_end = convert_code_line_number(content_shell, start, end)
                # 首先找到改行需要修改的代码
                base_shell_name = os.path.basename(shell_path)
                # 获取文件内容
                content = BinnyBase.bnGetFileData(tex_file)
                # 保存文件
                _content = replace_latex_lstinputlisting(
                    content, base_shell_name=base_shell_name,
                    row_begin=row_begin, row_end=row_end)
                if _content and _content != content:
                    if print_code:
                        print(f"正在替换: {tex_file} 文件中的函数 {function} 的起止位置 ...")
                    BinnyBase.bnSetFileData(tex_file, _content)                
                break


if __name__ == '__main__':
    print_code = False
    # Latex 代码文件和需要替换的函数体名称
    tex_files = []
    tex_files.append((r'../src/XXX.tex', 'xxx_func_name'))
    tex_files.append((r'../src/YYY.tex', 'yyy_func_name'))
    # 使用某个脚本，该脚本可能多次修改，这样可以自动找到修改以后的代码位置
    main(shell_path=r'./sanple.sh', tex_files=tex_files, print_code=print_code)

使用很简单，源码面前不用解释。

zBinny

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
1
评论
自动替换 Latex 中引用代码的行号

Latex 中引用代码的时候，可以不用包括整个文件，只用给起始和终止的行号，变可以嵌入需要的代码，但是，由于代码经常修改，导致起止行号变化，还需要手动更新该行号，否则代码错位。使用很简单，源码面前不用解释。
复制链接

扫一扫