Python-构建篇-解释器-5~6

最新推荐文章于 2024-10-02 10:53:34 发布

熙鹤v

最新推荐文章于 2024-10-02 10:53:34 发布

阅读量825

点赞数 28

分类专栏： # 构建篇文章标签： python 开发语言笔记

本文链接：https://blog.csdn.net/qq_55280590/article/details/141640427

版权

构建篇专栏收录该内容

5 篇文章 0 订阅

订阅专栏

前言

《构建篇-解释器》最后两篇

05

控制流结构

在上一期中我提到了关键变量doom，那么这一期我们就要来用到doom

While

我们可以思考下如何实现循环？

在上一期中，我提到了一个callback赋值的点，那就是将内容写入临时文件，再调用stcode函数；实际上实现while也可以用到这种方法，不过要更复杂些而已

首先你得明确，你想要的while的语法格式是什么？

while 条件:
    ....

while(条件){
    ...
}

第一种？第二种？

在这里我将采用第二种来说；

知道了语法格式，我们就需要去获取执行的代码块内容以及代码块部分的行数

我们只有知道了行数才能去将doom设置为相对应需要跳过的行数，这样就不会重复执行代码

那么这该怎么获取呢？

我们先新增一个函数，目的是让这个函数收集while相关的内容，让这个函数接收四个参数：文件路径、keyword、整数值（来决定返回值）、整数值（防重复收集）

我们让函数先打开文件并读所有行，再初始化一些变量以让它追踪while的开始、代码块边界以及while语句是否处于 if 语句块内

我们要让函数在遍历文件的每一行时，检查是否找到以关键词开头且格式正确的函数定义；

如果找到了，让其开始收集while函数的内容，直到遇到函数体的结束为止。同时，它需要处理 if 语句块，并在遇到结束的} 时结束当前代码块的收集

我们让这个函数在收集完while函数的内容后，根据参数4从收集到的函数内容列表中选取特定项

再让其根据参数3的值，返回不同的结果

如果参数3为1，返回选取项的行数加1。

如果参数3为2，判断是否处于 if 语句块内，返回包含函数内容以及0或1的列表

def searchdef2(file_path, keyword, wbornum, whilenum):
    with open(file_path, 'r', encoding='utf-8') as file:
        lines = file.readlines()
    in_function = False
    in_block = False
    collecting = False
    function_content = []
    current_content = []
    block_start_line = -1
    in_if_block = False 

    for i, line in enumerate(lines):
        stripped_line = line.strip()
        if stripped_line.startswith(keyword) and '(' in stripped_line and ')' in stripped_line:
            last_close_parenthesis = stripped_line.rfind(')')
            if last_close_parenthesis != -1:
                if stripped_line[last_close_parenthesis + 1:].strip() == '{':
                    in_function = True
                    collecting = True
                    block_start_line = i
                    continue

        if collecting:
            if stripped_line == '{':
                in_block = True
            elif stripped_line == '}':
                in_block = False
                if in_if_block:
                    in_if_block = False 
                else:
                    collecting = False
                    function_content.append(''.join(current_content).strip() + '\n}')
                    current_content = []
            else:
                if 'if' in stripped_line and '(' in stripped_line and ')' in stripped_line:
                    in_if_block = True  
                current_content.append(line)
            if in_function and not in_block:
                in_block = True
    full_content = function_content[whilenum]
    dpmvcjd = full_content.splitlines()
    doom2 = len(dpmvcjd) + 1
    if wbornum == 1:
        return doom2
    elif wbornum == 2:
        if in_if_block == True:
            return [full_content,0]
        else:
            return [full_content,1]

需要注意的是，每当if 'while' in line等于true进入执行时都需要让while_meet变量+1，while_meet变量用于控制获取的while语句，避免重复

同时我们将doom的值更新，以此来防止重复代码被重复执行

除此之外，我们还需要新增两个函数，第一个函数将用来获取其while语句括号内的循环次数

第二个函数，先global全局变量ppz

再将获取到的代码块内容写入临时文件

重新赋值ppz为temp，也就是临时文件的路径

def zddeget2(contentb):
    pattern = r'while\((.*?)\)'
    match = re.search(pattern, contentb)
    if match:
        value_inside_brackets = match.group(1)
        return value_inside_brackets

def temp_while(temp_content):
    global ppz
    with tempfile.NamedTemporaryFile(mode='w+', delete=False, encoding='utf-8') as temp_file:
        temp_file.write(temp_content)
        temp_file.seek(0)
        temp = temp_file.name
        ppz = temp
        return True

if 'while' in line:
    while_meet+=1
    function_meet = 1
    doom_while = searchdef2(filepathdpc2,'while',2,while_meet-1)
    function_meet = 0
    get_keycontent_while = zddeget2(line)
    if temp_while(doom_while[0]):
        for i_bmx in range(int(get_keycontent_while)-1):
            stcode(ppz,env,1)
        doom = int(get_keycontent_while)

If-支持else

同样语法格式，我这里以

if(条件){
...
}

为主

与while等同，同样是先获取if语句代码块的内容+代码块内容行数

新增一个函数，接收三个参数；

先读取文件内容，再逐行检查是否遇到函数定义，遇到则记录，将其存储在一个列表中，每一个元素是一个元组

最后根据wbornum参数的值，返回不同的结果：

如果wbornum为1，返回每个函数的行数

如果为2，返回每个函数的内容以及其中的else块内容

如果wbornum不是1或2，则函数返回None

def searchdef(file_path, keyword, wbornum):
    with open(file_path, 'r', encoding='utf-8') as file:
        lines = file.readlines()
    function_contents = []
    function_count = 0  
    block_nesting = 0  
    else_started = False  
    for i, line in enumerate(lines):
        stripped_line = line.strip()
        if stripped_line.startswith(keyword) and '(' in stripped_line and ')' in stripped_line:
            if '{' in stripped_line:
                function_count += 1  
                block_nesting += 1 
                function_name = f"{keyword}{function_count}"
                function_contents.append((function_name, [])) 
                continue
        if block_nesting > 0:
            if stripped_line == '{':
                block_nesting += 1  
            elif stripped_line == '}':
                block_nesting -= 1  
                if block_nesting == 0:
                    function_contents[-1][1].append(line)
            else:
                if stripped_line.startswith('else'):
                    else_started = True
                function_contents[-1][1].append(line)
    results = []
    for func_name, content in function_contents:
        content_str = ''.join(content).strip()
        if wbornum == 1:
            results.append(len(content_str.splitlines()) + 1 if content_str else 0)
        elif wbornum == 2:
            else_content = [line for line in content if line.strip().startswith('else')]
            results.append((content_str, ''.join(else_content).strip()))
    return results if wbornum in (1, 2) else None

在得到代码块内容后我们需要对其进行清洗，去除多余的空白符，换行符，空格符

def remove_empty_lines(text):
    lines = text.split("\n")  
    non_empty_lines = [line for line in lines if line.strip() != ""]  
    return "\n".join(non_empty_lines)

但是呢？

这就完了？

肯定不是，我们还需要获取if语句括号内的表达式，并新增一个函数用来处理这个表达式

这个函数将会处理一个表达式，在处理过程中如果遇到变量则使用env对象的get_variables方法来替换值

如果最后表达式成立则返回True，反之返回False

def execute_if_statement(condition, env):
    variable_pattern = r'\{(\w+)\}'
    simple_comparison_pattern = r'(\d+|\w+)[ ]?([><=]{2}[ ]?)(\d+|\w+)'
    simple_match = re.match(simple_comparison_pattern, condition)
    if simple_match:
        left, operator, right = simple_match.groups()
        left_value = env.get_variable(left) if left.strip('{}') else int(left) if left.isdigit() else left
        right_value = env.get_variable(right) if right.strip('{}') else int(right) if right.isdigit() else right
        if operator == '==':
            return left_value == right_value
        elif operator == '>':
            return left_value > right_value
        elif operator == '<':
            return left_value < right_value
        elif operator == '>=':
            return left_value >= right_value
        elif operator == '<=':
            return left_value <= right_value
    else:
        try:
            condition_result = env.evaluate_expression2(condition)
            return bool(condition_result)
        except Exception as e:
            return False

如果返回true，我们就将执行为true时的代码块内容写入临时文件

反之则为false时的代码块内容写入临时文件

并根据其返回的if语句代码块行数来重新赋值doom

if 'if' in line:
    doomn = searchdef(filepathdpc2,'if',2)
    doonmnew = doomn[newelseif_count][0].split('}else{')
    doom_true = remove_empty_lines(doonmnew[0])
    try:
        doom_false = remove_empty_lines(doonmnew[1])
    except IndexError:
        doom_false = ''
    get_keycontent = zddeget(line)
    if execute_if_statement(get_keycontent, env):
        if(temp_while(doom_true)):
            stcode(ppz,env,1)
    else:
        if(doom_false != ''):
            if(temp_while(doom_false)):
                stcode(ppz,env,1)
    newelseif_count+=1
    doom = searchdef(filepathdpc2,'if',1)[newelseif_count-1]

Function

这是我自己设置的语法格式

function(a){
...
}

同样还是先获取代码块内容和行数

先新增一个函数

函数接收三个参数：文件路径、关键字和函数定义索引

函数开始收集直到遇到对应的右大括号的代码块

此过程中，如果遇到左大括号，表示进入一个新的代码块；

如果遇到右大括号，表示退出当前的代码块

同时，函数能处理if语句块，以确保正确地收集if语句内的内容。

收集到的函数定义被存储在列表中

最后，函数根据提供的索引从这个列表中提取相应的函数定义，并返回该函数定义的代码块以及其行数

def searchdef_def(file_path, keyword,whilenum):
    with open(file_path, 'r', encoding='utf-8') as file:
        lines = file.readlines()
    in_function = False
    in_block = False
    collecting = False
    function_content = []
    current_content = []
    block_start_line = -1
    in_if_block = False 
    for i, line in enumerate(lines):
        stripped_line = line.strip()
        if stripped_line.startswith(keyword) and '(' in stripped_line and ')' in stripped_line:
            last_close_parenthesis = stripped_line.rfind(')')
            if last_close_parenthesis != -1:
                if stripped_line[last_close_parenthesis + 1:].strip() == '{':
                    in_function = True
                    collecting = True
                    block_start_line = i
                    continue
        if collecting:
            if stripped_line == '{':
                in_block = True
            elif stripped_line == '}':
                in_block = False
                if in_if_block:
                    in_if_block = False 
                else:
                    collecting = False
                    function_content.append(''.join(current_content).strip() + '\n}')
                    current_content = []
            else:
                if 'if' in stripped_line and '(' in stripped_line and ')' in stripped_line:
                    in_if_block = True  
                current_content.append(line)
            if in_function and not in_block:
                in_block = True
    full_content2 = function_content[whilenum]
    dpmvcjd = full_content2.splitlines()
    doom23 = len(dpmvcjd) + 1
    return [doom23,full_content2]

与while相同的是，我们要让每一次进入都会让变量fun_meet+1

然后就是获取function语句括号内容+写入临时文件+设置doom新值

最后我们还要更新字典的键的值

def zddeget3(contentb):
    pattern = r'function\((.*?)\)'
    match = re.search(pattern, contentb)
    if match:
        value_inside_brackets = match.group(1)
        return value_inside_brackets
def temp_while3(temp_content):
    with tempfile.NamedTemporaryFile(mode='w+', delete=False, encoding='utf-8') as temp_file:
        temp_file.write(temp_content)
        temp_file.seek(0)
        temp = temp_file.name
        return temp
if 'function' in line:
    fun_meet+=1
    get_keycontent = zddeget3(line)
    function_meet = 1
    get_keycontent2 = searchdef_def(filepathdpc2,'function',fun_meet-1)
    doom = int(get_keycontent2[0])
    temp_build = temp_while3(get_keycontent2[1])
    functiond[get_keycontent] = temp_build

06

扩展-补充

获取参数

参数的获取也是十分重要的，假设你设计了一个函数（例如md5）

//用户写的 
//md5('abc')

虽然解释器知道这一行内容是一个函数的执行，但问题是不知道其是否有参数，不知道具体值，那就是啥用也没有了，解释器就无法对其进行处理

所以需要获取参数

我们可以思考下，写一个函数

这个函数运用到了正则表达式，它需要收集三个参数

keyword，text，env

正则表达式会匹配以参数keyword开头的(...)的参数text内容，并获取括号内的内容；

接着用,来分割括号内容合成参数

当遇到变量时就调用env对象的get_variables方法来替换值，最后返回一个列表

def get_codekey(keyword, text, env):
    pattern = rf".*?{re.escape(keyword)}\((.*?)\)"
    match = re.search(pattern, text)
    if match:
        args_raw = match.group(1)
        def replace_variables(arg):
            for var_match in re.finditer(r'\{(\w+)\}', arg):
                var_name = var_match.group(1)
                arg = arg.replace('{' + var_name + '}', str(env.get_variable(var_name)))
            return arg
        args = [replace_variables(arg.strip()) for arg in args_raw.split(',') if arg.strip()]
        return tuple(args[:2]) if len(args) >= 2 else (args[0] if args else None, None)
    else:
        return None, None

def get_codekey2(keyword, text, env):
    pattern = rf".*?{re.escape(keyword)}\(((?:[^()]|//|/|\\)*)\)"
    match = re.search(pattern, text)
    if match:
        args_raw = match.group(1).strip()
        def replace_variable(matchobj):
            var_name = matchobj.group(1).strip()
            return str(env.get_variable(var_name))
        args_processed = re.sub(r'\{(\w+)\}', replace_variable, args_raw)
        if ',' in args_processed:
            return [arg.strip() for arg in args_processed.split(',') if arg.strip()]
        else:
            return [args_processed]
    else:
        return None

多参函数推荐使用codekey2，单参函数推荐使用codekey