python实现解释器_Python 解释器初探

最新推荐文章于 2024-04-02 06:32:09 发布

weixin_39677104

最新推荐文章于 2024-04-02 06:32:09 发布

阅读量143

点赞数

文章标签： python实现解释器

A Python Interpreter Written in Python 是一篇很棒的文章，作者用 Python 实现了一个 Python 解释器（Byterun），文章的前半部分讲解 Python 解释器的基本结构，实现一个玩具（简易指令集）解释器。

有时候我们把 Python 解释执行的整个流程都叫做 Python 解释器，它包含以下几步：

lexer

parser

compiler

interpreter

以源码文本作为输入，1, 2, 3 步后得到 code object，并作为第 4 步的输入，我们这里讨论的解释器只针对第 4 步。

可能需要说明的是，为什么 Python 作为解释语言还有 compiler 呢，其实所谓解释语言，只是在编译步做相对（编译语言，如 C，Java）少的工作。

从性质上说，Python 解释器是一个 virtual machine，也是一个 bytecode interpreter。而 virtual machine 又可以分为基于栈（stack）的和基于寄存器（register）的，Python 解释器属于前者。

对于 Python 解释器的输入 code object，它是一个包含 bytecode 的对象，而 bytecode 是一组指令，是 Python 代码的 IR。

以 7+5 为例，用一组玩具指令表示如下，其中 what_to_execute 就是 code object，instructions 就是 bytecode。

what_to_execute = {

"instructions": [

("LOAD_VALUE", 0), # the first number

("LOAD_VALUE", 1), # the second number

("ADD_TWO_VALUES", None),

("PRINT_ANSWER", None)

"numbers": [7, 5]

}

再加上变量的处理，就可以得到这个玩具 Python 解释器了。

class Interpreter:

def __init__(self):

self.stack = []

self.environment = {}

def LOAD_VALUE(self, number):

self.stack.append(number)

def PRINT_ANSWER(self):

answer = self.stack.pop()

print(answer)

def ADD_TWO_VALUES(self):

first_num = self.stack.pop()

second_num = self.stack.pop()

total = first_num + second_num

self.stack.append(total)

def STORE_NAME(self, name):

val = self.stack.pop()

self.environment[name] = val

def LOAD_NAME(self, name):

val = self.environment[name]

self.stack.append(val)

def parse_argument(self, instruction, argument, what_to_execute):

""" Understand what the argument to each instruction means. """

numbers = ["LOAD_VALUE"]

names = ["LOAD_NAME", "STORE_NAME"]

if instruction in numbers:

argument = what_to_execute["numbers"][argument]

elif instruction in names:

argument = what_to_execute["names"][argument]

return argument

def run_code(self, what_to_execute):

instructions = what_to_execute["instructions"]

for each_step in instructions:

instruction, argument = each_step

argument = self.parse_argument(instruction, argument, what_to_execute)

if instruction == "LOAD_VALUE":

self.LOAD_VALUE(argument)

elif instruction == "ADD_TWO_VALUES":

self.ADD_TWO_VALUES()

elif instruction == "PRINT_ANSWER":

self.PRINT_ANSWER()

elif instruction == "STORE_NAME":

self.STORE_NAME(argument)

elif instruction == "LOAD_NAME":

self.LOAD_NAME(argument)

# better run_code making use of Python's dynamic method lookup

def execute(self, what_to_execute):

instructions = what_to_execute["instructions"]

for each_step in instructions:

instruction, argument = each_step

argument = self.parse_argument(instruction, argument, what_to_execute)

bytecode_method = getattr(self, instruction)

if argument is None:

bytecode_method()

else:

bytecode_method(argument)

what_to_execute = {

"instructions": [("LOAD_VALUE", 0),

("STORE_NAME", 0),

("LOAD_VALUE", 1),

("STORE_NAME", 1),

("LOAD_NAME", 0),

("LOAD_NAME", 1),

("ADD_TWO_VALUES", None),

("PRINT_ANSWER", None)],

"numbers": [1, 2],

"names": ["a", "b"]}

interpreter = Interpreter()

interpreter.run_code(what_to_execute)

在 Python 中对于一个函数对象 obj 我们可以使用 obj.__code__ 得到 code object，obj.__code__.co_code 得到 bytecode。但实际的输出可能是不可读的（字节），可以利用 Python dis 模块（bytecode disassembler）中的 dis.opname(n) 得到字节对应的字符串，也可以直接用 dis.dis(obj) 输出函数对象字节码的解释。

后半部分过度到真实的 Python bytecode，其中关于 frames 的部分非常值得一读。

weixin_39677104

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫