python的代码对象Code Object

0. 参考文档

参考文档如下:

  • python官方文档代码对象: https://docs.python.org/3/c-api/code.html
  • python官方文档inspect介绍:https://docs.python.org/3/library/inspect.html
  • Github托管的源码: https://github.com/python/cpython/blob/main/Lib/opcode.py
  • Github托管的源码: https://github.com/python/cpython/blob/main/Include/opcode.h
  • Code Objects: https://nanguage.gitbook.io/inside-python-vm-cn/5.-code-objects
  • Python 中的代码对象 code object 说明: https://blog.csdn.net/jpch89/article/details/86764245
  • Python的Opcodes的说明: https://unpyc.sourceforge.net/Opcodes.html

1. 了解Code Object对象

在python官方文档中,对python的代码对象的解释:

Code objects are a low-level detail of the CPython implementation. Each one represents a chunk of executable code that hasn’t yet been bound into a function.
中文版本:
代码对象是 CPython 实现的低级细节。 每个代表一块尚未绑定到函数中的可执行代码。

可以看到官网的描述实在是不知所云,下面我们通过实际的例子来说明python的代码对象是什么。

1.1. 简单的示例代码

为了介绍今天介绍的主角Code Object,我们先看一个示例代码:

def func():
    pass

print(type(func))
print(func.__code__)  # 重点代码

输出的结果:

<class 'function'>
<code object func at 0x7fc6277269d0, file "/home/xd/project/learn_python/test/test.py", line 1>

示例代码中__code__属性输出的code object就是我们要介绍的对象。

1.2. 学习code object的官方文档

我们可以在python的官方文档中看到与code object有关的介绍:

python官方inspect文档:https://docs.python.org/3/library/inspect.html

我将其中有关code object的部分摘录如下:
python_inspect_code_object_001

每一个属性的后面都有关于其含义的解释,这是最权威的介绍,需要时候可以仔细查看。

1.3. 实际验证code object的属性

我们可以通过dir()的方式将属性打印出来,如下:
python_inspect_code_object_002

可以看到有相当多的以“co_”开头的属性名称, 这些是我们需要关注的重点。

2. 介绍code object的各个属性

2.1. 查看python的code object的各个属性:

我们编写一个简单的示例函数,然后将以“co_”开头的属性名称与值打印出来,示例代码如下:

def func(a, b=3, *args, **kwargs):
    c = a + b
    mm = 111
    str_mm = "test test"
    print(a + b + mm)
    return c


for attr in dir(func.__code__):
    if attr.startswith('co_'):
        print(f"{attr}:\t{getattr(func.__code__, attr)}")

执行的结果:

co_argcount:	2
co_cellvars:	()
co_code:	b'|\x00|\x01\x17\x00}\x04d\x01}\x05d\x02}\x06t\x00|\x00|\x01\x17\x00|\x05\x17\x00\x83\x01\x01\x00|\x04S\x00'
co_consts:	(None, 111, 'test test')
co_filename:	/home/xd/project/learn_python/test/test.py
co_firstlineno:	1
co_flags:	79
co_freevars:	()
co_kwonlyargcount:	0
co_lnotab:	b'\x00\x01\x08\x01\x04\x01\x04\x01\x10\x01'
co_name:	func
co_names:	('print',)
co_nlocals:	7
co_posonlyargcount:	0
co_stacksize:	3
co_varnames:	('a', 'b', 'args', 'kwargs', 'c', 'mm', 'str_mm')

这些属性有些好理解,有些不好理解,在下面的例子中我们进行了分类。

2.2. 验证python的code object的各个属性

这部分参考视频资料: https://www.bilibili.com/video/BV12i4y1C7MH/

这个例子中的示例代码与上面的代码几乎一样, 只是将打印的内容进行了分类,并增加了官方说明, 如下:

def func(a, b=3, *args, **kwargs):
    c = a + b
    mm = 111
    str_mm = "test test"
    print(a + b + mm)
    return c


code = func.__code__
print(f"{code.co_code = }")  # string of raw compiled bytecode
print(f"{len(code.co_code) = }")

print(f"{code.co_name = }")  # name with which this code object was defined
# co_filename: name of file in which this code object was created
print(f"{code.co_filename = }")
# co_lnotab: encoded mapping of line numbers to bytecode indices
print(f"{code.co_lnotab = }")

# co_flags: bitmap of CO_* flags, read more:
# https://docs.python.org/3/library/inspect.html#inspect-module-co-flags
print(f"{code.co_flags = }")
print(f"{code.co_stacksize = }")  # virtual machine stack space required

# number of arguments (not including keyword only arguments, * or ** args)
print(f"{code.co_argcount = }")
# co_posonlyargcount: number of positional only arguments
print(f"{code.co_posonlyargcount = }")
# co_kwonlyargcount: number of keyword only arguments (not including ** arg)
print(f"{code.co_kwonlyargcount = }")

print(f"{code.co_nlocals = }")  # number of local variables
# co_varnames: tuple of names of arguments and local variables
print(f"{code.co_varnames = }")
# co_names: tuple of names other than arguments and function locals
print(f"{code.co_names = }")
# co_cellvars: tuple of names of cell variables (referenced by containing scopes)
print(f"{code.co_cellvars = }")
# co_freevars: tuple of names of free variables (referenced via a function’s closure)
print(f"{code.co_freevars = }")

print(f"{code.co_consts = }")  # tuple of constants used in the bytecode

执行效果(这里为了显示效果,把注释删除了):

python_inspect_code_object_003

2.3. 从cpython源码中分析字节码对象

上面都是通过实测(以及通过官方文档)进行说明字节码对象有哪些属性,如果我们想要从源头上确认这点。我们深入到Cpython源码中,查看字节码的定义。

说明:下面所有的代码都摘录自: cpython源码中3.8分支的代码; 不同分支中的c代码实现可能不同

python字节码对象的结构体定义在文件Include/code.h中,如下:

/* Bytecode object */
typedef struct {
    PyObject_HEAD
    int co_argcount;            /* #arguments, except *args */
    int co_posonlyargcount;     /* #positional only arguments */
    int co_kwonlyargcount;      /* #keyword only arguments */
    int co_nlocals;             /* #local variables */
    int co_stacksize;           /* #entries needed for evaluation stack */
    int co_flags;               /* CO_..., see below */
    int co_firstlineno;         /* first source line number */
    PyObject *co_code;          /* instruction opcodes */
    PyObject *co_consts;        /* list (constants used) */
    PyObject *co_names;         /* list of strings (names used) */
    PyObject *co_varnames;      /* tuple of strings (local variable names) */
    PyObject *co_freevars;      /* tuple of strings (free variable names) */
    PyObject *co_cellvars;      /* tuple of strings (cell variable names) */
    /* The rest aren't used in either hash or comparisons, except for co_name,
       used in both. This is done to preserve the name and line number
       for tracebacks and debuggers; otherwise, constant de-duplication
       would collapse identical functions/lambdas defined on different lines.
    */
    Py_ssize_t *co_cell2arg;    /* Maps cell vars which are arguments. */
    PyObject *co_filename;      /* unicode (where it was loaded from) */
    PyObject *co_name;          /* unicode (name, for reference) */
    PyObject *co_lnotab;        /* string (encoding addr<->lineno mapping) See
                                   Objects/lnotab_notes.txt for details. */
    void *co_zombieframe;       /* for optimization only (see frameobject.c) */
    PyObject *co_weakreflist;   /* to support weakrefs to code objects */
    /* Scratch space for extra data relating to the code object.
       Type is a void* to keep the format private in codeobject.c to force
       people to go through the proper APIs. */
    void *co_extra;

    /* Per opcodes just-in-time cache
     *
     * To reduce cache size, we use indirect mapping from opcode index to
     * cache object:
     *   cache = co_opcache[co_opcache_map[next_instr - first_instr] - 1]
     */

    // co_opcache_map is indexed by (next_instr - first_instr).
    //  * 0 means there is no cache for this opcode.
    //  * n > 0 means there is cache in co_opcache[n-1].
    unsigned char *co_opcache_map;
    _PyOpcache *co_opcache;
    int co_opcache_flag;  // used to determine when create a cache.
    unsigned char co_opcache_size;  // length of co_opcache.
} PyCodeObject;

我们熟悉的co_argcount, co_posonlyargcount, co_kwonlyargcount, co_code等等在这里均有定义, 毕竟这里才是最权威的嘛。

3. 对比code object的co_code与反汇编代码

在上面的示例代码中稍加改动:

import dis
def func(a, b=3, *args, **kwargs):
    c = a + b
    mm = 111
    str_mm = "test test"
    print(a + b + mm)
    return c


code = func.__code__
print(f"{len(code.co_code) = }")
print(f"{'*' * 90}")
print(f"{code.co_code}")
print(f"{'*' * 90}")
print(f"{dis.dis(func)}")

执行效果:
python_inspect_code_object_004

可以看到:

代码对象的co_code属性直接读取是极为不便的,通常我们会使用dis.dis()将其反汇编出来再阅读,
这也是dis.dis()的主要应用场

说明:

dis模块包含在python标准库中,提供了反汇编的功能,以便我们更容易阅读python的字节码。
如果想要了解详细信息,请参考我单独写的文章。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值