从pdb源码到frame帧对象

最新推荐文章于 2022-09-25 20:17:44 发布

懒编程-二两

最新推荐文章于 2022-09-25 20:17:44 发布

阅读量463

点赞数

本文链接：https://blog.csdn.net/weixin_30230009/article/details/125476790

版权

本文深入探讨了Python内置调试器pdb的工作原理，从pdb.set_trace()方法开始，分析了frame栈帧对象、sys.settrace方法、pdb基本流程以及pdb的各种命令源码，揭示了如何在调试过程中获取代码上下文。

摘要由CSDN通过智能技术生成

前言

在使用pdb对某Python程序进行debug时，出现通过l或ll命令，无法获得代码上下文的情况，如下图：

所以我决定深究一下pdb代码是怎么写的，为啥有时候获取不到上下文代码。

最小实例

pdb是Python内置的调试器，其源码由Python实现，基于cmd和bdb这两个内置库实现，多数情况下，pdb还是很好用的，虽说如此，但PyCharm、Vscode这些都没有使用标准的pdb，而是自己开发了Python调试器来配合IDE。

为了直观理解pdb运行流程，这里构建一下最小实例，将pdb运行起来：

import pdb


def fib(n):
    a, b = 1, 1
    # 下断点
    pdb.set_trace()
    for i in range(n - 1):
        a, b = b, a + b

    return a


fib(10)

我在pycharm中运行上面代码，然后debug起来。

在调用pdb.set_trace()方法时，第一步便是实例化pdb对象：

def set_trace(*, header=None):
    # 实例化
    pdb = Pdb()
    if header is not None:
        pdb.message(header)
    pdb.set_trace(sys._getframe().f_back)

实例化会调用__init__方法：

class Pdb(bdb.Bdb, cmd.Cmd):

    _previous_sigint_handler = None

    def __init__(self, completekey='tab', stdin=None, stdout=None, skip=None,
                 nosigint=False, readrc=True):
        bdb.Bdb.__init__(self, skip=skip)
        cmd.Cmd.__init__(self, completekey, stdin, stdout)
        sys.audit("pdb.Pdb")
        # ... 省略

从Pdb类可知，Pdb继承了bdb和cmd。

bdb内置模块是Python提供调试能力的核心框架，它基于sys.setrace方法提供的动态插桩能力，实现对代码的单步调试。而cmd模块主要用于实现交互式命令的，是常用模块，并不是为pdb专门设计的。

先从简单的cmd开始讨论。

cmd是Python内置的模块，主要用于实现交互式shell，我们可以基于cmd轻松实现一个自己的交互式shell，这里简单演示一下cmd的使用（因为不是本文重点，便不去深究了）：

from cmd import Cmd


class MyCmd(Cmd):
    def __init__(self):
        Cmd.__init__(self)

    def do_name(self, name):
        print(f'Hello, {name}')

    def do_exit(self, arg):
        print('Bye!')
        return True


if __name__ == '__main__':
    mycmd = MyCmd()
    mycmd.cmdloop()

上述代码中，定义了MyCmd类，继承于Cmd类，然后实现了do_name方法和do_exit方法，这两个方法分别会匹配上name命令和exit命令，然后通过cmdloop方法开始运mycmd，效果如下：

frame栈帧对象

回顾一下set_trace方法：

def set_trace(*, header=None):
    pdb = Pdb()
    if header is not None:
        pdb.message(header)
    pdb.set_trace(sys._getframe().f_back)

实例化完后，会通过sys._getframe().f_back获得frame对象，然后传递给pdb.set_trace方法。

其中sys._getframe()方法会获得当前的frame（栈帧）。

当我们运行Python代码时，解释器会创建相应的PyFrameObject对象（即上面我们说的frame）。从Python源码中，我们可以翻出PyFrameObject的定义，如下：

typedef struct _frame {
    PyObject_VAR_HEAD
    struct _frame *f_back;      /* previous frame, or NULL */
    PyCodeObject *f_code;       /* code segment */
    PyObject *f_builtins;       /* builtin symbol table (PyDictObject) */
    PyObject *f_globals;        /* global symbol table (PyDictObject) */
    PyObject *f_locals;         /* local symbol table (any mapping) */
    PyObject **f_valuestack;    /* points after the last local */
    /* Next free slot in f_valuestack.  Frame creation sets to f_valuestack.
       Frame evaluation usually NULLs it, but a frame that yields sets it
       to the current stack top. */
    PyObject **f_stacktop;
    ...
    int f_lasti;                /* Last instruction if called */
    /* Call PyFrame_GetLineNumber() instead of reading this field
       directly.  As of 2.3 f_lineno is only valid when tracing is
       active (i.e. when f_trace is set).  At other times we use
       PyCode_Addr2Line to calculate the line from the current
       bytecode index. */
    int f_lineno;               /* Current line number */
    int f_iblock;               /* index in f_blockstack */
    char f_executing;           /* whether the frame is still executing */
    PyTryBlock f_blockstack[CO_MAXBLOCKS]; /* for try and loop blocks */
    PyObject *f_localsplus[1];  /* locals+stack, dynamically sized */
} PyFrameObject;

Python实际执行中，会产生很多PyFrameObject对象，这些对象会链接起来，构成执行链表，解释器训练处理链表上的栈帧对象，处理时就入栈，处理完便出栈。

通过PyFrameObject定义代码中的注释可知：