python 字节码_Python字节码与解释器学习

1. 在交互式命令行中执行命令的内部过程

当你敲下return键的时候,python完成了以下四步:词法分析、句法分析、编译、解释。词法分析的工作就是将你刚才输入的那行代码分解为一些符号token(译者注:包括标示符,关键字,数字, 操作符等)。句法分析程序再接收这些符号,并用一种结构来展现它们之间的关系(在这种情况下使用的抽象语法树)。然后编译器接收这棵抽象语法树,并将它转化为一个(或多个)代码对象。最后,解释器逐个接收这些代码对象,并执行它们所代表的代码。

每一行我们输入的命令,都要经过上面的四个步骤,才能够被执行。

2. 函数对象

对象是面向对象理论中的基本元素,在一些动态或者解释性语言中,函数也可以看作是一种对象,比如在JavaScript,以及功能性编程语言Haskell/Ocaml中,函数都是一种特殊的对象。

函数是对象,就意味着函数可以像对象一样被执行各种操作,比如分配,释放,复制,赋值......

“函数是最好的对象”说明函数是一种对象。它就如同一个列表或者举个例子来说 :MyObject 就是一个对象。既然 foo 是一个对象,那么我们就能在不调用它的情况下使用它(也就是说,foo 和 foo() 是大相径庭的)。我们能够将 foo 当作一个参数传递给另一个函数或者赋值给一个新函数名( other_function = foo )。有了如此棒的函数,一切皆为可能!

另外,函数作为对象出现的时候,就是和函数调用有区别的,函数调用是一个动态的过程;而函数作为一个对象,是一个静态的实体概念,意思是你可以对这个对象施予一些操作,这与这个对象的类型有关,或者以面向对象的思想来说,你可以执行这个对象提供的各种接口操作(函数)。

既然是对象,那么函数对象有哪些成员呢?

>>> dir

>>> dir(dir)

['__call__', '__class__', '__cmp__', '__delattr__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__le__', '__lt__', '__module__', '__name__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__self__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__']

>>> dir(dir.func_code)

Traceback (most recent call last):

File "", line 1, in

dir(dir.func_code)

AttributeError: 'builtin_function_or_method' object has no attribute 'func_code'

>>> def foo(a):

x = 3

return x + a

>>> foo

>>> dir(foo)

['__call__', '__class__', '__closure__', '__code__', '__defaults__', '__delattr__', '__dict__', '__doc__', '__format__', '__get__', '__getattribute__', '__globals__', '__hash__', '__init__', '__module__', '__name__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'func_closure', 'func_code', 'func_defaults', 'func_dict', 'func_doc', 'func_globals', 'func_name']

>>>

其中,内置函数dir的功能描述如下:

dir([object])

Without arguments, return the list of names in the current local scope. With an argument, attempt to return a list of valid attributes for that object.

If the object has a method named __dir__(), this method will be called and must return the list of attributes. This allows objects that implement a custom __getattr__() or __getattribute__() function to customize the way dir() reports their attributes.

If the object does not provide __dir__(), the function tries its best to gather information from the object’s __dict__ attribute, if defined, and from its type object. The resulting list is not necessarily complete, and may be inaccurate when the object has a custom __getattr__().

The default dir() mechanism behaves differently with different types of objects, as it attempts to produce the most relevant, rather than complete, information:

If the object is a module object, the list contains the names of the module’s attributes.

If the object is a type or class object, the list contains the names of its attributes, and recursively of the attributes of its bases.

Otherwise, the list contains the object’s attributes’ names, the names of its class’s attributes, and recursively of the attributes of its class’s base classes.

The resulting list is sorted alphabetically.

除此之外,help内置函数也很重要,可以查看内置函数的帮助内容。

首先,查看当前Python程序加载了哪些模块

>>> for i in sys.modules.keys():

... print "%20s:\t%s\n" % (i, sys.modules[i])

... print "*"*100

copy_reg:

****************************************************************************************************

sre_compile:

****************************************************************************************************

_sre:

****************************************************************************************************

encodings:

****************************************************************************************************

site:

****************************************************************************************************

__builtin__:

****************************************************************************************************

sysconfig:

****************************************************************************************************

__main__:

****************************************************************************************************

encodings.encodings:None

****************************************************************************************************

abc:

****************************************************************************************************

posixpath:

****************************************************************************************************

_weakrefset:

****************************************************************************************************

errno:

****************************************************************************************************

encodings.codecs:None

****************************************************************************************************

sre_constants:

****************************************************************************************************

re:

****************************************************************************************************

_abcoll:

****************************************************************************************************

types:

****************************************************************************************************

_codecs:

****************************************************************************************************

encodings.__builtin__:None

****************************************************************************************************

_warnings:

****************************************************************************************************

genericpath:

****************************************************************************************************

stat:

****************************************************************************************************

zipimport:

****************************************************************************************************

_sysconfigdata:

****************************************************************************************************

warnings:

****************************************************************************************************

UserDict:

****************************************************************************************************

encodings.utf_8:

****************************************************************************************************

sys:

****************************************************************************************************

codecs:

****************************************************************************************************

readline:

****************************************************************************************************

_sysconfigdata_nd:

****************************************************************************************************

os.path:

****************************************************************************************************

sitecustomize:

****************************************************************************************************

signal:

****************************************************************************************************

traceback:

****************************************************************************************************

linecache:

****************************************************************************************************

posix:

****************************************************************************************************

encodings.aliases:

****************************************************************************************************

exceptions:

****************************************************************************************************

sre_parse:

****************************************************************************************************

os:

****************************************************************************************************

_weakref:

****************************************************************************************************

可以通过下面代码查看__builtin__模块中的成员

>>> num = 0

>>> for i in dir(sys.modules["__builtin__"]):

... print "%20s\t" % i,

... num += 1

... if num == 5:

... print ""

... num = 0

...

ArithmeticError AssertionError AttributeError BaseException BufferError

BytesWarning DeprecationWarning EOFError Ellipsis EnvironmentError

Exception False FloatingPointError FutureWarning GeneratorExit

IOError ImportError ImportWarning IndentationError IndexError

KeyError KeyboardInterrupt LookupError MemoryError NameError

None NotImplemented NotImplementedError OSError OverflowError

PendingDeprecationWarning ReferenceError RuntimeError RuntimeWarning StandardError

StopIteration SyntaxError SyntaxWarning SystemError SystemExit

TabError True TypeError UnboundLocalError UnicodeDecodeError

UnicodeEncodeError UnicodeErrorUnicodeTranslateError UnicodeWarning UserWarning

ValueError Warning ZeroDivisionError _ __debug__

__doc__ __import__ __name__ __package__ abs

all any apply basestring bin

bool buffer bytearray bytes callable

chr classmethod cmp coerce compile

complex copyright credits delattr dict

dir divmod enumerate eval execfile

exit file filter float format

frozenset getattr globals hasattr hash

help hex id input int

intern isinstance issubclass iter len

license list locals long map

max memoryview min next object

oct open ord pow print

property quit range raw_input reduce

reload repr reversed round set

setattr slice sorted staticmethod str

sum super tuple type unichr

unicode vars xrange zip>>>

3. dir内置命令是怎么实现的

在/Python-2.7.8/Objects/object.c中

1963 /* Implementation of dir() -- if obj is NULL, returns the names in the current

1964 (local) scope. Otherwise, performs introspection of the object: returns a

1965 sorted list of attribute names (supposedly) accessible from the object

1966 */

1967 PyObject *

1968 PyObject_Dir(PyObject *obj)

1969 {

1970 PyObject * result;

1971

1972 if (obj == NULL)

1973 /* no object -- introspect the locals */

1974 result = _dir_locals();

1975 else

1976 /* object -- introspect the object */

1977 result = _dir_object(obj);

1978

1979 assert(result == NULL || PyList_Check(result));

1980

1981 if (result != NULL && PyList_Sort(result) != 0) {

1982 /* sorting the list failed */

1983 Py_DECREF(result);

1984 result = NULL;

1985 }

1986

1987 return result;

1988 }

可见,与help(dir)描述的基本一致。

>>> def foo(a):

... if a > x:

... return a/1024

... else:

... return a

...

>>> type(foo)

>>> dir(foo)

['__call__', '__class__', '__closure__', '__code__', '__defaults__', '__delattr__', '__dict__', '__doc__', '__format__', '__get__', '__getattribute__', '__globals__', '__hash__', '__init__', '__module__', '__name__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'func_closure', 'func_code', 'func_defaults', 'func_dict', 'func_doc', 'func_globals', 'func_name']

>>> foo.__call__

>>> foo.__str__

>>> foo

>>> foo.func_closure

>>> type(foo.func_closure)

>>> type(foo.func_code)

>>> foo.func_code

", line 1>

>>> dir(foo.func_code)

['__class__', '__cmp__', '__delattr__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__le__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'co_argcount', 'co_cellvars', 'co_code', 'co_consts', 'co_filename', 'co_firstlineno', 'co_flags', 'co_freevars', 'co_lnotab', 'co_name', 'co_names', 'co_nlocals', 'co_stacksize', 'co_varnames']

>>> foo.func_code.co_argcount

1

>>> foo.func_code.co_cellvars

()

>>> foo.func_code.co_code

'|\x00\x00t\x00\x00k\x04\x00r\x14\x00|\x00\x00d\x01\x00\x15S|\x00\x00Sd\x00\x00S'

>>> foo.func_code.co_consts

(None, 1024)

>>> foo.func_code.co_filename

''

>>> foo.func_code.co_firstlineno

1

>>> foo.func_code.co_flags

67

>>> foo.func_code.co_freevars

()

>>> foo.func_code.co_lnotab

'\x00\x01\x0c\x01\x08\x02'

>>> foo.func_code.co_name

'foo'

>>> foo.func_code.co_names

('x',)

>>> foo.func_code.co_nlocals

1

>>> foo.func_code.co_stacksize

2

>>> foo.func_code.co_varnames

('a',)

>>>

其中,foo.func_code.co_code打印出来的就是Python的字节码。

Help on built-in function ord in module __builtin__:

ord(...)

ord(c) -> integer

Return the integer ordinal of a one-character string.

>>> [ord(i) for i in foo.func_code.co_code]

[124, 0, 0, 116, 0, 0, 107, 4, 0, 114, 20, 0, 124, 0, 0, 100, 1, 0, 21, 83, 124, 0, 0, 83, 100, 0, 0, 83]

这就是那些组成python字节码的字节。解释器会循环接收各个字节,查找每个字节的指令然后执行这个指令。需要注意的是,字节码本身并不包括任何python对象,或引用任何对象。

如果你想知道python字节码的意思,可以去找到CPython解释器文件(ceval.c),然后查阅100的意思、1的意思、0的意思,等等。

>>> import dis

>>> dir(dis)

['EXTENDED_ARG', 'HAVE_ARGUMENT', '__all__', '__builtins__', '__doc__', '__file__', '__name__', '__package__', '_have_code', '_test', 'cmp_op', 'dis', 'disassemble', 'disassemble_string', 'disco', 'distb', 'findlabels', 'findlinestarts', 'hascompare', 'hasconst', 'hasfree', 'hasjabs', 'hasjrel', 'haslocal', 'hasname', 'opmap', 'opname', 'sys', 'types']

>>> type(dis.dis)

>>> dir(dis.dis)

['__call__', '__class__', '__closure__', '__code__', '__defaults__', '__delattr__', '__dict__', '__doc__', '__format__', '__get__', '__getattribute__', '__globals__', '__hash__', '__init__', '__module__', '__name__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'func_closure', 'func_code', 'func_defaults', 'func_dict', 'func_doc', 'func_globals', 'func_name']

>>> [ord(i) for i in dis.dis.func_code.co_code]

[124, 0, 0, 100, 1, 0, 107, 8, 0, 114, 23, 0, 116, 1, 0, 131, 0, 0, 1, 100, 1, 0, 83, 116, 2, 0, 124, 0, 0, 116, 3, 0, 106, 4, 0, 131, 2, 0, 114, 53, 0, 124, 0, 0, 106, 5, 0, 125, 0, 0, 110, 0, 0, 116, 6, 0, 124, 0, 0, 100, 2, 0, 131, 2, 0, 114, 80, 0, 124, 0, 0, 106, 7, 0, 125, 0, 0, 110, 0, 0, 116, 6, 0, 124, 0, 0, 100, 3, 0, 131, 2, 0, 114, 107, 0, 124, 0, 0, 106, 8, 0, 125, 0, 0, 110, 0, 0, 116, 6, 0, 124, 0, 0, 100, 4, 0, 131, 2, 0, 114, 246, 0, 124, 0, 0, 106, 9, 0, 106, 10, 0, 131, 0, 0, 125, 1, 0, 124, 1, 0, 106, 11, 0, 131, 0, 0, 1, 120, 174, 0, 124, 1, 0, 68, 93, 85, 0, 92, 2, 0, 125, 2, 0, 125, 3, 0, 116, 2, 0, 124, 3, 0, 116, 12, 0, 131, 2, 0, 114, 154, 0, 100, 5, 0, 124, 2, 0, 22, 71, 72, 121, 14, 0, 116, 13, 0, 124, 3, 0, 131, 1, 0, 1, 87, 110, 28, 0, 4, 116, 14, 0, 107, 10, 0, 114, 234, 0, 1, 125, 4, 0, 1, 100, 6, 0, 71, 124, 4, 0, 71, 72, 110, 1, 0, 88, 72, 113, 154, 0, 113, 154, 0, 87, 110, 78, 0, 116, 6, 0, 124, 0, 0, 100, 7, 0, 131, 2, 0, 114, 18, 1, 116, 15, 0, 124, 0, 0, 131, 1, 0, 1, 110, 50, 0, 116, 2, 0, 124, 0, 0, 116, 16, 0, 131, 2, 0, 114, 46, 1, 116, 17, 0, 124, 0, 0, 131, 1, 0, 1, 110, 22, 0, 116, 14, 0, 100, 8, 0, 116, 18, 0, 124, 0, 0, 131, 1, 0, 106, 19, 0, 22, 130, 2, 0, 100, 1, 0, 83]

>>> dir(dis.dis.func_code)

['__class__', '__cmp__', '__delattr__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__le__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'co_argcount', 'co_cellvars', 'co_code', 'co_consts', 'co_filename', 'co_firstlineno', 'co_flags', 'co_freevars', 'co_lnotab', 'co_name', 'co_names', 'co_nlocals', 'co_stacksize', 'co_varnames']

>>> dis.dis.func_code.co_filename

'/usr/lib/python2.7/dis.py'

>>> dis.dis.func_code.co_consts

('Disassemble classes, methods, functions, or code.\n\n With no argument, disassemble the last traceback.\n\n ', None, 'im_func', 'func_code', '__dict__', 'Disassembly of %s:', 'Sorry:', 'co_code', "don't know how to disassemble %s objects")

>>> dis.dis.func_code.co_names

('None', 'distb', 'isinstance', 'types', 'InstanceType', '__class__', 'hasattr', 'im_func', 'func_code', '__dict__', 'items', 'sort', '_have_code', 'dis', 'TypeError', 'disassemble', 'str', 'disassemble_string', 'type', '__name__')

>>> dis.dis.func_code.co_varnames

('x', 'items', 'name', 'x1', 'msg')

>>> dis.dis.func_code.co_stacksize

6

>>> dis.dis.func_code.co_nlocals

5

其实dis.dis也不过就是是一连串的字节码而已,它被Python解释器执行,从而完成指定的功能。

下面我们就使用dis.dis来反汇编一下字节码

>>> dis.dis(foo.func_code.co_code)

0 LOAD_FAST 0 (0)

3 LOAD_GLOBAL 0 (0)

6 COMPARE_OP 4 (>)

9 POP_JUMP_IF_FALSE 20

12 LOAD_FAST 0 (0)

15 LOAD_CONST 1 (1)

18 BINARY_DIVIDE

19 RETURN_VALUE

>> 20 LOAD_FAST 0 (0)

23 RETURN_VALUE

24 LOAD_CONST 0 (0)

27 RETURN_VALUE

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值