(Python编程)扩展模块的细节

Programming Python, 3rd Edition 翻译
最新版本见: http://wiki.woodpecker.org.cn/moin/PP3eD


22.5. Extension Module Details

22.5. 扩展模块的细节

Now that I've shown you the somewhat longer story, let's fill in the rest. The next few sections go into more detail on compilation and linking, code structure, data conversions, error handling, and reference counts. These are core ideas in Python C extensionssome of which we will later learn you can often largely forget.

既然我已经给你看了这个有点长的故事,让我们填满剩下的内容。下面几节将详细讲述编译和链接,代码结构,数据转换,错误处理,及引用计数。这些是Python C扩展的核心概念,但我们以后会知道,其中一些你是可以不管的。

22.5.1. Compilation and Linking

22.5.1. 编译和链接

You always must compile C extension files such as the hello.c example and somehow link them with the Python interpreter to make them accessible to Python scripts, but there is wide variability on how you might go about doing so. For example, a rule of the following form could be used to compile this C file on Linux too:

你总是必须将C语言扩展的源文件,例如hello.c编译,并与Python解释器链接,这样才能用于Python脚本,但是编译和链接的方法是多种多样的。例如,在Linux上,可以用下列的规则来编译这个C文件。

hello.so: hello.c
    gcc hello.c -c -g -fpic -I$(PYINC) -o hello.o
    gcc -shared hello.o -o hello.so
    rm -f hello.o


To compile the C file into a shareable object file on Solaris, you might instead say something like this:

在Solaris上,你会这样编译:

hello.so: hello.c
    cc hello.c -c -KPIC -o hello.o
    ld -G hello.o -o hello.so
    rm hello.o


On other platforms, it's more different still. Because compiler options vary widely, you'll want to consult your C or C++ compiler's documentation or Python's extension manuals for platform- and compiler-specific details. The point is to determine how to compile a C source file into your platform's notion of a shareable or dynamically loaded object file. Once you have, the rest is easy; Python supports dynamic loading of C extensions on all major platforms today.

在其它平台,仍是有许多不同。因为编译选项变化很大,你应该翻翻C或C++编译器的帮助文档,或Python扩展的手册,查看一下平台和编译器相关的细节。关键是如何将C文件编译成为你平台上的共享的或动态装载的目标文件。一旦完成,剩下的就容易了;Python可以在所有主流的平台上动态装载C扩展。

Because build details vary so widely from machine to machine (and even compiler to compiler), the build scripts in this book will take some liberties with platform details. In general, most are shown under the Cygwin Unix-like environment on Windows, partly because it is a simpler alternative to a full Linux install and partly because this writer's background is primarily in Unix development. Be sure to translate for your own context. If you use standard Windows build tools, see also the directories PC and PCbuild in Python's current source distribution for pointers.

因为构建的细节在不同机器上(甚至在不同编译器上)变化很大,本书的构建脚本对于平台很随意。一般是用Windows上的类Unix环境Cygwin,因为相对于一个完整的Linux安装,它比较简单,也因为作者主要在Unix开发环境下工作。你应该把它翻译到自己的环境。如果你使用标准的Windows构建工具,也可以参考Python源代码的PC和PCbuild目录。

22.5.1.1. Dynamic binding

22.5.1.1. 动态绑定

Technically, what I've been showing you so far is called dynamic binding, and it represents one of two ways to link compiled C extensions with the Python interpreter. Since the alternative, static binding, is more complex, dynamic binding is almost always the way to go. To bind dynamically, simply follow these steps:

从技术上说,目前为止我展示给你看的东西称为动态绑定,它是链接C扩展模块与Python的两种方法之一。因为另一方法,静态绑定比较复杂,所以几乎总是采用动态绑定的方法。动态绑定简单地按以下步骤执行:

Compile hello.c into a shareable object file for your system (e.g., .dll, .so).

将hello.c编译为你系统上的共享目标文件(如:.dll, .so)。

Put the object file in a directory on Python's module search path.

将目标文件放在Python的模块搜索路径下。

That is, once you've compiled the source code file into a shareable object file, simply copy or move the object file to a directory listed in sys.path (which includes PYTHONPATH and .pth path file settings). It will be automatically loaded and linked by the Python interpreter at runtime when the module is first imported anywhere in the Python processincluding imports from the interactive prompt, a standalone or embedded Python program, or a C API call.

就是说,一旦你编译出一个共享目标文件,只需复制或移到sys.path所列的一个目录下(其中包括PYTHONPATH和.pth文件指定目录)。它会在初次导入时,自动由Python解释器装入并链接,如从交互命令行导入,通过独立的或内嵌的Python程序导入,或者通过C API调用导入。

Notice that the only non-static name in the hello.c example C file is the initialization function. Python calls this function by name after loading the object file, so its name must be a C global and should generally be of the form initX, where X is both the name of the module in Python import statements and the name passed to Py_InitModule. All other names in C extension files are arbitrary because they are accessed by C pointer, not by name (more on this later). The name of the C source file is arbitrary tooat import time, Python cares only about the compiled object file.

注意,hello.c示例文件中唯一的非静态的名字是初始化函数。Python在装入目标文件后,通过这个名字调用这个函数,所以这个名字必须是全局的,并且形如initX,其中X是Python导入语句中的模块名,也是传递给Py_InitModule函数的名字。C扩展文件中其它所有名字都是任意的,因为它们是通过指针调用的,而不是名字(下详)。C源文件的名字也是任意的,导入时,Python只关心编译后的目标文件。

22.5.1.2. Static binding

22.5.1.2. 静态绑定

Although dynamic binding is preferred in most applications, static binding allows extensions to be added to the Python interpreter in a more permanent fashion. This is more complex, though, because you must rebuild Python itself, and hence you need access to the Python source distribution (an interpreter executable won't do). Moreover, static linking of extensions is prone to change over time, so you should consult the README file at the top of Python's source distribution tree for current details.[*]

动态绑定是大多数应用的优先选择,而静态绑定让扩展模块以更持久的方式加入到Python解释器中。静态绑定较复杂,因为你必须重新编译Python本身,因此需要Python源代码(不是解释器执行程序)。而且,扩展模块的静态链接以后可能会有更改,你应该查询Python源代码顶层的README文件,了解相关细节。[*]

[*] In fact, starting with Python 2.1, the setup.py script at the top of the source distribution attempts to detect which modules can be built, and it automatically compiles them using the distutils system described in the next section. The setup.py script is run by Python's make system after building a minimal interpreter. This process doesn't always work, though, and you can still customize the configuration by editing the Modules/Setup file. As a more recent alternative, see also the example lines in Python's setup.py for xxmodule.c.

[*] 事实上,从Python 2.1开始,源代码发布中的顶层的setup.py脚本会检测哪些模块可以构建,并使用distutils系统自动编译,distutils将在下节描述。Python的make系统先构建一个最小化的解释器,然后调用setup.py脚本。然而这个过程并不总是可行,但你仍可以编辑Modules/Setup文件来配置。作为一个新的方法,请参阅setup.py中关于xxmodule.c的示例行。

In short, though, one way to statically link the extension of Example 22-1 is to add a line such as the following:

hello ~/PP3E/Integrate/Extend/Hello/hello.c


to the Modules/Setup configuration file in the Python source code tree (change the ~ if this isn't in your home directory). Alternatively, you can copy your C file to the Modules directory (or add a link to it there with an ln command) and add a line to Setup, such as hello hello.c.

简单地说,例22-1扩展模块进行静态链接的方法之一是,将下面一行添加到Python源代码树的Module/Setup配置文件(如果路径不同请按实际情况更改):

hello ~/PP3E/Integrate/Extend/Hello/hello.c

或者,将C文件复制到Modules目录下(或用ld命令添加链接),并且在Setup中添加一行:hello hello.c。

Then, rebuild Python itself by running a make command at the top level of the Python source tree. Python reconstructs its own makefiles to include the module you added to Setup, such that your code becomes part of the interpreter and its libraries. In fact, there's really no distinction between C extensions written by Python users and services that are a standard part of the language; Python is built with this same interface. The full format of module declaration lines looks like this:

然后,在Python源代码树顶层运行make命令,重建Python。Python会重建自己的make文件,使之包含添加到Setup中的模块,这样你的代码就成为解释器和它的库的一部份。事实上,Python用户写的C扩展模块和Python语言的标准服务模块之间并没有实质上的区别;它们的接口是一致的。模块声明行的完整形式如下:

<module> ... [<sourceOrObjectFile> ...] [<cpparg> ...] [<library> ...]


Under this scheme, the name of the module's initialization function must match the name used in the Setup file, or you'll get linking errors when you rebuild Python. The name of the source or object file doesn't have to match the module name; the leftmost name is the resulting Python module's name. This process and syntax are prone to change over time, so again, be sure to consult the README file at the top of Python's source tree.

在这种方案下,模块的初始化函数的名字必须与Setup文件中的名字相一致,不然重建Python会产生链接错误。源文件或目标文件的名字不必匹配模块名;最左边的名字是最终的Python模块名。这个流程和语法以后可能会更改,所以,还是要查看Python源代码树顶层的README文件。

22.5.1.3. Static versus dynamic binding

22.5.1.3. 静态和动态绑定对比

Static binding works on any platform and requires no extra makefile to compile extensions. It can be useful if you don't want to ship extensions as separate files, or if you're on a platform without dynamic linking support. Its downsides are that you need to update Python configuration files and rebuild the Python interpreter itself, so you must therefore have the full source distribution of Python to use static linking at all. Moreover, all statically linked extensions are always added to your interpreter, regardless of whether they are used by a particular program. This can needlessly increase the memory needed to run all Python programs.

静态绑定可在任何平台上工作,并不需要额外的make文件。如果你不想分发独立的扩展模块文件,或者你的平台不支持动态链接,就用静态绑定。它的缺点是,你需要更改Python的配置文件,并重建Python解释器本身,因此为了使用静态绑定,你必须拥有完整的Python源代码。另外,Python解释器总是加载所有的静态链接扩展库,不管它们是否被用到,这样会增加不必要的内存开销。

With dynamic binding, you still need Python include files, but you can add C extensions even if all you have is a binary Python interpreter executable. Because extensions are separate object files, there is no need to rebuild Python itself or to access the full source distribution. And because object files are only loaded on demand in this mode, it generally makes for smaller executables tooPython loads into memory only the extensions actually imported by each program run. In other words, if you can use dynamic linking on your platform, you probably should.

动态绑定虽然需要Python include文件,但是即使你只有Python解释器二进制执行程序,你也可以添加C扩展模块。扩展模块是独立的目标文件,所以不需要重建Python本身,也不需要发行的源码。并且在这种模式下,因为目标仅在需要时装载,它一般生成更小的执行程序。Python装入内存的仅仅是各运行的程序真正导入的扩展模块。换句话说,如果你的平台可以使用动态链接,你大概就应该使用动态绑定。

22.5.2. Compiling with the Distutils System

22.5.2. 用Distutils系统编译

As an alternative to makefiles, it's possible to specify compilation of C extensions by writing Python scripts that use tools in the Distutils packagea standard part of Python that is used to build, install, and distribute Python extensions coded in Python or C. Its larger goal is automated building of distributed packages on target machines.

除了用make文件编译,还有另外一种方法,可以写Python脚本来指定编译C扩展模块。这要用到Distutils包中的工具。Distutils是Python的标准部件,用来构建,安装和发布Python或C编码的Python扩展,它的主要目的是自动构建目标机器的发行包。

We won't go into Distutils exhaustively in this text; see Python's standard distribution and installation manuals for more details. Among other things, Distutils is the de facto way to distribute larger Python packages these days. Its tools know how to install a system in the right place on target machines (usually, in Python's standard site-packages) and handle many platform-specific details that are tedious and error prone to accommodate manually.

我们不会详尽研究Distutils,详情请阅Python标准的发行与安装手册。除了别的用处以外,Distutils是目前发布大型Python包的事实上的方式。它的工具知道如何在目标机器的正确位置安装一个系统(通常是在Python的标准site-packages目录), 并且处理许多平台相关的细节,而手工安装则是麻烦的和易错的。

For our purposes here, though, because Distutils also has built-in support for running common compilers on a variety of platforms (including Cygwin), it provides an alternative to makefiles for situations where the complexity of makefiles is either prohibitive or unwarranted. For example, to compile the C code in Example 22-1, we can code the makefile of Example 22-2, or we can code and run the Python script in Example 22-4.

而就我们的目的来说,因为Distutils也内建支持在不同的平台(包括Cygwin)上运行通用的编译器,如果make文件的复杂性让你望而却步,或者认为它不保险,Distutils提供了另外一种编译方法。例如编译例22-1的C代码,我们可以编写例22-2的make文件,或者编写并运行例22-4的Python脚本。

Example 22-4. PP3E/Integrate/Extend/Hello/hellouse.py
# to build: python disthello.py build
# resulting dll shows up in build subdir

from distutils.core import setup, Extension
setup(ext_modules=[Extension('hello', ['hello.c'])])


Example 22-4 is a Python script run by Python; it is not a makefile. Moreover, there is nothing in it about a particular compiler or compiler options. Instead, the Distutils tools it employs automatically detect and run an appropriate compiler for the platform, using compiler options that are appropriate for building dynamically linked Python extensions on that platform. For the Cygwin test machine, gcc is used to generate a .dll dynamic library ready to be imported into a Python scriptexactly like the result of the makefile in Example 22-2, but considerably simpler:

例22-4是Python脚本而不是make文件。另外,它没有指定特别的编译器或编译选项。相反,它调用的Distutils工具自动检测并运行适合平台的编译器,并使用适当的编译选项,来构建此平台的动态链接Python扩展库。对于测试机器的Cygwin平台,它使用gcc生成一个可以导入Python脚本的.dll动态库,与例22-2的make文件的结果完全相同,只是更为简单。

.../PP3E/Integrate/Extend/Hello$ python disthello.py build
running build
running build_ext
building 'hello' extension
creating build
creating build/temp.cygwin-1.5.19-i686-2.4
gcc -fno-strict-aliasing -DNDEBUG -g -O3 -Wall -Wstrict-prototypes
-I/usr/include/python2.4 -c hello.c -o build/temp.cygwin-1.5.19-i686-2.4/hello.o
hello.c:31: warning: function declaration isn't a prototype
creating build/lib.cygwin-1.5.19-i686-2.4
gcc -shared -Wl,--enable-auto-image-base build/temp.cygwin-1.5.19-i686-2.4/hello
.o -L/usr/lib/python2.4/config -lpython2.4
-o build/lib.cygwin-1.5.19-i686-2.4/hello.dll


The resulting binary library file shows up in the generated built subdirectory, but it's used in Python code just as before:

生成的二进制库文件在生成的build子目录中,但是可以和以前的例子一样地使用:

.../PP3E/Integrate/Extend/Hello$ cd build/lib.cygwin-1.5.19-i686-2.4/

.../PP3E/Integrate/Extend/Hello/build/lib.cygwin-1.5.19-i686-2.4$ ls
hello.dll

.../PP3E/Integrate/Extend/Hello/build/lib.cygwin-1.5.19-i686-2.4$ python
>>> import hello
>>> hello._ _file_ _
'hello.dll'
>>> hello.message('distutils')
'Hello, distutils'


Distutils scripts can become much more complex in order to specify build options; for example, here is a slightly more verbose version of ours:

Distutils脚本可以更加复杂,来指定构建的选项;例如,这是一个更详细的版本:

from distutils.core import setup, Extension
setup(name='hello',
     version='1.0',
     ext_modules=[Extension('hello', ['hello.c'])])


Unfortunately, further details about both Distutils and makefiles are beyond the scope of this chapter and book. Especially if you're not used to makefiles, see the Python manuals for more details on Distutils. Makefiles are a traditional way to build code on some platforms and we will employ them in this book, but Distutils can sometimes be simpler in cases where they apply.

有关Distutils和make文件的更多细节超出了本书的范围。如果你特别不习惯make文件,可查阅Python手册中关于Distutils的更多细节。make文件是一些平台上传统的构建方法,我们在本书中使用make文件,但是在适用的情况下,用Distutils可以更简单。

22.5.3. Anatomy of a C Extension Module

22.5.3. C扩展模块剖析

Though simple, the hello.c code of Example 22-1 illustrates the structure common to all C modules. Most of it is glue code, whose only purpose is to wrap the C string processing logic for use in Python scripts. In fact, although this structure can vary somewhat, this file consists of fairly typical boilerplate code:

尽管简单,例22-1的hello.c代码阐明了所有C语言模块的共同结构。大部份是粘合代码,仅用于包裹C语言的字符串处理,使之用于Python脚本。实际上,虽然这个结构可以变化,但是这个文件包含了相当典型的样板文件代码。

Python header files

Python头文件

The C file first includes the standard Python.h header file (from the installed Python Include directory). This file defines almost every name exported by the Python API to C, and it serves as a starting point for exploring the API itself.

C文件首先包含标准的Python.h头文件(在Python安装的Include目录)。这个头文件定义了几乎每一个Python API输出到C语言的名字,而且它也是输出自身API的起点。

Method functions

方法函数

The file then defines a function to be called from the Python interpreter in response to calls in Python programs. C functions receive two Python objects as input, and send either a Python object back to the interpreter as the result or a NULL to trigger an exception in the script (more on this later). In C, a PyObject* represents a generic Python object pointer; you can use more specific type names, but you don't always have to. C module functions can be declared C static (local to the file) because Python calls them by pointer, not by name.

然后是定义一个可由Python解释器调用的函数,可以响应Python程序的调用。C函数接受两个Python对象作为输入,输出一个Python对象返回给解释器作为结果,或者返回NULL在脚本中触发一个异常(下详)。在C语言中,PyObject*代表一个通用的Python对象指针;你可以使用更明确的类型名,但你总是不必这样做。

Registration table

注册表

Near the end, the file provides an initialized table (array) that maps function names to function pointers (addresses). Names in this table become module attribute names that Python code uses to call the C functions. Pointers in this table are used by the interpreter to dispatch C function calls. In effect, the table "registers" attributes of the module. A NULL enTRy terminates the table.

快到最后时有一个初始化表(数组),将函数名映射为函数指针。表中的名字会成为模块的属性名,Python代码用这个名字来调用C函数。表中的函数指针被解释器用来分派C函数的调用。实际上,这个表“注册”了模块的属性。该表由一个NULL表项终止。

Initialization function

初始化函数

Finally, the C file provides an initialization function, which Python calls the first time this module is imported into a Python program. This function calls the API function Py_InitModule to build up the new module's attribute dictionary from the entries in the registration table and create an entry for the C module on the sys.modules table (described in Chapter 3). Once so initialized, calls from Python are routed directly to the C function through the registration table's function pointers.

最后,这个C文件提供了一个初始化函数,这是Python初次导入该模块时调用的函数。这个函数调用API函数Py_InitModule读取注册表来建立新模块的属性字典,并在sys.modules表中为该C模块新建了一个表项(第3章所述)。一旦初始化,Python调用就通过注册表的函数指针直接传递到C函数。

22.5.4. Data Conversions

22.5.4. 数据转换

C module functions are responsible for converting Python objects to and from C datatypes. In Example 22-1, message gets two Python input objects passed from the Python interpreter: args is a Python tuple holding the arguments passed from the Python caller (the values listed in parentheses in a Python program), and self is ignored. It is useful only for extension types (discussed later in this chapter).

C模块函数要负责Python对象和C数据类型之间的转换。在例22-1中,message从Python解释器得到两个Python输入对象:args是一个Python元组(Python程序中用括号括起来的几个值),持有Python调用者传递的参数,而self参数没用到。self参数仅在扩展类型中有用(本章后面讨论)。

After finishing its business, the C function can return any of the following to the Python interpreter: a Python object (known in C as PyObject*), for an actual result; a Python None (known in C as Py_None), if the function returns no real result; or a C NULL pointer, to flag an error and raise a Python exception.

事情做完后,C函数可以向Python解释器返回下列值之一:如果有实际结果,则返回一个Python对象(即C中的PyObject*);如果没有实际结果,则返回一个Python None(C中的Py_None);或者返回一个C NULL指针,指示有错误来触发一个Python异常。

There are distinct API tools for handling input conversions (Python to C) and output conversions (C to Python). It's up to C functions to implement their call signatures (argument lists and types) by using these tools properly.

有特殊的API工具来处理输入转换(Python到C)和输出转换(C到Python)。C函数负责恰当地使用这些工具,来实现调用签名(即参数列表和类型)。

22.5.4.1. Python to C: using Python argument lists

22.5.4.1. Python到C: 使用Python参数列表

When the C function is run, the arguments passed from a Python script are available in the args Python tuple object. The API function PyArg_Parseand its cousin, PyArg_ParseTuple, which assumes it is converting a tuple objectis probably the easiest way to extract and convert passed arguments to C form.

当C函数执行时,Python脚本传入的参数在args这个Python元组对象中。API函数PyArg_Parse及其同胞PyArg_ParseTuple可能是提取输入参数并将其转为C形式最简单的方法,其中PyArg_ParseTuple假定它在转换的是一个元组。

PyArg_Parse takes a Python object, a format string, and a variable-length list of C target addresses. It converts the items in the tuple to C datatype values according to the format string, and it stores the results in the C variables whose addresses are passed in. The effect is much like C's scanf string function. For example, the hello module converts a passed-in Python string argument to a C char* using the s convert code:

PyArg_Parse输入一个Python对象,一个格式化串,和一个变长的C目标地址列表。它将元组中的元素根据格式化串转换为C数据类型,结果保存在传入的地址所指的变量中。效果就象C语言的scanf函数。例如,hello模块将传入的Python串对象转为C语言的char*:

PyArg_Parse(args, "(s)", &fromPython)      # or PyArg_ParseTuple(args, "s",...


To handle multiple arguments, simply string format codes together and include corresponding C targets for each code in the string. For instance, to convert an argument list holding a string, an integer, and another string to C, say this:

对于多个参数,只需在格式化串中添加代号,并给出相应的C目标变量。例如,将一个串,一个整数,和另一个串转换到C,这样写:

PyArg_Parse(args, "(sis)", &s1, &i, &s2)   # or PyArg_ParseTuple(args, "sis",...


To verify that no arguments were passed, use an empty format string like this:

为了核实无参数传入,就像这样用一个空的格式化串:

PyArg_Parse(args,"( )")


This API call checks that the number and types of the arguments passed from Python match the format string in the call. If there is a mismatch, it sets an exception and returns zero to C (more on errors shortly).

这个API调用检查从Python传入的参数个数与类型,与格式化串相匹配。如果不匹配,它会设置一个异常并返回0到C语言(错误处理见下)。

22.5.4.2. Python to C: using Python return values

22.5.4.2. Python到C: 使用Python返回值

As we'll see in Chapter 23, API functions may also return Python objects to C as results when Python is being run as an embedded language. Converting Python return values in this mode is almost the same as converting Python arguments passed to C extension functions, except that Python return values are not always tuples. To convert returned Python objects to C form, simply use PyArg_Parse. Unlike PyArg_ParseTuple, this call takes the same kinds of arguments but doesn't expect the Python object to be a tuple.

我们会在23章看到,当Python作为内嵌语言运行时,API函数也可以返回结果Python对象到C语言。在这种模式下转换Python返回值和转换传递给C扩展函数的Python参数几乎一样,只是Python返回值并不总是元组。为了转换返回的Python对象到C形式,只需调用PyArg_Parse。这个调用虽然输入参数相同,但它不像PyArg_ParseTuple,它并不要求输入Python对象为一个元组。

22.5.4.3. C to Python: returning values to Python

22.5.4.3. C到Python: 返回值到Python

There are two ways to convert C data to Python objects: by using type-specific API functions or via the general object-builder function, Py_BuildValue. The latter is more general and is essentially the inverse of PyArg_Parse, in that Py_BuildValue converts C data to Python objects according to a format string. For instance, to make a Python string object from a C char*, the hello module uses an s convert code:

有两个方法将C语言数据转为Python对象:类型相关的API函数,或者通用的对象构建函数Py_BuildValue。后者更通用,它实质上是PyArg_Parse的逆操作,因为Py_BuildValue根据格式化串将C数据转化为Python对象。例如,hello模块使用下列代码,从一个C语言的char*获得了一个Python字符串对象:

return Py_BuildValue("s", result)            # "result" is a C char []/*


More specific object constructors can be used instead:

可以使用更专用的对象构建器,效果是一样的:

return PyString_FromString(result)           # same effect


Both calls make a Python string object from a C character array pointer. See the now-standard Python extension and runtime API manuals for an exhaustive list of such calls available. Besides being easier to remember, though, Py_BuildValue has syntax that allows you to build lists in a single step, described next.

两种调用方法都从一个C字符数组指针获得了一个Python字符串对象。请参阅标准的Python扩展和运行时API手册,查看这类函数的详尽列表。然而,Py_BuildValue的语法更容易记忆,它让你一步构建一串数据,如下所述。

22.5.4.4. Common conversion codes

22.5.4.4. 通用的转换代码

With a few exceptions, PyArg_Parse(Tuple) and Py_BuildValue use the same conversion codes in format strings. A list of all supported conversion codes appears in Python's extension manuals. The most commonly used are shown in Table 22-1; the tuple, list, and dictionary formats can be nested.

PyArg_Parse(Tuple)和Py_BuildValue在格式化串中使用相同的转化代码,仅有少许不同。完整的转化代码可查阅Python扩展手册。表22-1列出了最常用的代码;元组,链表,和字典的格式化可以嵌套。

Table 22-1. Common Python/C data conversion codes Format-string code

Table 22-1. 通用Python/C数据转换格式化代码

XXX Todo: Insert the table


These codes are mostly what you'd expect (e.g., i maps between a C int and a Python integer object), but here are a few usage notes on this table's entries:

这些代码大部份和你想的一样(如i代表C语言的int和Python整数对象),但是有各别地方需要注意:

Pass in the address of a char* for s codes when converting to C, not the address of a char array: Python copies out the address of an existing C string (and you must copy it to save it indefinitely on the C side: use strdup or similar).

代码s转换成C数据时,需要传入一个char*的地址,而不是一个字符数组的地址:Python会传出一个已有的C字符串的地址(为了无限期保存该字符串,你必须在C代码中复制该字符串,如用strdup或类似方法)。

The O code is useful to pass raw Python objects between languages; once you have a raw object pointer, you can use lower-level API tools to access object attributes by name, index and slice sequences, and so on.

代码O用于在语言之间传递原始的Python对象;一旦你得到一个原始对象指针,你可以利用底层API访问对象属性,通过名字,或索引,分片序列或其它方式都可以。

The O& code lets you pass in C converter functions for custom conversions. This comes in handy for special processing to map an object to a C datatype not directly supported by conversion codes (for instance, when mapping to or from an entire C struct or C++ class instance). See the extensions manual for more details.

代码O&允许你传入C语言的转换函数来自定义转换。这可以便利地将一个对象映射为一个转换代码不直接支持的C数据类型(例如,转换一个完整的C结构或C++类实例)。详情请阅扩展手册。

The last two entries, [...] and {...}, are currently supported only by Py_BuildValue: you can construct lists and dictionaries with format strings, but you can't unpack them. Instead, the API includes type-specific routines for accessing sequence and mapping components given a raw object pointer.

最后两项,[...]和{...}现在只有Py_BuildValue支持:你可以用格式化串来构建链表和字典,但是你不能分拆它们。作为替代,API中有明确类型的函数来操作序列和映射,输入为一个原始对象指针。

PyArg_Parse supports some extra codes, which must not be nested in tuple formats ((...)):

PyArg_Parse还支持一些额外的代码,但是不能嵌套在元组格式中((...)):

|

The remaining arguments are optional (varargs, much like the Python language's * arguments). The C targets are unchanged if arguments are missing in the Python tuple. For instance, si|sd requires two arguments but allows up to four.

余下的参数为可选的(变参,很像Python语言的*参数)。如果Python元组中没提供参数,则C目标变量将保持不变。例如,si|sd要求两个参数但允许最多四个。

:

The function name follows, for use in error messages set by the call (argument mismatches). Normally Python sets the error message to a generic string.

紧跟函数名,用在错误信息设置中(当参数不匹配时)。正常情况下,Python会设置一个通用的错误信息。

;

A full error message follows, running to the end of the format string.

紧跟一个完整的错误信息,直到格式串尾。

This format code list isn't exhaustive, and the set of convert codes may expand over time; refer to Python's extension manual for further details.

这个格式代码表并不是完备的,并且这个转换代码集会随时扩充;更多详情请阅Python扩展手册。

22.5.5. Error Handling

22.5.5. 错误处理

When you write C extensions, you need to be aware that errors can occur on either side of the languages fence. The following sections address both possibilities.

当你编写C扩展模块,你要明白两种语言两边都有可能发生错误。下边几节探讨这两种可能性。

22.5.5.1. Raising Python exceptions in C

22.5.5.1. 在C语言中引发Python异常

C extension module functions return a C NULL value for the result object to flag an error. When control returns to Python, the NULL result triggers a normal Python exception in the Python code that called the C function. To name an exception, C code can also set the type and extra data of the exceptions it triggers. For instance, the PyErr_SetString API function sets the exception object to a Python object and sets the exception's extra data to a character string:

C扩展模块函数返回一个C NULL值来标志一个错误。当控制返回到Python,NULL会使调用C函数的Python代码触发一个标准的Python异常。C代码可以设置异常的类型和附加数据来指定它触发的这个异常。例如,PyErr_SetString API函数将异常对象设为一个Python对象,并设置异常的附加数据为一个字符串:

PyErr_SetString(ErrorObject, message)


We will use this in the next example to be more specific about exceptions raised when C detects an error. C modules may also set a built-in Python exception; for instance, returning NULL after saying this:

下一个例子中,当C代码检测到一个错误,我们会用这个函数来具体化引发的异常。C模块也可以设置一个内建的Python异常;例如,在返回NULL之前:

PyErr_SetString(PyExc_IndexError, "index out-of-bounds")


raises a standard Python IndexError exception with the message string data. When an error is raised inside a Python API function, both the exception object and its associated "extra data" are automatically set by Python; there is no need to set it again in the calling C function. For instance, when an argument-passing error is detected in the PyArg_Parse function, the hello stack module just returns NULL to propagate the exception to the enclosing Python layer, instead of setting its own message.

引发一个标准的带字符串信息的Python IndexError异常。如果错误是在Python API函数中引发的,异常对象和它的附加数据会自动由Python设置;而不必在调用的C函数中重复设置。例如,当PyArg_Parse函数检测到一个参数传递错误,hello模块只需直接返回NULL,就可将异常传播给Python外壳,而不是它自己设置信息。

22.5.5.2. Detecting errors that occur in Python

22.5.5.2. 检测Python产生的错误

Python API functions may be called from C extension functions or from an enclosing C layer when Python is embedded. In either case, C callers simply check the return value to detect errors raised in Python API functions. For pointer result functions, Python returns NULL pointers on errors. For integer result functions, Python generally returns a status code of -1 to flag an error and a 0 or positive value on success. (PyArg_Parse is an exception to this rule: it returns 0 when it detects an error.) To make your programs robust, you should check return codes for error indicators after most Python API calls; some calls can fail for reasons you may not have expected (e.g., memory overflow).

Python API函数可以从C扩展函数中调用,或者当Python内嵌时,从外裹的C语言层调用。无论哪一种,C调用者只是简单地检查返回值来检测Python API函数引发的错误。对返回指针的函数来说,Python返回NULL指针表示错误。对返回整型的函数来说,Python通常返回一个状态码,-1表示错误,0或正数表示成功。(PyArg_Parse例外:它错误时返回0。)为了让你的程序更健壮,对于大多数Python API调用,你都应该检查返回值;有些调用可能会意外失败(如内存溢出)。

22.5.6. Reference Counts

22.5.6. 引用计数

The Python interpreter uses a reference-count scheme to implement garbage collection. Each Python object carries a count of the number of places it is referenced; when that count reaches zero, Python reclaims the object's memory space automatically. Normally, Python manages the reference counts for objects behind the scenes; Python programs simply make and use objects without concern for managing storage space.

Python解释器使用引用计数方案来实现垃圾回收。每一个Python对象都带有一个它被引用数量的计数器;当计数为0时,Python自动回收对象的内存空间。正常情况下,Python在幕后管理对象的引用计数;Python程序只是获取并使用对象,而不必关心存储空间的管理。

When extending or embedding Python, though, integrated C code is responsible for managing the reference counts of the Python objects it uses. How important this becomes depends on how many raw Python objects a C module processes and which Python API functions it calls. In simple programs, reference counts are of minor, if any, concern; the hello module, for instance, makes no reference-count management calls at all.

然而,扩展或内嵌Python时,集成的C代码有责任管理它所使用的Python对象的引用计数。如何实现要看C模块拥有多少原始Python对象,及它调用的Python API函数。在简单的程序中,引用计数关系不大;例如在hello模块中,就没有调用任何引用计数管理函数。

When the API is used extensively, however, this task can become significant. In later examples, we'll see calls of these forms show up:

然而当API广泛使用时,计用引数就变得重要。以后的例子中我们会看到下列形式的调用:

Py_INCREF(obj)

Increments an object's reference count.

增加对象的引用计数。

Py_DECREF(obj)

Decrements an object's reference count (reclaims if zero).

减少对象的引用计数(到0回收)。

Py_XINCREF(obj)

Behaves similarly to Py_INCREF(obj), but ignores a NULL object pointer.

类似Py_INCREF(obj),但忽略NULL对象指针。

Py_XDECREF(obj)

Behaves similarly to py_DECREF(obj), but ignores a NULL object pointer.

类似py_DECREF(obj),但忽略NULL对象指针。

C module functions are expected to return either an object with an incremented reference count or NULL to signal an error. As a general rule, API functions that create new objects increment their reference counts before returning them to C; unless a new object is to be passed back to Python, the C program that creates it should eventually decrement the object's counts. In the extending scenario, things are relatively simple; argument object reference counts need not be decremented, and new result objects are passed back to Python with their reference counts intact.

C模块函数返回一个对象就要增加一个引用计数,或者返回NULL表示错误。作为一个普遍规则,API函数创建一个新对象并返回给C时,会先增加它们的引用计数;除非一个新对象要回传给Python,创建它的C程序应该最终减少对象的引用计数。在大部份情形下,事情是比较简单的;参数对象的引用计数不必减少,并且,回传给Python的新建对象的引用计数不要更改。

The upside of reference counts is that Python will never reclaim a Python object held by C as long as C increments the object's reference count (or doesn't decrement the count on an object it owns). Although it requires counter management calls, Python's garbage collector scheme is fairly well suited to C integration.

引用计数的好处是,只要C增加一个Python对象的引用计数(或者不减少它所拥有对象的引用计数),Python就不会回收这个对象。Python的垃圾回收方法非常适用于C语言集成,只需要调用计数管理函数。

22.5.7. Other Extension Tasks: Threads

22.5.7. 其它扩展任务: 线程

Some C extensions may be required to perform additional tasks beyond data conversion, error handling, and reference counting. For instance, long-running C extension functions in threaded applications must release and later reacquire the global interpreter lock, so as to allow Python language threads to run in parallel. See the introduction to this topic in Chapter 5 for background details. Calls to long-running tasks implemented in C extensions, for example, are normally wrapped up in two C macros:

一些C扩展可能在数据转换,错误处理,引用计数之外,还要求执行额外的任务。例如,带线程的应用中,执行时间很长的C扩展函数必须释放全局的解释器锁,并在后来再次获取,这样才能让Python语言的线程并发运行。有关的背景知识请阅读第5章的介绍部份。例如,调用C扩展实现的费时的任务,一般用两个宏语句包裹:

Py_BEGIN_ALLOW_THREADS
...Perform a potentially blocking operation...
Py_END_ALLOW_THREADS


The first of these saves the thread state data structure in a local variable and releases the global lock; the second reacquires the lock and restores the thread state from the local variable. The net effect is to allow Python threads to run during the execution of the code in the enclosed block, instead of making them wait. The C code in the calling thread can run freely of and in parallel with other Python threads, as long as it doesn't reenter the Python C API until it reacquires the lock.

第一个宏用一个局部变量保存线程状态数据结构,并释放全局锁;第二个宏重新获取锁并从局部变量恢复线程状态。实际效果是,在被包裹的语句块执行同时,允许Python的线程运行,而不是让它们等待。调用线程中的C代码可以自由地和其它Python线程并发运行,但是重入Python C API时必须重新获取锁。

The API has addition thread calls, and depending on the application, there may be other C coding requirements in general. In deference to space, though, and because we're about to meet a tool that automates much of our integration work, we'll defer to Python's integration manuals for additional details.
 
还有一些其它的线程API函数,而且不同的应用还可能有其它的普遍性需求。但是出于篇幅考虑,而且我们会使用工具来自动化大多数的集成任务,其它详细内容就留给Python集成手册了。
   

 
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值