Embedding Python in C/C++: Part I

最新推荐文章于 2024-08-11 20:33:24 发布

「已注销」

最新推荐文章于 2024-08-11 20:33:24 发布

阅读量3k

点赞数

分类专栏： python 文章标签： python thread function class reference null

python 专栏收录该内容

3 篇文章 0 订阅

订阅专栏

Introduction

介绍

Inspired by the article "Embedding Python in Multi-Threaded C/C++ Applications" (Linux Journal), I felt the need for a more comprehensive coverage on the topic of embedding Python. While writing this article, I had two objectives:

受《在多线程C/C++应用程序中嵌入Python》（Linux杂志）这篇文章的启发，我觉得自已需要对Pyrthon的嵌入作更全面的理解。当写这篇文章时，我有两个目标：

This is written for programmers who are more experienced in C/C++ than in Python, the tutorial takes a practical approach and omits all theoretical discussions.这篇文章是写给那些相比Python,更熟悉C/C++的程序员的，教程以实际编程为主，同时也引发所有理论探讨。
Try to maintain Python's cross-platform compatibility when writing the embedding code.当写嵌入的代码时，尝试去保持Python的跨平台的兼容性，

Now, you have got some modules written in Python by others and you want to use them. You are experienced in C/C++, but fairly new to Python. You might wonder if you can find a tool to convert them to C code, like the conversion from FORTRAN. The answer is no. Some tools can help you generate the executable from a Python module. Is the problem solved? No. Converting code into executables usually makes things even more complicated, as you must figure out how your C/C++ application communicates with the executable "black-box".

现在，你已经得到一些由其它人写的Python模块，并且你想使用它们。你熟悉C/C++，但是对Python却很陌生。你可能想知道是不是有一个工具，它可以将这些Ｐython代码转化成Ｃ代码，就像ＦＯＲＴＲＡＮ一样。答案是否定的。一些工具可以帮助你从Ｐython模块中生成可执行的东西。问题解决了吗？没有。将代码转化成可执行文件通常将事情变得更加复杂，因为你必须弄清楚你的Ｃ／Ｃ＋＋应用程序怎样与这些可执行的黑例子通信。

I am going to introduce C/C++ programmers to Python/C API, a C library that helps to embed python modules into C/C++ applications. The API library provides a bunch of C routines to initialize the Python Interpreter, call into your Python modules and finish up the embedding. The library is built with Python and distributed with all the recent Python releases.

我将向Ｃ／Ｃ＋＋程序员介绍Ｐythogn／Ｃ的ＡＰＩ，一个帮助将Ｐuthon模块嵌入到Ｃ／Ｃ＋＋程序中的Ｃ库。这个ＡＰＩ库提供一个Ｃ选择时分支来初始化Ｐython解释器，调用你的Ｐython模块并且完成嵌入。这个库是和Ｐython一起编译的，而且安装在所有最近的发布版中。

Part I of this article series discusses the basics of Python embedding. Part II will move on to more advanced topics. This tutorial does not teach the Python language systematically, but I will briefly describe how the Python code works when it comes up. The emphasis will be on how to integrate Python modules with your C/C++ applications. See article: "Embedding Python in C/C++: Part II".

这篇文章的第一部分讨论Ｐython嵌入的基础。第二部分讨论更高级的话题。这篇教程不教Ｐython语法，但是当有需要时，我将会简洁的描述Ｐython代码是怎样工作的。重点是怎样将Ｐython模块整合进你的Ｃ／Ｃ＋＋应用程序中。见文章《在C/C++中嵌入Ｐython：Ｐart：　ＩＩ》

In order to use the source code, you should install a recent Python release, Visual C++ (or GCC compiler on Linux). The environment that I have used to test is: Python 2.4 (Windows and Linux), Visual C++ 6.0 (Windows) or GCC 3.2 (RedHat 8.0 Linux). With Visual C++, select the Release configuration to build. Debug configuration requires Python debug library "python24_d.lib", which is not delivered with normal distributions.

为了使用源码，你应该安装一个最近的Ｐython版本，ＶisualＣ＋＋（或者Ｌinux上的ＧＣＣ编译器）。我用来测试的环境是：。。。。略

Background

背景

Python is a powerful interpreted language, like Java, Perl and PHP. It supports a long list of great features that any programmer would expect, two of my favorite features are "simple" and "portable". Along with the available tools and libraries, Python makes a good language for modeling and simulation developers. Best of all, it's free and the tools and libraries written for Python programmers are also free. For more details on the language, visit the official website.

Ｐython是一个强大的粘合语言，像Ｊava，　Ｐerl和ＰＨＰ。它支持一长串的任何程序员都期望的特性。我最喜欢的两个特性是“简单”和“轻便”。结合一些可用的工具和库，Ｐyhotn成为一个优秀的建模语言和模块开发者。最优的是，它以及所有用它写成的库也是自由免费的。更多细节见官方主页。

Embedding basics: functions, classes and methods

基础嵌入：函数，类和方法

First, let us start from a sample C program that calls a function within a Python module. Here is the source file "call_function.c":

首先，让我们以一个简单的Ｃ程序调用一个在Ｐython模块中的函数起步。下面是“call_function.c”中的代码：

// call_function.c - A sample of calling 
// python functions from C code
// 
#include <Python.h>

int main(int argc, char *argv[])
{
    PyObject *pName, *pModule, *pDict, *pFunc, *pValue;

    if (argc < 3) 
    {
        printf("Usage: exe_name python_source function_name\n");
        return 1;
    }

    // Initialize the Python Interpreter
    Py_Initialize();

    // Build the name object
    pName = PyString_FromString(argv[1]);

    // Load the module object
    pModule = PyImport_Import(pName);

    // pDict is a borrowed reference 
    pDict = PyModule_GetDict(pModule);

    // pFunc is also a borrowed reference 
    pFunc = PyDict_GetItemString(pDict, argv[2]);

    if (PyCallable_Check(pFunc)) 
    {
        PyObject_CallObject(pFunc, NULL);
    } else 
    {
        PyErr_Print();
    }

    // Clean up
    Py_DECREF(pModule);
    Py_DECREF(pName);

    // Finish the Python Interpreter
    Py_Finalize();

    return 0;
}

The Python source file " py_function.py " is as follows:

Ｐython源码文件“py_function.py”如下：

'''py_function.py - Python source designed to '''
'''demonstrate the use of python embedding'''

def multiply():
    c = 12345*6789
    print 'The result of 12345 x 6789 :', c
    return c

Note that checks for the validity of objects are omitted for brevity. On Windows, simply compile the C source and get the executable, which we call " call_function.exe ". To run it, enter the command line " call_function py_function multiply ". The second argument is the name of the Python file (without extension), which when loaded becomes the module name. The third argument is the name of the Python function you are going to call within the module. The Python source does not take part in compiling or linking; it's only loaded and interpreted at run-time. The output from the execution is:

注意，为了简明起见，去掉了检查对象的有效性。在Ｗindow上，直接编译Ｃ源码得到可执行程序“call_function.exe”。为了运行它，　在命令行输入“call_function py_function multiply”，　第二个参数是Ｐython文件的名字（除去后缀），当加载时它会变成模块名。第三个参数是你即将要从这个模块中调用的Ｐython函数的名字。Ｐython源文件没有参与编译或是链接。它只是是运行时加载和翻译。可执行文件的输出如下：

The result of 12345 x 6789 : 83810205

The C code itself is self-explanatory, except that:

Ｃ代码是自解释的，除了以下：

Everything in Python is an object. pDict and pFunc are borrowed references so we don't need to Py_DECREF() them.Ｐython中的一切都是对象，pDict和pＦunc是borrowed reference,所以我们不需要在它们上运行Ｐy_DECREF()
All the Py_XXX and PyXXX_XXX calls are Python/C API calls.所有的Ｐy_XXX和PyXXX_XXX调用都是Ｐython／Ｃ的ＡＰＩ调用
The code will compile and run on all the platforms that Python supports.这段代码可以在所有Ｐython支持的平台上编译运行。

Now, we want to pass arguments to the Python function. We add a block to handle arguments to the call:

现在，我们想给Ｐython函数传递参数，我们增加一个代码块来处理参数调用：

if (PyCallable_Check(pFunc)) 
{
    // Prepare the argument list for the call
    if( argc > 3 )
    {
            pArgs = PyTuple_New(argc - 3);
            for (i = 0; i < argc - 3; i++)
            {
            pValue = PyInt_FromLong(atoi(argv[i + 3]));
                    if (!pValue)
                    {
                PyErr_Print();
                         return 1;
                    }
                    PyTuple_SetItem(pArgs, i, pValue);    
            }
            
        pValue = PyObject_CallObject(pFunc, pArgs);

        if (pArgs != NULL)
        {
            Py_DECREF(pArgs);
        }
    } else
    {
        pValue = PyObject_CallObject(pFunc, NULL);
    }

    if (pValue != NULL) 
    {
        printf("Return of call : %d\n", PyInt_AsLong(pValue));
        Py_DECREF(pValue);
    }
    else 
    {
        PyErr_Print();
    }
    
    // some code omitted...
}

The new C source adds a block of "Prepare the argument list for the call" and a check of the returned value. It creates a tuple (list-like type) to store all the parameters for the call. You can run the command " call_function py_source multiply1 6 7 " and get the output:

新的Ｃ代码增加一块“为调用准备参数列表”的代码块，并且检查了返回值。它新建了一个元组（像列表类型）来存储所有的调用参数。你可以运行命令“call_function py_source multiply1 6 7”，得到的结果如下：

The result of 6 x 7 : 42
Return of call : 42

Writing classes in Python is easy. It is also easy to use a Python class in your C code. All you need to do is create an instance of the object and call its methods, just as you call normal functions. Here is an example:

写一个类在Ｐython中是一件简单的事件，在Ｃ代码中使用Ｐython的类也同样简单。你所需要做的就是创建一个对象实例，然后调用它的方法，就像你调用普通函数一样，下面是例子：

// call_class.c - A sample of python embedding 
// (calling python classes from C code)
//
#include <Python.h>

int main(int argc, char *argv[])
{
    PyObject *pName, *pModule, *pDict, 
                  *pClass, *pInstance, *pValue;
    int i, arg[2];

    if (argc < 4) 
    {
        printf(
          "Usage: exe_name python_fileclass_name function_name\n");
        return 1;
    }

    // some code omitted...
   
    // Build the name of a callable class 
    pClass = PyDict_GetItemString(pDict, argv[2]);

    // Create an instance of the class
    if (PyCallable_Check(pClass))
    {
        pInstance = PyObject_CallObject(pClass, NULL); 
    }

    // Build the parameter list
    if( argc > 4 )
    {
        for (i = 0; i < argc - 4; i++)
            {
                    arg[i] = atoi(argv[i + 4]);
            }
        // Call a method of the class with two parameters
        pValue = PyObject_CallMethod(pInstance, 
                    argv[3], "(ii)", arg[0], arg[1]);
    } else
    {
        // Call a method of the class with no parameters
        pValue = PyObject_CallMethod(pInstance, argv[3], NULL);
    }
    if (pValue != NULL) 
    {
        printf("Return of call : %d\n", PyInt_AsLong(pValue));
        Py_DECREF(pValue);
    }
    else 
    {
        PyErr_Print();
    }
   
    // some code omitted...
}

The third parameter to PyObject_CallMethod() , " (ii) " is a format string, which indicates that the next arguments are two integers. Note that PyObject_CallMethod() takes C variables types as its arguments, not Python objects. This is different from the other calls we have seen so far. The Python source " py_class.py " is as follows:

ＰyＯbject_CallMethod()的第三个参数“（ii）”是一个格式字符串，它指示接下来的参数是两个整数，注意，PyObject_CallMethod（）使用Ｃ变作类型作为它的参数，而不是Ｐython　objects。这与我们以前所见的情况不一样。Ｐy_call.py源码中的内容如下：

'''py_class.py - Python source designed to demonstrate''' 
'''the use of python embedding'''

class Multiply: 
    def __init__(self): 
            self.a = 6 
            self.b = 5 
    
    def multiply(self):
            c = self.a*self.b
    print 'The result of', self.a, 'x', self.b, ':', c
            return c
    
    def multiply2(self, a, b):
            c = a*b
    print 'The result of', a, 'x', b, ':', c
    return c

To run the application, you add a class name between the module and function names, which is "Multiply" in this case. The command line becomes "call_class py_class Multiply multiply" or "call_class py_class Multiply multiply2 9 9".

为了运行程序，你在模块名和函数名之间增加一个类名，　在当前情况下就是"Multiply"，命令行变成"call_call py_class Multiply multiply "或者"call_class py_class Multiply multiply2 9 9"

Multi-threaded Python embedding

多线程下的Python嵌入

With the above preparations, we are ready for some serious business. The Python module and your C/C++ application have to run simultaneously from time to time. This is not uncommon in simulation communities. For instance, the Python module to be embedded is part of a real-time simulation and you run it in parallel with the rest of your simulation. Meanwhile, it interacts with the rest at run-time. One conventional technique is multi-threading. There are a number of options for multi-threaded embedding. We are going to discuss two of them here.

有了以上的准备，我们已经可以开始一些严肃事情了。Ｐython模块和你的Ｃ／Ｃ＋＋应用程序必须同时运行。这在模拟通信中并不罕见。例如，Ｐython模块嵌入到一个实时的模块系统中，它与你其它的仿真同时运行。同时，在运行时它也与其余的仿真交互。一个方便技术是多线程。已经有大量的关系多线程嵌入的观点。我们将讨论其中的两个：

In one approach you create a separate thread in C and call the Python module from the thread function. This is natural and correct, except that you need to protect the Python Interpreter state. Basically, we lock the Python Interpreter before you use it and release after the use so that Python can track its states for the different calling threads. Python provides the global lock for this purpose. Let us look at some source code first. Here is the complete content of "call_thread.c":

一种方法是你在Ｃ产生一个独立的线程，然后从线程函数里调用Ｐython模块。这是非常自然和正确的，除非你需要保护Ｐython解释器状态。基本上，我们在你使用Ｐython解释器之前锁住它，然后在你使用完后释放它，以致Ｐython能够跟踪到不同的线程调用。Ｐython提供全局锁来达到这个目的。让我们先看一下源码。下面是“call_thread.c”的完整内容：

// call_thread.c - A sample of python embedding 
// (C thread calling python functions)
// 
#include <Python.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <errno.h>

#ifdef WIN32    // Windows includes
#include <Windows.h>
#include <process.h>
#define sleep(x) Sleep(1000*x)
HANDLE handle;
#else    // POSIX includes
#include <pthread.h>
pthread_t mythread;
#endif

void ThreadProc(void*);

#define NUM_ARGUMENTS 5
typedef struct 
{
   int argc;
   char *argv[NUM_ARGUMENTS]; 
} CMD_LINE_STRUCT;

int main(int argc, char *argv[])
{
    int i;
    CMD_LINE_STRUCT cmd;
    pthread_t mythread;

    cmd.argc = argc;
    for( i = 0; i < NUM_ARGUMENTS; i++ )
    {
        cmd.argv[i] = argv[i];
    }

    if (argc < 3) 
    {
        fprintf(stderr,
          "Usage: call python_filename function_name [args]\n");
        return 1;
    }

    // Create a thread
#ifdef WIN32
    // Windows code
    handle = (HANDLE) _beginthread( ThreadProc,0,&cmd);
#else
    // POSIX code
    pthread_create( &mythread, NULL, 
                 ThreadProc, (void*)&cmd );
#endif

    // Random testing code
    for(i = 0; i < 10; i++)
    {
        printf("Printed from the main thread.\n");
    sleep(1);
    }

    printf("Main Thread waiting for My Thread to complete...\n");

    // Join and wait for the created thread to complete...
#ifdef WIN32
    // Windows code
    WaitForSingleObject(handle,INFINITE);
#else
    // POSIX code
    pthread_join(mythread, NULL);
#endif

    printf("Main thread finished gracefully.\n");

    return 0;
}

void ThreadProc( void *data )
{
    int i;
    PyObject *pName, *pModule, *pDict, 
               *pFunc, *pInstance, *pArgs, *pValue;
    PyThreadState *mainThreadState, *myThreadState, *tempState;
    PyInterpreterState *mainInterpreterState;
    
    CMD_LINE_STRUCT* arg = (CMD_LINE_STRUCT*)data;

    // Random testing code
    for(i = 0; i < 15; i++)
    {
        printf("...Printed from my thread.\n");
    sleep(1);
    }

    // Initialize python inerpreter
    Py_Initialize();
        
    // Initialize thread support
    PyEval_InitThreads();

    // Save a pointer to the main PyThreadState object
    mainThreadState = PyThreadState_Get();

    // Get a reference to the PyInterpreterState
    mainInterpreterState = mainThreadState->interp;

    // Create a thread state object for this thread
    myThreadState = PyThreadState_New(mainInterpreterState);
    
    // Release global lock
    PyEval_ReleaseLock();
    
    // Acquire global lock
    PyEval_AcquireLock();

    // Swap in my thread state
    tempState = PyThreadState_Swap(myThreadState);

    // Now execute some python code (call python functions)
    pName = PyString_FromString(arg->argv[1]);
    pModule = PyImport_Import(pName);

    // pDict and pFunc are borrowed references 
    pDict = PyModule_GetDict(pModule);
    pFunc = PyDict_GetItemString(pDict, arg->argv[2]);

    if (PyCallable_Check(pFunc)) 
    {
        pValue = PyObject_CallObject(pFunc, NULL);
    }
    else {
        PyErr_Print();
    }

    // Clean up
    Py_DECREF(pModule);
    Py_DECREF(pName);

    // Swap out the current thread
    PyThreadState_Swap(tempState);

    // Release global lock
    PyEval_ReleaseLock();
    
    // Clean up thread state
    PyThreadState_Clear(myThreadState);
    PyThreadState_Delete(myThreadState);

    Py_Finalize();
    printf("My thread is finishing...\n");

    // Exiting the thread
#ifdef WIN32
    // Windows code
    _endthread();
#else
    // POSIX code
    pthread_exit(NULL);
#endif
}

The thread function needs a bit of explanation. PyEval_InitThreads() initializes Python's thread support.PyThreadState_Swap(myThreadState) swaps in the state for the current thread, andPyThreadState_Swap(tempState) swaps it out. The Python Interpreter will save what happens between the two calls as the state data related to this thread. As a matter of fact, Python saves the data for each thread that is using the Interpreter so that the thread states are mutually exclusive. But it is your responsibility to create and maintain a state for each C thread. You may wonder why we didn't call the first PyEvel_AcquireLock(). Because PyEval_InitThreads() does so by default. In other cases, we do need to use PyEvel_AcquireLock() and PyEvel_ReleaseLock() in pairs.

线程函数需要一些解释。ＰyEval_InitThreads初始化Ｐython的线程支持。ＰyThreadState_Swap(myThreadState)将当前进程状态换入，ＰyＴhreadＳtate_Swap(tempState)则是换出。Ｐython解释器将保存这两个换进换出之间的状态数据。事实上，Ｐython保存每个使用解释器的线程数据，所以线程状态是互斥的。但是，为每个Ｃ线程道理和维护这个状态是你的责任。你可能会好奇为什么我们不先调用PyEvel_AcquireLock().因为它默认做这些事情。在其它情况下，我们确实需要使用PyEvel_AcquireLock()和PyEvel_ReleaseLock（）成对的使用

Run "call_thread py_thread pythonFunc" and you can get the output as shown below. The file "py_thread.py" defines a function called pythonFunc() in which a similar random testing block prints "print from pythonFunc..." to screen fifteen times.

Printed from the main thread.
...Printed from my thread.
Printed from the main thread.
...Printed from my thread.
Printed from the main thread.
...Printed from my thread.
Printed from the main thread.
...Printed from my thread.
Printed from the main thread.
...Printed from my thread.
...Printed from my thread.
Printed from the main thread.
Printed from the main thread.
...Printed from my thread.
Printed from the main thread.
...Printed from my thread.
Printed from the main thread.
...Printed from my thread.
Printed from the main thread.
...Printed from my thread.
Main Thread waiting for My Thread to complete...
...Printed from my thread.
...Printed from my thread.
...Printed from my thread.
...Printed from my thread.
...Printed from my thread.
My thread is finishing...
Main thread finished gracefully.

Obviously, the implementation is getting complicated because writing multi-threading code in C/C++ is not trivial. Although the code is portable, it contains numerous patches, which require detailed knowledge of the system call interfaces for specific platforms. Fortunately, Python has done most of this for us, which brings up the second solution to our question under discussion, namely, letting Python handle the multi-threading. This time the Python code is enhanced to add a threading model:

''' Demonstrate the use of python threading'''

import time
import threading

class MyThread(threading.Thread):
    def __init__(self):
        threading.Thread.__init__(self)
    def run(self):
        for i in range(15):
            print 'printed from MyThread...'
            time.sleep(1)

def createThread():
    print 'Create and run MyThread'
    background = MyThread()
    background.start()
    print  'Main thread continues to run in foreground.'
    for i in range(10):
        print 'printed from Main thread.'
        time.sleep(1)
    print  'Main thread joins MyThread and waits until it is done...'
    background.join() # Wait for the background task to finish
    print  'The program completed gracefully.'

The C code does not handle threading any more. All it needs to do is just call createThread(). Try to use the previous "call_function.c". You can run "call_function py_thread createThread" to see how the output looks like. In this case, the second solution is cleaner and simpler. More importantly, Python threading model is portable. While the C threading code for Unix and Windows is different, it remains the same in Python.

The Python function createThread() is not required if our C code calls the thread class' start() and joint() methods. The relevant changes are listed in the following (from the C source file "call_thread_2.c"):

// Create instance
pInstance = PyObject_CallObject(pClass, NULL); 

PyObject_CallMethod(pInstance, "start", NULL);

i = 0;
while(i<10)
{
printf("Printed from C thread...\n");

// !!!Important!!! C thread will not release CPU to 
// Python thread without the following call.
PyObject_CallMethod(pInstance, "join", "(f)", 0.001);        
Sleep(1000);
i++;
}

printf(
  "C thread join and wait for Python thread to complete...\n");
PyObject_CallMethod(pInstance, "join", NULL);        

printf("Program completed gracefully.\n");

Basically, after you create the class instance, call its start() method to create a new thread and execute its run() method. Note that without frequent short joins to the created thread, the created thread can only get executed at the beginning and the main thread will not release any CPU to it until it's done. You may try this out by commenting the joint call within the while loop. The behavior is somehow different from that of the previous case where we had called start() from within the Python module. This seems to be a feature of multi-threading which is not documented in the Python Library Reference .