性能提升优化之_C和Python混合编程之Cython

最新推荐文章于 2025-04-12 21:26:26 发布

liyuanchao_blog

最新推荐文章于 2025-04-12 21:26:26 发布

阅读量8.3k

点赞数 1

分类专栏： Python性能优化篇

本文链接：https://blog.csdn.net/mingtiannihaoabc/article/details/119548900

版权

Python性能优化篇专栏收录该内容

6 篇文章

订阅专栏

安装与简介Cython

它可以让我们直接将Python代码编译成C语
言.
Cython可以看成是一个转换器，可以简单看成一个软件，它可以把
源代码从一种语言翻译成另一种语言。类似的软件还有CoffeeScript和
Dart。这两个是不同的软件，使用不同的语言，但是都翻译成
JavaScript。
Cython把Python的超集（扩展版本）翻译成C/C++。然后，它会被编译
成Python模块。这样允许开发者：
1. 用Python代码调用原生C/C++
2. 用静态类型声明把Python代码优化成C语言的性能
安装
```
pip install cython
```

静态类型是Cython这个翻译器产生优化的C语言代码的主要特征，可以
把Python的动态特性转变成静态且更快的代码（有时候可以达到几个数
量级）。
不过这么做会把代码变得更啰嗦，会破坏代码的可维护型和可读型。因
此，通常并不推荐使用静态类型，除非有充分理由证明增加静态类型可
以充分提高代码的性能。
开发者可以使用所有的C类型。 Cython可以对变量赋值自动进行类型转
换。当面对Python的任意长度整数时，如果转换成C类型出现了栈溢
出， Python的溢出错误就会产生。

定义函数类型

Cython中有两种不同类型的函数可以定义：

标准Python函数：这种普通函数与纯Python代码中声明的函数完全
一样。要定义这种函数，你只需要用标准的cdef关键字就行。这种
函数接受Python对象作为参数，也返回Python对象。
C函数：这种函数是是标准函数的优化版。它们可以用Python对象
和C语言类型作为参数，返回值也可以是两种类型。要定义这种函
数，你需要用特殊关键字cpdef。

建立一个Cython模块

在这里插入图片描述

新建一个SortList.pyx文件，内容如下



def addNode(list orginList, int capacity, list datas, list sortKeysIndex, int isReverse):
    cdef int isAdd = 1
    cdef list reduceNode = []
    cdef int index = capacity + 1
    if len(orginList) != 0:
        if __compare(datas, orginList[-1], sortKeysIndex, isReverse) == 1:
            if len(orginList) < 5:
                index = __addNormal(orginList, datas, sortKeysIndex, isReverse)
            else:
                index = __addDichotomy(orginList, datas, sortKeysIndex, isReverse)
                if capacity != 0 and len(orginList) > capacity:
                    reduceNode = orginList[-1]
        else:
            if capacity == 0:
                orginList.append(datas)
                index = 0
            else:
                if len(orginList) < capacity:
                    orginList.append(datas)
                    index = len(orginList)
                else:
                    isAdd = 0
    else:
        orginList.append(datas)
        index = 0
    return orginList, isAdd, reduceNode, index

cdef int __addNormal(list orginList, list datas, list sortKeysIndex, int isReverse):
    cdef int insertIndex = 0
    cdef int index
    for index in range(len(orginList)-1, -1, -1):
        if __compare(datas, orginList[index], sortKeysIndex, isReverse) == 1:
            insertIndex = index
        else:
            break
    orginList.insert(insertIndex, datas)
    return insertIndex

cdef int __addDichotomy(list orginList, list datas, list sortKeysIndex, int isReverse):
    cdef int forward = 1
    cdef int frontIndex = 0
    cdef int targetIndex = int((len(orginList))/2)
    cdef int nextIndex = len(orginList)
    cdef int insertIndex = len(orginList)
    while True:
        if __compare(datas, orginList[targetIndex], sortKeysIndex, isReverse) == 1:
            forward = 1
            nextIndex = targetIndex
        else:
            forward = 2
            frontIndex = targetIndex
        # 判断是否满足停止条件
        if frontIndex == nextIndex:
            insertIndex = targetIndex if forward == 1 else targetIndex + 1
            break
        if frontIndex + 1 == nextIndex:
            if forward == 1:
                insertIndex = frontIndex if __compare(datas, orginList[frontIndex], sortKeysIndex, isReverse) == 1 else frontIndex + 1
            else:
                insertIndex = nextIndex if __compare(datas, orginList[nextIndex], sortKeysIndex, isReverse) == 1 else nextIndex + 1
            break
        targetIndex = int((frontIndex + nextIndex) * 0.5)
    orginList.insert(insertIndex, datas)
    return insertIndex

cdef int __compare(list cValue, list oValue, list sortKeysIndex, int isReverse):
    if oValue == []:
        return 0
    cdef int keyIndex1 = sortKeysIndex[0]
    cdef int keyIndex2 = sortKeysIndex[1] if len(sortKeysIndex) > 1 else 0
    result = 0
    if isReverse == 1:
        # 正序，值越小，排名越高
        if cValue[keyIndex1] < oValue[keyIndex1]:
            result = 1
        elif cValue[keyIndex1] == oValue[keyIndex1]:
            if keyIndex2:
                if cValue[keyIndex2] < oValue[keyIndex2]:
                    result = 1
    else:
        # 反序：值越大，排名越高
        if cValue[keyIndex1] > oValue[keyIndex1]:
            result = 1
        elif cValue[keyIndex1] == oValue[keyIndex1]:
            if keyIndex2:
                if cValue[keyIndex2] > oValue[keyIndex2]:
                    result = 1
    return result

新建setup.py，内容如下

from distutils.core import setup
from Cython.Build import cythonize
setup(ext_modules = cythonize("SortList.pyx"))
# setup(ext_modules = cythonize("SorterListC.pyx"))

编译
```
python setup.py build_ext --inplace
```
除了上述方式编译外还可以
- 运行cython命令将.pyx文件编译成.c文件。然后用C语言编译器把C
  代码手动编译成库文件。
- 最后一种方法是用pyximport，像导入.py文件一样导入.pyx直接使
  用。
编译后的目录变成了如下结构
编译后我们用到的就是pyd文件，这个时候我们可以重命名pyd文件为SortList.pyd

测试

新建test目录，结构如下
7. ![在这里插入图片描述](https://img-blog.csdnimg.cn/fa41318fc8a743068fbbd16ef1dec343.pn
其中test.py代码内容如下

import SortList


# region 性能测试
'''
容量	                 100000	  90000	      80000	     70000	    60000   	50000	    40000	    30000	    20000	  15000	        10000
初始化：总秒数	 1.57	     1.34	       1.05	        0.88	     0.71	        0.55	      0.37	        0.265        0.139   	0.086	        0.049
新增：秒/个	        0.00045	 0.00038	0.00034	  0.0003	 0.00023	 0.0002	     0.00016	0.0001	    0.00004   0.000024	   0.000014
更新：秒/个	        0.012	   0.011	    0.01	     0.008  	 0.006         0.005        0.0039	    0.0026      0.0007	   0.0006        0.00028
获得：秒/个	        0.024	   0.021	    0.018	    0.015	    0.012	      0.009	       0.0067	   0.0039	   0.0014	  0.0009	    0.0005
'''
# endregion


class Sortor:
    """
    列表排序：使用二分法对多维数组（低维数组称为节点）进行排序\n
    uidIndex：节点内唯一值的下标\n
    capacity：排序列表的容量\n
    sortKeysIndex：节点内排序依据的下标，支持1~2个\n
    isReverse：是否为反序（正：大→小，反序：小→大），默认正序
    """

    def __init__(self, uidIndex, capacity, sortKeysIndex, isReverse=True):
        self.sorterList = []
        self.sorterNodeDict = {}
        self.uidIndex = uidIndex
        self.sortKeysIndex = sortKeysIndex
        self.isReverse = int(isReverse)
        self.capacity = capacity

    def addNode(self, node):
        """
        新增排序节点，初始化时也使用该接口\n
        node：节点数据，类型为数组
        """
        self.delNode(node[self.uidIndex])
        self.sorterList, isAdd, reduceNode, index = SortList.addNode(self.sorterList, self.capacity, node, self.sortKeysIndex, self.isReverse)
        if isAdd == 1:
            self.sorterNodeDict[node[self.uidIndex]] = node
            if reduceNode:
                self.delNode(reduceNode[self.uidIndex])
                del self.sorterNodeDict[reduceNode[self.uidIndex]]
        return index

    def getNode(self, uid):
        """
        根据uid获得节点\n
        uid：节点数据中的唯一值
        """
        if uid in self.sorterNodeDict:
            index = self.sorterList.index(self.sorterNodeDict[uid]) + 1
            return [index, self.sorterNodeDict[uid]]
        return [0, []]

    def getNodeList(self, startIndex=0, cnt=100):
        """
        获取排序段\n
        startIndex：起始位置，默认为0\n
        cnt：截取长度，默认为100
        """
        curCnt = len(self.sorterList)
        if startIndex <= curCnt:
            if len(self.sorterList) <= (startIndex + cnt):
                return self.sorterList[startIndex:curCnt-startIndex]
            else:
                return self.sorterList[startIndex: startIndex + cnt]
        else:
            return []

    def delNode(self, uid):
        """
        删除节点\n
        :param uid：节点数据中的唯一值
        :return 是否成功
        """
        if uid in self.sorterNodeDict:
            if self.sorterNodeDict[uid] in self.sorterList:
                self.sorterList.remove(self.sorterNodeDict[uid])
                return True
        return False

    @property
    def SortList(self):
        """排序列表"""
        return self.sorterList


def test():
    import random
    import time
    capacity = 15000
    result = []
    testList = []
    sorter = Sortor(0, capacity, [1, 2])
    for i in range(capacity):
        node = [
            capacity + i,
            random.randint(1, 100),
            random.randint(1, 1000000),
        ]
        testList.append(node)

    t1 = time.time()
    for node in testList:
        sorter.addNode(node)
    t2 = time.time()
    print("init ranking time = %s" % (t2-t1))

    testList = []
    for i in range(2000):
        node = [
            10 + i,
            i*2+i,
            i*i+i
        ]
        testList.append(node)
    t1 = time.time()
    for node in testList:
        sorter.addNode(node)
    t2 = time.time()
    print("add node time = %s" % ((t2 - t1)/2000))
    testList = []
    for i in range(200):
        node = [
            capacity + i,
            i*2+i,
            i*i+i
        ]
        testList.append(node)
    t1 = time.time()
    for node in testList:
        sorter.addNode(node)
    t2 = time.time()
    print("update node time = %s" % ((t2 - t1)/200))
    t1 = time.time()
    for i in range(200):
        node = sorter.getNode(capacity+i)
    t2 = time.time()
    print("get node time = %s" % ((t2 - t1)/200))

test()

复杂模型的处理

前面的代码是一个较为简单的模型。但是，面对复杂的情况时， Cython
通常都需要导入两类文件。

定义文件：文件扩展名.pxd，是其他Cython文件要使用的变量、类
型、函数名称的C语言声明。
实现文件：文件扩展名.pyx，包括在.pxd文件中已经定义好的函数
实现。

定义文件中通常包括C类型声明、外部C函数或变量声明，以及模块中
定义的C函数声明。它们不包含任何C或Python函数的实现，也不包含任
何Python类的定义或可执行代码行。

官方例子

官方文档
dishes.pxd:

cdef enum otherstuff:
    sausage, eggs, lettuce

cdef struct spamdish:
    int oz_of_spam
    otherstuff filler

restaurant.pyx:

from __future__ import print_function
cimport dishes
from dishes cimport spamdish

cdef void prepare(spamdish *d):
    d.oz_of_spam = 42
    d.filler = dishes.sausage

def serve():
    cdef spamdish d
    prepare(&d)
    print(f'{d.oz_of_spam} oz spam, filler no. {d.filler}')

默认情况下，当运行cimport时，它会在搜索路径中查找同名模块的
modulename.pxd文件。无论定义文件何时改变，导入的文件都需要重新
编译。好在Cythin.Build.cythonize功能可以帮我们解决这个问题。

调用c函数

pass

限制条件

生成器表达式

由于表达式计算范围（evaluation scope）的限定有问题，因此不能
在生成器表达式内部使用可迭代对象（iterable）。
另外，在处理生成器表达式内部使用可迭代对象时， Cython会在生
成器内部计算可迭代对象。而CPython是在生成器外部计算。
CPython的生成器具有一些属性可以让用户查看。但是Cython的生
成器的这类属性还不够全面。

对比char*常量

目前Cython的字节字符串比较是通过指针实现的，并不是字符串的真实
值：

cdef char* str = "test string"
print str == b"test string"

上面的代码并不一定会返回True。这将由存储第一个字符串的指针地址
决定，而不是字符串的真实内容

元组作为函数参数

栈帧

Cython目前通过except返回值作为异常捕捉机制的一部分。这种方法
不能捕捉locals和co_code值的异常。为了解决这类问题，需要在函数
调用时生成栈帧（stack frame），因此也造成了性能损失。目前还不确定Cython团队是否会解决这个问题。