【高级教程】ctypes：从python菜鸟到c大神

DWORD WlanQueryInterface(
          HANDLE                hClientHandle,
          const GUID              *pInterfaceGuid,
          WLAN_INTF_OPCODE    OpCode,
          PVOID                  pReserved,
          PDWORD                pdwDataSize,
          PVOID                  *ppData,
          PWLAN_OPCODE_VALUE_TYPE pWlanOpcodeValueType
)

正如大家所知，python有自己的数据类型，所以即便有DLL入口也无法在代码中直接调用WlanQueryInterface，这个时候就要用到ctypes，以pywifi源码为例：

def __wlan_query_interface(self, handle, iface_guid, opcode, data_size, data, opcode_value_type):
        func = native_wifi.WlanQueryInterface
        func.argtypes = [HANDLE, POINTER(GUID), DWORD, c_void_p, POINTER(DWORD), POINTER(POINTER(DWORD)), POINTER(DWORD)]
        func.restypes = [DWORD]
        return func(handle, iface_guid, opcode, None, data_size, data, opcode_value_type)

def status(self, obj):
        """Get the wifi interface status."""
        data_size = DWORD()
        data = PDWORD()
        opcode_value_type = DWORD()
        self.__wlan_query_interface(self._handle, obj['guid'], 6, byref(data_size), byref(data), byref(opcode_value_type))
        return status_dict[data.contents.value]

不管怎样，pywifi提供的无线网卡查询方法（status）极大地弱化原始API（WlanQueryInterface）的查询能力，虽然只要一个简单的xxx.status(obj)就可以启动查询。

什么是ctypes？ctypes是python的一个外部函数库，提供c兼容的数据类型及调用DLL或共享库函数的入口，可用于对DLL/共享库函数的封装，封装之后就可以用“纯python”的形式调用这些函数（如上面的status）。

2、Hello，CALLING

2.1动态链接库（DLL）

动态链接库是一个已编译好、程序运行时可直接导入并使用的数据-函数库。动态链接库必须先载入，为此ctypes提供三个对象：cdll、windll（windows-only）、oledll（windows-only），并使得载入dll就如访问这些对象的属性一样。

cdll：cdll对象载入使用标准cdecl调用约定的函数库。

windll：windll对象载入使用stdcall调用约定的函数库。

oledll：oledll对象载入使用stdcall调用约定的函数库，但它会假定这些函数返回一个windows系统HRESULT错误代码（函数调用失败时自动抛出OSError/WindowsError异常）。

以msvcrt.dll和kernel32.dll为例介绍dll的载入。

msvcrt.dll：包含使用cdecl调用约定声明的MS标准c函数库，通过cdll载入。

kernel32.dll：包含使用stdcall调用约定声明的windows内核级函数库，通过windll载入。

>>> from ctypes import *
>>> cdll.msvcrt
<CDLL 'msvcrt', handle 7ffbf6930000 at 0x183d91aeac8>
>>> windll.kernel32
<WinDLL 'kernel32', handle 7ffbf6720000 at 0x183d921ae80>
>>>

windows会自动添加“.dll”为文件后缀。通过cdll.msvcrt访问的标准c函数库可能使用一个过时的版本，该版本与python正在使用的函数版本不兼容。所以，尽可能地使用python自身功能特性，或者用import导入msvcrt模块。

在linux系统中，载入一个函数库时要指定带扩展名的文件名，所以不再是属性访问式载入，而是或者使用dll载入对象的LoadLibrary()方法，或者通过构造函数创建一个CDLL实例来载入（官网示例）：

>>> cdll.LoadLibrary("libc.so.6") 
<CDLL 'libc.so.6', handle ... at ...>
>>> libc = CDLL("libc.so.6")      
>>> libc                          
<CDLL 'libc.so.6', handle ... at ...>
>>>

而在载入之前，要先获取DLL/共享库（本机windows，以user32.dll为例）：

>>> from ctypes.util import find_library
>>> from ctypes import *
>>> find_library('user32')
'C:\\Windows\\system32\\user32.dll'
>>> cdll.LoadLibrary('C:\\Windows\\system32\\user32.dll')
<CDLL 'C:\Windows\system32\user32.dll', handle 7ffa00110000 at 0x23eaf6eeb70>
>>>

对于用ctypes封装的共享库而言一个更好的习惯是运行时避免使用find_library()定位共享库，而是在开发时确定好库名并固化（hardcode）到库中。

2.2函数（FUNCTION）

如何获取DLL/共享库中的函数？

很简单：还是像访问一个类实例（这里是DLL对象）属性一样来载入。

所访问的函数都将作为dll载入对象的属性。

>>> from ctypes import *
>>> libc=cdll.msvcrt
>>> libc.printf
<_FuncPtr object at 0x00000183D91DFA08>
>>> help(libc.printf)
Help on _FuncPtr in module ctypes object:
printf = class _FuncPtr(_ctypes.PyCFuncPtr)
 |  Function Pointer 
 |  Method resolution order:
 |      _FuncPtr
 |      _ctypes.PyCFuncPtr
 |      _ctypes._CData
 |      builtins.object
 |  __call__(self, /, *args, **kwargs)
>>> windll.kernel32.GetModuleHandleA
<_FuncPtr object at 0x00000183D91DFAD8>
>>> windll.kernel32.MyOwnFunction
AttributeError: function 'MyOwnFunction' not found
>>>

2.2.1 A还是W

操作字符串的API在声明时会指定字符集。像kernel32.dll和user32.dll这样的win32 dll通常会导出同一个函数的ANSI版本（函数名末尾有一个A，如GetModuleHandA）和UNICODE版本（函数名末尾有一个W，如GetModuleHandW）。

/* ANSI version */
HMODULE GetModuleHandleA(LPCSTR lpModuleName);
/* UNICODE version */
HMODULE GetModuleHandleW(LPCWSTR lpModuleName);

这是win32 API函数GetModuleHandle在c语言中的原型，它根据给定模块名返回一个模块句柄，并根据宏UNICODE是否定义决定GetModuleHandle代表此二版本中的哪一个。

windll不会试着基于魔法去确定GetModuleHandle的实际版本，就像很多事不能无中生有凭空想象，必须显式地指定所访问的是GetModuleHandleA还是GetModuleHandleW，然后用bytes或string对象调用。

2.2.2 语法通不过却可以很6地跑

有时候，从DLL导出的函数名是非法的python标识符（如??2@YAPAXI@Z），这个时候就得用getattr()来获取该函数：

>>> cdll.msvcrt.??0__non_rtti_object@@QEAA@AEBV0@@Z
SyntaxError: invalid syntax
>>> getattr(cdll.msvcrt, "??0__non_rtti_object@@QEAA@AEBV0@@Z")
<_FuncPtr object at 0x00000183D91DFBA8>
>>>

2.2.3 DLL函数索引

windows中，一些DLL不是通过名称而是通过次序导出函数，对于这些函数就可以通过索引（数字索引或名称索引）DLL对象来访问：

>>> windll.kernel32[1]
<_FuncPtr object at 0x000002FD3AAD0AD8>
>>> windll.kernel32[0]
AttributeError: function ordinal 0 not found
>>> windll.kernel32['GetModuleHandleA']
<_FuncPtr object at 0x000002FD3AAD0A08>

2.3 RUNNING，Functions

python中callable对象是怎么调用的，DLL函数就可以怎么调用。

下面以time()、GetModuleHandleA()为例来说明如何调用DLL函数。

>>> libc=cdll.msvcrt
>>> libc.time(None)
1591282222
>>> hex(windll.kernel32.GetModuleHandleA(None))
'0x1c700000'
>>>

如果函数调用之后ctypes检测到传递给函数的参数不合要求则抛出ValueError异常。这种行为是不可靠的，python3.6.2中就被反对使用，而在python3.7已经被移除。

探讨：cdll? windll？

理论上，当一个导出声明为stdcall的函数使用cdecl调用约定时会抛出ValueError异常（反之亦然）：

>>> cdll.kernel32.GetModuleHandleA(None) 
ValueError: Procedure probably called with not enough arguments (4 bytes missing)
>>>
>>> windll.msvcrt.printf(b"spam") 
ValueError: Procedure probably called with too many arguments (4 bytes in excess)

上面是来自python官方文档的例子，本机（python 3.6.5 shell）实际操作如下：

>>> from ctypes import *
>>> cdll.kernel32.GetModuleHandleA(None)
477102080
>>> windll.kernel32.GetModuleHandleA(None)
477102080
>>> hex(cdll.kernel32.GetModuleHandleA(None))
'0x1c700000'
>>> windll.msvcrt.printf(b'spam')
4
>>>

为什么实际操作时两种调用约定都可以被cdll和windll使用？

直接查看ctypes源码（已去掉无关内容）：

class CDLL(object):
    """An instance of this class represents a loaded dll/shared
    library, exporting functions using the standard C calling
    convention (named 'cdecl' on Windows).

    The exported functions can be accessed as attributes, or by
    indexing with the function name.  Examples:

    <obj>.qsort -> callable object
    <obj>['qsort'] -> callable object

    Calling the functions releases the Python GIL during the call and
    reacquires it afterwards.
    """

    _func_flags_ = _FUNCFLAG_CDECL
    _func_restype_ = c_int
    # default values for repr
    _name = '<uninitialized>'
    _handle = 0
    _FuncPtr = None

    def __init__(self, name, mode=DEFAULT_MODE, handle=None,
                 use_errno=False,
                 use_last_error=False):
        self._name = name
        flags = self._func_flags_
        if use_errno:
            flags |= _FUNCFLAG_USE_ERRNO
        if use_last_error:
            flags |= _FUNCFLAG_USE_LASTERROR

        class _FuncPtr(_CFuncPtr):
            _flags_ = flags
            _restype_ = self._func_restype_
        self._FuncPtr = _FuncPtr

        if handle is None:
            self._handle = _dlopen(self._name, mode)
        else:
            self._handle = handle

    def __repr__(self):
        return "<%s '%s', handle %x at %#x>" % \
               (self.__class__.__name__, self._name,
                (self._handle & (_sys.maxsize*2 + 1)),
                id(self) & (_sys.maxsize*2 + 1))

    def __getattr__(self, name):
        if name.startswith('__') and name.endswith('__'):
            raise AttributeError(name)
        func = self.__getitem__(name)
        setattr(self, name, func)
        return func

    def __getitem__(self, name_or_ordinal):
        func = self._FuncPtr((name_or_ordinal, self))
        if not isinstance(name_or_ordinal, int):
            func.__name__ = name_or_ordinal
        return func

class PyDLL(CDLL):
    """This class represents the Python library itself.  It allows
    accessing Python API functions.  The GIL is not released, and
    Python exceptions are handled correctly.
    """
    _func_flags_ = _FUNCFLAG_CDECL | _FUNCFLAG_PYTHONAPI

if _os.name == "nt":
    class WinDLL(CDLL):
        """This class represents a dll exporting functions using the
        Windows stdcall calling convention.
        """
        _func_flags_ = _FUNCFLAG_STDCALL

    class OleDLL(CDLL):
        """This class represents a dll exporting functions using the
        Windows stdcall calling convention, and returning HRESULT.
        HRESULT error values are automatically raised as OSError
        exceptions.
        """
        _func_flags_ = _FUNCFLAG_STDCALL
        _func_restype_ = HRESULT

class LibraryLoader(object):
    def __init__(self, dlltype):
        self._dlltype = dlltype

    def __getattr__(self, name):
        if name[0] == '_':
            raise AttributeError(name)
        dll = self._dlltype(name)
        setattr(self, name, dll)
        return dll

    def __getitem__(self, name):
        return getattr(self, name)

    def LoadLibrary(self, name):
        return self._dlltype(name)

cdll = LibraryLoader(CDLL)
pydll = LibraryLoader(PyDLL)

if _os.name == "nt":
    windll = LibraryLoader(WinDLL)
    oledll = LibraryLoader(OleDLL)

    if _os.name == "nt":
        GetLastError = windll.kernel32.GetLastError
    else:
        GetLastError = windll.coredll.GetLastError
    from _ctypes import get_last_error, set_last_error

    def WinError(code=None, descr=None):
        if code is None:
            code = GetLastError()
        if descr is None:
            descr = FormatError(code).strip()
        return OSError(None, descr, None, code)

分析上面源码可知，ctypes提供CDLL、PyDLL、 WinDLL、OleDLL四种类型的DLL对象，后三者是CDLL的子类，前二者是通用DLL，后二者专为windows系统定义。此四者主要区别在于_func_flags_的取值：

CDLL	WinDLL
_FUNCFLAG_CDECL	_FUNCFLAG_STDCALL
OleDLL	PyDLL
_FUNCFLAG_STDCALL	_FUNCFLAG_CDECL \| _FUNCFLAG_PYTHONAPI

三个子类的方法与属性都继承自CDLL，其中OleDLL还有一个例外的_func_restype_属性。

此外，ctypes提供cdll、windll、pydll、oledll四个LibraryLoader对象用于实际完成dll的载入。

>>> windll=LibraryLoader(WinDLL)
>>> windll.kernel32

因为windll.__dict__不存在名称’kernel32’，所以最终将调用LibraryLoader中__getattr__，开始实际上的WinDLL(‘kernell32’)实例化（会用到CDLL中的__init__，载入模块、获取模块句柄），实例对象加入windll的__dict__后被返回；windll.LoadLibrary(‘kernel32’)作用类似（返回新DLL对象）；支持名称索引。

>>> windll.kernel32.GetModuleHandleA

windll.kernel32将返回一个WinDLL(‘kernell32’)对象，接着会调用CDLL中

__getattr__，__getitem__来获取GetModuleHandleA 的_FuncPtr对象，通过该对象调用函数。若函数载入方式只有windll.kernel32['GetModuleHandleA']，GetModuleHandleA将不被加入WinDLL(‘kernell32’)对象的__dict__（因为有__getattr__，在使用时感觉不到属性载入和名称索引载入的区别）。

仅基于以上ctypes源码分析还看不到windll和cdll在载入dll及相关函数时的本质差异，而两个关键之处_dlopen、_CFuncPtr来自_ctypes.pyd：

from _ctypes import LoadLibrary as _dlopen
from _ctypes import CFuncPtr as _CFuncPtr

所以_func_flags_是如何发挥作用并未得知。如果哪位大神已知晓为什么能混合调用，还望多多指教。

无论怎样，虽然两种方式都可以，但为避免不必要的潜在风险还是请遵循python官方文档的使用指导。

而想要知道一个函数的正确调用约定，就得从相关c头文件或文档中找出函数声明。

windows中，ctypes使用WIN32结构化的异常处理来防止以错误参数调用函数时产生的程序崩溃（如一般性保护故障）：

>>> windll.kernel32.GetModuleHandleA(32)
OSError: exception: access violation reading 0x0000000000000020
>>> getattr(cdll.msvcrt, "??0__non_rtti_object@@QEAA@AEBV0@@Z")(123)
OSError: exception: access violation writing 0x000000000000008B
>>>

这里有足够多的方式通过ctypes击溃python，所以无论如何要非常小心。faulthandler模块对于调试“python事故”（比如错误的c库函数调用产生的段故障）非常有帮助。

对比1：CDLL、OleDLL、WinDLL、PyDLL

class ctypes.CDLL(name, mode=DEFAULT_MODE, handle=None, use_errno=False, use_last_error=False)
DLL类型：cdecl调用约定
返回值类型：int

class ctypes.OleDLL(name, mode=DEFAULT_MODE, handle=None, use_errno=False, use_last_error=False)
DLL类型：stdcall调用约定
返回值类型：HRESULT（指示函数调用失败时，已自动抛出异常）

class ctypes.WinDLL(name, mode=DEFAULT_MODE, handle=None, use_errno=False, use_last_error=False)
DLL类型：stdcall调用约定
返回值类型：int
以上DLL导出函数在调用前释放GIL，调用结束后请求GIL。

class ctypes.PyDLL(name, mode=DEFAULT_MODE, handle=None)
DLL类型：cdecl调用约定
返回值类型：int
PyDLL导出函数调用前无需释放GIL，且在调用结束后执行python错误标记检查，如有错误则抛出异常。PyDLL对直接调用python C API函数非常有用。

以上所有类都可以用至少带一个参数（此时为DLL/共享库路径名）的自身工厂函数来实例化。如果已经获取DLL/共享库句柄，则可以作为参数传给handle，否则将会用到底层平台dlopen或LoadLibrary函数将DLL/共享库载入进程，并取得相应句柄。

mode参数指定如何载入DLL/共享库，详情请参考dlopen(3)手册页。windows中mode参数被忽略。posix系统中，mode总要被加入RTLD_NOW，且不可配置。常用于mode的标志有：

ctypes.RTLD_GLOBAL：在该标志不可用的平台上其值被定义为0。

ctypes.RTLD_LOCAL：在该标志不可用的平台上其值同RTLD_GLOBAL。

ctypes.DEFAULT_MODE：默认的mode，用于载入DLL/共享库。在OS X 10.3该标志同RTLD_GLOBAL，其他平台同RTLD_LOCAL。

use_errno参数被置为True时，ctypes机制将以一种安全的方式访问系统errno错误代码。ctypes维持一份系统变量errno的本地线程副本。如果调用创建自带use_errno=True的DLL外部函数，ctypes会在函数调用前以自身副本errno和系统errno交换，而在调用之后又立即交换回来。

最低0.47元/天解锁文章