学习Python、Cython、PyCuda

目标

从基于CPU的Cython代码 -> 基于CPU+GPU的Cython+PyCuda代码


原因

源程序用了Python的PyFITS、Astropy等库,打算简单粗暴的把CPU并行部分改成GPU并行,所以加入PyCuda。(代码应该被搞得更复杂了,若大家有更简单的方法,请留言)


准备工作

1. 粗略看完Hetland,M.L.的《Python基础教程》的前10章。(花了一天)

2. 搭建环境:(网上找教程,很多)

  • Note
    • 安装NVIDIA CUDA ToolKit 要注意显卡的版本,不支持NVIDIA GeForce GTX 300及之前的版本。
    • 若是Windows用户,安装Visual Studio 也要注意版本,NVIDIA CUDA ToolKit 支持VS15及之前的版本。(建议使用Linux)


Cython+PyCuda的测试

Cython

1. 初始文件目录

(初始文件是我编辑的,其他编译后新增文件后面会列出)

/*
--testCuda/
 |
  --setup.py
  --test/
   |
    --constants.h
    --constants.pxd
    --test.pyx
    --__init__.py
*/

Note: pyx文件pxd文件

2.各文件内容

  • constants.h

//constants.h
#ifndef CONSTANTS_H
#define CONSTANTS_H

#define PI (3.1415926535897932384626433832)
#define TWOPI (PI * 2.0)

#endif

  • constants.pxd

#!python

cdef extern from "./constants.h":
    long double PI
    long double TWOPI

  • test.pyx

#!python
# import python3 compat modules
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from __future__ import unicode_literals

# import std lib
import sys
import traceback

# import cython specifics
cimport cython
from cython.parallel import prange
from cython.operator cimport dereference as deref, preincrement as inc
from cpython cimport bool as python_bool
cimport openmp

# import C/C++ modules
from libc.math cimport exp, cos, sin, sqrt, asin, acos, atan2, fabs, fmod
from libcpp.vector cimport vector
from libcpp.pair cimport pair
from libcpp.set cimport set as cpp_set
from libcpp cimport bool
from libcpp.unordered_map cimport unordered_map

# import numpy/data types
import numpy as np
from numpy cimport (
    int8_t, int16_t, int32_t, int64_t, uint8_t, uint16_t,
    uint32_t, uint64_t, float32_t, float64_t
    )
cimport numpy as np

from .constants cimport PI, TWOPI
print ("test: Cython")
print (TWOPI)


  • __init__.py

from .test import *

  • setup.py

#!/usr/bin/env python
from setuptools import setup
from setuptools.extension import Extension
from Cython.Distutils import build_ext
import numpy
import platform
import os

EX_COMP_ARGS = []
TEST_EXT = Extension(	//只是照搬一下,求解每句话的意义?
	'test.test',
	['test/test.pyx'],
	extra_compile_args=['-fopenmp', '-O3', '-std=c++11'] + EX_COMP_ARGS,
	extra_link_args=['-fopenmp'],
	language='c++',
	include_dirs=[
		numpy.get_include(),
	]
)

setup(
	name='test_Cython',
	packages=['test'],
	cmdclass={'build_ext': build_ext},
	ext_modules=[
		TEST_EXT,
	]
)


3.编译

$ python setup.py build_ext --inplace


4. 运行

打开python

$ python

运行

>>> import test
test: Cython
6.28318530718
>>>


5. 此时文件目录

/*
--testCuda/
 |
  --setup.py
  --test/
   |
    --constants.h
    --constants.pxd
    --test.pyx
    --test.cpp
    --test.so
    --__init__.py
    --__init__.pyc
  --build/
   |
    --temp.linux-x86_64-2.7
*/

PyCuda

PyCuda样例加入test.pyx

#!python
# import python3 compat modules
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from __future__ import unicode_literals

# import std lib
import sys
import traceback

# import cython specifics
cimport cython
from cython.parallel import prange
from cython.operator cimport dereference as deref, preincrement as inc
from cpython cimport bool as python_bool
cimport openmp

# import C/C++ modules
from libc.math cimport exp, cos, sin, sqrt, asin, acos, atan2, fabs, fmod
from libcpp.vector cimport vector
from libcpp.pair cimport pair
from libcpp.set cimport set as cpp_set
from libcpp cimport bool
from libcpp.unordered_map cimport unordered_map

# import numpy/data types
import numpy as np
from numpy cimport (
    int8_t, int16_t, int32_t, int64_t, uint8_t, uint16_t,
    uint32_t, uint64_t, float32_t, float64_t
    )
cimport numpy as np

from .constants cimport PI, TWOPI

print ("test: Cython")
print (TWOPI)

import pycuda.driver as cuda
import pycuda.autoinit
from pycuda.compiler import SourceModule

a = np.random.randn(4,4)
a = a.astype(numpy.float32)
a_gpu = cuda.mem_alloc(a.size * a.dtype.itemsize)
cuda.memcpy_htod(a_gpu, a)
mod = SourceModule("""
    __global__ void doublify(float *a)
    {
      int idx = threadIdx.x + threadIdx.y*4;
      a[idx] *= 2;
    }
    """)
func = mod.get_function(str("doublify"))
func(a_gpu, block=(4,4,1))

a_doubled = np.empty_like(a)
cuda.memcpy_dtoh(a_doubled, a_gpu)
print ("original array:")
print (a)
print ("doubled with kernel:")
print (a_doubled)


编译->运行,结果如下:

>>> import test
test: Cython
6.28318530718
original array:
[...省略
]
doubled with kernel:
[...省略
]
>>>


总结:

PyCuda是可以和Cython结合的!希望路过的大牛能甩几个言简意赅的帖子,让我深入理解一下。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值