pyopencl 在 windows 上的安装及使用

<2022-05-07 周六>

pyopenclwindows上的安装及使用

因为pyopenclwindows上安装和使用不是想象中的那么容易,所以还是需要记录一下。

我的环境Python 3.10.4,按照pip install pyopencl的提示,先后安装了numpywheelmako,总之,提示缺啥安装啥,然后就不管在cmdpowershell还是在Developer Command Prompt for VS 2022中都没有安装成功pyopencl,一直提示:

src\wrap_cl.hpp(70): fatal error C1083: Cannot open include file: 'CL/cl.h': No such file or directory

最后在官网上找到了方法,从“Christoph Gohlke distributes binary wheels for PyOpenCL on Windows.”下载pyopencl-2022.1.3-cp310-cp310-win_amd64.whl解决:

PS D:\dnld> pip install .\pyopencl-2022.1.3-cp310-cp310-win_amd64.whl

这样就安装成功了。使用官网的例子测试一下:

#!/usr/bin/env python

import numpy as np
import pyopencl as cl

a_np = np.random.rand(50000).astype(np.float32)
b_np = np.random.rand(50000).astype(np.float32)

ctx = cl.create_some_context()
queue = cl.CommandQueue(ctx)

mf = cl.mem_flags
a_g = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=a_np)
b_g = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=b_np)

prg = cl.Program(ctx, """
__kernel void sum(
    __global const float *a_g, __global const float *b_g, __global float *res_g)
{
  int gid = get_global_id(0);
  res_g[gid] = a_g[gid] + b_g[gid];
}
""").build()

res_g = cl.Buffer(ctx, mf.WRITE_ONLY, a_np.nbytes)
knl = prg.sum  # Use this Kernel object for repeated calls
knl(queue, a_np.shape, None, a_g, b_g, res_g)

res_np = np.empty_like(a_np)
cl.enqueue_copy(queue, res_np, res_g)

# Check on CPU with Numpy:
print(res_np - (a_np + b_np))
print("res_np       : ", res_np)
print("(a_np + b_np): ", (a_np + b_np))
print(np.linalg.norm(res_np - (a_np + b_np)))
assert np.allclose(res_np, a_np + b_np)

运行效果:

PS D:\demos\t_pyopencl\demo> python.exe .\demo.py
Choose platform:
[0] <pyopencl.Platform 'Intel(R) OpenCL' at 0x1c89caa1e40>
Choice [0]:
Choose device(s):
[0] <pyopencl.Device 'Intel(R) HD Graphics 530' on 'Intel(R) OpenCL' at 0x1c89cb07160>
[1] <pyopencl.Device 'Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz' on 'Intel(R) OpenCL' at 0x1c89c33df60>
Choice, comma-separated [0]:
Set the environment variable PYOPENCL_CTX=':' to avoid being asked again.
C:\Users\ysouyno\AppData\Local\Programs\Python\Python310\lib\site-packages\pyopencl\__init__.py:270: CompilerWarning: Non-empty compiler output encountered. Set the environment variable PYOPENCL_COMPILER_OUTPUT=1 to see more.
  warn("Non-empty compiler output encountered. Set the "
[0. 0. 0. ... 0. 0. 0.]
res_np       :  [1.3405421  0.803946   0.69187665 ... 0.9120757  0.5142325  0.5313911 ]
(a_np + b_np):  [1.3405421  0.803946   0.69187665 ... 0.9120757  0.5142325  0.5313911 ]
0.0
PS D:\demos\t_pyopencl\demo>

这里主要说明下PYOPENCL_CTX的环境变量设置,powershell的输出如下:

PS D:\demos\t_pyopencl\demo> $env:PYOPENCL_COMPILER_OUTPUT=1
PS D:\demos\t_pyopencl\demo> $env:PYOPENCL_CTX='0:0'
PS D:\demos\t_pyopencl\demo> python.exe .\demo.py
C:\Users\ysouyno\AppData\Local\Programs\Python\Python310\lib\site-packages\pyopencl\__init__.py:268: CompilerWarning: Built kernel retrieved from cache. Original from-source build had warnings:
Build on <pyopencl.Device 'Intel(R) HD Graphics 530' on 'Intel(R) OpenCL' at 0x1f43640f230> succeeded, but said:

fcl build 1 succeeded.
bcl build succeeded.

  warn(text, CompilerWarning)
[0. 0. 0. ... 0. 0. 0.]
res_np       :  [1.1701384  1.1061795  1.1798061  ... 0.99728197 0.5002522  1.80389   ]
(a_np + b_np):  [1.1701384  1.1061795  1.1798061  ... 0.99728197 0.5002522  1.80389   ]
0.0
PS D:\demos\t_pyopencl\demo> $env:PYOPENCL_CTX='0:1'
PS D:\demos\t_pyopencl\demo> python.exe .\demo.py
C:\Users\ysouyno\AppData\Local\Programs\Python\Python310\lib\site-packages\pyopencl\__init__.py:268: CompilerWarning: Built kernel retrieved from cache. Original from-source build had warnings:
Build on <pyopencl.Device 'Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz' on 'Intel(R) OpenCL' at 0x1a9cda3ade0> succeeded, but said:

Compilation started
Compilation done
Linking started
Linking done
Device build started
Device build done
Kernel <sum> was successfully vectorized (8)
Done.
  warn(text, CompilerWarning)
C:\Users\ysouyno\AppData\Local\Programs\Python\Python310\lib\site-packages\pyopencl\__init__.py:268: CompilerWarning: From-binary build succeeded, but resulted in non-empty logs:
Build on <pyopencl.Device 'Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz' on 'Intel(R) OpenCL' at 0x1a9cda3ade0> succeeded, but said:

Device build started
Device build done
Reload Program Binary Object.
  warn(text, CompilerWarning)
[0. 0. 0. ... 0. 0. 0.]
res_np       :  [0.3848073 1.6682227 1.1773763 ... 0.801935  1.1112832 1.5165817]
(a_np + b_np):  [0.3848073 1.6682227 1.1773763 ... 0.801935  1.1112832 1.5165817]
0.0
PS D:\demos\t_pyopencl\demo>

cmd的输出如下:

D:\demos\t_pyopencl\demo>set PYOPENCL_COMPILER_OUTPUT=1

D:\demos\t_pyopencl\demo>set PYOPENCL_CTX=0:0

D:\demos\t_pyopencl\demo>python demo.py
C:\Users\ysouyno\AppData\Local\Programs\Python\Python310\lib\site-packages\pyopencl\__init__.py:268: CompilerWarning: Built kernel retrieved from cache. Original from-source build had warnings:
Build on <pyopencl.Device 'Intel(R) HD Graphics 530' on 'Intel(R) OpenCL' at 0x1bc6e0b1250> succeeded, but said:

fcl build 1 succeeded.
bcl build succeeded.

  warn(text, CompilerWarning)
[0. 0. 0. ... 0. 0. 0.]
res_np       :  [0.6949563  1.1166335  0.57082427 ... 1.3923297  1.2337677  1.4651983 ]
(a_np + b_np):  [0.6949563  1.1166335  0.57082427 ... 1.3923297  1.2337677  1.4651983 ]
0.0

D:\demos\t_pyopencl\demo>set PYOPENCL_CTX=0:1

D:\demos\t_pyopencl\demo>python demo.py
C:\Users\ysouyno\AppData\Local\Programs\Python\Python310\lib\site-packages\pyopencl\__init__.py:268: CompilerWarning: Built kernel retrieved from cache. Original from-source build had warnings:
Build on <pyopencl.Device 'Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz' on 'Intel(R) OpenCL' at 0x22c8a4e1b60> succeeded, but said:

Compilation started
Compilation done
Linking started
Linking done
Device build started
Device build done
Kernel <sum> was successfully vectorized (8)
Done.
  warn(text, CompilerWarning)
C:\Users\ysouyno\AppData\Local\Programs\Python\Python310\lib\site-packages\pyopencl\__init__.py:268: CompilerWarning: From-binary build succeeded, but resulted in non-empty logs:
Build on <pyopencl.Device 'Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz' on 'Intel(R) OpenCL' at 0x22c8a4e1b60> succeeded, but said:

Device build started
Device build done
Reload Program Binary Object.
  warn(text, CompilerWarning)
[0. 0. 0. ... 0. 0. 0.]
res_np       :  [1.0808852  1.2096584  0.8938436  ... 0.35234842 0.3697133  0.88079625]
(a_np + b_np):  [1.0808852  1.2096584  0.8938436  ... 0.35234842 0.3697133  0.88079625]
0.0

D:\demos\t_pyopencl\demo>

注意powershellcmd的设置方法不一样:

SHELLENV
powershell$env:PYOPENCL_CTX=‘0:1’
cmdset PYOPENCL_CTX=0:1

:0表示platform:1表示device

为什么这个时候了还要去了解pyopencl的使用?因为python的代码更简洁,可以快速的测试kernel函数,减少开发周期,就像当年学习octave一样。

  • 7
    点赞
  • 7
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
PyOpenCL 是一个 Python 的跨平台并行计算库,它使你可以在 OpenCL 支持的设备上执行并行代码。下面是 PyOpenCL 的教学: 1. 安装 PyOpenCL 首先,你需要在系统中安装 OpenCL 库,然后使用 pip 命令安装 PyOpenCL。 2. 创建 OpenCL 上下文 在 PyOpenCL 中,你需要创建一个 OpenCL 上下文来管理 OpenCL 设备和内核程序。下面是一个简单的示例代码: ``` import pyopencl as cl # 获取 OpenCL 平台列表 print(cl.get_platforms()) # 获取第一个平台的设备列表 devices = cl.get_platforms()[0].get_devices() # 创建上下文 context = cl.Context(devices) ``` 3. 创建内存缓冲区 在 PyOpenCL 中,你需要显式地为数据创建内存缓冲区。下面是一个简单的示例代码: ``` import numpy as np import pyopencl as cl # 创建数据 data = np.array([1, 2, 3, 4, 5], dtype=np.float32) # 创建上下文和命令队列 context = cl.create_some_context() queue = cl.CommandQueue(context) # 创建内存缓冲区 data_buffer = cl.Buffer(context, cl.mem_flags.READ_ONLY | cl.mem_flags.COPY_HOST_PTR, hostbuf=data) ``` 4. 编写内核程序 在 PyOpenCL 中,你需要使用 OpenCL C 语言编写内核程序。下面是一个简单的示例代码: ``` __kernel void square(__global float* data) { int gid = get_global_id(0); data[gid] = data[gid] * data[gid]; } ``` 5. 编译和执行内核程序 在 PyOpenCL 中,你需要使用编译器将内核程序编译成可执行的二进制文件。下面是一个简单的示例代码: ``` import numpy as np import pyopencl as cl # 创建数据和内核程序 data = np.array([1, 2, 3, 4, 5], dtype=np.float32) kernel = """ __kernel void square(__global float* data) { int gid = get_global_id(0); data[gid] = data[gid] * data[gid]; } """ # 创建上下文、命令队列和内存缓冲区 context = cl.create_some_context() queue = cl.CommandQueue(context) data_buffer = cl.Buffer(context, cl.mem_flags.READ_ONLY | cl.mem_flags.COPY_HOST_PTR, hostbuf=data) # 编译内核程序 program = cl.Program(context, kernel).build() # 执行内核程序 program.square(queue, data.shape, None, data_buffer) # 读取结果 result = np.empty_like(data) cl.enqueue_copy(queue, result, data_buffer) # 输出结果 print(result) ``` 以上就是 PyOpenCL 的简单教学。如果你想深入了解 PyOpenCL 的更多功能,请参考官方文档。
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值