在看这篇博客之前,作者的运行环境首先要装好cuda和cudnn。然后我的环境为:
ubuntu 16.04
anaconda
python 3.6
theano 1.0.2 py36h6bb024c_0
首先安装pygpu
conda install -c conda-forge pygpu
然后进行theano-gpu测试,我的测试代码为theano-gpu-test.py
from theano import function, config, shared, sandbox
import theano.tensor as T
import numpy
import time
vlen = 10 * 30 * 768 # 10 x #cores x # threads per core
iters = 1000
rng = numpy.random.RandomState(22)
x = shared(numpy.asarray(rng.rand(vlen), config.floatX))
f = function([], T.exp(x))
print(f.maker.fgraph.toposort())
t0 = time.time()
for i in range(iters):
r = f()
t1 = time.time()
print('Looping %d times took' % iters, t1 - t0, 'seconds')
print('Result is', r )
if numpy.any([isinstance(x.op, T.Elemwise) for x in f.maker.fgraph.toposort()]):
print('Used the cpu')
else:
print('Used the gpu')
执行的命令gpu版本:
THEANO_FLAGS=mode=FAST_RUN,device=cuda,floatX=float32 python theano-gpu-test.py
我的输出是:
Using cuDNN version 6021 on context None
Mapped name None to device cuda: GeForce GTX 1080 Ti (0000:03:00.0)
[GpuElemwise{exp,no_inplace}(<GpuArrayType<None>(float32, vector)>), HostFromGpu(gpuarray)(GpuElemwise{exp,no_inplace}.0)]
Looping 1000 times took 0.23069453239440918 seconds
Result is [1.2317803 1.6187935 1.5227807 ... 2.2077181 2.2996776 1.623233 ]
Used the cpu #这行没有参考意义,看上面的用时
执行命令CPU版本:
python theano-gpu-test.py
我的电脑输出为:
[Elemwise{exp,no_inplace}(<TensorType(float64, vector)>)]
Looping 1000 times took 26.147013664245605 seconds
Result is [1.23178032 1.61879341 1.52278065 ... 2.20771815 2.29967753 1.62323285]
Used the cpu #这行没有参考意义,看上面的用时
用GPU运行程序花了0.23秒,用CPU运行程序花了26秒
如果不想每次用gpu的时候加上前面的一长串,则需要配置.theanorc文件,然后就直接输入python theano-gpu-test.py命令就行了,我这里把我配置的过程分享一下:
vim ~/.theanorc
然后加上
[global]
floatX=float32
device=cuda
按esc :wq退出就行了,亲测可用。
参考文献
[1].Theano2.1.12-基础知识之使用GPU.https://www.cnblogs.com/shouhuxianjian/p/4590224.html
[2].Theano坑--无法调用GPU. https://www.jianshu.com/p/12af936e20ef
[3].gpuarray. http://deeplearning.net/software/libgpuarray/installation.html