win10
显卡 GT 720
* 安装Anaconda2 (python2.7)
http://www.continuum.io/downloads
确保系统环境变量PATH已添加
C:\Anaconda2
C:\Anaconda2\Scripts
C:\Anaconda2\Library\bin
* 安装 Visual studio 2013
这里用的是 VS2013_RTM_ULT_CHS.iso, 之前一直不成功不知道是不是没有用这个版本 。
系统环境变量PATH添加
C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin
C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\x86_amd64
VS的版本要和CUDA匹配,详见注意事项
* 安装 CUDA 7.5
https://developer.nvidia.com/cuda-downloads
测试cmd:
nvcc -V
* 安装minGW
cmd:
conda install mingw libpython
确保系统环境变量PATH已添加
C:\Anaconda2\MinGW\bin
C:\Anaconda2\MinGW\x86_64-w64-mingw32\lib
* 安装theano
cmd:
pip install theano
在当前用户的目录(如 C:\Users\valex, 一般为cmd的默认路径,但也不一定)下
创建文件 .theanorc.txt, 内容如下:
[blas]
ldflags=
[global]
openmp=False
floatX = float32
device = gpu
allow_input_downcast=True
[nvcc]
fastmath = True
flags = -LC:\Anaconda2\libs
compiler_bindir = C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin
[gcc]
cxxflags = -IC:\Anaconda2\MinGW
fastmath = True
[cuda]
root = C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\bin
* 测试是否可用GPU加速
from theano import function, config, shared, sandbox
import theano.tensor as T
import numpy
import time
vlen = 10 * 30 * 768 # 10 x #cores x # threads per core
iters = 1000
rng = numpy.random.RandomState(22)
x = shared(numpy.asarray(rng.rand(vlen), config.floatX))
f = function([], T.exp(x))
print (f.maker.fgraph.toposort())
t0 = time.time()
for i in range(iters):
r = f()
t1 = time.time()
print ('Looping %d times took' % iters, t1 - t0, 'seconds')
print ('Result is', r)
if numpy.any([isinstance(x.op, T.Elemwise) for x in f.maker.fgraph.toposort()]):
print ('Used the cpu')
else:
print ('Used the gpu')
* 注意事项
http://blog.sina.com.cn/s/blog_87ecc6830102wnh1.html