windows10下安装Tensorflow-GPU跑深度学习(Nvidia-MX150)

windows10下安装Tensorflow-GPU跑深度学习


【转载请附加原文链接。】
1、利用anaconda安装python3.6环境:
https://www.anaconda.com/download/
目前tensorflow-gpu还不支持python3.7,所以安装完成Version 5.3.1(自带python 3.7版本)后,需要自行安装python3.6版本,以下步骤是在juypter notebook可以切换python版本的安装方法:
1)pip install ipykernel
2).python -m ipykernel install --name python3.6
3)新建python3.6的环境
conda create -n py36 python=3.6 anaconda
activate py36

4)将python3.6的路径指定
C:\ProgramData\Anaconda3\envs\py36\python.exe -m ipykernel install --name python3.6
5)检查是否安装成功(jupyter notebook新建笔记,可以进行切换kernel),在jupyter notebook上进行测试:
import sys
print(sys.executable)
输出:路径正确
C:\ProgramData\Anaconda3\envs\py36\python.exe

2、显卡支持检查:
显卡型号支持:https://developer.nvidia.com/cuda-gpus

3.1、下载CUDA10.0(经检查CUDA10.0、CUDA9.x可以支持mx150)
CUDA下载地址:https://developer.nvidia.com/cuda-toolkit-archive
3.2、下载支持卷积的CuDNN(记得要安装对应CUDA的对应版本)
下载地址:https://developer.nvidia.com/rdp/cudnn-download
解压后直接替换CUDA10.0安装路径的同名目录

4、将相关路径加入环境变量(指令检查path:echo %path%)
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\bin;
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\libnvvp;
C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common;
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\lib\x64;
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\lib;
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\include;
安装完成后记得重启,要不会报:Could not find ‘cudart64_90.dll’ ,这种现象也会出现在tensorflow与CUDA不匹配的情况下。

5、CUDA安装后检查:
1)打开nv的控制台
在帮助-系统信息里面打开查看支持的CUDA版本
2)在帮助-组件内查看NVCUDA.DLL

3)命令行:
输入nvidia-smi :看左右版本是一致的(这是一个nvidia-smi.exe文件,如果找不到就搜索安装盘相应路径)

6、Tensorflow-GPU安装:
经检查Tensorflow1.9.0以上版本都不支持CUDA10.0
查看CUDA版本支持情况:https://github.com/tensorflow/tensorflow/blob/master/RELEASE.md
快速安装方法:pip install tensorflow-gpu==1.9.0 -i https://pypi.douban.com/simple

7、运行:
如果只有一个GPU 这个参数只能置为0,要不会报错找不到GPU
import os
os.environ[‘CUDA_VISIBLE_DEVICES’] = '0’

跑一个简单实例(MLP):

# -*- coding: utf-8 -*-
"""
Created on Mon Nov 19 19:33:03 2018

@author: KUMA
"""

import numpy as np
import tensorflow as tf
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
class LinearSep:
    def __init__(self):
        self.n_train = 10
        self.n_test = 50
        self.x_train, self.y_train, self.x_test, self.y_test = self._gene_data()
    def _gene_data(self):
        x = np.random.uniform(-1,1,[self.n_train, 2])
        y = (x[:,1]>x[:,0]).astype(np.int32)
        x += np.random.randn(self.n_train, 2)*0.05
        x_test = np.random.uniform(-1,1,[self.n_test, 2])
        y_test = (x_test[:,1]>x_test[:,0]).astype(np.int32)
        return x, y, x_test, y_test
        
#随机生成数据
dataset = LinearSep()
X_train, Y_train = dataset.x_train, dataset.y_train
print(Y_train)
Y_train=np.eye(2)[Y_train]
X_test,Y_test=dataset.x_test,dataset.y_test
Y_test=np.eye(2)[Y_test]
x=tf.placeholder(tf.float32,[None,2],name='input')
y=tf.placeholder(tf.float32,[None,2],name='output')
w1 = tf.get_variable(name='w_fc1', shape=[2, 20], dtype=tf.float32)
b1 = tf.get_variable(name='b_fc1', shape=[20], dtype=tf.float32)

out = tf.matmul(x, w1) + b1
out = tf.nn.relu(out)


w2 = tf.get_variable(name='w_fc2', shape=[20, 2], dtype=tf.float32)
b2 = tf.get_variable(name='b_fc2', shape=[2], dtype=tf.float32)
out = tf.matmul(out, w2) +b2
out = tf.nn.softmax(out)
#cross entropy 损失函数
loss=-tf.reduce_mean(tf.reduce_sum(y*tf.log(out+1e-8), axis=1), axis=0)
#准确率
correct_pred = tf.equal(tf.argmax(y,axis=1), tf.argmax(out,axis=1))

accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))
#定义优化器
train_op = tf.train.AdamOptimizer(1e-3).minimize(loss) # 1e-3 是学习律
#初始化网络
#BATCH_SIZE = 128
EPOCH = 7000#优化次数

sess = tf.Session()
sess.run(tf.global_variables_initializer())
for ep in range(EPOCH):
    sess.run(train_op, feed_dict={x:X_train, y:Y_train})

    loss_train,acc_train= sess.run([loss,accuracy], feed_dict={x:X_train, y:Y_train})
    acc_test,pre_test= sess.run([accuracy,correct_pred], feed_dict={x:X_test, y:Y_test})

    if ep % 1000 == 0:
        print(ep, loss_train,acc_train, acc_test)
        print(Y_test.shape)
test_pre=sess.run(out,feed_dict={x:X_test, y:Y_test})
print(len(test_pre))
mask  = np.argmax(test_pre,axis=1)
print(mask)
mask_0  = np.where(mask==0)
mask_1 =  np.where(mask==1)
X_0 = X_train[mask_0]
X_1 = X_train[mask_1]
print(X_0)

1)在IDEA的命令行输出
C:\ProgramData\Anaconda3\envs\py3.6\python.exe C:/Users/13302/IdeaProjects/textmerge/.idea/src/tensorflow/littlework2.py
[0 0 0 0 0 1 1 0 0 1]
2018-11-24 20:04:05.551704: I T:\src\github\tensorflow\tensorflow\core\platform\cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2018-11-24 20:04:06.328242: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1392] Found device 0 with properties:
name: GeForce MX150 major: 6 minor: 1 memoryClockRate(GHz): 1.5315
pciBusID: 0000:01:00.0
totalMemory: 4.00GiB freeMemory: 3.32GiB

2018-11-24 20:04:06.328708: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1471] Adding visible gpu devices: 0
2018-11-24 20:04:07.080690: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-11-24 20:04:07.081071: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:958] 0
2018-11-24 20:04:07.081343: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:971] 0: N
2018-11-24 20:04:07.081704: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3041 MB memory) -> physical GPU (device: 0, name: GeForce MX150, pci bus id: 0000:01:00.0, compute capability: 6.1)

2)输入nvidia-smi查看显卡占用情况(此图是换成了CUDA9.0的显示,区别不大)

PS:鸣谢梁同学

阅读更多

没有更多推荐了,返回首页