深度学习_工具

1. 引入

深度学习的工具有很多Tensorflow, Theano, Caffe, Keras, MXNet, Scikit-learn…有用c++写的,有用Python写的,还有R的,Java的,从哪里入手呢?
先看看最热门的Tensorflow,它是谷歌研发的人工智能学习系统,主要优点是分布式计算,特别是在多GPU的环境中。Theano也是比较低级的库,一般单机使用.什么是低级库?就像炒回锅肉不用从杀猪开始,杀猪就是比较低级的工作,已经有人帮你做好了,像Keras这种较上层的工具,它把 Theano和TensorFlow包装成了更具人性化的API。至于是选择低级工具,还是上层工具,主要取决于您的目标是开肉联厂还是开饭馆.

图片.png
图片.png

2. 工具简介

Caffe是c++写的库,是较低层级的库,它是个老牌的工具,工作稳定,性能也好,还提供绑定到Python上的编程语言,但相对没有Python类工具灵活.
Theano是一个低级库,透明地使用GPU来完成数学计算.也是一个老牌的工具,工作比较稳定,只支持单机,可供Keras调用.
TensorFlow也是个低级的库,由Google发布,其的主要优点是分布式计算,特别是在多GPU的环境中,可供Keras调用.
CNTK也是个低级的库,微软发布,可供Keras调用.
MXNet还是一个低级库,由C++实现的,同时提供了python,lua,R,和js等多种语言接口,优点是分布式计算,它支持多个CPU / GPU分配训练网络,可供Keras调用.
Keras用得比较多,它把 Theano,TensorFlow,MXNet,CNTK包装成了更具人性化的API(低级的库想做完整的解决方案,还是要自己写很多代码),可供快速入门,但速度相对慢一些,在设计复杂算法时,也相对受限.
Lasagne是基于Theano的,用于构建和训练网络的轻量级库。Lasagne的功能是Theano的低级编程和Keras的高级抽象之间的一个折中。
NoLearn把Lasagne封装成了更具人性化的API(就如同Keras把 Theano和TensorFlow封装一样),此外,NoLearn中所有的代码都是与scikit-learn兼容的.
Scikit-learn是Python中常用的机器学习库,除了深度学习也提供很多浅层模型,深度学习不是它的重点.
Torch是由lua语言编写的深度学习库.
这里所谓的高级/低级都是相对而言的,高级只是相对简单一些.在算法的使用层面,对上层调用比较多,但会牺牲一些效率,限制也比较多.在算法改进层面,更多用到低级库.
接口怎么调都不过都是调库而已.关键还是看如何设计神经网络的结构,以及它背后的数学原理,所以说深度学习主要比拼的还是"内功".

3. 工具选择

上述工具很多是Python写的,或者有Python的接口,建议在Python层面调用.
作为入门,建议选择Keras,基于以下原因:代码简单,资料多,可将Theano,Tensorflow等作为后端(只要安装好后端工具,就可以使用它了,无需了解具体用法;有一定扩展性(可用theano或tensorflow的语句来写扩展功能并和keras结合使用).最主要的还是容易上手.

4. Keras&Theano&Tensorflow简单安装

1) 安装软件

以下为最简单地安装方法,此文及下文中环境默认为ubuntu

$ sudo pip install tensorflow
$ sudo pip install theano
$ sudo pip install keras
2) 安装例程

学习keras最好从它的例程开始,需要下载keras源码

$ git clone https://github.com/fchollet/keras.git

例程在keras/examples目录下
推荐mnist_*例程,它分别使用了CNN,RNN,GAN等方法实现了手写数字的识别,可以先运行一下看看效果.代码很简单,不过是看代码时,就会发现,虽然只有几个语句,几个参数,但还是不明白,为什么要这么写?和那些讲原理的书有点对不上.因此,建议先自己写个简单神经网络具体实现(不是调库),以了解整个流程以及各个参数的具体作用.具体请见下一篇《深度学习——BP神经网络》

5. 配置GPU支持

1) 说明

到上一步简单安装之后,例程基本都可以跑了.此部分,建议初学者看看就行了,不要急于把环境配置得一步到位.NVdriver+Cuda+TensorFlow+Theano各个软件版本相互依赖,又与机器的操作系统以及显卡型号相关,在配置的过程中的确有很多坑,有可能导致桌面无法启动,以致严重打乱学习的进程.强烈建议先把深度学习的框架弄明白了,需要大量计算时,再回来配置GPU.
注意:只有评分在3.0以上的显卡才能支持tensorflow_gpu,具体型号见https://developer.nvidia.com/cuda-gpus

2) NVdriver

nvidia的卡大部分都支持深度学习,只是性能不同。可用以下命令看看自己显卡情况(Tensorflow需要评分3.0以上,Theano只需支持cuda即可)

$ lspci  | grep -i vga
$ nvidia-smi

显卡驱动可能不是最新的,一般在图形界面->系统设置->软件更新->附加驱动->选择nvidia的可用驱动.也可以用apt-get命令安装,还可以从Nvidia官网上下载安装脚本,但该方法需要先关掉图形界面,安装后再开启,比较麻烦.
另外需要注意的是,不是版本号越高,支持的显卡越多,有些老的显卡只在低版本支持——这也就是安装中最大的坑:显卡驱动升级后不支持当前硬件,使得图形界面无法启动了——不断显示输入密码界面.

$ ls /proc/driver/nvidia/gpus/

正常安装后,以上目录被生成.

3) Cuda

Cuda是一种由NVIDIA推出的通用并行计算架构,该架构使GPU能够解决复杂的计算问题。各个机器学习库都通过Cuda库调用GPU.
Cuda的安装可使用命令(不推荐此方法)

$ apt-get install nvidia-cuda-toolkit
$ nvcc -V # 安装成功后,看一下版本信息

至于安装后能否正常使用,就要看运气了.Cuda和Nvdriver,以及TensorFlow的版本都是强相关的,当使用apt-get安装Cuda时,有时会升级显卡驱动(安装时会提示升级一堆包,其中有nvidia-xxx),但升级后的驱动很可能不支持你的硬件.
推荐从https://developer.nvidia.com/cuda-downloads下载run脚本安装,脚本安装会提示你是否升级显卡驱动,此时选No即可.安装后还需要设置一些bin和lib的环境变量.cuda有很多版本,需要找到与显卡驱动匹配的版本,否则安装后还是找不到gpu.

4) Tensorflow_gpu

Tensorflow有gpu和cpu两个版本,简单安装中安装的是cpu版本,gpu版本则需要系统中安装cuda和libcudnn,且与Tensorflow匹配.比如:最新的Tensorflow需要cude-8.0版本和cudnn-6.0与之匹配,其它版本会有各种各样奇怪的报错.安装命令如下:

$ sudo pip install tensorflow_gpu

也可以从git中下载tensorflow最新代码编译,不过编译过程也需要安装bazel等工具支持,比较麻烦.
我安装成功时使用的是用pip安装tensorflow_gpu,同时安装了cuda_8.0.44_linux.run和cudnn-8.0-linux-x64-v6.0.tgz(tgz包约200M,用15M的包会报错找不到函数)

测试程序
安装后,可用以下程序测试tensorflow是否正常运行

import tensorflow as tf

hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()
print(sess.run(hello))
a = tf.constant(10)
b = tf.constant(32)
print(sess.run(a + b))
5) Theano

Theano也是基本于cuda调用GPU的,相对于Tensorflow要简单一些,它对cuda的版本要求不严格,只要在配置文件中设置正常即可调用GPU
i. 配置文件

$ vi ~/.theanorc
内容如下
[global]
device=gpu
floatX=float32

[nvcc]
optimizer=None

ii. 测试程序

from theano import function, config, shared, sandbox  
import theano.tensor as T  
import numpy  
import time  
  
vlen = 10 * 30 * 768  # 10 x #cores x # threads per core  
iters = 1000  
  
rng = numpy.random.RandomState(22)  
x = shared(numpy.asarray(rng.rand(vlen), config.floatX))  
f = function([], T.exp(x))  
print(f.maker.fgraph.toposort())  
t0 = time.time()  
for i in range(iters):  
    r = f()  
t1 = time.time()  
print("Looping %d times took %f seconds" % (iters, t1 - t0))  
print("Result is %s" % (r,))  
if numpy.any([isinstance(x.op, T.Elemwise) for x in f.maker.fgraph.toposort()]):  
    print('Used the cpu')  
else:  
    print('Used the gpu')  

6. 参考

1) 《我最喜欢的9个 Python深度学习库》

http://blog.csdn.net/u013886628/article/details/51819142

2) 《Ubuntu16.04+cuda8.0+caffe安装教程》

http://blog.csdn.net/autocyz/article/details/52299889/


技术文章定时推送

请关注公众号:算法学习分享


  • 1
    点赞
  • 13
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
深度学习工具包 Deprecation notice. ----- This toolbox is outdated and no longer maintained. There are much better tools available for deep learning than this toolbox, e.g. [Theano](http://deeplearning.net/software/theano/), [torch](http://torch.ch/) or [tensorflow](http://www.tensorflow.org/) I would suggest you use one of the tools mentioned above rather than use this toolbox. Best, Rasmus. DeepLearnToolbox ================ A Matlab toolbox for Deep Learning. Deep Learning is a new subfield of machine learning that focuses on learning deep hierarchical models of data. It is inspired by the human brain's apparent deep (layered, hierarchical) architecture. A good overview of the theory of Deep Learning theory is [Learning Deep Architectures for AI](http://www.iro.umontreal.ca/~bengioy/papers/ftml_book.pdf) For a more informal introduction, see the following videos by Geoffrey Hinton and Andrew Ng. * [The Next Generation of Neural Networks](http://www.youtube.com/watch?v=AyzOUbkUf3M) (Hinton, 2007) * [Recent Developments in Deep Learning](http://www.youtube.com/watch?v=VdIURAu1-aU) (Hinton, 2010) * [Unsupervised Feature Learning and Deep Learning](http://www.youtube.com/watch?v=ZmNOAtZIgIk) (Ng, 2011) If you use this toolbox in your research please cite [Prediction as a candidate for learning deep hierarchical models of data](http://www2.imm.dtu.dk/pubdb/views/publication_details.php?id=6284) ``` @MASTERSTHESIS\{IMM2012-06284, author = "R. B. Palm", title = "Prediction as a candidate for learning deep hierarchical models of data", year = "2012", } ``` Contact: rasmusbergpalm at gmail dot com Directories included in the toolbox ----------------------------------- `NN/` - A library for Feedforward Backpropagation Neural Networks `CNN/` - A library for Convolutional Neural Networks `DBN/` - A library for Deep Belief Networks `SAE/` - A library for Stacked Auto-Encoders `CAE/` - A library for Convolutional Auto-Encoders `util/` - Utility functions used by the libraries `data/` - Data used by the examples `tests/` - unit tests to verify toolbox is working For references on each library check REFS.md Setup ----- 1. Download. 2. addpath(genpath('DeepLearnToolbox')); Example: Deep Belief Network --------------------- ```matlab function test_example_DBN load mnist_uint8; train_x = double(train_x) / 255; test_x = double(test_x) / 255; train_y = double(train_y); test_y = double(test_y); %% ex1 train a 100 hidden unit RBM and visualize its weights rand('state',0) dbn.sizes = [100]; opts.numepochs = 1; opts.batchsize = 100; opts.momentum = 0; opts.alpha = 1; dbn = dbnsetup(dbn, train_x, opts); dbn = dbntrain(dbn, train_x, opts); figure; visualize(dbn.rbm{1}.W'); % Visualize the RBM weights %% ex2 train a 100-100 hidden unit DBN and use its weights to initialize a NN rand('state',0) %train dbn dbn.sizes = [100 100]; opts.numepochs = 1; opts.batchsize = 100; opts.momentum = 0; opts.alpha = 1; dbn = dbnsetup(dbn, train_x, opts); dbn = dbntrain(dbn, train_x, opts); %unfold dbn to nn nn = dbnunfoldtonn(dbn, 10); nn.activation_function = 'sigm'; %train nn opts.numepochs = 1; opts.batchsize = 100; nn = nntrain(nn, train_x, train_y, opts); [er, bad] = nntest(nn, test_x, test_y); assert(er < 0.10, 'Too big error'); ``` Example: Stacked Auto-Encoders --------------------- ```matlab function test_example_SAE load mnist_uint8; train_x = double(train_x)/255; test_x = double(test_x)/255; train_y = double(train_y); test_y = double(test_y); %% ex1 train a 100 hidden unit SDAE and use it to initialize a FFNN % Setup and train a stacked denoising autoencoder (SDAE) rand('state',0) sae = saesetup([784 100]); sae.ae{1}.activation_function = 'sigm'; sae.ae{1}.learningRate = 1; sae.ae{1}.inputZeroMaskedFraction = 0.5; opts.numepochs = 1; opts.batchsize = 100; sae = saetrain(sae, train_x, opts); visualize(sae.ae{1}.W{1}(:,2:end)') % Use the SDAE to initialize a FFNN nn = nnsetup([784 100 10]); nn.activation_function = 'sigm'; nn.learningRate = 1; nn.W{1} = sae.ae{1}.W{1}; % Train the FFNN opts.numepochs = 1; opts.batchsize = 100; nn = nntrain(nn, train_x, train_y, opts); [er, bad] = nntest(nn, test_x, test_y); assert(er < 0.16, 'Too big error'); ``` Example: Convolutional Neural Nets --------------------- ```matlab function test_example_CNN load mnist_uint8; train_x = double(reshape(train_x',28,28,60000))/255; test_x = double(reshape(test_x',28,28,10000))/255; train_y = double(train_y'); test_y = double(test_y'); %% ex1 Train a 6c-2s-12c-2s Convolutional neural network %will run 1 epoch in about 200 second and get around 11% error. %With 100 epochs you'll get around 1.2% error rand('state',0) cnn.layers = { struct('type', 'i') %input layer struct('type', 'c', 'outputmaps', 6, 'kernelsize', 5) %convolution layer struct('type', 's', 'scale', 2) %sub sampling layer struct('type', 'c', 'outputmaps', 12, 'kernelsize', 5) %convolution layer struct('type', 's', 'scale', 2) %subsampling layer }; cnn = cnnsetup(cnn, train_x, train_y); opts.alpha = 1; opts.batchsize = 50; opts.numepochs = 1; cnn = cnntrain(cnn, train_x, train_y, opts); [er, bad] = cnntest(cnn, test_x, test_y); %plot mean squared error figure; plot(cnn.rL); assert(er<0.12, 'Too big error'); ``` Example: Neural Networks --------------------- ```matlab function test_example_NN load mnist_uint8; train_x = double(train_x) / 255; test_x = double(test_x) / 255; train_y = double(train_y); test_y = double(test_y); % normalize [train_x, mu, sigma] = zscore(train_x); test_x = normalize(test_x, mu, sigma); %% ex1 vanilla neural net rand('state',0) nn = nnsetup([784 100 10]); opts.numepochs = 1; % Number of full sweeps through data opts.batchsize = 100; % Take a mean gradient step over this many samples [nn, L] = nntrain(nn, train_x, train_y, opts); [er, bad] = nntest(nn, test_x, test_y); assert(er < 0.08, 'Too big error'); %% ex2 neural net with L2 weight decay rand('state',0) nn = nnsetup([784 100 10]); nn.weightPenaltyL2 = 1e-4; % L2 weight decay opts.numepochs = 1; % Number of full sweeps through data opts.batchsize = 100; % Take a mean gradient step over this many samples nn = nntrain(nn, train_x, train_y, opts); [er, bad] = nntest(nn, test_x, test_y); assert(er < 0.1, 'Too big error'); %% ex3 neural net with dropout rand('state',0) nn = nnsetup([784 100 10]); nn.dropoutFraction = 0.5; % Dropout fraction opts.numepochs = 1; % Number of full sweeps through data opts.batchsize = 100; % Take a mean gradient step over this many samples nn = nntrain(nn, train_x, train_y, opts); [er, bad] = nntest(nn, test_x, test_y); assert(er < 0.1, 'Too big error'); %% ex4 neural net with sigmoid activation function rand('state',0) nn = nnsetup([784 100 10]); nn.activation_function = 'sigm'; % Sigmoid activation function nn.learningRate = 1; % Sigm require a lower learning rate opts.numepochs = 1; % Number of full sweeps through data opts.batchsize = 100; % Take a mean gradient step over this many samples nn = nntrain(nn, train_x, train_y, opts); [er, bad] = nntest(nn, test_x, test_y); assert(er < 0.1, 'Too big error'); %% ex5 plotting functionality rand('state',0) nn = nnsetup([784 20 10]); opts.numepochs = 5; % Number of full sweeps through data nn.output = 'softmax'; % use softmax output opts.batchsize = 1000; % Take a mean gradient step over this many samples opts.plot = 1; % enable plotting nn = nntrain(nn, train_x, train_y, opts); [er, bad] = nntest(nn, test_x, test_y); assert(er < 0.1, 'Too big error'); %% ex6 neural net with sigmoid activation and plotting of validation and training error % split training data into training and validation data vx = train_x(1:10000,:); tx = train_x(10001:end,:); vy = train_y(1:10000,:); ty = train_y(10001:end,:); rand('state',0) nn = nnsetup([784 20 10]); nn.output = 'softmax'; % use softmax output opts.numepochs = 5; % Number of full sweeps through data opts.batchsize = 1000; % Take a mean gradient step over this many samples opts.plot = 1; % enable plotting nn = nntrain(nn, tx, ty, opts, vx, vy); % nntrain takes validation set as last two arguments (optionally) [er, bad] = nntest(nn, test_x, test_y); assert(er < 0.1, 'Too big error'); ``` [![Bitdeli Badge](https://d2weczhvl823v0.cloudfront.net/rasmusbergpalm/deeplearntoolbox/trend.png)](https://bitdeli.com/free "Bitdeli Badge")

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值