深度学习构架的性能对比

最新推荐文章于 2024-06-01 20:29:05 发布

bbzz2

最新推荐文章于 2024-06-01 20:29:05 发布

阅读量1.8k

点赞数

分类专栏：深度学习

深度学习专栏收录该内容

64 篇文章 0 订阅

订阅专栏

知乎上对各种深度学习方法的对比：

在众多的神经网络框架如chainer, caffe, torch，mxnet等之间如何做选择？

四个月前就有人提出更新对比，现在我看还没有对比更新过。

Evaluation of Deep Learning Toolkits

原文：

Abstract. In this study, I evaluate some popular deep learning toolkits. The candidates are listed in alphabetical order:Caffe,CNTK,TensorFlow,Theano, andTorch. This is a dynamic document and the evaluation, to the best of my knowledge, is based on the current state of their code.

I also provide ratings in some areas because for a lot of people, ratings are useful. However, keep in mind that ratings are inherently subjective [1].

If you find something wrong or inadequate, please help improve by filing an issue.

本文对 Caffe,CNTK,TensorFlow,Theano, andTorch. 几种框架进行对比，如有错误，敬请指正！

Table of contents

Modeling Capability-兼容性

In this section, we evaluate each toolkit's ability to train common and state-of-the-art networks without writing too much code. Some of these networks are:

ConvNets: AlexNet, OxfordNet, GoogleNet
RecurrentNets: plain RNN, LSTM/GRU, bidirectional RNN
Sequential modeling with attention.

In addition, we also evaluate the flexibility to create a new type of model.

模型相容性：在此章节中，评价每个工具箱在不修改更多代码的情况下训练通用和日新月异的网络的能力。

一些网络为：

卷积神经网络： AlexNet, OxfordNet, GoogleNet

递归神经网路 : plain RNN, LSTM/GRU, bidirectional RNN

注意力序列模型

Caffe

Caffe 作为社区和业界最为流行的深度神经网络，具有很强的伸缩性、扩展性和相容性；但是对递归神经网络的支持比较贫乏。

Caffe is perhaps the first mainstream industry-grade deep learning toolkit, started in late 2013, due to its excellent convnet implementation (at the time). It is still the most popular toolkit within the computer vision community, with many extensions being actively added.

However, its support for recurrent networks and language modeling in general is poor, due to its legacy architecture, which's limitations are detailed in thearchitecture section.

CNTK

CNTK在speech社区更为流行。在CNTK（如 TensorFlow 和 Theano ），网络作为一个向量操作图，栗如矩阵加和乘。一个层是这种运算的组合。buildding blocks 的微调粒度允许在不执行底层的情况下创建一个更复杂的层。

CNTK is a deep learning system started by the speech people whostarted the deep learning craze and grown into a more general platform-independent deep learning system. It is better known in the speech community than in the general deep learning community.

In CNTK (as in TensorFlow and Theano), a network is specified as a symbolic graph of vector operations, such as matrix add/multiply or convolution. A layer is just a composition of those operations. The fine granularity of the building blocks (operations) allows users to invent new complex layer types without implementing them in a low-level language (as in Caffe).

As of today, CNTK is not usable for a variety of tasks such as sequence-2-sequence.

TensorFlow

tensorflow 是一个较新的网络，对RNN的表示较为容易且有效（使用桶的方法）；特点：RNN API、次最优执行；双向RNN；暂时没有适用于视频的3D卷积。

每一个计算流被构建为一个静态图，这会使一些计算困难，比如柱搜索方法（常用于序列预测任务的方法）。

State-of-the-art models

RNN API and implementation are suboptimal. The team also commented about ithere andhere.
Bidirectional RNN not available yet
No 3D convolution, which is useful for video recognition

New modelsSince TF uses symbolic graph of vector operations approach, specifying a new network is fairly easy. Although it doesn't support symbolic loop yet (at least not well tested/documented, as of 05/2016), RNNs can be made easy and efficient using the bucketing trick.

However, TF has a major weakness in terms of modeling flexibility. Every computational flow has be constructed as a static graph. That makes some computations difficult, such asbeam search (which is used frequently in sequence prediction tasks).

Theano

Theano：较新的框架结构，一般以高层的构架运行或者一纯Theano运行；

新的模型：Theano倡导使用符号图表运行网络，其符号API支持环控制--成为搜索，这种方法使RNN执行变得容易且有效；

State-of-the-art models. Theano has implementation for most state-of-the-art networks, either in the form of a higher-level framework (e.g.Blocks,Keras, etc.) or in pure Theano.

New models. Theano pioneered the trend of using symbolic graph for programming a network. Theano's symbolic API supports looping control, so-calledscan, which makes implementing RNNs easy and efficient. Users don't always have to define a new model at the tensor operations level. There are a few higher-level frameworks, mentioned above, which make model definition and training simpler.

Torch

State-of-the-art models

Excellent for conv nets. It's worth noting that temporal convolution can be done in TensorFlow/Theano viaconv2d but that's a trick. The native interface for temporal convolution in Torch makes it slightly more intuitive to use.
Rich set of RNNs available through anon-official extension [2]

New models. In Torch, there are multiple ways (stack of layers or graph of layers) to define a network but essentially, a network is defined as a graph of layers. Because of this coarser granularity, Torch is sometimes considered less flexible because for new layer types, users have to implement the full forward, backward, and gradient input update.

However, unlike Caffe, defining a new layer in Torch is much easier because you don't have to program in C++. Plus, in Torch, the difference between new layer definition and network definition is minimal. In Caffe, layers are defined in C++ while networks are defined via Protobuf.

Torch is more flexible than TensorFlow and Theano in that it is imperative while TF/Theano are declarative (i.e. one has to declare a computational graph). That makes some operations, e.g. beam search, much easier to do in Torch.

Torch在CNN网络方面做的极为优秀，在2维卷积网络方面使用的更为直观。与caffe不同的是，Torch更容易构建网络，因为构建新层不涉及C++的执行。因此，使得网络和层的定义可以占比重较小。而Caffe定义网络：每一层使用C++定义，整个网络配置则使用Protobuf文件。
TF/Theano are declarative使用陈述时语言（语法图），而命令式语言的Torch则显得扩展性更强。使得一些方法如柱搜索更加容易。

Left: graph model of CNTK/Theano/TensorFlow; Right: graph model of Caffe/Torch