【转载】Deep learning these days

106 篇文章 2 订阅
49 篇文章 0 订阅

原文地址:http://www.52ml.net/14277.html


It seems that quite a few people with interest in deep learning think of it in terms of unsupervised pre-training, autoencoders, stacked RBMs and deep belief networks. It’s easy to get into this groove by watching one of Geoff Hinton’s videos from a few years ago, where he bashes backpropagation in favour of unsupervised methods that are able to discover the structure in data by themselves, the same way as human brain does. Those videos, papers and tutorials linger. They were state of the art once, but things have changed since then.

These days supervised learning is the king again. This has to do with the fact that you can look at data from many different angles and usually you’d prefer representation that is useful for the discriminative task at hand. Unsupervised learning will find some angle, but will it be the one you want? In case of the MNIST digits, sure. Otherwise probably not. Or maybe it will find a lot of angles while you only need one.

Ladies and gentlemen, please welcome Sander Dieleman. He has won the The Galaxy Zoo challenge on Kaggle, so he might know a thing or two about deep learning. Here’s a Reddit comment about deep learning Sander recently made, reprinted with the author’s permission:


It’s true that unsupervised pre-training was initially what made it possible to train deeper networks, but the last few years the pre-training approach has been largely obsoleted.

Nowadays, deep neural networks are a lot more similar to their 80’s cousins. Instead of pre-training, the difference is now in the activation functions and regularisation methods used (and sometimes in the optimisation algorithm, although much more rarely).

I would say that the “pre-training era”, which started around 2006, ended in the early ’10s when people started using rectified linear units (ReLUs), and later dropout, and discovered that pre-training was no longer beneficial for this type of networks.

ReLUs (and modern variants such as maxout) suffer significantly less from the vanishing gradient problem. Dropout is a strong regulariser that helps ensure the solution we get generalises well. These are precisely the two issues pre-training sought to solve, but they are now solved in different ways.

Using ReLUs also tends to result in faster training, because they’re faster to compute, and because the optimisation is easier.

Commercial applications of deep learning usually consist of very large, feed-forward neural nets with ReLUs and dropout, and trained simply with backprop. Unsupervised pre-training still has its applications when labeled data is very scarce and unlabeled data is abundant, but more often than not, it’s no longer needed.


原文:http://fastml.com/deep-learning-these-days/


评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值