Theano-Deep Learning Tutorials 笔记:LSTM Networks for Sentiment Analysis

教程地址:http://deeplearning.net/tutorial/lstm.html

国外一个人的博客,图解比较多,可以对比着看:http://colah.github.io/posts/2015-08-Understanding-LSTMs/

UCSD一个博士写的介绍:http://blog.terminal.com/demistifying-long-short-term-memory-lstm-recurrent-neural-networks/

Geoffrey Hinton在coursera上的Neural Networks for Machine Learning课程第7课介绍了RNN以及LSTM:

https://class.coursera.org/neuralnets-2012-001/lecture

这节代码的源码分析的博客:http://www.cnblogs.com/neopenx/p/4806006.html

 

Large Movie Review Dataset 数据集:用爬虫在IMDB上收集的影评文字,根据评分情况分为两类。数据集的下载使用及预处理脚本代码详见教程(本节教材提供的代码imdb.py中会自动从网上下载预处理过的数据集)

 

Model

In a traditional recurrent neural network, during the gradient back-propagation phase, the gradient signal can end up being multiplied a large number of times (as many as the number of timesteps) by the weight matrix associated with the connections between the neurons of the recurrent hidden layer. This means that, the magnitude of weights in the transition matrix can have a strong impact on the learning process.

传统RNN的训练是Back propagation Through Time,即跨时间步的反向传导,这就导致了梯度会被图中紫色权重乘好多好多次(时间跨度多少次就被乘多少次)。紫色权重对学习过程的影响就非常大了。

If the weights in this matrix are small (or, more formally, if the leading eigenvalue of the weight matrix is smaller than 1.0), it can lead to a situation called vanishing gradients where the gradient signal gets so small that learning either becomes very slow or stops working altogether. It can also make more difficult the task of learning long-term dependencies in the data.Conversely, if the weights in this matrix are large (or, again, more formally, if the leading eigenvalue o

评论 4
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值