LSTM Networks

最新推荐文章于 2022-01-27 10:50:26 发布

铁皮樵夫

最新推荐文章于 2022-01-27 10:50:26 发布

阅读量247

点赞数

分类专栏： Deep Learning 文章标签： Deep learning LSTM

本文链接：https://blog.csdn.net/u011074528/article/details/53388405

版权

Deep Learning 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

1. Recurrent Neural Network (RNN)

Compare to traditional neural network, RNN can work on persistence model, which means the current state is depending on the past states.

For example, imagine you want to classify what kind of event is happening at every point in a movie. It’s unclear how a traditional neural network could use its reasoning about previous events in the film to inform later ones [1].

However, RNN can only work well on short-term dependencies, as that gap grows, RNNs become unable to learn to connect the information.

2. LSTM Networks

Long Short Term Memory Networks (LSTM) is a special kind of RNN, which is also capable of learning long-term dependencies.

LSTM also have this chain like structure, but the repeating module has a different structure. Instead of having a single neural network layer, LSTM has four and the four interacting in a special way.

The key to LSTMs is the cell state, the horizontal line running through the top of the diagram.

The LSTM does have the ability to remove or add information to the cell state, carefully regulated by structures called gates. Gates are a way to optionally let information through. They are composed out of a sigmoid neural network layer and a pointwise operation.

The sigmoid layer outputs numbers between zero and one, describing how much of each component should be let through. A value of zero means “let nothing through,” while a value of one means “let everything through!”

An LSTM has three of these gates, to protect and control the cell state.

3. Step by Stey LSTM Walk Through

a)The first step in our LSTM is to decide what information we’re going to throw away from the cell state.

b) The next step is to decide what new information we’re going to store in the cell state.

This has two parts. First, a sigmoid layer called the “input gate layer” decides which values we’ll update. Next, a tanh layer creates a vector of new candidate values, , that could be added to the state. In the next step, we’ll combine these two to create an update to the state.

It’s now time to update the old cell state, , into the new cell state .

c) Finally, we need to decide what we’re going to output.

This output will be based on our cell state, but will be a filtered version.

4. Variants on LSTM

a) Adding “peephole connections.” This means that we let the gate layers look at the cell state.

b) using coupled forget and input gates.

Instead of separately deciding what to forget and what we should add new information to, we make those decisions together. We only forget when we’re going to input something in its place. We only input new values to the state when we forget something older.

c) Gated Recurrent Unit, or GRU.

It combines the forget and input gates into a single “update gate.” It also merges the cell state and hidden state, and makes some other changes.

Reference: http://colah.github.io/posts/2015-08-Understanding-LSTMs/

铁皮樵夫

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
LSTM Networks

1. Recurrent Neural Network (RNN)Compare to traditional neural network, RNN can work on persistence model, which means the current state is depending on the past states.For example, imagine you wa
复制链接

扫一扫