[深度学习论文笔记][Recurrent Neural Networks] Visualizing and Understanding Recurrent Networks

最新推荐文章于 2021-09-16 23:35:00 发布

Hao_Zhang_Vision

最新推荐文章于 2021-09-16 23:35:00 发布

阅读量1.2k

点赞数 1

分类专栏： CNN Papers 文章标签： CNN Deep Learning Papers Computer Vision Recurrent Neural Net

本文链接：https://blog.csdn.net/Hao_Zhang_Vision/article/details/53159236

版权

本文深入探讨了循环神经网络（RNN）的问题，如梯度消失和爆炸，并重点关注了解决这些问题的两种结构：长短期记忆网络（LSTM）和门控循环单元（GRU）。LSTM通过门控机制实现稳定的学习，允许信息长时间无损地回传。GRU则通过候选隐藏向量和门控机制进行平滑的更新。这些网络在深度学习和计算机视觉等领域有广泛应用。

摘要由CSDN通过智能技术生成

Karpathy, Andrej, Justin Johnson, and Li Fei-Fei. “Visualizing and understanding recurrent networks” arXiv preprint arXiv:1506.02078 (2015). (Citations: 79).

1 RNN

RNN has form

Where W varies between layers but is shared through time. ⃗ x is the input from the layer below.

It was observed that the back-propagation dynamics caused the gradients in an RNN to either vanish or explode.

2 LSTM

The exploding gradient concern can be alleviated with a heuristic of clipping the gradients, and LSTMs were designed to mitigate the vanishing gradient problem. In addition to a ⃗ LSTMs also maintain a memory vector ⃗ c . At each time step the hidden state vector h, LSTM can choose to read from, write to,