Deep Learning_拉普拉斯的汪的博客-CSDN博客

Deep Learning

关注

文章平均质量分 93

一些零碎的理论。主要参考Dive into Deep Learning (https://d2l.ai/index.html)和Deep Learning (https://www.deeplearningbook.org/)

关注数：文章数：7 文章阅读量：2066 文章收藏量：1

作者: 拉普拉斯的汪

这个作者很懒，什么都没留下…

展开

Optimization Algorithms

Reference:https://d2l.ai/chapter_optimization/index.htmlContentReview: Stochastic Gradient Descent (SGD)Momentum: Leaky AveragesExample: An Ill-conditioned ProblemThe Momentum MethodAdagradMotivation: Sparse FeaturesAdditional Benefits: PreconditioningTh

原创 2021-09-11 05:09:36 · 446 阅读 · 0 评论
Attention Mechanisms

Reference:https://d2l.ai/chapter_attention-mechanisms/index.html 10.2-10.6ContentAttention Pooling-An ExampleAttention Scoring FunctionsAdditive AttentionScaled Dot-Product AttentionBahdanau AttentionSequence to Sequence LearningIncorporate Attention Mod

原创 2021-09-06 17:15:28 · 312 阅读 · 0 评论
Modern Recurrent Neural Networks

Reference:https://d2l.ai/chapter_recurrent-modern/index.html 9.1-9.4ContentMotivationGated Recurrent Units (GRU)Reset Gate and Update GateHidden StateLong Short-Term Memory (LSTM)Input Gate, Forget Gate, and Output GateMemory CellHidden StateDeep Recurre

原创 2021-09-05 17:59:00 · 227 阅读 · 0 评论
Recurrent Neural Networks

Reference:https://d2l.ai/chapter_recurrent-neural-networks/index.html (8.1 & 8.4)Pattern Recognition and Machine Learning 13.1-13.2ContentSequence ModelsAutoregressive ModelsMarkov ModelsLatent Autoregressive ModelsHidden Markov ModelsNeural Network

原创 2021-09-03 04:11:59 · 265 阅读 · 0 评论
Vanishing and Exploding Gradients

Reference:https://d2l.ai/chapter_multilayer-perceptrons/numerical-stability-and-init.htmlGlorot X, Bengio Y. Understanding the difficulty of training deep feedforward neural networks[C]//Proceedings of the thirteenth international conference on artificia

原创 2021-08-30 03:57:20 · 260 阅读 · 0 评论
Cross-Entropy Loss

Reference:https://en.wikipedia.org/wiki/Cross_entropyhttps://d2l.ai/chapter_linear-networks/softmax-regression.html#loss-functionDefinition: Cross-EntropyThe cross-entropy of the distribution qqq relative to a distribution ppp over a given set is defin

原创 2021-08-30 03:55:23 · 271 阅读 · 0 评论
Minibatch Stochastic Gradient Descent

Reference:https://d2l.ai/chapter_linear-networks/linear-regression.htmlhttps://d2l.ai/chapter_linear-networks/linear-regression-scratch.htmlGradient descentx←x−η∂xL(x)\mathbf x\leftarrow \mathbf x-\eta \partial_{\mathbf x}\mathcal L(\mathbf x)x←x−η∂x

原创 2021-08-30 03:54:43 · 288 阅读 · 0 评论

Deep Learning

作者: 拉普拉斯的汪

Optimization Algorithms

Attention Mechanisms

Modern Recurrent Neural Networks

Recurrent Neural Networks

Vanishing and Exploding Gradients

Cross-Entropy Loss

Minibatch Stochastic Gradient Descent