Lecture 2_Extra Optimization for Deep Learning

Lecture 2_Extra Optimization for Deep Learning

New Optimizers for Deep Learning

What you have known before?

image-20220828115340388
  • SGD
image-20220828111715973
  • SGD with momentum (SGDM)
image-20220828112833716
  • Adagrad
image-20220828114131817
  • RMSProp
image-20220828114209064
  • Adam
image-20220828115045206

Some Notations

image-20220828102119813

What is Optimization about?

image-20220828102606552

On-line vs Off-line

image-20220828102924761 image-20220828102954855

Optimizers: Real Application

image-20220828143739812 image-20220828143825424 image-20220828145006618

Adam vs SGDM

Original article

image-20220828145449421 image-20220828145827211 image-20220828150017210 image-20220828150048761 image-20220828150403372

Towards Improving Adam

Simply combine Adam with SGDM?

image-20220828151130217

Trouble shooting

怎么样让 Adam 收敛得又快又好?

image-20220828151808265 image-20220828151855459

AMSGrad [Reddi, et al., ICLR’18]

image-20220828152127255 image-20220828152153806

AMSGrad only handles large learning rates

AdaBound [Luo, et al., ICLR’19]

image-20220828152954768

Towards Improving SGDM

image-20220828153416918

LR range test [Smith, WACV’17]

image-20220828153512849

Cyclical LR [Smith, WACV’17]

image-20220828153623183

SGDR [Loshchilov, et al., ICLR’17]

image-20220828153950767

One-cycle LR [Smith, et al., arXiv’17]

image-20220828154126394

Does Adam need warm-up?

image-20220830094440492 image-20220830094718417

RAdam [Liu, et al., ICLR’20]

image-20220830094906156
RAdam vs SWATS
image-20220830095437316

k k k step forward, 1 1 1 step back

Lookahead [Zhang, et al., arXiv’19]

image-20220830100541476 image-20220830100732973

More than momentum

Nesterov accelerated gradient (NAG) [Nesterov, jour Dokl. Akad. Nauk SSSR’83]

image-20220830101415480 image-20220830101804986

Adam in the future

image-20220830102655748

Do you really know your optimizer?

A story of L 2 L_2 L2 regularization

image-20220830102917903

AdamW & SGDW with momentum [Loshchilov, arXiv’17]

image-20220830103753884

Something helps optimization

image-20220830104117026 image-20220830104135225 image-20220830104203207

Summary

image-20220830110110995 image-20220830111534114

Advices

image-20220830111802706
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值