A CLOSER LOOK AT DEEP LEARNING HEURISTICS: LEARNING RATE RESTARTS, WARMUP AND DISTILLATION

题目:A CLOSER LOOK AT DEEP LEARNING HEURISTICS: LEARNING RATE RESTARTS, WARMUP AND DISTILLATION
ABSTRACT:

词:1.heuristics 启发式的  2.knowledge distillation 知识升华  3.underpinnings 基础 4.aid  援助 5.empirical 经验 6.linear interpolation and visualizations with dimensionality reduction 线性差值和降维可视化 7.mode connectivity and canonical correlation analysis 模式连通性和规范相关分析 8.hypothesize 假设 9.annealing  退火 10.viz.,即
段落:about  mode connectivity and canonical correlation analysis 

we explore knowledge distillation and learning rate heuristics of (cosine) restarts and warmup using mode connectivity and CCA

1 INTRODUCTION

词:1.commonplace 平凡 2. buttressed 支撑 3.intuitive 直觉的 4.ingredient 成分 5.step-decay 逐步衰减 6.mimic 模仿 7.piecewise 分段的  8.invariances 不变性  9.permutation and scaling 排列和缩放
 

短语:1.out of the need 出于需要

2. EMPIRICAL TOOLS

 

公式和证明。。。。。。

RESULTS:

 

Our empirical analysis sheds light on these heuristics and suggests that: (a) the reasons often quoted for the success of cosine annealing are not evidenced in practice; (b) that the effect of learning rate warmup is to prevent the deeper layers from creating training instability; and (c) that the latent knowledge shared by the teacher is primarily disbursed in the deeper layers.



 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值