标题:ADAPTIVE GRADIENT METHODS WITH DYNAMIC BOUND OF LEARNING RATE
Abstract:
1.element-wise scaling term 逐元素缩放项 2.gradual and smooth transition 逐步平稳过渡 3.prototypes 原型 4.non-adaptive counterparts 同行 5.portion 部分 6. plateaus 平稳状态
短语:in spite of its simplicity 尽管很简单
1INTRODUCTION
词:7. state-of-the-art 最先进的 8.wherein 其中 9.instances 实例 10.dominant 优势 11.scales the gradient uniformly 均匀缩放梯度 12.sparse 稀疏 13.empirical 经验 14.abate 减轻 15.elucidate 阐明 16.scale-down term 缩减项 17 constant 不变 18.prototypes 原型
方法:named AD-ABOUND AND AMSBOUND
We employ dynamic bounds on learning rates in these adaptive methods, where the lower and upper bound are initialized as zero and infinity respectively, and they both smoothly converge to a constant final step size. The new variants can be regarded as adaptive methods at the beginning of training, and they gradually and smoothly transform to SGD (or with momentum) as time step increases.
2NOTATIONS AND PRELIMINARIES——简单记录
词:1.coordinate坐标 2.elementwise 逐元素地
EXPERIMENT ON CNN——using DenseNet-121 and ResNet-34 with CIFAR-10 dataset