学习率退火
- “learning_rate”:学习率
- “learning_rate_a”和”learning_rate_b”:学习率衰减参数,具体衰减公式由learning_rate_schedule决定
- “learning_rate_schedule”:配置不同的学习率递减模式,包括:
- ”constant”: lr = learning_rate
- “poly”: lr = learning_rate * pow(1 + learning_rate_decay_a * num_samples_processed, -learning_rate_decay_b)
- “exp”: lr = learning_rate * pow(learning_rate_decay_a, num_samples_processed / learning_rate_decay_b)
- “discexp”: lr = learning_rate * pow(learning_rate_decay_a, floor(num_samples_processed / learning_rate_decay_b))
- “linear”: lr = max(learning_rate - learning_rate_decay_a * num_sample_passed, learning_rate_decay_b)