Deep Learning：Optimization for Training Deep Models(零)_chapter 8optimization for training deepmodels 哪本书-CSDN博客

本文链接：https://blog.csdn.net/XJY104165/article/details/78354190

Of all of the many optimization problems involved in deep learning, the most difficult is neural network training.
It is quite common to invest days to months of time on hundreds of machines in order to solve even a single instance of the neural network training problem.
Because this problem is so important and so expensive, a specialized set of optimization techniques have been developed for solving it. This chapter presents these optimization techniques for neural network training.
This chapter focuses on one particular case of optimization: finding the parameters θ of a neural network that significantly reduce a cost function J(θ), which typically includes a performance measure evaluated on the entire training set as well as additional regularization terms.

We begin with a description of how optimization used as a training algorithm for a machine learning task differs from pure optimization.
Next, we present several of the concrete challenges that make optimization of neural networks difficult.
We then define several practical algorithms, including both optimization algorithms themselves and strategies for initializing the parameters. More advanced algorithms adapt their learning rates during training or leverage information contained in the second derivatives of the cost function.
Finally, we conclude with a review of several optimization strategies that are formed by combining simple optimization algorithms into higher-level procedures.