【AI】Practical tips for training neural networks

最新推荐文章于 2019-03-05 08:47:05 发布

xiaoyisha

最新推荐文章于 2019-03-05 08:47:05 发布

阅读量138

点赞数

分类专栏： AI 文章标签： AI

本文链接：https://blog.csdn.net/xiaoyisha/article/details/84541856

版权

AI 专栏收录该内容

3 篇文章 0 订阅

订阅专栏

Be systematic.

Keep a log of every architecture you've tried, what the hyperparameters (layer sizes, learning rate, etc.) were, and what the resulting performance was. As you try more things, you can start seeing patterns about which parameters matter. If you find a bug in your code, be sure to cross out past results that are invalid due to the bug.

Start with a shallow network (just two layers, i.e. one non-linearity).

Deeper networks have exponentially more hyperparameter combinations, and getting even a single one wrong can ruin your performance. Use the small network to find a good learning rate and layer size; afterwards you can consider adding more layers of similar size.

If your learning rate is wrong, none of your other hyperparameter choices matter.

You can take a state-of-the-art model from a research paper, and change the learning rate such that it performs no better than random.

Smaller batches require lower learning rates.

When experimenting with different batch sizes, be aware that the best learning rate may be different depending on the batch size.

Making the network too wide generally doesn't hurt accuracy too much.

If you keep making the network wider accuracy will gradually decline, but computation time will increase quadratically in the layer size -- you're likely to give up due to excessive slowness long before the accuracy falls too much. The full autograder for all parts of the project takes 2-12 minutes to run with staff solutions; if your code is taking much longer you should check it for efficiency.

If your model is returning Infinity or NaN, your learning rate is probably too high for your current architecture.

Recommended values for your hyperparameters:

Hidden layer sizes: between 10 and 400

Batch size: between 1 and the size of the datasize. For Q2 and Q3, we require that total size of the dataset be evenly divisible by the batch size.

Learning rate: between 0.001 and 1.0

Number of hidden layers: bewteen 1 and 3

xiaoyisha

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
【AI】Practical tips for training neural networks

Be systematic. Keep a log of every architecture you've tried, what the hyperparameters (layer sizes, learning rate, etc.) were, and what the resulting performance was. As you try more things, you ca...
复制链接

扫一扫

专栏目录