深度学习模型遇到的问题以及解决的tips(李弘毅机器学习-Tips for deep learning)

在深度学习过程中,我们会遇到很多问题,并不是所有的问题都是overfitting。
比如下面这个很经典的例子:
在这里插入图片描述
这个例子中,随着迭代次数的增加,50-layers的网络在训练集上本身就比20-layers的网络表现差。而在测试集上也表现差,这类问题我们不能只看测试集的结果就说是一个overfitting的例子。还要看训练集。因此,在深度学习的过程中我们会遇到很多问题,不同的问题有不同的解决办法。 当下主要有以下办法:

  1. New activation function(使用新的或者变化的激活函数)
  2. Adaptive Learning rate(适应性变化的学习率)
  3. Early stopping(早结束)
  4. Regularization(正则化)
  5. Dropout

part1 : New activation function

问题:梯度消失(Vanish Gradient )
首先我们对梯度消失的问题进行解释:通过训练我们发现刚network的左端刚开始的部分梯度比较小,对应的学习的比较慢,几乎还保持在刚开始赋予的随机值周围。但是在network的右端梯度比较大,学习的非常快,并且已经接近收敛。这种收敛可并不是什么好现象,因为后端的输入是前端的输出,在输出结果还近似于随机值时,后端开始收敛,那么并没有学习到什么有效的信息。
在这里插入图片描述
为什么会出现这种情况呢?我们假设一开始权重的差值 Δ \Delta Δw比较大,然后经过一层sigmoid函数, Δ \Delta Δw变小了,然后再经过一层sigmoid函数变得更小,一层一层直到输出层。在 Δ \Delta Δl不变的情况下, Δ \Delta Δw不断变小,使得梯度 ∂ l ∂ w \frac{\partial l}{\partial w} wl不断变大。看来问题是出现在激活函数sigmoid上面了,我们需要改变它。
在这里插入图片描述
这时候,我们将sigmoid函数替换成Relu函数。为什么换成Relu呢,很简单它容易计算,可以看成无限个sigmoid叠加,更重要的是它可以避免梯度消失。
在这里插入图片描述
Relu是一种特殊的Maxout,Maxout人如其名就是让比较大的东西输出。Maxout最杰出的地方在于它可以使得激活函数变成可以学习的,也就是说随着训练,激活函数会不断变化。Maxout主要有两个步骤:一个是分组,这个是事先自己规定好的。第二是比大小,选出最大的,进入下一层。
在这里插入图片描述
如上图所示,我们将(5,7)分为一组,(-1,1)分为一组然后进行选出其中比较大的7和1进入下一组。我们前面有说道,Relu是特殊的Maxout,我们来进行一下解释。
在这里插入图片描述
我们将另外两个权重为0,也就是我们对于两个节点的值可以绘出两条直线: z 1 = w x + b z_{1}=wx+b z1=wx+b

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Through exposure to the news and social media, you are probably aware of the fact that machine learning has become one of the most exciting technologies of our time and age. Large companies, such as Google, Facebook, Apple, Amazon, and IBM, heavily invest in machine learning research and applications for good reasons. While it may seem that machine learning has become the buzzword of our time and age, it is certainly not a fad. This exciting field opens the way to new possibilities and has become indispensable to our daily lives. This is evident in talking to the voice assistant on our smartphones, recommending the right product for our customers, preventing credit card fraud, filtering out spam from our email inboxes, detecting and diagnosing medical diseases, the list goes on and on. If you want to become a machine learning practitioner, a better problem solver, or maybe even consider a career in machine learning research, then this book is for you. However, for a novice, the theoretical concepts behind machine learning can be quite overwhelming. Many practical books have been published in recent years that will help you get started in machine learning by implementing powerful learning algorithms. Getting exposed to practical code examples and working through example applications of machine learning are a great way to dive into this field. Concrete examples help illustrate the broader concepts by putting the learned material directly into action. However, remember that with great power comes great responsibility! In addition to offering a hands-on experience with machine learning using the Python programming languages and Python-based machine learning libraries, this book introduces the mathematical concepts behind machine learning algorithms, which is essential for using machine learning successfully. Thus, this book is different from a purely practical book; it is a book that discusses the necessary details regarding machine learning concepts and offers intuitive yet informative explanations of how machine learning algorithms work, how to use them, and most importantly, how to avoid the most common pitfalls. Currently, if you type "machine learning" as a search term in Google Scholar, it returns an overwhelmingly large number of publications—1,800,000. Of course, we cannot discuss the nitty-gritty of all the different algorithms and applications that have emerged in the last 60 years. However, in this book, we will embark on an exciting journey that covers all the essential topics and concepts to give you a head start in this field. If you find that your thirst for knowledge is not satisfied, this book references many useful resources that can be used to follow up on the essential breakthroughs in this field. If you have already studied machine learning theory in detail, this book will show you how to put your knowledge into practice. If you have used machine learning techniques before and want to gain more insight into how machine learning actually works, this book is for you. Don't worry if you are completely new to the machine learning field; you have even more reason to be excited. Here is a promise that machine learning will change the way you think about the problems you want to solve and will show you how to tackle them by unlocking the power of data. Before we dive deeper into the machine learning field, let's answer your most important question, "Why Python?" The answer is simple: it is powerful yet very accessible. Python has become the most popular programming language for data science because it allows us to forget about the tedious parts of programming and offers us an environment where we can quickly jot down our ideas and put concepts directly into action. We, the authors, can truly say that the study of machine learning has made us better scientists, thinkers, and problem solvers. In this book, we want to share this knowledge with you. Knowledge is gained by learning. The key is our enthusiasm, and the real mastery of skills can only be achieved by practice. The road ahead may be bumpy on occasions and some topics may be more challenging than others, but we hope that you will embrace this opportunity and focus on the reward. Remember that we are on this journey together, and throughout this book, we will add many powerful techniques to your arsenal that will help us solve even the toughest problems the data-driven way.

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值