三次多项式曲线php,Polynomial Curve Fitting 多项式曲线拟合(三)

For the moment, however, it is instructive to continue with the current approach and to consider how in practice we can apply it to data sets of limited size where we may wish to use relatively complex and flexible models. One technique that is often used to control the over-fitting phenomenon in such cases is that of regularization, which involves adding a penalty term to the error function(1.2) in order  to discourage the coefficients from reaching large values. The simplest such penalty term takes the form of a sum of squares of all of the coefficients,leading to a modified error function of the form

此刻,我们要继续使用当前的方法,思考怎样将这个方法应用到有限的数据集,这里我们获取希望使用相对复杂或灵活的模型。一个经常使用的技术是正则化,它过去常常用来控制过拟合现象,这个技术会在错误函数中加一个惩罚因子,为了阻止系数达到最大值。最简单的惩罚因子就是所有系统的平方和的形式,修改后的错误函数如下:

61c8d946277b

where ||w||2 ≡wTw = w2 0 +w2 1 +...+w2 M, and the coefficient λ governs the relative importance of the regularization term compared with the sum-of-squares error term. Note that often the coefficient w0 is omitted from the regularizer because its inclusion causes the results to depend on the choice of origin for the target variable (Hastie et al., 2001), or it may be included but with its own regularization coefficient (we shall discuss this topic in more detail in Section 5.5.1). Again, the error function in (1.4) can be minimized exactly in closed form. Techniques such as this are known Exercise 1.2 in the statistics literature as shrinkage methods because they reduce the value of the coefficients. The particular case of a quadratic regularizer is called ridge regression (Hoerl and Kennard, 1970). In the context of neural networks, this approach is known as weight decay.

其中

math?formula=%7C%7Cw%7C%7C%5E2 = 

math?formula=w%5ET

math?formula=w%3Dw_%7B0%7D%5E2%2B%20w_%7B2%7D%5E2%2B%20w_%7B3%7D%5E2...%2B%20w_%7BM%7D%5E2,系数

math?formula=%5Clambda%20用来管理正则化因子相对于平方和错误因子的重要性。注意

math?formula=w_%7B0%7D%20常常会被从正则式中去掉,因为包含了它会导致依赖目标值的原点选择,或者包含一个属于它自己的正则化系数。还有,这个错误函数可以被精确的以闭合形式最小化。例如在练习1.2中统计学的收缩方法,因为他们减少了系数的值。一个二次正则式的特例是叫做边缘回归。在神经网络场景下,这种方法叫权重衰减。

Figure 1.7 shows the results of fitting the polynomial of order M =9 to the same data set as before but now using the regularized error function given by (1.4). We see that, for a value of lnλ = −18, the over-fitting has been suppressed and we now obtain a much closer representation of the underlying function sin(2πx). If, however, we use too large a value for λ then we again obtain a poor fit, as shown in Figure 1.7 for lnλ =0. The corresponding coefficients from the fitted polynomials are given in Table 1.2, showing that regularization has the desired effect of reducing the magnitude of the coefficients. The impact of the regularization term on the generalization error can be seen by plotting the value of the RMS error (1.3) for both training and test sets against lnλ, as shown in Figure 1.8. We see that in effect λ now controls the effective complexity of the model and hence determines the degree of over-fitting.

61c8d946277b

图1.7展示了原来的阶为9,使用相同数据集,但现在使用正则化错误函数的拟合结果。我们看到对于ln

math?formula=%5Clambda%20=-18,过拟合现象已经被抑制了,我们现在得到了一个更接近函数sin(2

math?formula=%5Cpi%20x)的代表。但是如果我们使用太大的值,效果再次变差,如图1.7中ln

math?formula=%5Clambda%20=0时的情况。如表1.2,给定了这个拟合多项式的相应的系数,显示正则化已经起到了减少系数权重的效果。正则化因子对于泛化的影响可以在RMS错误函数的对于训练数据和测试数据两种数据集的图像中看到,如图1.8.我们看到

math?formula=%5Clambda%20现在控制了模型的有效复杂度,因此决定了过拟合的度。

61c8d946277b

The issue of model complexity is an important one and will be discussed at length in Section1.3. Here we simply note that, if we were trying to solve a practical application using this approach of minimizing an error function, we would have to find a way to determine a suitable value for the model complexity. The results above suggest a simple way of achieving this, namely by taking the available data and partitioning it into a training set, used to determine the coefficients w, and a separate validation set, also called a hold-out set, used to optimize the model complexity (either M or λ). In many cases, however, this will prove to be too wasteful of valuable training data, and we have to seek more sophisticated approaches.

模型复杂度是一个重要的问题,我们会在1.3章节中讨论。在这里,我们就仅简单的知道,如果我们要使用这个最小化错误函数的方式解决实际应用,我们将不得不找到一种方式去决定模型复杂度的合适的值。上面的结果是一个简单的方法,将找一些可用的数据,将其分为训练集,用来决定系数,还有验证集,也叫做留出集合,用于优化模型复杂度(M或者

math?formula=%5Clambda%20)。但是在许多场景下,这中方法浪费了有价值的训练数据。我们必须找到一种更巧妙的方法。

61c8d946277b

So far our discussion of polynomial curve fitting has appealed largely to intuition. We now seek a more principled approach to solving problems in pattern recognition by turning to a discussion of probability theory. As well as providing the foundation for nearly all of the subsequent developments in this book, it will also give us some important insights into the concepts we have introduced in the context of polynomial curve fitting and will allow us to extend these to more complex situations.

至此,我们对拟合曲线的讨论已经有了一个大的印象。我们现在寻找一种更原则性的方式去解决模式识别的问题,就是回到概率理论的讨论。它为这本书提供了所有后续开发的基础,也给我们一些重要的对于前面已经介绍过的曲线拟合概念的理解,也允许我们将这些扩展到更复杂的场景下。

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值