正则化技术

What is Regularization Technique?It’s a technique mainly used to overcome the over-fitting issue during the model fitting. This is done by adding a penalty as the model’s complexity gets increased. Regularization parameter λ penalizes all the regression parameters except the intercept so that the model generalizes the data and it will avoid the over-fitting (i.e. it helps to keep the parameters regular or normal). This will make the fit more generalized to unseen data.

w ^帽子是正则化?这是主要用于模型拟合过程中,克服了过度拟合问题的技术。 这是通过在模型的复杂性增加时增加惩罚来实现的。 正则化参数λ惩罚除截距外的所有回归参数,以便模型对数据进行泛化,从而避免过拟合(即,有助于保持参数规则或正常)。 这将使拟合更广泛地适用于看不见的数据。

Over-fitting means while training the model using the training data, the model reads all the observation and learns from it and model becomes too complex. But while validating the same model using the testing data, the fit becomes worse.

过度拟合意味着在使用训练数据训练模型时,模型会读取所有观察值并从中学习,因此模型变得过于复杂。 但是,当使用测试数据验证同一模型时,拟合度会变差。

What does the Regularization Technique do?The basic concept is we don’t want huge weight for the regression coefficients. The simple regression equation is y= β0+β1x , where y is the response variable or dependent variable or target variable, x is the feature variable or independent variable and β’s are the regression coefficient parameter or unknown parameter.A small change in the weight to the parameters makes a larger difference in the target variable, thus it ensures that not too much weight is added. In this, not too much weight to any feature is given, and zero weight is given to the least significant feature.

w ^帽子做的正则化呢?其基本概念是我们不希望的回归系数的巨大重量。 简单的回归方程为y =β0+β1x,其中y是响应变量或因变量或目标变量,x是特征变量或自变量,β是回归系数参数或未知参数。参数对目标变量的影响更大,因此可以确保添加的权重不会太大。 在这种情况下,没有给予任何特征太多的权重,并且给最低有效特征赋予零的权重。

Working of RegularizationThus regularization will add the penalty for the higher terms and this will decrease the importance given to the higher terms and will bring the model towards less complex. Regularization equation:

RegularizationThus正规化W的工作会有将增加则判为更高的条款,这将减少分配给更高的条款的重要性,并会带来对模型不太复杂。 正则化方程:

Min(Σ(yi-βi*xi)² + λ/2 * Σ (|βi|)^p )

最小(Σ(yi-βi* xi)²+λ/ 2 *Σ(|βi|)^ p)

where p=1,2,…. and i=1,…,n. Mostly the popular values of p chosen would be 1 or 2. Thus selecting the feature is done by regularization.

其中p = 1,2,...。 并且i = 1,…,n。 通常,选择的p的流行值将是1或2。因此,通过正则化选择特征。

What is Loss function?Loss function is a function mainly used to estimate how far the estimated value from the observed actual value. i.e Σ(Y - f(x)). This is of two types:

w ^帽子损失函数?丧失功能主要是用来估计多远的估计值从观测到的实际值的函数。 即Σ(Y-f(x)) 。 这有两种类型:

  1. L1 loss function- Which gives the absolute sum of the difference of actual value minus estimated value. Given by: Σ(|Yi - f(x)|), thus there is a possibility of multiple solutions.

    L1损失函数-给出实际值减去估计值之差的绝对和。 由Σ(| Yi-f(x)|)给出,因此可能有多种解决方案。

  2. L2 loss function- Which gives the squared sum of the difference of actual value minus estimated value. Given by: Σ(Yi - f(x))², thus it gives us the least square value and will give us one clear form of solution.

    L2损失函数-给出实际值减去估计值之差的平方和。 给出: Σ(Yi-f(x))²,因此它为我们提供了最小的平方值,并且为我们提供了一种清晰的解决方案。

What are the type of Regularization Technique?There are two type of regularization technique, they are:

?w ^帽子是正则化的类型有两种类型的正则化技术的,它们分别是:

  1. Lasso Regularization / L1 Regularization- This will add “Absolute value of magnitude” of coefficient as penalty term to the loss function.

    套索正则化/ L1正则化-这会将系数的“幅度绝对值”作为损失项加到惩罚函数中。

arg Min(Σ(yi-βi*xi)² + λ* Σ (|βi|))

arg Min(Σ(yi-βi* xi)²+λ*Σ(|βi|))

2. Ridge Regularization / L2 Regularization- This will add “Squared magnitude” of coefficient as penalty term to the loss function.

2.岭正则化/ L2正则化-这会将系数的“平方幅度”作为损失项的惩罚项。

arg Min(Σ(yi-βi*xi)² + λ* Σ (βi)²)

arg Min(Σ(yi-βi* xi)²+λ*Σ(βi)²)

If λ is zero, then OLS (Ordinary Least Square) method is used. Else if λ is Very large, then it will add too much weight and it will lead to under-fitting. Thus choosing the value for λ is very important.

如果λ为零,则使用OLS(普通最小二乘)方法。 否则,如果λ非常大,那么它将增加过多的重量,并且会导致拟合不足。 因此,选择λ的值非常重要。

LASSOwill shrink the less important feature’s coefficient by assigning the value zero to it, and this will remove automatically the least significant variables. This will help us in variable selection. Mainly, LASSO will work well will less number of significant variable parameters.

L ASSO通过将零值分配给次要特征的系数来缩小它,这将自动删除最低有效变量。 这将有助于我们进行变量选择。 主要是,LASSO将在较少数量的重要可变参数的情况下工作良好。

RidgeThis will add penalty and as a result, shrinks the size of weight. Additionally, this will work well even with a huge number of variable parameters. Also used in collinearity problems.

řIDGE这将增加惩罚并且作为结果,收缩的体积重量。 此外,即使使用大量可变参数,此方法也能很好地工作。 也用于共线性问题。

Finding optimal suitable weights is a big challenge.

寻找最佳合适的重量是一个很大的挑战。

Prediction accuracy is good.

预测精度好。

Other model selection criteria like AIC, BIC, Cross-Validation, Step-wise regression to handle over-fitting, and perform feature selection work well with a small set of features but these regularization techniques are a great alternative when we are dealing with a large set of features.

其他模型选择标准,例如AIC,BIC,交叉验证,逐步回归以处理过度拟合,以及使用少量特征来很好地执行特征选择,但是当我们处理大型特征时,这些正则化技术是一个很好的选择功能集。

翻译自: https://medium.com/swlh/regularization-technique-84779df34092

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值