理解normalization||Standardization||Feature scaling

最新推荐文章于 2022-12-31 22:10:14 发布

PerpetualLearner

最新推荐文章于 2022-12-31 22:10:14 发布

阅读量329

点赞数

分类专栏： # 多因子模型文章标签： normalization standardization feature scaling gradient

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/The_Time_Runner/article/details/109171542

版权

多因子模型专栏收录该内容

17 篇文章 24 订阅

订阅专栏

Feature scaling

Feature scaling is a method used to normalize the range of independent variables or features of data.

In some machine learning algorithms, objective functions will not work without normalization.

Gradient descent converges much faster with feature scaling than without it.
- Rescaling(min-max normalization) Normalization
  
  It’s a method consists in rescaling the range of features to scale the range in [0,1] or [-1,1].
  
  The general formula for a min-max of [0,1] is given as:
  $x`=\frac{x-min(x)}{max(x)-min(x)}$
Mean normalization

$x`=\frac{x-average(x)}{max(x)-min(x)}$
Standardization (Z-score Normalization)

$x`=\frac{x-\hat x}{\sigma}$
Scaling to unit length

$x`=\frac{x}{||x||}$
Why Should we Use Feature Scaling

Gradient Descent Based Algorithms like linear regression, logistic regression, neural network, etc. that use gradient descent as an optimization technique require data to be scaled.

Distance-Based Algorithms like KNN, K-means, and SVM are most affected by the range of feetures.

Tree-Based Algorithms, are fairly insensitive to the scale of the features. A decision tree is only splitting a node based on a single feature, This split on a feature is not influenced by other features.
Normalization vs. Standardization

These are two of the most commonly used feature scaling techniques in machine learning but a level of ambiguity exists in their understanding.

Standardization can be helpful in cases where the data follows a Gaussian distribution;and unlike normalization, standardization does not have a bounding range, Which means even if you have outliers in your data, they will not be affected by standardization;

Normalization is good to use when you know that the distribution of your data does not follow a Gaussian distribution.
Normalization

Normalization is a scaling technique in which values are shifted and rescaled so that they end up ranging between 0 and 1. It is also known as Min-Max scaling.
$\frac{X-X_{min}}{X_{max}-X_{min}}$
Standardization

Standardization is another scaling technique where the values are centered around the mean with a unit standard deviation.
$\frac{X-\mu}{\sigma}$
References

Feature Scaling for Machine Learning: Understanding the Difference Between Normalization vs. Standardization

PerpetualLearner

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
理解normalization||Standardization||Feature scaling

Feature scalingFeature scaling is a method used to normalize the range of independent variables or features of data.In some machine learning algorithms, objective functions will not work without normalization.Gradient descent converges much faster wit..
复制链接

扫一扫

专栏目录

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。