机器学习归一化标准化_机器学习中的标准化

最新推荐文章于 2023-04-13 19:11:21 发布

郝ren

最新推荐文章于 2023-04-13 19:11:21 发布

阅读量1.6k

点赞数

文章标签：机器学习 python 人工智能

原文链接：https://medium.com/@sailaja.karra/normalization-in-machine-learning-166a364d3edc

版权

本文探讨了机器学习中数据预处理的关键步骤——归一化和标准化。通过这两种技术，可以调整特征的尺度，提高算法的性能。归一化通常将数据缩放到0到1的区间，而标准化则使数据服从标准正态分布。了解并正确应用这些方法对于优化机器学习模型至关重要。

摘要由CSDN通过智能技术生成

机器学习归一化标准化

Normalization is a technique often applied as part of data preparation for machine learning. The goal of normalization is to change the values of numeric columns in the dataset to use a common scale, without distorting differences in the ranges of values or losing information. Normalization is also required for some algorithms to model the data correctly.

规范化是一种经常用作机器学习数据准备过程中的技术。归一化的目标是将数据集中的数字列的值更改为使用公共刻度，而不会扭曲值范围内的差异或丢失信息。一些算法还需要规范化以正确地对数据建模。

For example, assume your input dataset contains one column with values ranging from 0 to 1, and another column with values ranging from 10,000 to 100,000. The great difference in the scale of the numbers could cause problems when you attempt to combine the values as features during modeling.

例如，假设您的输入数据集包含一列，其值的范围从0到1，另一列的值的范围是10,000到100,000。当您在建模期间尝试将值组合为要素时，数字比例的巨大差异可能会导致问题。

Normalization avoids these problems by creating new values that maintain the general distribution and ratios in the source data, while keeping values within a scale applied across all numeric columns used in the model.

规范化通过创建新值来保持源数据中的一般分布和比率，同时将值保持在模型中使用的所有数字列上的刻度范围内，从而避免了这些问题。

There are several ways to normalize the data.Some of them are as follows.

有几种标准化数据的方法，其中一些如下。

日志转换 (Log transformation)

A log transformation is a very useful tool when you have data that clearly does not follow a normal distribution. Log transformation can help reduce skewness when you have skewed data, and can help reducing variability of data. Please do make sure your data is only positive and non-zero numbers as log of negative or 0 is undefined. For just positive numbers that might contain zero’s there is a log 1+p transformation that, as you might have guessed, adds 1 to all the numbers and then does