# 正则化(Regularization)

python机器学习 专栏收录该内容
36 篇文章 4 订阅

y=w0+w1xi+w2x2i++wmxmi+εii=1,2,,n y = w 0 + w 1 x i + w 2 x i 2 + ⋯ + w m x i m + ε i （ i = 1 , 2 , ⋯ , n ）

# L2正则化

L2:w21=j=1nw2j L 2 : ‖ w ‖ 1 2 = ∑ j = 1 n w j 2

J(θ)=12m[i=1m(hθ(x(i))y(i))2+λj=1nw2j] J ( θ ) = 1 2 m [ ∑ i = 1 m ( h θ ( x ( i ) ) − y ( i ) ) 2 + λ ∑ j = 1 n w j 2 ]

# L1正则化

L1:w1=j=1n|wj| L 1 : ‖ w ‖ 1 = ∑ j = 1 n | w j |

L1正则化可生成稀疏的特征向量，且大多数的权值为0。当高维数据集中包含许多不相关的特征，尤其是不相关的特征向量大于样本数量时，权重的稀疏化处理相当于一种特征选择技术。

from sklearn import datasets
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression

X = iris.data[:, [2, 3]]  # 维度：(150,2)
y = iris.target
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.3, random_state=0)

sc = StandardScaler()
sc.fit(X_train)  # 计算训练数据的均值和标准差
X_train_std = sc.transform(X_train)
X_test_std = sc.transform(X_test)

LR = LogisticRegression(penalty='l1', C=0.1)
LR.fit(X_train_std, y_train)
print('Training accuracy:', LR.score(X_train_std, y_train))
print('Test accuracy:', LR.score(X_test_std, y_test))
print(LR.intercept_)
# 输出截距：
# [-0.44864439 -0.40546215 -0.24906672]
print(LR.coef_)
'''
输出权重数组
[[-1.84458718  0.        ]
[ 0.          0.        ]
[ 0.          1.50479021]]
'''
01-13 1万+

04-12 4588
10-18 2573
12-06 1万+
03-24 1135
01-16 2492
02-24 791
07-19 5万+
03-02 8万+
11-07 554
03-19 1159
05-31 2万+

winycg

¥2 ¥4 ¥6 ¥10 ¥20

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、C币套餐、付费专栏及课程。