文章目录
【一】 LR(Logistic Regression,逻辑回归)
【有监督】常用的二分类算法,功能强大,逻辑简单,决策边界 为 W T X + b = 0 \,\bm {W^TX + b = 0} WTX+b=0
- Softmax 回归退化 到 二分类 情况下的特殊形式,基于 梯度下降 优化算法,需将特征归一化
p ( y = 1 ∣ x , w ) = 1 1 + e − w T x \bm {\red{p\, ( y = 1 \, | \, x , w )} = \frac { 1 } { 1 + e ^ { - w ^ { T } x } }} p(y=1∣x,w)=1+e−wTx1
【二】 Softmax 回归(多个 LR 的组合,多分类)
p
(
y
=
k
∣
x
,
w
)
=
e
−
w
k
T
x
∑
i
=
1
K
e
−
w
i
T
x
s
.
t
y
∈
{
0
,
…
,
k
,
…
,
K
}
\bm {\red{p\,(y=k \, | \, x, w)}=\frac { e ^ { - w _ { k } ^ { T } x } } { \sum _ { i = 1 } ^ { K } e ^ { - w _ { i } ^ { T } x } } } \;\;\;\; s . t \,\, y \in \{ 0 , \ldots , k , \ldots , K\}
p(y=k∣x,w)=∑i=1Ke−wiTxe−wkTxs.ty∈{0,…,k,…,K}
【三】 Sigmoid(激活函数)
- 激活函数求导,证明
σ
′
(
z
)
=
σ
(
z
)
(
1
−
σ
(
z
)
)
\,\, σ'(z) = σ(z)(1-σ(z))
σ′(z)=σ(z)(1−σ(z))
σ ′ ( z ) = ( 1 1 + e − z ) ′ = ( − 1 ) ( 1 + e − z ) ( − 1 ) − 1 ⋅ ( e − z ) ′ \bm \red{\sigma ^ { \prime } ( z )} = ( \frac { 1 } { 1 + e ^ { - z } } ) ^ { \prime } =( - 1 ) ( 1 + e ^ { - z } ) ^ { ( - 1 ) - 1 } \cdot ( e ^ { - z } ) ^ { \prime } σ′(z)=(1+e−z1)′=(−1)(1+e−z)(−1)−1⋅(e−z)′
= 1 ( 1 + e − z ) 2 ( e − z ) = 1 1 + e − z ⋅ e − z 1 + e − z =\frac { 1 } { ( 1 + e ^ { - z } ) ^ { 2 } } ( e ^ { - z } ) = \frac { 1 } { 1 + e ^ { - z } } \cdot \frac { e ^ { - z } } { 1 + e ^ { - z } } =(1+e−z)21(e−z)=1+e−z1⋅1+e−ze−z
= 1 1 + e − z ⋅ ( 1 − 1 1 + e − z ) = σ ( z ) ⋅ ( 1 − σ ( z ) ) = \frac { 1 } { 1 + e ^ { - z } } \cdot ( 1 - \frac { 1 } { 1 + e ^ { - z } } )=\bm \red{ \sigma ( z ) \cdot ( 1 - \sigma ( z ) )} =1+e−z1⋅(1−1+e−z1)=σ(z)⋅(1−σ(z))
【四】 Cross Entropy Loss(交叉熵损失函数):
- 损失函数
L = − 1 n ∑ x [ y ⋅ l n σ ( z ) + ( 1 − y ) ⋅ l n ( 1 − σ ( z ) ) ] \bm {\red L} = - \frac { 1 } { n } \sum _ { x } [ \, y \cdot ln \sigma (z) + (1-y) \cdot ln(1- \sigma (z)) \, ] L=−n1x∑[y⋅lnσ(z)+(1−y)⋅ln(1−σ(z))]
- 损失函数求导: l n ( x ) ′ = 1 / x , l n ( 1 − x ) ′ = − 1 / ( 1 − x ) ln(x)' = 1 / x,ln(1-x)' = - 1 / (1-x) ln(x)′=1/x,ln(1−x)′=−1/(1−x)
∂
L
∂
w
i
=
−
1
n
∑
x
[
y
σ
(
z
)
−
(
1
−
y
)
1
−
σ
(
z
)
]
∂
σ
∂
w
i
=
−
1
n
∑
x
[
y
σ
(
z
)
−
(
1
−
y
)
1
−
σ
(
z
)
]
σ
′
(
z
)
x
i
\bm \red {\frac { \partial L } { \partial w _ { i } } }= - \frac { 1 } { n } \sum _ { x } [ \frac { y } { \sigma ( z ) } - \frac { ( 1- y ) } { 1-\sigma ( z ) } ] \, \frac { \partial \sigma } { \partial w _ { i } } = - \frac { 1 } { n } \sum _ { x } [ \frac { y } { \sigma ( z ) } - \frac { ( 1- y ) } { 1-\sigma ( z ) } ] \, \sigma ^ { \prime } ( z ) x _ { i }
∂wi∂L=−n1x∑[σ(z)y−1−σ(z)(1−y)]∂wi∂σ=−n1x∑[σ(z)y−1−σ(z)(1−y)]σ′(z)xi
=
1
n
∑
x
σ
′
(
z
)
x
i
σ
(
z
)
(
1
−
σ
(
z
)
(
σ
(
z
)
−
y
)
=
1
n
∑
x
x
i
(
σ
(
z
)
−
y
)
= \frac { 1 } { n } \sum _ { x } \frac { \sigma ^ { \prime } ( z ) x _ { i } } { \sigma ( z ) ( 1 - \sigma ( z ) } ( \sigma ( z ) - y )= \bm \red {\frac { 1 } { n } \sum _ { x } x _ { i } ( \sigma ( z ) - y )}
=n1x∑σ(z)(1−σ(z)σ′(z)xi(σ(z)−y)=n1x∑xi(σ(z)−y)
∂
L
∂
b
=
1
n
∑
x
(
σ
(
z
)
−
y
)
\bm \red {\frac { \partial L } { \partial b } }= \bm \red {\frac { 1 } { n } \sum _ { x } ( \sigma ( z ) - y )}
∂b∂L=n1x∑(σ(z)−y)
【五】 应用场景
- 货款违约(会 / 不会),广告点击(点 / 不点),商品推荐(买 / 不买),情感分析(正面 / 负面),疾病诊断(阴 / 阳)
【六】 LR 代码使用(Sklearn)
from sklearn.linear_model.logistic import LogisticRegression
'''
:param (参数) 一般默认就行了
'''
lr = LogisticRegression()
'''
:object (方法)
lr.fit(X,y): LR 是有监督的机器学习算法
lr.predict(X): 返回数据 X 预测的类别
'''