- Logistic回归是”广义线性模型“,用于解决分类问题。是在线性回归的基础上加入了非线性映射sigmoid函数
线性回归公式
h
θ
(
x
)
=
θ
0
x
0
+
θ
1
x
1
+
θ
2
x
2
+
.
.
.
+
θ
n
x
n
h_\theta(x) = \theta_0x_0 + \theta_1x_1 + \theta_2x_2 + ... + \theta_nx_n
hθ(x)=θ0x0+θ1x1+θ2x2+...+θnxn
线性回归向量形式
h
θ
(
x
)
=
θ
T
x
h_\theta(x) = \theta^Tx
hθ(x)=θTx
sigmoid函数
g ( z ) = 1 1 + e − z g(z) = \frac{1}{1 + e^{-z} } g(z)=1+e−z1
- 其中, z = h θ ( x ) = θ T x z = h_\theta(x) = \theta^Tx z=hθ(x)=θTx
g ( z ) = 1 1 + e − θ T x g(z) = \frac{1}{1 + e^{-\theta^Tx } } g(z)=1+e−θTx1
逻辑回归损失函数
J ( θ ) = − 1 m ∑ i = 0 m [ y ( i ) l o g h θ ( x ( i ) ) + ( 1 − y ( i ) ) l o g ( 1 − h θ ( x ( i ) ) ) ] J(\theta) = -\frac{1}{m} \sum_{i=0} ^ m [y^{(i)}log{h_{\theta} (x ^ {(i)})}+ (1 - y ^ {(i)})log{(1 -h_{\theta} (x ^ {(i)}))}] J(θ)=−m1i=0∑m[y(i)loghθ(x(i))+(1−y(i))log(1−hθ(x(i)))]
梯度下降更新
θ
\theta
θ
θ
j
:
=
θ
j
−
α
∂
∂
θ
j
J
(
θ
)
\theta_j := \theta_j - \alpha \frac{\partial}{\partial \theta_j} J (\theta)
θj:=θj−α∂θj∂J(θ)
sigmoid函数求导
g
(
x
)
=
1
1
+
e
−
x
g(x) = \frac{1}{1 + e^{-x } }
g(x)=1+e−x1
g ′ ( x ) = g ( x ) ( 1 − g ( x ) ) ( 5.8 ) g'(x) = g(x)(1 - g(x)) \quad (5.8) g′(x)=g(x)(1−g(x))(5.8)
求偏导数推导
∂
J
(
θ
)
∂
θ
j
=
−
1
m
∑
i
=
0
m
[
y
(
i
)
1
h
θ
(
x
(
i
)
)
∗
∂
h
θ
(
x
(
i
)
)
∂
θ
j
−
(
1
−
y
(
i
)
)
∗
1
1
−
h
θ
(
x
(
i
)
)
∗
∂
h
θ
(
x
(
i
)
)
∂
θ
j
]
\frac{\partial J (\theta)}{\partial \theta_j} = -\frac{1}{m} \sum_{i=0} ^ m [y ^ {(i)} \frac{1}{h_\theta(x^{(i)})} * \frac{\partial h_\theta (x^{(i)})}{\partial \theta_j} - (1 - y ^ {(i)}) *\frac{1}{1 - h_\theta(x^{(i)})} * \frac{\partial h_\theta (x^{(i)})}{\partial \theta_j}]
∂θj∂J(θ)=−m1i=0∑m[y(i)hθ(x(i))1∗∂θj∂hθ(x(i))−(1−y(i))∗1−hθ(x(i))1∗∂θj∂hθ(x(i))]
= − 1 m ∑ i = 0 m [ y ( i ) 1 g ( θ T x ( i ) ) − ( 1 − y ( i ) ) 1 1 − g ( θ T x ( i ) ) ] ∗ ∂ g ( θ T x ( i ) ) ∂ θ j = -\frac{1}{m} \sum_{i=0} ^ m [y ^ {(i)} \frac{1}{g(\theta^Tx ^ {(i)})} - (1 - y ^ {(i)}) \frac{1}{1 -g(\theta^Tx ^ {(i)})}] * \frac{\partial g(\theta^Tx ^ {(i)})}{\partial \theta_j} =−m1i=0∑m[y(i)g(θTx(i))1−(1−y(i))1−g(θTx(i))1]∗∂θj∂g(θTx(i))
= − 1 m ∑ i = 0 m [ y ( i ) 1 g ( θ T x ( i ) ) − ( 1 − y ( i ) ) 1 1 − g ( θ T x ( i ) ) ] ∗ g ( θ T x ( i ) ) ( 1 − g ( θ T x ( i ) ) x j ( i ) = -\frac{1}{m} \sum_{i=0} ^ m [y ^ {(i)} \frac{1}{g(\theta^Tx ^ {(i)})} - (1 - y ^ {(i)}) \frac{1}{1 -g(\theta^Tx ^ {(i)})}] * g(\theta^Tx ^ {(i)})(1 - g(\theta^Tx ^ {(i)}) x^{(i)}_j =−m1i=0∑m[y(i)g(θTx(i))1−(1−y(i))1−g(θTx(i))1]∗g(θTx(i))(1−g(θTx(i))xj(i)
= − 1 m ∑ i = 0 m [ y ( i ) ( 1 − g ( θ T x ( i ) ) − ( 1 − y ( i ) ) g ( θ T x ( i ) ) ] x j ( i ) = -\frac{1}{m} \sum_{i=0} ^ m [y ^ {(i)} (1 - g(\theta^Tx ^ {(i)}) - (1 - y ^ {(i)})g(\theta^Tx ^ {(i)})] x^{(i)}_j =−m1i=0∑m[y(i)(1−g(θTx(i))−(1−y(i))g(θTx(i))]xj(i)
= − 1 m ∑ i = 0 m ( y ( i ) − g ( θ T x ( i ) ) ) x j ( i ) = -\frac{1}{m} \sum_{i=0} ^ m (y ^ {(i)} - g(\theta^Tx ^ {(i)})) x^{(i)}_j =−m1i=0∑m(y(i)−g(θTx(i)))xj(i)
= 1 m ∑ i = 0 m ( h θ ( x ( i ) ) − y ( i ) ) x j ( i ) = \frac{1}{m} \sum_{i=0} ^ m(h_\theta(x^{(i)}) - y^{(i)}) x^{(i)}_j =m1i=0∑m(hθ(x(i))−y(i))xj(i)
欢迎大家交流学习,任何问题都可以留言