一、什么是Logistic回归
Logistic回归不要看它带了“回归”二字,其实它是一个分类模型,一般用于解决二分类问题,比如用于判断是否患癌等等。
二、Logistic回归模型
Sigmoid函数: 也叫Logistic函数,是一种在生物学上常见的s型曲线,
σ
=
1
1
+
e
−
z
\sigma=\frac{1}{1+e^{-z}}
σ=1+e−z1,图像如下所示:
通常把
σ
(
z
)
>
0.5
\sigma(z)>0.5
σ(z)>0.5的部分分类至1标签,
σ
(
z
)
<
0.5
\sigma(z)<0.5
σ(z)<0.5则分类至0标签,即:
P
(
Y
=
1
∣
x
)
=
σ
(
ω
T
x
)
=
1
1
+
e
−
ω
T
x
=
e
ω
T
x
1
+
e
ω
T
x
=
π
(
x
(
i
)
)
P(Y=1|x)=\sigma(\omega^{T} x)=\frac{1}{1+e^{-\omega^{T} x}}=\frac{e^{\omega^{T} x}}{1+e^{\omega^{T} x}}=\pi(x^{(i)})
P(Y=1∣x)=σ(ωTx)=1+e−ωTx1=1+eωTxeωTx=π(x(i))
P
(
Y
=
0
∣
x
)
=
1
−
P
(
Y
=
1
∣
x
)
=
1
1
+
e
ω
T
x
=
1
−
π
(
x
(
i
)
)
P(Y=0|x)=1-P(Y=1|x)=\frac{1}{1+e^{\omega^{T} x}}=1-\pi(x^{(i)})
P(Y=0∣x)=1−P(Y=1∣x)=1+eωTx1=1−π(x(i))
假设训练集
T
=
(
x
(
1
)
,
y
(
1
)
)
,
⋅
⋅
⋅
,
(
x
(
m
)
,
y
(
m
)
)
T={{(x^{(1)},y^{(1)})},\cdot\cdot\cdot, {(x^{(m)},y^{(m)})}}
T=(x(1),y(1)),⋅⋅⋅,(x(m),y(m)),
y
(
i
)
∈
{
0
,
1
}
y_{(i)}\in \{0,1\}
y(i)∈{0,1}
x
(
i
)
=
[
x
1
(
i
)
⋅
⋅
⋅
x
n
(
i
)
]
T
x^{(i)}= \begin{bmatrix} x^{(i)}_{1} &\cdot\cdot\cdot & x^{(i)}_{n} \\ \end{bmatrix}^{T}
x(i)=[x1(i)⋅⋅⋅xn(i)]T
模型: 求 m a x ω L ( ω ) = ∏ i = 1 m π ( x ( i ) ) y ( i ) ( 1 − π ( x ( i ) ) ) 1 − y ( i ) \mathop{max}\limits_{\omega}L(\omega)=\prod_{i=1}^{m}\pi(x^{(i)})^{y^{(i)}}(1-\pi(x^{(i)}))^{1-y^{(i)}} ωmaxL(ω)=∏i=1mπ(x(i))y(i)(1−π(x(i)))1−y(i)
推导:
m
i
n
ω
L
(
ω
)
=
−
l
n
∏
i
=
1
m
π
(
x
(
i
)
)
y
(
i
)
(
1
−
π
(
x
(
i
)
)
)
1
−
y
(
i
)
=
−
∑
i
=
1
m
[
y
(
i
)
l
n
(
π
(
x
(
i
)
)
)
+
(
1
−
y
(
i
)
)
⋅
l
n
(
1
+
e
ω
T
x
(
i
)
)
]
=
−
∑
i
=
1
m
[
y
(
i
)
⋅
ω
T
x
(
i
)
−
l
n
(
1
+
e
ω
T
x
(
i
)
)
]
\begin {aligned}\mathop{min}\limits_{\omega}L(\omega)&=-ln\prod_{i=1}^{m}\pi(x^{(i)})^{y^{(i)}}(1-\pi(x^{(i)}))^{1-y^{(i)}}\\ &=-\sum_{i=1}^{m}[y^{(i)}ln(\pi(x^{(i)}))+(1-y^{(i)})\cdot ln(1+e^{\omega^{T}x^{(i)}})]\\ &=-\sum_{i=1}^{m}[y^{(i)}\cdot \omega^{T}x^{(i)}-ln(1+e^{\omega^{T}x^{(i)}})] \end{aligned}
ωminL(ω)=−lni=1∏mπ(x(i))y(i)(1−π(x(i)))1−y(i)=−i=1∑m[y(i)ln(π(x(i)))+(1−y(i))⋅ln(1+eωTx(i))]=−i=1∑m[y(i)⋅ωTx(i)−ln(1+eωTx(i))]
接着采用梯度下降法求解:
∂
L
(
ω
)
∂
ω
=
−
∑
i
=
1
m
[
y
(
i
)
⋅
x
(
i
)
−
1
1
+
e
ω
T
x
⋅
e
ω
T
x
(
i
)
⋅
x
(
i
)
]
=
−
∑
i
=
1
m
[
y
(
i
)
−
e
ω
T
x
1
+
e
ω
T
x
]
x
(
i
)
\begin {aligned} \frac{\partial L(\omega)}{\partial \omega}&=-\sum_{i=1}^{m}[y^{(i)}\cdot x^{(i)}-\frac{1}{1+e^{\omega^{T} x}}\cdot e^{\omega^{T}x^{(i)}}\cdot x^{(i)}]\\ &=-\sum_{i=1}^{m}[y^{(i)}-\frac{e^{\omega^{T} x}}{1+e^{\omega^{T} x}}]x^{(i)} \end{aligned}
∂ω∂L(ω)=−i=1∑m[y(i)⋅x(i)−1+eωTx1⋅eωTx(i)⋅x(i)]=−i=1∑m[y(i)−1+eωTxeωTx]x(i)
算法步骤:
Step1: 初始化
k
,
ε
,
α
,
M
a
x
N
,
ω
k
k,\varepsilon,\alpha,MaxN,\omega_{k}
k,ε,α,MaxN,ωk
Step2: 任选样本
(
x
(
i
)
,
y
(
i
)
)
(x^{(i)},y^{(i)})
(x(i),y(i)),计算
d
k
=
−
[
y
(
i
)
−
e
ω
T
x
1
+
e
ω
T
x
]
x
(
i
)
d_{k}=-[y^{(i)}-\frac{e^{\omega^{T} x}}{1+e^{\omega^{T} x}}]x^{(i)}
dk=−[y(i)−1+eωTxeωTx]x(i)
Step3:
ω
k
+
1
:
=
ω
k
−
α
d
k
\omega_{k+1}:=\omega_{k}-\alpha d_{k}
ωk+1:=ωk−αdk
Step4:
k
=
k
+
1
k=k+1
k=k+1,若
∣
∣
d
k
∣
∣
<
ε
||d_{k}||<\varepsilon
∣∣dk∣∣<ε或
k
>
M
a
x
N
k>MaxN
k>MaxN,输出
ω
\omega
ω,否则转回Step2
三、正则化
为了防止过拟合,添加正则项是一个不错的选择。模型即变为
L
(
ω
)
=
−
∑
i
=
1
m
[
y
(
i
)
⋅
ω
T
x
(
i
)
−
l
n
(
1
+
e
ω
T
x
(
i
)
)
]
+
λ
∣
∣
ω
∣
∣
2
L(\omega)=-\sum_{i=1}^{m}[y^{(i)}\cdot \omega^{T}x^{(i)}-ln(1+e^{\omega^{T}x^{(i)}})]+\lambda ||\omega||^{2}
L(ω)=−i=1∑m[y(i)⋅ωTx(i)−ln(1+eωTx(i))]+λ∣∣ω∣∣2
求导过程只需要在求出
∂
∣
∣
ω
∣
∣
2
∂
ω
=
2
ω
\frac{\partial ||\omega||^{2}}{\partial \omega}=2\omega
∂ω∂∣∣ω∣∣2=2ω
最后求导结果为
∂
L
(
ω
)
∂
ω
=
−
∑
i
=
1
m
[
y
(
i
)
−
e
ω
T
x
1
+
e
ω
T
x
]
x
(
i
)
+
2
ω
\begin {aligned} \frac{\partial L(\omega)}{\partial \omega} =-\sum_{i=1}^{m}[y^{(i)}-\frac{e^{\omega^{T} x}}{1+e^{\omega^{T} x}}]x^{(i)}+2\omega \end{aligned}
∂ω∂L(ω)=−i=1∑m[y(i)−1+eωTxeωTx]x(i)+2ω
四、多分类问题
由于Logistic模型用于解决二分类问题,因此在手工编程中若遇到多分类问题时,可以通过多次进行二分类来解决。下面代码解决的便是一个三分类问题。
import pandas as pd
import numpy as np
from sklearn.datasets import load_iris
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
def load(): # 数据加载,处理
iris = load_iris()
scaler = MinMaxScaler()
x = scaler.fit_transform(iris.data[:,:])
ones = np.ones(x.shape[0])
X = np.insert(x, 0, values=ones, axis=1)
y = iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
df_train = pd.concat([pd.DataFrame(X_train),pd.DataFrame(y_train,columns=['Species'])],axis = 1)
df_test = pd.concat([pd.DataFrame(X_test), pd.DataFrame(y_test,columns=['Species'])], axis=1)
m,n = X.shape #150,5
cate = set(y)
return df_train,df_test,n,cate
def Weight(): # 创造w
_,_,n,cate = load()
w = np.ones((n, 1))
return w
def data_spilt1(df_train):
X = np.array(df_train.iloc[:,:5])
Y = np.array(df_train.iloc[:,-1])
Y[Y == 2] = 1
return X,Y
def data_spilt2(df_train):
df_train.drop(df_train.index[(df_train['Species'] == 0)], inplace=True)
X = np.array(df_train.iloc[:, :5])
Y = np.array(df_train.iloc[:, -1])
Y[Y == 1] = 0
Y[Y == 2] = 1
return X, Y
def gk(x,y,w):
h = np.exp(w.T@x)
g = -(y-h/(1+h))*x
return g
def logistic(X,Y):
w = Weight()
k, sigma, alpha, MaxN = 0, 10 ** (-5), 0.1, 8000
for i in range(MaxN):
j = i % len(X)
x = X[j].reshape(-1,1)
y = Y[j]
g = gk(x,y,w)
w = w - alpha*g
k += 1
if np.linalg.norm(g) < sigma:
break
return w
def Accuracy(W_matrix,df_test):
X = np.array(df_test.iloc[:, :5])
Y = np.array(df_test.iloc[:, -1])
accur = []
for x in X:
lst = []
for w in range(len(W_matrix)):
total = W_matrix[w].T@x
lst.append(total)
if lst[0][0] < 0:
accur.append(0)
elif lst[0][0] > 0:
if lst[1][0] < 0:
accur.append(1)
elif lst[1][0] > 0:
accur.append(2)
accuracy = sum(1 for x, y in zip(Y, accur) if x == y) / len(Y)
return accuracy
if __name__ == '__main__':
df_train,df_test,n,cate = load()
W_matrix = []
for i in range(len(cate)-1):
pro = 'data_spilt{}'.format(i+1)
X, Y = eval(pro)(df_train)
w = logistic(X,Y)
W_matrix.append(w)
print('w:', W_matrix)
accuracy = Accuracy(W_matrix,df_test)
print('accuracy score:',accuracy)