Logistic回归模型推导及手工编程

一、什么是Logistic回归

 Logistic回归不要看它带了“回归”二字,其实它是一个分类模型,一般用于解决二分类问题,比如用于判断是否患癌等等。

二、Logistic回归模型

Sigmoid函数: 也叫Logistic函数,是一种在生物学上常见的s型曲线, σ = 1 1 + e − z \sigma=\frac{1}{1+e^{-z}} σ=1+ez1,图像如下所示:
Sigmoid
 通常把 σ ( z ) > 0.5 \sigma(z)>0.5 σ(z)>0.5的部分分类至1标签, σ ( z ) < 0.5 \sigma(z)<0.5 σ(z)<0.5则分类至0标签,即:

P ( Y = 1 ∣ x ) = σ ( ω T x ) = 1 1 + e − ω T x = e ω T x 1 + e ω T x = π ( x ( i ) ) P(Y=1|x)=\sigma(\omega^{T} x)=\frac{1}{1+e^{-\omega^{T} x}}=\frac{e^{\omega^{T} x}}{1+e^{\omega^{T} x}}=\pi(x^{(i)}) P(Y=1∣x)=σ(ωTx)=1+eωTx1=1+eωTxeωTx=π(x(i))
P ( Y = 0 ∣ x ) = 1 − P ( Y = 1 ∣ x ) = 1 1 + e ω T x = 1 − π ( x ( i ) ) P(Y=0|x)=1-P(Y=1|x)=\frac{1}{1+e^{\omega^{T} x}}=1-\pi(x^{(i)}) P(Y=0∣x)=1P(Y=1∣x)=1+eωTx1=1π(x(i))

假设训练集 T = ( x ( 1 ) , y ( 1 ) ) , ⋅ ⋅ ⋅ , ( x ( m ) , y ( m ) ) T={{(x^{(1)},y^{(1)})},\cdot\cdot\cdot, {(x^{(m)},y^{(m)})}} T=(x(1),y(1)),,(x(m),y(m)) y ( i ) ∈ { 0 , 1 } y_{(i)}\in \{0,1\} y(i){0,1}
x ( i ) = [ x 1 ( i ) ⋅ ⋅ ⋅ x n ( i ) ] T x^{(i)}= \begin{bmatrix} x^{(i)}_{1} &\cdot\cdot\cdot & x^{(i)}_{n} \\ \end{bmatrix}^{T} x(i)=[x1(i)xn(i)]T

模型: m a x ω L ( ω ) = ∏ i = 1 m π ( x ( i ) ) y ( i ) ( 1 − π ( x ( i ) ) ) 1 − y ( i ) \mathop{max}\limits_{\omega}L(\omega)=\prod_{i=1}^{m}\pi(x^{(i)})^{y^{(i)}}(1-\pi(x^{(i)}))^{1-y^{(i)}} ωmaxL(ω)=i=1mπ(x(i))y(i)(1π(x(i)))1y(i)

推导: m i n ω L ( ω ) = − l n ∏ i = 1 m π ( x ( i ) ) y ( i ) ( 1 − π ( x ( i ) ) ) 1 − y ( i ) = − ∑ i = 1 m [ y ( i ) l n ( π ( x ( i ) ) ) + ( 1 − y ( i ) ) ⋅ l n ( 1 + e ω T x ( i ) ) ] = − ∑ i = 1 m [ y ( i ) ⋅ ω T x ( i ) − l n ( 1 + e ω T x ( i ) ) ] \begin {aligned}\mathop{min}\limits_{\omega}L(\omega)&=-ln\prod_{i=1}^{m}\pi(x^{(i)})^{y^{(i)}}(1-\pi(x^{(i)}))^{1-y^{(i)}}\\ &=-\sum_{i=1}^{m}[y^{(i)}ln(\pi(x^{(i)}))+(1-y^{(i)})\cdot ln(1+e^{\omega^{T}x^{(i)}})]\\ &=-\sum_{i=1}^{m}[y^{(i)}\cdot \omega^{T}x^{(i)}-ln(1+e^{\omega^{T}x^{(i)}})] \end{aligned} ωminL(ω)=lni=1mπ(x(i))y(i)(1π(x(i)))1y(i)=i=1m[y(i)ln(π(x(i)))+(1y(i))ln(1+eωTx(i))]=i=1m[y(i)ωTx(i)ln(1+eωTx(i))]
接着采用梯度下降法求解:
∂ L ( ω ) ∂ ω = − ∑ i = 1 m [ y ( i ) ⋅ x ( i ) − 1 1 + e ω T x ⋅ e ω T x ( i ) ⋅ x ( i ) ] = − ∑ i = 1 m [ y ( i ) − e ω T x 1 + e ω T x ] x ( i ) \begin {aligned} \frac{\partial L(\omega)}{\partial \omega}&=-\sum_{i=1}^{m}[y^{(i)}\cdot x^{(i)}-\frac{1}{1+e^{\omega^{T} x}}\cdot e^{\omega^{T}x^{(i)}}\cdot x^{(i)}]\\ &=-\sum_{i=1}^{m}[y^{(i)}-\frac{e^{\omega^{T} x}}{1+e^{\omega^{T} x}}]x^{(i)} \end{aligned} ωL(ω)=i=1m[y(i)x(i)1+eωTx1eωTx(i)x(i)]=i=1m[y(i)1+eωTxeωTx]x(i)
算法步骤:
Step1: 初始化 k , ε , α , M a x N , ω k k,\varepsilon,\alpha,MaxN,\omega_{k} k,ε,α,MaxN,ωk
Step2: 任选样本 ( x ( i ) , y ( i ) ) (x^{(i)},y^{(i)}) (x(i),y(i)),计算 d k = − [ y ( i ) − e ω T x 1 + e ω T x ] x ( i ) d_{k}=-[y^{(i)}-\frac{e^{\omega^{T} x}}{1+e^{\omega^{T} x}}]x^{(i)} dk=[y(i)1+eωTxeωTx]x(i)
Step3: ω k + 1 : = ω k − α d k \omega_{k+1}:=\omega_{k}-\alpha d_{k} ωk+1:=ωkαdk
Step4: k = k + 1 k=k+1 k=k+1,若 ∣ ∣ d k ∣ ∣ < ε ||d_{k}||<\varepsilon ∣∣dk∣∣<ε k > M a x N k>MaxN k>MaxN,输出 ω \omega ω,否则转回Step2

三、正则化

  为了防止过拟合,添加正则项是一个不错的选择。模型即变为
L ( ω ) = − ∑ i = 1 m [ y ( i ) ⋅ ω T x ( i ) − l n ( 1 + e ω T x ( i ) ) ] + λ ∣ ∣ ω ∣ ∣ 2 L(\omega)=-\sum_{i=1}^{m}[y^{(i)}\cdot \omega^{T}x^{(i)}-ln(1+e^{\omega^{T}x^{(i)}})]+\lambda ||\omega||^{2} L(ω)=i=1m[y(i)ωTx(i)ln(1+eωTx(i))]+λ∣∣ω2
求导过程只需要在求出 ∂ ∣ ∣ ω ∣ ∣ 2 ∂ ω = 2 ω \frac{\partial ||\omega||^{2}}{\partial \omega}=2\omega ω∣∣ω2=2ω
最后求导结果为
∂ L ( ω ) ∂ ω = − ∑ i = 1 m [ y ( i ) − e ω T x 1 + e ω T x ] x ( i ) + 2 ω \begin {aligned} \frac{\partial L(\omega)}{\partial \omega} =-\sum_{i=1}^{m}[y^{(i)}-\frac{e^{\omega^{T} x}}{1+e^{\omega^{T} x}}]x^{(i)}+2\omega \end{aligned} ωL(ω)=i=1m[y(i)1+eωTxeωTx]x(i)+2ω

四、多分类问题

 由于Logistic模型用于解决二分类问题,因此在手工编程中若遇到多分类问题时,可以通过多次进行二分类来解决。下面代码解决的便是一个三分类问题。

import pandas as pd
import numpy as np
from sklearn.datasets import load_iris
from sklearn.preprocessing import  MinMaxScaler
from sklearn.model_selection import train_test_split

def load():  # 数据加载,处理
    iris = load_iris()
    scaler = MinMaxScaler()
    x = scaler.fit_transform(iris.data[:,:])
    ones = np.ones(x.shape[0])
    X = np.insert(x, 0, values=ones, axis=1)
    y = iris.target
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
    df_train = pd.concat([pd.DataFrame(X_train),pd.DataFrame(y_train,columns=['Species'])],axis = 1)
    df_test = pd.concat([pd.DataFrame(X_test), pd.DataFrame(y_test,columns=['Species'])], axis=1)
    m,n = X.shape #150,5
    cate = set(y)
    return df_train,df_test,n,cate

def Weight():  # 创造w
    _,_,n,cate = load()
    w = np.ones((n, 1))
    return w

def data_spilt1(df_train):
    X = np.array(df_train.iloc[:,:5])
    Y = np.array(df_train.iloc[:,-1])
    Y[Y == 2] = 1
    return X,Y

def data_spilt2(df_train):
    df_train.drop(df_train.index[(df_train['Species'] == 0)], inplace=True)
    X = np.array(df_train.iloc[:, :5])
    Y = np.array(df_train.iloc[:, -1])
    Y[Y == 1] = 0
    Y[Y == 2] = 1
    return X, Y

def gk(x,y,w):
    h = np.exp(w.T@x)
    g = -(y-h/(1+h))*x
    return g

def logistic(X,Y):
    w = Weight()
    k, sigma, alpha, MaxN = 0, 10 ** (-5), 0.1, 8000
    for i in range(MaxN):
        j = i % len(X)
        x = X[j].reshape(-1,1)
        y = Y[j]
        g = gk(x,y,w)
        w = w - alpha*g
        k += 1
        if np.linalg.norm(g) < sigma:
            break
    return w

def Accuracy(W_matrix,df_test):
    X = np.array(df_test.iloc[:, :5])
    Y = np.array(df_test.iloc[:, -1])
    accur = []
    for x in X:
        lst = []
        for w in range(len(W_matrix)):
            total = W_matrix[w].T@x
            lst.append(total)
        if lst[0][0] < 0:
            accur.append(0)
        elif lst[0][0] > 0:
            if lst[1][0] < 0:
                accur.append(1)
            elif lst[1][0] > 0:
                accur.append(2)
    accuracy = sum(1 for x, y in zip(Y, accur) if x == y) / len(Y)
    return accuracy

if __name__ == '__main__':
    df_train,df_test,n,cate = load()
    W_matrix = []
    for i in range(len(cate)-1):
        pro = 'data_spilt{}'.format(i+1)
        X, Y = eval(pro)(df_train)
        w = logistic(X,Y)
        W_matrix.append(w)
    print('w:', W_matrix)
    accuracy = Accuracy(W_matrix,df_test)
    print('accuracy score:',accuracy)
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值