逻辑回归实现

逻辑回归就是把线性回归的输出做了一个映射,把结果映射成一个概率问题。以实现分类

代码实现

sigmoid:
f ( x ) = 1 1 + e − x f(x) = \frac{1}{1 + e^{-x}} f(x)=1+ex1
损失函数:
J ( θ ) = − l ( θ ) = − ∑ i = 1 n [ y ( i ) ln ⁡ ( h θ ( x ( i ) ) ) + ( 1 − y ( i ) ) ln ⁡ ( 1 − h θ ( x ( i ) ) ) ] J(\theta) = -l(\theta) = -\sum\limits_{i = 1}^n[y^{(i)}\ln(h_{\theta}(x^{(i)})) + (1-y^{(i)})\ln(1-h_{\theta}(x^{(i)}))] J(θ)=l(θ)=i=1n[y(i)ln(hθ(x(i)))+(1y(i))ln(1hθ(x(i)))]
线性函数
y = X W + b y=XW + b y=XW+b
梯度更新
θ j t + 1 = θ j t − α ⋅ ∑ i = 1 n ( h θ ( x ( i ) ) − y ( i ) ) x j ( i ) \theta_j^{t+1} = \theta_j^t - \alpha \cdot \sum\limits_{i=1}^{n}(h_{\theta}(x^{(i)}) -y^{(i)})x_j^{(i)} θjt+1=θjtαi=1n(hθ(x(i))y(i))xj(i)

class My_LogisticRegression:
    
    #默认没有正则化,正则项参数默认为1,学习率默认为0.001,迭代次数为10001次
    def __init__(self,penalty = None,Lambda = 1,a = 0.001,epochs = 10001):
        self.W = None
        self.penalty = penalty
        self.Lambda = Lambda
        self.a = a
        self.epochs =epochs
        #sigmoid
        self.sigmoid = lambda x:1/(1 + np.exp(-x))
        
    #损失函数
    def loss(self,x,y):
        m=x.shape[0]
        #转化成概率
        p = self.sigmoid(x * self.W)
        return (-1/m) * np.sum((np.multiply(y, np.log(p)) + np.multiply((1-y),np.log(1-p))))
    #预测
    def predict(self,X):
        #加偏置
        X = np.concatenate((np.ones((X.shape[0],1)),X),axis = 1)
        y_p = np.mat(X) * self.W
        #概率
        p = self.sigmoid(y_p)
        y_p = np.where(p>=0.5,1,0)
        return y_p
    def fit(self,x,y):
        import numpy as np
        lossList = []
        #总样本数
        m = x.shape[0]
        #添加偏置项
        X = np.concatenate((np.ones((m,1)),x),axis = 1)
        #总特征数
        n = X.shape[1]
        #初始化W的值
        self.W = np.mat(np.ones((n,1)))
        xMat = np.mat(X)
        yMat = np.mat(y.reshape(-1,1))
        #初始化loss
        loss =  0
        #前一次的loss
        pre_loss = loss + 1
        #循环epochs次
        for i in range(self.epochs):
            #预测值
            p = self.sigmoid(xMat * self.W)
            gradient = xMat.T * (p - yMat)/m
            
            
            #加入l1和l2正则项,和之前的线性回归正则化一样
            if self.penalty == 'l2':
                gradient = gradient + self.Lambda * np.linalg.norm(self.W, ord=2)
            elif self.penalty == 'l1':
                gradient = gradient + self.Lambda * np.linalg.norm(self.W, ord=1)
          
            self.W = self.W-self.a * gradient
            
            #当前的loss
            pre_loss = loss
            loss = self.loss(xMat,yMat)
            if i % 50 == 0:
                lossList.append(loss)
            #损失没什么变化,收敛退出迭代
            if np.abs(pre_loss - loss) < 0.002:
                break
        #返回系数,和损失列表
        return self.W,lossList,i

加载乳腺癌数据

from sklearn import datasets
data = datasets.load_breast_cancer()
from sklearn.preprocessing import scale # 数据标准化Z-score
np.set_printoptions(suppress=True)
X, y = data['data'], data['target']
# 数据标准化Z-score
X = scale(X)
# print(X)
# display(X.shape,y.shape)

数据集切分

from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.2)

训练评估

import warnings
warnings.filterwarnings('ignore')

lgr = My_LogisticRegression(penalty="l1")
W,loss_list,times = lgr.fit(X_train,y_train)
# 预测评估
from sklearn.metrics import accuracy_score
y_pre = lgr.predict(X_test)
score = accuracy_score(y_test,y_pre)
print("L1正则化-逻辑回归的准确率:",score)
L1正则化-逻辑回归的准确率: 0.9298245614035088

对比sklearn

from sklearn.linear_model import LogisticRegression

lcs = LogisticRegression()
lcs.fit(X_train,y_train)
pre = lcs.predict(X_test)
score = accuracy_score(y_test,pre)
print("sklearn 逻辑回归的准确率:",score)
sklearn 逻辑回归的准确率: 0.9824561403508771
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

学AI不秃头

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值