通过梯度下降实现logistic（对数几率）回归，含公式推导和代码实现

最新推荐文章于 2023-08-04 22:08:06 发布

请叫我Ricardo

最新推荐文章于 2023-08-04 22:08:06 发布

阅读量2.6k

点赞数 5

分类专栏：机器学习

本文链接：https://blog.csdn.net/weixin_43467711/article/details/104072085

版权

机器学习专栏收录该内容

10 篇文章 2 订阅

订阅专栏

总所周知，logistic regression算法是一个十分经典的机器学习算法，可用于回归和分类任务。

它在广义上说也是一种线性回归模型，不过是在线性回归的基础上加入了kernel函数，包括高斯核，多项式核，线性核等激活器，最经典的莫过于sigmoid，于是可以解决线性回归难以解决的非线性问题。

倘若从损失函数的角度来看，LR的损失函数是基于极大似然函数的，（具体做法是加了Log和负号），而线性回归模型的损失函数是最小二乘损失。

为什么损失函数会有所不同？因为各自的响应变量y服从不同的概率分布。在Linear Regression中，前提假设是y服从正态分布，而Logistic中的y非0即1，所以是服从二项分布的。

下面就是logistic 的激活函数公式和损失函数公式

在这里插入图片描述

使用交叉熵作为损失函数

本次logistic算法是在线性模型基础上使用单调可微函数sigmoid实现的，为什么logistic问题要用sigmoid函数？因为sigmoid能将数据压缩到（0,1）之间，很适合概率预测，而且sigmoid函数求导方便。

但是sigmoid在反向传播时候容易造成梯度消失，这是他的缺点，但是不能阻止我们在logistic上使用它，因为logsitic模型是对有“两点分布”的数据，利用极大似然函数来进行二分类的模型，两点分布的函数的指数表达式就是sigmoid()函数形式
西瓜书
以下是梯度下降推导过程，不过这过程着实抽象，于是我在草稿上实现了具体参数的梯度求导
在这里插入图片描述
具体到权重w和b的话则在下面（手写链式求导）
前提条件

w的梯度

b的梯度，b的梯度和w的梯度差距就在链式求导的最后一项

接下来是纯代码实现，不调机器学习库纯手写算法
使用鸢尾花iris来实现

import pandas as pd
import numpy as np
from numpy import mat,transpose,dot,log
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import  train_test_split
import matplotlib.pyplot as plt
import matplotlib as mpl

#特征名称
names = ['sepal length', 'sepal width', 'petal length', 'petal width', 'label']
df = pd.read_csv('./iris.data',header = None,names = names) 
#本次只取三种花中其中两种
df = df[df['label']!='Iris-versicolor']

#确定x和label
x = df[names[0:-1]].values
y = df[names[-1:]].values
#由于label是string类型的花朵名称，在此将其0,1二值化
y_first = LabelEncoder().fit_transform(y)
y = np.reshape(y_first,(len(y),1))

#将数据集的 70%作为训练集,30%作为测试集,检验模型在测试集上的分类正确率
x_train,x_test,y_train,y_test = train_test_split(x,y,test_size = 0.3,random_state = 0)

cost_x_list = []
cost_y_list = []
 
#参照sklearn框架实现logistic类
class logistic:
    def sigmoid(self,z):
        return 1.0 / (1+np.exp(-z))
    
    def loss(self,h,y):    #损失函数
        return -(y* log(h)+(1-y)*log(1-h)).mean()
    
    def predict_prob(self,x,weight,b):  #求得该类别概率
        return self.sigmoid(dot(x, weight)+b)    
   
    def predict (self,x,weight,b):    #最终输出类别
        pre = self.sigmoid(dot(x, weight)+b) 
        final=[]
        for i in pre:
            if i >=0.5:
                final.append(1)
            else:
                final.append(0)
        return np.array(final)
    
    def score(self,x,y,weight,b):   #精确度
        pre = self.predict(x,weight,b)
        count = 0
        for index,result in enumerate(pre):
                if result == y[index]:
                    count += 1
        #print(count)
        return float(count/len(y))
        
    def fit(self,x,y,alpha,iterations):   #拟合过程
        iteration = 0   #迭代次数
        m,n = x.shape 
        weight = np.zeros((n,1))  #初始权重
        b = 0  #偏移量初始为0
        while 1:
            z = dot(x,weight)+b  
            h = self.sigmoid(z)   # h即每次输出的结果
            j = self.loss(h,y)   #进行损失计算
            #w,b的梯度求导过程可看我上面手写的那两幅图
            gradien_w = dot(x.T,h-y)/len(y)    #以梯度推导结果求得梯度，.T为数组转置，不转置无法得内积
            gradien_b = np.mean(h-y)		
            weight = weight - alpha * gradien_w    #迭代
            b = b - alpha * gradien_b
            iteration += 1                       #迭代次数加一
            cost_x_list.append(iteration)
            cost_y_list.append(j)
            if iteration%500 == 0:              #每500次打印一次损失
                print('loss:',j)
            if iteration == iterations:
                
                return weight ,b ,j           #迭代完成，返回权重
        
alpha = 0.01  #学习率
iterations = 5000 #迭代次数
lr = logistic()	 #实例化
weight,b,j = lr.fit(x_train,y_train,alpha,iterations)
y_pre = lr.predict(x_test,weight,b)
y_prob= lr.predict_prob(x_test,weight,b)
score = lr.score(x_test,y_test,weight,b)

print('final_loss:',j)
print('weight:',weight)
print('b:',b)
print('score:',score)
print('y_pre: ',y_pre)
print('y_prob:',y_prob.ravel())

#画出损失函数
mpl.rcParams['font.sans-serif'] = [u'SimHei']
mpl.rcParams['axes.unicode_minus'] = False		#防止中文乱码
plt.figure('损失函数变换')
cost_plt = plt.gca()
cost_plt.set_xlabel('iteration')
cost_plt.set_ylabel('loss')
plt.title("cost_function")    
cost_plt.plot(cost_x_list, cost_y_list, color='r', linewidth=1, alpha=0.6)
plt.show()

以上就是从手写推导到代码实现logistic regression的全过程
假期重新回顾基础知识，重新推导算法的日子还是很快乐的

请叫我Ricardo

关注

5
点赞
踩
37

收藏

觉得还不错? 一键收藏
1
评论
通过梯度下降实现logistic（对数几率）回归，含公式推导和代码实现

总所周知，logistic regression算法是一个十分经典的机器学习算法，可用于回归和分类任务。它在广义上说也是一种线性回归模型，不过是在线性回归的基础上加入了kernel函数，包括高斯核，多项式核，线性核等，于是可以解决线性回归难以解决的非线性问题。倘若从损失函数的角度来看，LR的损失函数是基于极大似然函数的，（具体做法是加了Log和负号），而传统线性回归模型的损失函数是最小二乘损失...
复制链接

扫一扫

专栏目录