机器学习之逻辑斯提回归（Logistic Regression）模型

最新推荐文章于 2023-12-04 14:00:39 发布

简范式AI

最新推荐文章于 2023-12-04 14:00:39 发布

阅读量1.3k

点赞数 1

分类专栏：机器学习文章标签： Logistic Regression 逻辑回归逻辑斯蒂回归

本文链接：https://blog.csdn.net/havefun00/article/details/79586211

版权

机器学习专栏收录该内容

12 篇文章 4 订阅

订阅专栏

机器学习之逻辑回归模型

1、逻辑回归模型介绍
2、逻辑回归数学原理
3、算法及Python实现
4、小结

1、逻辑回归模型介绍

这里主要介绍二项逻辑回归模型，逻辑回归是一种分类模型，由条件概率分布P(Y|X)表示，形式为参数化的逻辑斯提分布。这里，随机变量X取值为实数，随机变量Y取值为1或0，通过监督学习的方法来估计模型参数。

2、逻辑回归数学原理

Logistic回归虽然名字里带“回归”，但是它实际上是一种分类方法，主要用于两分类问题（即输出只有两种，分别代表两个类别），所以利用了Sigmoid函数（或称为Logistic函数），函数形式为：

g(z)=11+e−z g ( z ) = 1 1 + e − z $g(z)=\frac{1}{1+e^{-z}}$
Sigmoid 函数如下图所示

这里写图片描述

逻辑回归的损失函数定义如下（具体推导过程可以查阅其他资料，这里不再赘述）

C o s t (h θ (x), y) = {- l o g (h θ (x)), y = 1 - l o g (1 - h θ (x)), y = 0

$Cost(h_\theta(x),y) = \begin{cases} -log(h_\theta(x)), \quad y=1 \\ -log(1-h_\theta(x)), \quad y=0 \\ \end{cases}$

J (θ) = 1 m \sum i = 1 n C o s t (h θ (x i), y i) = - 1 m [\sum i = 1 n y i l o g h θ (x i) + (1 - y i) l o g (1 - h θ (x i))]

$J(\theta)=\frac{1}{m}\sum_{i=1}^{n}Cost(h_\theta(x_i),y_i)=-\frac{1}{m}\left[ \sum_{i=1}^ny_ilogh_\theta(x_i)+(1-y_i)log(1-h_\theta(x_i)) \right]$
注：公式中的h为sigmoid函数

梯度下降法求的最小值
θ更新过程：
这里写图片描述

即：

θ j : = θ j - α 1 m \sum i = 1 m (h θ (x i) - y i) x j i

$\theta_j:=\theta_j-\alpha \frac{1}{m}\sum_{i=1}^m(h_\theta(x_i)-y_i)x_i^j$

3、算法及Pyhton实现

输入：训练数据T={( $x_1,y_1),(x_2,y_2),\cdots,(x_m,y_m$ )}，其中 $x_i=(x_i^{(1)},x_i^{(2)},\cdots,x_i^{(n)},)^T,x_i^{(j)}$ 是第i个样本的第j个特征， $y_m$ 表示第m个样本的标签。
输出：实例x的类。
Python实现代码如下（所用到的数据集horse.rar）

from numpy import *
def sigmoid(inX):
    return 1.0/(1+exp(-inX))
def stocGradDescent1(dataMatrix,classLabels,numIter=150):
    m,n = shape(dataMatrix)
    weights = ones(n)
    for j in range(numIter):
        dataIndex = list(range(m))
        for i in range(m):
            alpha = 4/(1.0+j+i)+0.01
            randIndex = int(random.uniform(0,len(dataIndex)))
            h = sigmoid(sum(dataMatrix[randIndex]*weights))
            error = h - classLabels[randIndex]
            weights = weights - alpha *error*dataMatrix[randIndex]
            del(dataIndex[randIndex])
    return weights
def classifyVector(inX,weights):
    prob = sigmoid(sum(inX*weights))
    if prob > 0.5:
        return 1.0
    else:
        return 0.0
def colicTest():
    frTrain = open('./horse/horseColicTraining.txt')
    frTest = open('./horse/horseColicTest.txt')
    trainingSet =[]; trainingLabels = []
    for line in frTrain.readlines():
        currLine = line.strip().split('\t')
        lineArr = []
        for i in range(21):
            lineArr.append(float(currLine[i]))
        trainingSet.append(lineArr)
        trainingLabels.append(float(currLine[21]))
    trainWeights = stocGradDescent1(array(trainingSet),trainingLabels,500)
    errorCount = 0; numTestVec = 0.0
    for line in frTest.readlines():
        numTestVec += 1.0
        currLine = line.strip().split('\t')
        lineArr = []
        for i in range(21):
            lineArr.append(float(currLine[i]))
        predClass = int(classifyVector(array(lineArr),trainWeights))
        groundLabel = int(currLine[21])
        if predClass != groundLabel:
            errorCount += 1
    errorRate = (float(errorCount)/numTestVec)
    print('The Predict Class is %d'%predClass),
    print('The Ground Truth is %d'%groundLabel)
    print('the error rate of this test is: %f'%errorRate)
    return errorRate
def multiTest():
    numTests = 10; errorSum = 0.0
    for k in range(numTests):
        errorSum += colicTest()
    print("after %d iterations the average error rate is: %f"%(numTests,errorSum/float(numTests)))