对数几率回归（Logistic Regression）

最新推荐文章于 2023-11-17 16:02:12 发布

韩明宇

最新推荐文章于 2023-11-17 16:02:12 发布

阅读量1.2k

点赞数

分类专栏：机器学习

本文链接：https://blog.csdn.net/qq_37098526/article/details/88766757

版权

机器学习专栏收录该内容

17 篇文章 2 订阅

订阅专栏

简介

假设你是某某大学某某系的主任，你想根据每个申请者在两次考试中的成绩来确定他们的入学机会。你有以前申请者的历史数据，可以用作对数几率回归的训练集。对于每个训练示例，你都有申请人在两次考试中的分数和录取结果。

绘制数据

横纵坐标是申请人两次考试的成绩，录取和未录取的示例用两种记号标出。

# PLOTDATA Plots the data points X and y into a new figure
#   PLOTDATA(x,y) plots the data points with + for the positive examples
#   and o for the negative examples. X is assumed to be a Mx2 matrix.
from matplotlib import pyplot as plt
import numpy as np


def plotData(X, y):
    exam1_0 = []
    exam2_0 = []
    exam1_1 = []
    exam2_1 = []
    for i in range(len(y)):
        if y[i] == 0:
            exam1_0.append(X[i][0])
            exam2_0.append(X[i][1])
        elif y[i] == 1:
            exam1_1.append(X[i][0])
            exam2_1.append(X[i][1])

    plt.title("Training data")
    plt.xlim(30, 100)
    plt.ylim(30, 100)
    plt.xticks(np.arange(30, 101, 10))
    plt.yticks(np.arange(30, 101, 10))
    plt.xlabel("Exam 1 score")
    plt.ylabel("Exam 2 score")
    plt.scatter(exam1_0, exam2_0, s=50, c='y', marker='o')
    plt.scatter(exam1_1, exam2_1, s=50, c='b', marker='+')
    plt.legend(scatterpoints=1, labels=['Not admitted', 'Admitted'], loc=1)
    plt.show()

如图：

Sigmoid函数

对数几率回归的假设函数定义为： $h_{\theta }(x)=g(\theta ^{T}x)$

其中函数g就是sigmoid函数，定义为： $g(z)=\frac{1}{1+e^{-z}}$

# SIGMOID Compute sigmoid function
#   g = SIGMOID(z) computes the sigmoid of z.
import numpy as np
import math


def sigmoid(z):
    g = np.zeros(shape=z.shape)
    g = 1/(1+math.e**(-z))
    return g

迭代训练1000000次，得到梯度下降结果如下，可以看到代价函数在不断减小，并逐渐收敛于期望代价：

Running Gradient Descent ...

After 0 steps, the cost function: [0.69829069]
After 100000 steps, the cost function: [0.38738841]
After 200000 steps, the cost function: [0.31655389]
After 300000 steps, the cost function: [0.28368669]
After 400000 steps, the cost function: [0.2646348]
After 500000 steps, the cost function: [0.25216993]
After 600000 steps, the cost function: [0.24337911]
After 700000 steps, the cost function: [0.23685629]
After 800000 steps, the cost function: [0.23183607]
After 900000 steps, the cost function: [0.22786442]
Cost at theta found by gradient descent: 0.224654

Expected cost (approx): 0.203

代价函数和梯度

对数几率回归的代价函数定义如下：

$J(\theta )=\frac{1}{m}\sum_{i=1}^{m}[-y^{(i)}log(h_{\theta }(x^{(i)}))-(1-y^{(i)})log(1-h_{\theta }(x^{(i)}))]$

代价函数的梯度是一个与 $\theta$ 相同长度的向量，其中第j个元素(j=0,1,...,n)定义如下：

$\frac{\partial J(\theta )}{\partial \theta _{j}}=\frac{1}{m}\sum_{i=1}^{m}(h_{\theta }(x^{(i)})-y^{(i)})x_{j}^{(i)}$

注意，虽然这个梯度看起来与线性回归的梯度相同，但是公式实际上是不同的，因为线性回归和对数几率回归对 $h_{\theta }(x)$ 的定义不同。

# COSTFUNCTION Compute cost and gradient for logistic regression
#   J = COSTFUNCTION(theta, X, y) computes the cost of using theta as the
#   parameter for logistic regression and the gradient of the cost
#   w.r.t. to the parameters.
import numpy as np
import math
from sigmoid import sigmoid


def costFunction(theta, X, y):
    m = len(y)  # number of training examples
    J = 0
    grad = np.zeros(shape=(len(theta), 1))
    for i in range(m):
        # 以e为底
        J += -y[i]*math.log(sigmoid(X[i].dot(theta))) - (1-y[i])*math.log((1-sigmoid(X[i].dot(theta))))
    J = J/m

    for i in range(len(theta)):
        for j in range(m):
            grad[i] += (sigmoid(X[j].dot(theta))-y[j]) * X[j][i]
        grad[i] = grad[i]/m

    return J, grad

梯度下降

同时更新所有的 $\theta _{j}$ ：

$\theta _{j}:=\theta _{j}-\alpha \frac{1}{m}\sum_{i=1}^{m}(h_{\theta }(x^{(i)})-y^{(i)})x_{j}^{(i)}$

（看起来同线性回归的公式一模一样）

from costFunction import costFunction
import numpy as np


def gradientDescent(X, y, theta, alpha, num_iters):
    m = len(y)  # number of training examples
    J_history = np.zeros(shape=(num_iters, 1))

    for i in range(num_iters):
        _, grad = costFunction(theta, X, y)
        for j in range(len(theta)):
            theta[j] -= alpha*grad[j]

        # Save the cost J in every iteration
        cost, _ = costFunction(theta, X, y)
        J_history[i] = cost
        if i % 100000 == 0:
            print("After %d steps, the cost function:" % i, J_history[i])
            # print("the gradient:", theta)

    return theta, J_history[-1]

评价对数几率回归

评估对数几率回归得到的参数的一种方法是查看学习的模型对训练集的预测准确率。predict函数将根据给定的数据集和学习的参数向量 $\theta$ 生成“1”或“0”预测，并通过计算与示例一致的结果百分比来得到分类器的训练准确率。

# PREDICT Predict whether the label is 0 or 1 using learned logistic
# regression parameters theta
#   p = PREDICT(theta, X) computes the predictions for X using a
#   threshold at 0.5 (i.e., if sigmoid(theta'*x) >= 0.5, predict 1)
from sigmoid import sigmoid
import numpy as np

def predict(theta, X):
    m = len(X)
    p = np.zeros(shape=(m, 1))
    for i in range(m):
        if sigmoid(X[i].dot(theta)) > 0.5:
            p[i] = 1
        else:
            p[i] = 0

    return p

结果如下，可以看到在经过1000000次迭代训练后，训练的准确率已经与期望准确率相同：

Train Accuracy: 89.000000

Expected accuracy (approx): 89.0

具体代码参考：https://github.com/hanmy1021/MachineLearning

韩明宇

关注

0
点赞
踩
8

收藏

觉得还不错? 一键收藏
0
评论
对数几率回归（Logistic Regression）

简介假设你是某某大学某某系的主任，你想根据每个申请者在两次考试中的成绩来确定他们的入学机会。你有以前申请者的历史数据，可以用作对数几率回归的训练集。对于每个训练示例，你都有申请人在两次考试中的分数和录取结果。绘制数据横纵坐标是申请人两次考试的成绩，录取和未录取的示例用两种记号标出。# PLOTDATA Plots the data points X and y int...
复制链接

扫一扫