机器学习-逻辑回归

简单668

已于 2023-01-01 21:42:21 修改

阅读量104

点赞数

文章标签：逻辑回归

于 2020-11-04 14:24:55 首次发布

本文链接：https://blog.csdn.net/qq_34225469/article/details/109489071

版权

逻辑回归

一、算法思想

线性回归可以对数据进行线性拟合，拟合后的模型可以输出连续的值。由于它没有范围，因此不适合与分类问题。逻辑回归用于离散变量的分类问题，其输出值为属于某一类的概率，主要用于类的判别。

二、算法推导

1、逻辑回归

线性回归可以对数据进行线性拟合，拟合后的模型可以输出连续的值。由于它没有范围，因此不适合与分类问题。逻辑回归用于离散变量的分类问题，其输出值为属于某一类的概率，主要用于类的判别。
对于线性回归的预测函数为：

${h_\theta }(x) = {\theta _0} + {\theta _1}{x_1} + {\theta _2}{x_2} + ... + {\theta _n}{x_n} = {\theta ^T}x$

在逻辑回归中，将线性回归的值通过sigmoid激活函数激活输出概率值，如图1所示，当 $z > 0$ 时 $g (z) > 0.5$ ,当 $z < 0$ 时, $g (z) < 0.5$ ,当 $z = 0$ 时， $g (z) = 0.5$ 。逻辑回归的预测函数如下所示：

${h_\theta }(x) = g({\theta ^T}x)\\ g(z) = \frac{1}{{1 + {e^{ - z}}}}$

这样预测函数解释为在已知 $x$ 和 $\theta$ 的条件下函数值为1的概率。

${h_\theta }(x) = P(y = 1|x;\theta )$

例如，对于给定的 $x$ 和 $\theta$ 所求得的概率 ${h_\theta }(x)=0.7$ ，可知有0.7的概认为值为1，相应的为负向类的概率为0.3.
图1.Sigmoid函数

图1.sigmoid函数

2、损失函数

对于线性回归模型，我们定义的代价函数是所有模型误差的平方和。理论上来说，我们也可以对逻辑回归模型沿用这个定义，但是问题在于，当我们将

${h_\theta }(x) = \frac{1}{{1 + {e^{ - {\theta ^T}x}}}}$

带入到线性回归的代价函数中，会得到图2：可知该函数时非凸的，因此需要重新设计损失函数，使其变为凸函数以便执行后续的函数优化求解。所设计的损失函数为：

$J(\theta ) = \frac{1}{m}\sum\limits_{i = 1}^m {[ - {y^{(i)}}\log ({h_\theta }(x)) - (1 - {y^{(i)}})\log (1 - {h_\theta }(x))]}$

写成以下形式为：

$J(\theta ) = \frac{1}{m}\sum\limits_{i = 1}^m {{\mathop{\rm Cos}\nolimits} t({h_\theta }({x^{(i)}}),{y^{(i)}})}$

当 $y = 1$ 或 $y = 0$ 时Cost函数为：
$y = 1$ 时， $J(\theta )= - \log ({h_\theta }(x))$
$y = 0$ 时， $J(\theta )= - \log ({1-h_\theta }(x))$

该函数的特点是：

当真实值 $y = 1$ 以及预测值 ${h_\theta }(x)=1$ 时误差为0，当 $y = 1$ 且 ${h_\theta }(x)$ 不为1时，误差随着 ${h_\theta }(x)$ 的变小而增大;
当真实值 $y = 0$ 以及预测值 ${h_\theta }(x)=0$ 时误差为0，当真实值当真实值 $y = 0$ 且 ${h_\theta }(x)$ 不为0时，误差随着 ${h_\theta }(x)$ 的增大而增大。

在这里插入图片描述

图2.非凸函数

3、算法求解

损失函数中变量的求解为凸优化问题，推导过程如下：

ps://img-blog.csdnimg.cn/20201104141739600.png#pic_center)
并且有：
在这里插入图片描述
因此可得：

对其进行求导可得：

在这里插入图片描述
可以使用梯度下降法对参数进行更新：

在这里插入图片描述

三、算法实现

在这里插入图片描述
LogisticRegression.py

# -*- coding: utf-8 -*-

import numpy as np


class LogisticRegression(object):

    def __init__(self, learning_rate=0.1, max_iter=100, seed=None):
        self.seed = seed
        self.lr = learning_rate
        self.max_iter = max_iter

    def fit(self, x, y):
        np.random.seed(self.seed)
        self.w = np.random.normal(loc=0.0, scale=1.0, size=x.shape[1])
        self.b = np.random.normal(loc=0.0, scale=1.0)
        self.x = x
        self.y = y
        for i in range(self.max_iter):
            self._update_step()
            # print('loss: \t{}'.format(self.loss()))
            # print('score: \t{}'.format(self.score()))
            # print('w: \t{}'.format(self.w))
            # print('b: \t{}'.format(self.b))

    def _sigmoid(self, z):
        return 1.0 / (1.0 + np.exp(-z))

    def _f(self, x, w, b):
        z = x.dot(w) + b
        return self._sigmoid(z)

    def predict_proba(self, x=None):
        if x is None:
            x = self.x
        y_pred = self._f(x, self.w, self.b)
        return y_pred

    def predict(self, x=None):
        if x is None:
            x = self.x
        y_pred_proba = self._f(x, self.w, self.b)
        y_pred = np.array([0 if y_pred_proba[i] < 0.5 else 1 for i in range(len(y_pred_proba))])
        return y_pred

    def score(self, y_true=None, y_pred=None):
        if y_true is None or y_pred is None:
            y_true = self.y
            y_pred = self.predict()
        acc = np.mean([1 if y_true[i] == y_pred[i] else 0 for i in range(len(y_true))])
        return acc

    def loss(self, y_true=None, y_pred_proba=None):
        if y_true is None or y_pred_proba is None:
            y_true = self.y
            y_pred_proba = self.predict_proba()
        return np.mean(-1.0 * (y_true * np.log(y_pred_proba) + (1.0 - y_true) * np.log(1.0 - y_pred_proba)))

    def _calc_gradient(self):
        y_pred = self.predict()
        d_w = (y_pred - self.y).dot(self.x) / len(self.y)
        d_b = np.mean(y_pred - self.y)
        return d_w, d_b

    def _update_step(self):
        d_w, d_b = self._calc_gradient()
        self.w = self.w - self.lr * d_w
        self.b = self.b - self.lr * d_b
        return self.w, self.b

# -*- coding: utf-8 -*-

import numpy as np
import matplotlib.pyplot as plt
import data_helper
from logistic_regression import *


# data generation
x, y = data_helper.generate_data(seed=272)
x_train, y_train, x_test, y_test = data_helper.train_test_split(x, y)

# visualize data
# plt.scatter(x_train[:,0], x_train[:,1], c=y_train, marker='.')
# plt.show()
# plt.scatter(x_test[:,0], x_test[:,1], c=y_test, marker='.')
# plt.show()

# data normalization
x_train = (x_train - np.min(x_train, axis=0)) / (np.max(x_train, axis=0) - np.min(x_train, axis=0))
x_test = (x_test - np.min(x_test, axis=0)) / (np.max(x_test, axis=0) - np.min(x_test, axis=0))

# Logistic regression classifier
clf = LogisticRegression(learning_rate=0.1, max_iter=500, seed=272)
clf.fit(x_train, y_train)

# plot the result
split_boundary_func = lambda x: (-clf.b - clf.w[0] * x) / clf.w[1]
xx = np.arange(0.1, 0.6, 0.1)
plt.scatter(x_train[:,0], x_train[:,1], c=y_train, marker='.')
plt.plot(xx, split_boundary_func(xx), c='red')
plt.show()

# loss on test set
y_test_pred = clf.predict(x_test)
y_test_pred_proba = clf.predict_proba(x_test)
print(clf.score(y_test, y_test_pred))
print(clf.loss(y_test, y_test_pred_proba))
# print(y_test_pred_proba)

简单668

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
1
评论
机器学习-逻辑回归

逻辑回归一、算法思想线性回归可以对数据进行线性拟合，拟合后的模型可以输出连续的值。由于它没有范围，因此不适合与分类问题。逻辑回归用于离散变量的分类问题，其输出值为属于某一类的概率，主要用于类的判别。二、算法推导1、逻辑回归线性回归可以对数据进行线性拟合，拟合后的模型可以输出连续的值。由于它没有范围，因此不适合与分类问题。逻辑回归用于离散变量的分类问题，其输出值为属于某一类的概率，主要用于类的判别。对于线性回归的预测函数为：hθ(x)=θ0+θ1x1+θ2x2+...+θnxn=θTx{h_\t
复制链接

扫一扫