Logistic Regression

最新推荐文章于 2022-04-12 22:07:49 发布

Code Wang

最新推荐文章于 2022-04-12 22:07:49 发布

阅读量615

点赞数 1

分类专栏：机器学习算法文章标签：机器学习 logistic regression python源代码逻辑回归

本文链接：https://blog.csdn.net/wzl1997/article/details/88913865

版权

机器学习算法专栏收录该内容

7 篇文章 0 订阅

订阅专栏

Logistic Regression

又称逻辑回归，分类算法中的二分类算法，属于监督学习的范畴，算法复杂度低。

1.模型

Logistic Regression模型是广义线性模型的一种，属于线性的分类模型。找到一条直线，将两类区分开来，这样的直线成为超平面。可以用线性函数来表示：

Wx+b=0

其中，是权重，是偏置。在多维的情况下，它们都是向量。在算法中，还要用到阈值函数，一般这里用到的sigmoid函数：

$f(x)=\frac{1}{1+e^{-x}}$

其导数为：

$f^{'}(x)=\frac{e^{-x}}{(1+e^{-x})^{2}}=f(x)[1-f(x)]$

对于输入向量X，其属于正例的概率为：

$P(y=1|X,W,b)=\sigma (WX+b)=\frac{1}{1+e^{-(WX+b)}}$

同理，属于负例的概率为：

$P(y=0|X,W,b)=1-P(y=1|X,W,b)=1-\sigma (WX+b)=\frac{e^{-(WX+b)}}{1+e^{-(WX+b)}}$

要得到超平面，就要求解权重参数W和b，为了求解模型的两个参数，必须定义损失函数。

2.损失函数

样本属于y的概率为：

$P(y|X,W,b)=\sigma (WX+b)^{y}(1-\sigma (WX+b)^{1-y})$

我们可以用极大似然法对参数进行估计，假设训练数据有m个样本 ${(X^{(1)},y^{(1)}),(X^{(2)},y^{(2)}),...(X^{(m)},y^{(m)})}$ ，似然函数可以写成：

$L_{W,b}=\prod_{i=1}^{m}[h_{W,b}(X^{(i)})^{y^{(i)}}(1-h_{W,b}(X^{(i)})^{1-y^{(i)}})]$

其中， $h_{W,b}(X^{(i)})=\sigma (WX^{(i)}+b)$ 。对于似然函数的极大值求解，通常使用Log似然函数，即the negative log-likelihood(NLL)作为其损失函数：

$l_{W,b}=-\frac{1}{m}\log L_{W,b}= -\frac{1}{m}\sum_{i=1}^{m}[y^{(i)}\log(h_{W,b}(X^{(i)}))+(1-y^{(i)})\log(1-h_{W,b}(X^{(i)}))]$

为了求解损失函数的最小值，可以使用基于梯度的方法求解。

3.梯度下降法

梯度下降法是一种迭代型的优化算法，根据初始点在每一次迭代的过程中选择下降法方向，进而改变需要修改的参数，对于优化问题 minf(w) ，梯度下降法的过程如下：

a.随机选择一个初始点 $w_{0}$
b.重复一下过程

决定梯度下降的方向： $d_{i}=-\frac{\partial}{\partial w}f(w))|_{w_{i}}$

选择步长 $\alpha$

更新： $w_{i+1}=w_{i}+\alpha \cdot d_{i}$

c.直到满足终止条件

4.用梯度下降法训练模型

上述模型的梯度为：

$\triangledown w_{j}(l_{W,b})=-\frac{1}{m}\sum_{i=1}^{m}(y^{(i)}-h_{(W,b)}(X^{(i)}))x_{j}^{(i)}$

手推求导：

其中， $x_{j}^{(i)}$ 表示样本 $X^{(i)}$ 的第j个分量。更新公式如下：

$W_{j}=W_{j}+\alpha \triangledown w_{j}(l_{W,b})$

5.Python训练模型源代码

import numpy as np

# the function of sigmoid
def sig(x):
    return 1.0 / (1 + np.exp(-x))

# train the model by BGD algorithm
def lr_train_bgd(feature, label, maxCycle, alpha):
    n = np.shape(feature)[1]  # the dimension of the feature
    w = np.mat(np.ones((n, 1))) # initialize the weights
    i =0
    while i < maxCycle:
        i += 1
        h = sig(feature * w)
        err = label - h
        if i % 100 == 0:
            print('-----iter=' + str(i) + 'train error rate=' + str(error_rate(h, label)))
        w = w + alpha * feature.T * err # update the weights
    return w

#  compute the rate of error
def error_rate(h, label):
    m = np.shape(h)[0]
    sum_err = 0.0
    for i in range(m):
        if h[i, 0] > 0 and (1 - h[i, 0]) > 0:
            sum_err -= (label[i, 0] * np.log(h[i, 0]) + (1-label[i, 0]) * np.log(1-h[i, 0]))
        else:
            sum_err -= 0
    return sum_err / m

# load the data for training
def load_data(file_name):
    f = open(file_name)
    feature_data = []
    label_data = []
    for line in f.readlines():
        feature_tmp = []
        label_tmp = []
        lines = line.strip().split("\t")
        feature_tmp.append(1)
        for i in range(len(lines) - 1):
            feature_tmp.append(float(lines[i]))
        label_tmp.append(float(lines[-1]))

        feature_data.append(feature_tmp)
        label_data.append(label_tmp)

    f.close()
    return np.mat(feature_data), np.mat(label_data)

# save the model,on the other word,save the weights
def save_model(file_name, w):
    m = np.shape(w)[0]
    f_w = open(file_name, 'w')
    w_array = []
    for i in range(m):
        w_array.append(str(w[i, 0]))
    f_w.write("/t".join(w_array))
    f_w.close()

# the main function
if __name__ == "__main__":
    feature, label = load_data('data.txt')
    w = lr_train_bgd(feature, label, 1000, 0.01)
    save_model("weights", w)

6.训练结果

7.Python测试模型源代码

import numpy as np
import matplotlib.pyplot as plt

# load the weight that hava been trained
def load_weight(w):
    f = open(w)
    w = []
    for line in f.readlines():
        lines = line.strip().split("\t")
        w_tmp = []
        for x in lines:
            w_tmp.append(float(x))
        w.append(w_tmp)
    f.close()
    return np.mat(w)

# load the test data
def load_data(filename, n):
    f = open(filename)
    feature_data=[]
    for line in f.readlines():
        feature__tmp = []
        lines = line.strip().split("\t")
        if len(lines) != n-1:
            continue
        feature__tmp.append(1)
        for x in lines:
            feature__tmp.append(float(x))
        feature_data.append(feature__tmp)
    f.close()
    return np.mat(feature_data)

# the function of sigmoid
def sig(x):
    return 1.0 / (1 + np.exp(-x))


# predict the test data
def predict(data, w):
    h = sig(data * w.T)
    m = np.shape(h)[0]
    for i in range(m):
        if h[i, 0] < 0.5:
            h[i, 0] = 0.0
        else:
            h[i, 0] = 1.0
    return h

# the main function
if __name__ == "__main__":
    w = load_weight("weights")
    n = np.shape(w)[1]
    testData = load_data("test_data", n)
    h = predict(testData, w)
    testData = testData.T
    # figure
    plt.plot(testData[1][0, 0:100], testData[2][0, 0:100], 'g-s')
    plt.plot(testData[1][0, 100:200], testData[2][0, 100:200], 'r-s')
    plt.show()
    print(h)