常见的机器学习算法(二)逻辑回归

与线性回归不同,Logistic 回归没有封闭解。但由于损失函数是凸函数,因此我们可以使用梯度下降法来训练模型。

我们希望模型得到的目标值概率落在 0 到 1 之间。因此在训练期间,我们希望调整参数,使得模型较大的输出值对应正标签(真实标签为 1),较小的输出值对应负标签(真实标签为 0  )。这在损失函数中表现为如下形式:

对权重向量和偏置量,计算其对损失函数的梯度

更新权重和偏置值:

import numpy as np
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
from sklearn.datasets import make_blobs

np.random.seed(123)
x, y_true = make_blobs(n_samples=1000, centers=2)#n_samples待生成的样本总数,centers类别数,n_features每个样本的特征数
# print(x.shape)#(1000,2)
# print(y_true.shape)#一维(1000,)

'数据集'
fig = plt.figure(figsize=(8,6))
#plt.scatter散点图;x,y是大小为(n,)的数组,即绘制散点图的数据点;c是颜色
plt.scatter(x[:,0], x[:, 1], c=y_true)#x[:,0]是数组所有行的第一列数据,x[:,1]是数组所有行的第二列数据
plt.title('Dataset')
plt.xlabel('First feature')
plt.ylabel('Second feature')
plt.show()

# Reshape targets to get column vector with shape (n_samples, 1)
y_true = y_true[:, np.newaxis]
# print(y_true.shape)#二维(1000,1)
x_train, x_test, y_train, y_test = train_test_split(x, y_true)
print('Shape of x_train: ', x_train.shape)
print('Shape of y_train: ', y_train.shape)
print('Shape of x_test: ', x_test.shape)
print('Shape of y_test: ', y_test.shape)

class logisticRegression:
    def __init__(self):
        pass

    def sigmoid(self, x):
        return 1 / (1 + np.exp(-x))

    def train(self, x, y_true, n_iters, l_r):
        n_samples, n_features = x.shape
        self.weight = np.zeros((n_features, 1))
        self.bias = 0
        costs = []

        for i in range(n_iters):
            y_predict = self.sigmoid(np.dot(x, self.weight) + self.bias)
            cost = (-1 / n_samples) * np.sum(y_true * np.log(y_predict) +
                                             (1 - y_true) * np.log(1 - y_predict))
            dw = (1 / n_samples) * np.dot(x.T, (y_predict - y_true))
            db = (1 / n_samples) * np.sum(y_predict - y_true)

            self.weight = self.weight - l_r * dw
            self.bias = self.bias - l_r * db

            costs.append(cost)
            if(i % 100 == 0):
                print('Cost after iteration {}:{}'.format(i, cost))
        return self.weight, self.bias, costs

    def predict(self, x):
        y_predict = self.sigmoid(np.dot(x, self.weight) + self.bias)
        y_predict_labels = [1 if elem > 0.5 else 0 for elem in y_predict]
        return np.array(y_predict_labels)[:, np.newaxis]

regressor = logisticRegression()
w_trained, b_trained, costs = regressor.train(x_train, y_train, n_iters=600, l_r=0.009)
fig = plt.figure(figsize=(8,6))
plt.plot(np.arange(600), costs)
plt.title('Development of cost over training')
plt.xlabel('Number of iterations')
plt.ylabel('Cost')
plt.show()

y_p_train = regressor.predict(x_train)
y_p_test = regressor.predict(x_test)
print('Train accuracy: ',
      (100 - np.mean(np.abs(y_p_train - y_train))), '%')
print('Test accuracy: ',
      (100 - np.mean(np.abs(y_p_test - y_test))), '%')

 

(二) 直接调用sklearn的API

from sklearn.linear_model import LogisticRegression         # 逻辑回归 #
module = LogisticRegression()
module.fit(x, y)
module.score(x, y)
module.predict(test)

完整代码:

import numpy as np
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
from sklearn.datasets import make_blobs
from sklearn.linear_model import LogisticRegression

np.random.seed(123)
x, y_true = make_blobs(n_samples=1000, centers=2)

'数据集'
fig = plt.figure(figsize=(8, 6))
plt.scatter(x[:, 0], x[:, 1], c=y_true)
plt.title('Dataset')
plt.xlabel('First feature')
plt.ylabel('Second feature')
plt.show()

y_true = y_true[:, np.newaxis]
# print(y_true.shape)#二维(1000,1)
x_train, x_test, y_train, y_test = train_test_split(x, y_true)
print('Shape of x_train: ', x_train.shape)
print('Shape of y_train: ', y_train.shape)
print('Shape of x_test: ', x_test.shape)
print('Shape of y_test: ', y_test.shape)

module = LogisticRegression()
module.fit(x_test, y_test)
y_p_train = module.predict(x_train)
y_p_test = module.predict(x_test)
print('Train accuracy: ',
      (100 - np.mean(np.abs(y_p_train - y_train))), '%')
print('Test accuracy: ',
      (100 - np.mean(np.abs(y_p_test - y_test))), '%')

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值