1.6 logistic回归的keras实现

最新推荐文章于 2022-01-10 15:25:38 发布

小学渣的春天

最新推荐文章于 2022-01-10 15:25:38 发布

阅读量667

点赞数

分类专栏：视觉

本文链接：https://blog.csdn.net/Doctor_Wei/article/details/109580242

版权

视觉专栏收录该内容

12 篇文章 0 订阅

订阅专栏

1.6 logistic回归的keras实现

1.6.1 Sigmoid函数表达形式

$f(x)=\frac{1}{1+e^{-x}}$
在这里插入图片描述
Sigmoid输出值可以看作“概率”

在这里插入图片描述

1.6.2 损失函数

1.6.2.1 平方和损失

能否采用和之前均方误差类似的思路设定损失函数呢？
$L_2=\sum_{i=1}^N(y_i-\hat{y_i})^2$
其中， $y_i$ 是真实值， $\hat{y_i}$ 是预测值， $N$ 为样本总量
将 $L_2$ 的图像画出：
在这里插入图片描述
发现，不止有一个极值点。
在做梯度下降时候，不方便进行梯度更新。

1.6.2.2 交叉熵

$CrossEntropy=-\sum_{i=1}^N[y_i\ln(\hat{y_i})+(1-y_i)\ln(1-\hat{y_i})]$

在这里插入图片描述

导入必要的模块

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from sklearn.model_selection import train_test_split

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

生成数据

定义数据生成函数

def create_data(data_num=500):
    np.random.seed(251)
    x1 = np.random.normal(0, 0.2, data_num)
    x2 = np.random.normal(1, 0.2, data_num)
    x = np.append(x1,x2)
    y = np.array([0] * data_num + [1] * data_num)
    return x, y

x,y = create_data()

划分训练集和测试集

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=16)

画出训练集数据的散点图

plt.scatter(x_train, y_train, color='g', label='train dataset')
plt.legend()
plt.show()

在这里插入图片描述

plt.scatter(x_test, y_test, color='c', label='test dataset')
plt.legend()
plt.show()

在这里插入图片描述

模型搭建

使用tf.keras.Sequential按顺序堆叠神经网络层，添加网络只要使用.add()函数即可。

使用到的api：

全连接操作tf.keras.layers.Dense

用到的参数：

input_dim：如果是第一个全连接层，需要设置输入层的大小。
units：输入整数，全连接层神经元个数。
activation：激活函数，二分类的输出通常使用’sigmoid’激活函数。
name：输入字符串，给改层设置一个名称。

模型设置tf.keras.Sequential.compile

用到的参数：

loss：损失函数，二分类任务使用"binary_crossentropy"。
optimizer：优化器，这里选用"sgd"，更多优化器请查看https://tensorflow.google.cn/api_docs/python/tf/keras/optimizers
metrics：评价指标，这里选用"accuracy"，更多优化器请查看https://tensorflow.google.cn/api_docs/python/tf/keras/metrics

model = Sequential()

# 全连接层
model.add(Dense(input_dim=1, units=1, activation='sigmoid', name='dense'))

# 设置损失函数loss、优化器optimizer、评价标准metrics
model.compile(loss='binary_crossentropy', optimizer='sgd', metrics=['accuracy'])

查看模型每层输出的shape和参数量

model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense (Dense)                (None, 1)                 2         
=================================================================
Total params: 2
Trainable params: 2
Non-trainable params: 0
_________________________________________________________________

模型训练

使用到的api：

tf.keras.Sequential.fit

用到的参数：

x：输入数据。
y：输入标签。
batch_size：一次梯度更新使用的数据量。
epochs：数据集跑多少轮模型训练，一轮表示整个数据集训练一次。
validation_split：验证集占总数据量的比例，取值0~1。
shuffle：每轮训练是否打乱数据顺序，默认True。

返回：History对象，History.history属性会记录每一轮训练集和验证集的损失函数值和评价指标。

history = model.fit(x=x_train, y=y_train, batch_size=32, epochs=1000, validation_split=0.3, shuffle=True)

查看评价指标的变化趋势

pd.DataFrame(history.history).plot(figsize=(8, 5))
plt.grid(True)
plt.xlabel('epoch')
plt.show()

在这里插入图片描述

模型验证

loss_test, accuracy_test = model.evaluate(x_test, y_test)
print(loss_test, accuracy_test)

模型预测

查看测试集的预测结果

y_test_pred = model.predict(x_test)

画出测试集的散点图和预测曲线

plt.plot(np.sort(x_test), y_test_pred[np.argsort(x_test)], color='r', label='predict')
plt.scatter(x_test, y_test, color='c', label='test dataset')
plt.legend()
plt.show()

在这里插入图片描述

查看logistic回归模型的系数w和截距b

w, b = model.layers[0].get_weights()
print('Weight={0} bias={1}'.format(w.item(), b.item()))

Weight=6.792507171630859 bias=-3.3748743534088135

小学渣的春天

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
7
评论
1.6 logistic回归的keras实现

1.6 logistic回归的keras实现1.6.1 Sigmoid函数表达形式f(x)=11+e−xf(x)=\frac{1}{1+e^{-x}}f(x)=1+e−x1Sigmoid输出值可以看作“概率”1.6.2 损失函数1.6.2.1 平方和损失能否采用和之前均方误差类似的思路设定损失函数呢？L2=∑i=1N(yi−yi^)2L_2=\sum_{i=1}^N(y_i-\hat{y_i})^2L2=i=1∑N(yi−yi^)2其中，yiy_iyi是真实值，yi^\ha
复制链接

扫一扫