【Tensorflow】Tensorflow实现线性回归及逻辑回归

最新推荐文章于 2022-05-13 09:00:00 发布

Day-yong

最新推荐文章于 2022-05-13 09:00:00 发布

阅读量909

点赞数 5

分类专栏： Tensorflow 文章标签： Tensorflow

本文链接：https://blog.csdn.net/Daycym/article/details/89979772

版权

Tensorflow 专栏收录该内容

8 篇文章 1 订阅

订阅专栏

前言

Tensorflow详细入门

前文我们介绍了关于 Tensorflow 的基本操作，知道了 Tensorflow 主要应用于机器学习和深度神经网络方面的研究，接下来我们就用 Tensorflow 来实现机器学习中的线性回归和逻辑回归，进一步了解 Tensorflow 的使用。

大体可包括如下几步：

数据的获取、处理、划分
模型的图构建
损失函数的构建
选择优化方式和优化目标
变量初始化
运行

博主环境：

Windows 10 64位
Python 3.6
Tensorflow-gpu 1.9

本篇代码可见：Github

一、`Tensorflow`实现线性回归

线性回归：

$y = w * x + b$

损失函数（MSE）：

$J(\theta) = \frac{1}{2}\sum_{i=1}^m\bigg(h_\theta(x^{(i)}) - y^{(i)}\bigg)^2$

本案例通过模拟数据，来实现线性回归模型。

代码：

import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf

# 1. 构造一个数据
np.random.seed(28) # 随机数种子，使每次运行程序产生的随机数相同
N = 1000
x = np.linspace(0, 6, N) + np.random.normal(loc=0.0, scale=2, size=N)  # 均值、方差、数据点数目
y = 14 * x - 7 + np.random.normal(loc=0.0, scale=5.0, size=N)

# 将x和y设置成为矩阵
x.shape = -1, 1
y.shape = -1, 1

# 2. 模型构建
# 定义一个变量w和变量b
# random_uniform：（random意思：随机产生数据， uniform：均匀分布的意思） ==> 意思：产生一个服从均匀分布的随机数列
# shape: 产生多少数据/产生的数据格式是什么； minval：均匀分布中的可能出现的最小值，maxval: 均匀分布中可能出现的最大值
w = tf.Variable(initial_value=tf.random_uniform(shape=[1], minval=-1.0, maxval=1.0), name='w')
b = tf.Variable(initial_value=tf.zeros([1]), name='b')
# 构建一个预测值（线性回归公式）
y_hat = w * x + b

# 构建一个损失函数
# 以MSE作为损失函数（预测值和实际值之间的平方和）
loss = tf.reduce_mean(tf.square(y_hat - y), name='loss')

# 以随机梯度下降的方式优化损失函数
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.05)
# 在优化的过程中，是让哪个函数最小化
train = optimizer.minimize(loss, name='train')

# 全局变量更新
init_op = tf.global_variables_initializer()


# 运行
def print_info(r_w, r_b, r_loss):
    print("w={},b={},loss={}".format(r_w, r_b, r_loss))


with tf.Session() as sess:
    # 初始化
    sess.run(init_op)

    # 输出初始化的w、b、loss
    r_w, r_b, r_loss = sess.run([w, b, loss])
    print_info(r_w, r_b, r_loss)

    # 进行训练(n次)
    for step in range(100):
        # 模型训练
        sess.run(train)
        # 输出训练后的w、b、loss
        r_w, r_b, r_loss = sess.run([w, b, loss])
        print_info(r_w, r_b, r_loss)
        
# 画图展示
plt.scatter(x, y, c='r')
plt.plot(x, r_w * x + r_b)
plt.show()

运行结果：

在这里插入图片描述

二、`Tensorflow`实现逻辑回归

sigmoid 函数：
$\dfrac {1}{1+e^{-z}}$
逻辑回归：
$h_\theta(x) = g(\theta^Tx) = \dfrac {1}{1+e^{-\theta^Tx}}$

损失函数：

$J(\theta) = \frac{1}{m} \sum_{i=1}^{m} (y^{(i)}logh_\theta(x^{(i)})+(1-y^{(i)})log(1-h_\theta(x^{(i)})))$

1、`sigmoid` 实现二分类（逻辑回归）

首先我们想使用模拟的数据进行训练，看看训练的结果。

模拟数据代码：

import numpy as np
import matplotlib.pyplot as plt
from sklearn.preprocessing import Binarizer, OneHotEncoder

# 1.模拟数据产生
np.random.seed(28)
n = 500
x_data = np.random.normal(loc=0, scale=2, size=(n, 2))  # 随机生成均值为0，标准差为2的500数据，2个类别
y_data = np.dot(x_data, np.array([[5], [3]]))
y_data = OneHotEncoder().fit_transform(Binarizer(threshold=0).fit_transform(y_data)).toarray()

# 构建最终画图的数据（数据点）
t1 = np.linspace(-8, 10, 100)
t2 = np.linspace(-8, 10, 100)
xv, yv = np.meshgrid(t1, t2)
x_test = np.dstack((xv.flat, yv.flat))[0]

plt.scatter(x_data[y_data[:, 0] == 0][:, 0], x_data[y_data[:, 0] == 0][:, 1], s=50, marker='+', c='red')
plt.scatter(x_data[y_data[:, 0] == 1][:, 0], x_data[y_data[:, 0] == 1][:, 1], s=50, marker='x', c='blue')
plt.show()

在这里插入图片描述

import numpy as np
import tensorflow as tf
import matplotlib as mpl
import matplotlib.pyplot as plt
from sklearn.preprocessing import Binarizer, OneHotEncoder

# 1.模拟数据产生
np.random.seed(28)
n = 500
x_data = np.random.normal(loc=0, scale=2, size=(n, 2))  # 随机生成均值为0，标准差为2的500数据，2个类别
y_data = np.dot(x_data, np.array([[5], [3]]))
y_data = OneHotEncoder().fit_transform(Binarizer(threshold=0).fit_transform(y_data)).toarray()

# 构建最终画图的数据（数据点）
t1 = np.linspace(-8, 10, 100)
t2 = np.linspace(-8, 10, 100)
xv, yv = np.meshgrid(t1, t2)
x_test = np.dstack((xv.flat, yv.flat))[0]

# 2.模型构建
# 构建数据输入占位符
# x/y:None的意思表示维度未知（那也就是我们可以传入任意的数据样本条数
# x:2表示变量的特征属性时2个特征，即输入样本的维度数目
# y:2表示是样本变量所属的类别的数目，类别是多少个，这里就是几
x = tf.placeholder(tf.float32, [None, 2], name='x')
y = tf.placeholder(tf.float32, [None, 2], name='y')

# 预测模型构建
# 构建权重w和偏置项b
# w第一个2是输入样本的维度数目
# w第二个2是样本的目标属性所属类别的数目（有多少类别，就为几个）
# b中的2是样本的目标属性所属类别的数目（有多少类别，就为几）
w = tf.Variable(tf.zeros([2, 2]), name='w')
b = tf.Variable(tf.zeros([2]), name='b')
# act(Tensor)是通过sigmoid函数转换后的一个概率值（矩阵形式）逻辑回归公式中
act = tf.nn.sigmoid(tf.matmul(x, w) + b)

# 模型构建的损失函数
# tf.reduce_sum:求和，当参数为矩阵的时候，axis相当于1的时候，对每行求和（和Numpy API中的axis参数意义一样）
# tf.reduce_mean:求均值，当不给定任何axis参数的时候，表示求解全部所有数据的均值
cost = -tf.reduce_mean(tf.reduce_sum(y * tf.log(act), axis=1))

# 使用梯度下降求解
# 使用梯度下降，最小化误差
# learning_rate:要注意，不要过大，过大可能不收敛，也不要过小，过小收敛速度比较慢
train = tf.train.GradientDescentOptimizer(learning_rate=0.01).minimize((cost))

# 得到预测的类别是哪一个
# tf.argmax:对矩阵按行或列计算最大值对应的下标，和numpy中的一样
# tf.equal:是对比这两个居中或者向量的相等的元素，如果是相等的那就返回True，否则返回False
pred = tf.equal(tf.argmax(act, axis=1), tf.argmax(y, axis=1))
# 正确率（True转换为1，False转换为0）
acc = tf.reduce_mean(tf.cast(pred, tf.float32))

# 初始化
init = tf.global_variables_initializer()

# 总共训练迭代次数
training_epochs = 50
# 批次数量
num_batch = int(n / 10)
# 训练迭代次数（打印信息）
display_step = 5

with tf.Session() as sess:
    # 变量初始化
    sess.run(init)

    for epoch in range(training_epochs):
        # 模型训练
        sess.run(train, feed_dict={x: x_data, y: y_data})

    # 对用于画图的数据进行预测
    # y_hat：是一个None*2的矩阵
    y_hat = sess.run(act, feed_dict={x: x_test})
    # 根据softmax分类的模型理论，获取每个样本对应出现概率最大的（值最大）
    # y_hat:是一个None*1的矩阵
    y_hat = np.argmax(y_hat, axis=1)

print("模型训练完成！")
# 画图展示一下
cm_light = mpl.colors.ListedColormap(['#bde1f5', '#f7cfc6'])
y_hat = y_hat.reshape(xv.shape)
plt.pcolormesh(xv, yv, y_hat, cmap=cm_light)  # 预测值
plt.scatter(x_data[y_data[:, 0] == 0][:, 0], x_data[y_data[:, 0] == 0][:, 1], s=50, marker='+', c='red')
plt.scatter(x_data[y_data[:, 0] == 1][:, 0], x_data[y_data[:, 0] == 1][:, 1], s=50, marker='o', c='blue')
plt.show()

在这里插入图片描述

2、`softmax` 实现多分类

本案例使用的是 mnist 的手写数字识别数据集，是图片的格式，每张图片都是一个手写数字，形状为 $28 * 28$ 。

加载数据：

import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
from tensorflow_linear_regression import input_data

# 设置字符集，防止中文乱码
mpl.rcParams['font.sans-serif'] = [u'simHei']
mpl.rcParams['axes.unicode_minus'] = False

# 读取数据
mnist = input_data.read_data_sets('data/', one_hot=True)

# 下载下来的数据集被分三个子集：
# 5.5W行的训练数据集（mnist.train），
# 5千行的验证数据集（mnist.validation)
# 1W行的测试数据集（mnist.test）。
# 因为每张图片为28x28的黑白图片，所以每行为784维的向量。
trainimg = mnist.train.images
trainlabel = mnist.train.labels
testimg = mnist.test.images
testlabel = mnist.test.labels

# 打印数据形状
print(trainimg.shape)
print(trainlabel.shape)
print(testimg.shape)
print(testlabel.shape)
print(trainlabel[0])

# 随机展示4张图片
nsample = 4
randidx = np.random.randint(trainimg.shape[0], size=nsample)

for i in randidx:
    # reshape：格式变化
    curr_img = np.reshape(trainimg[i, :], (28, 28))  # 28 by 28 matrix
    # 获取最大值（10个数字中，只有一个为1，其它均为0，所以最大值极为数字对应的实际值）
    curr_label = np.argmax(trainlabel[i, :])  # Label
    # 矩阵图
    plt.matshow(curr_img, cmap=plt.get_cmap('gray'))
    plt.title("第" + str(i) + "个图，实际数字为：" + str(curr_label))
    plt.show()

在这里插入图片描述

下载下来的数据集被分三个子集：

5.5W行的训练数据集（mnist.train）
1W行的测试数据集（mnist.test）
因为每张图片为28x28的黑白图片，所以每行为784维的向量。

代码：

import numpy as np
import tensorflow as tf
from tensorflow_linear_regression import input_data

# 读取数据
mnist = input_data.read_data_sets('data/', one_hot=True)

trainimg = mnist.train.images
trainlabel = mnist.train.labels
testimg = mnist.test.images
testlabel = mnist.test.labels

# 定义占位符x,y和常量w,b
x = tf.placeholder("float", [None, 784])  # 784是维度，none表示的是无限多，程序自动识别
y = tf.placeholder("float", [None, 10])  # 10是输出维度，表示类别数字0-9
W = tf.Variable(tf.zeros([784, 10]))  # 每个数字是784像素点的，所以w与x相乘的话也要有784个
b = tf.Variable(tf.zeros([10]))  # ，10表示这个10分类的
# 这里只是简单初始化为0，可以以某种分布随机初始化

# 回归模型  w*x+b，然后再加上softmax，这里和逻辑回归中的公式相对应
actv = tf.nn.softmax(tf.matmul(x, W) + b)
# cost function 均值
cost = tf.reduce_mean(-tf.reduce_sum(y * tf.log(actv), reduction_indices=1))
# 优化
learning_rate = 0.01
# 使用梯度下降，最小化误差
optm = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)

pred = tf.equal(tf.argmax(actv, 1), tf.argmax(y, 1))
# 正确率
accr = tf.reduce_mean(tf.cast(pred, "float"))
# 初始化
init = tf.global_variables_initializer()

# 迭代次数
training_epochs = 50
# 批尺寸
batch_size = 100
# 每迭代5次显示一次结果
display_step = 5
# 开启会话
sess = tf.Session()
sess.run(init)

for epoch in range(training_epochs):  # 遍历迭代次数
    avg_cost = 0.
    # 55000/100
    num_batch = int(mnist.train.num_examples / batch_size)  # 批次
    for i in range(num_batch):
        # 获取数据集 next_batch获取下一批的数据
        batch_xs, batch_ys = mnist.train.next_batch(batch_size)
        # 模型训练
        sess.run(optm, feed_dict={x: batch_xs, y: batch_ys})
        feeds = {x: batch_xs, y: batch_ys}
        avg_cost += sess.run(cost, feed_dict=feeds) / num_batch
    # 满足5次的一个迭代
    if epoch % display_step == 0:
        feeds_train = {x: mnist.train.images, y: mnist.train.labels}
        feeds_test = {x: mnist.test.images, y: mnist.test.labels}
        train_acc = sess.run(accr, feed_dict=feeds_train)
        test_acc = sess.run(accr, feed_dict=feeds_test)
        print("批次: %03d/%03d 损失: %.9f 训练集准确率: %.3f 测试集准确率: %.3f"
              % (epoch, training_epochs, avg_cost, train_acc, test_acc))
print("训练完成")