Tensorflow学习笔记（三）：机器学习基础(2)线性回归

最新推荐文章于 2022-04-08 01:29:14 发布

recusant

最新推荐文章于 2022-04-08 01:29:14 发布

阅读量338

点赞数 3

分类专栏： Keras/tensorflow

本文链接：https://blog.csdn.net/weixin_38047275/article/details/81433756

版权

Keras/tensorflow 专栏收录该内容

6 篇文章 0 订阅

订阅专栏

一.

在统计学中，线性回归（Linear regression）是利用称为线性回归方程的最小平方函數对一个或多个自变量和因变量之间关系进行建模的一种回归分析。这种函数是一个或多个称为回归系数的模型参数的线性组合。机器学习中，线性回归也是一种最简单的建模手段，给定一个数据点集合作为训练集，线性回归的目标是找到一个与这些数据最吻合的线性函数。

线性函数的一半表达式为Y=WX+b

·Y为待预测的值

·X为一组独立的预测变量

·W为权值，即模型从训练中学到的参数

·b也为学习到的参数，称为偏置值（bias）

二.线性回归实例

这是在吴恩达机器学习课程中的课后习题,我们使用tensorflows来完成这道习题.

In this part of this exercise, you will implement linear regression with one
variable to predict profits for a food truck. Suppose you are the CEO of a
restaurant franchise and are considering different cities for opening a new
outlet. The chain already has trucks in various cities and you have data for
profits and populations from the cities.
3You would like to use this data to help you select which city to expand
to next.
The file ex1data1.txt contains the dataset for our linear regression prob-
lem. The first column is the population of a city and the second column is
the profit of a food truck in that city. A negative value for profit indicates a
loss.
The ex1.m script has already been set up to load this data for you.

我们下载其中的训练数据并将其转化为csv文件,得到数据如图.

首先我们导入tensorflow,并初始化参数权值w和偏置值b为0,因为是单变量回归问题故将w设为标量.

import tensorflow as tf
import os
import matplotlib.pylab as plt
# 初始化参数
w=tf.Variable(0,dtype=tf.float32,name="weight")
b=tf.Variable(0,dtype=tf.float32,name='bias')

用线性函数y=wx+b来拟合数据,损失函数采用平均平方误差和函数

# 线性函数
def inference(X):
    return ((tf.multiply(X,w))+b)

# 损失函数
def loss(X,Y):
    Y_predict=inference(X)
    return tf.reduce_mean(tf.square(Y_predict-Y))

每次read的执行都会从文件中读取一行内容， decode_csv 操作会解析这一行内容并将其转为张量列表。如果输入的参数有缺失，record_default参数可以根据张量的类型来设置默认值。

# 读取文件
def read_csv(batch_size, file_name, record_defaults):
    # 建立一个文件队列
    filename_queue = tf.train.string_input_producer([''.join([os.path.dirname(__file__), '/', file_name])])
    # 建立reader对象,每次读取一行
    reader = tf.TextLineReader()
    key, value = reader.read(filename_queue)
    # decode_csv将文本行转换到相应类型的元组中
    decoded = tf.decode_csv(value, record_defaults=record_defaults)
    # 对文件进行批处理读取
    return tf.train.batch(decoded, batch_size=batch_size, capacity=batch_size*100)


def inputs():
    x,y=read_csv(BATCH_SIZE, 'data.csv',
                 [[0.0], [0.0]])

训练采用梯度下降方法接近loss函数的最小值.

# 训练数据，采用梯度下降法，步长0.01
def train(total_loss):
    learning_rate=0.01
    return tf.train.GradientDescentOptimizer(learning_rate).minimize(total_loss)

建立Session对象,开始拟合数据,并绘制函数图.

with tf.Session() as sess:
    init=tf.global_variables_initializer()
    sess.run(init)

    x,y=inputs()

    total_loss=loss(x,y)
    train_op=train(total_loss)
    coord=tf.train.Coordinator()
    threads=tf.train.start_queue_runners(sess=sess,coord=coord)

    train_steps=1000
    for step in range(train_steps):
        sess.run([train_op])
        if (step+1)%50==0:
            print('step:',step+1,'loss:',sess.run(total_loss),'w=',sess.run(w),'b=',sess.run(b))

    print(sess.run(w))
    print(sess.run(b))
    X=sess.run(x)

    plt.plot(sess.run(x), sess.run(y), 'ro', label="Original data")
    plt.plot(X, sess.run(w) * X + sess.run(b), label="Fitted line")
    plt.legend()
    plt.show()

    coord.request_stop()
    coord.join(threads)
    sess.close()

运行结果如下: