机器学习算法 09—04 tensorFlow 自动求导实现线性回归，使用优化器求最优解分轮次分批次随机的求解

最新推荐文章于 2022-09-23 19:38:38 发布

熙仪繁华

最新推荐文章于 2022-09-23 19:38:38 发布

阅读量412

点赞数

分类专栏：机器学习算法文章标签： tensorflow 机器学习算法

本文链接：https://blog.csdn.net/weixin_41672684/article/details/122510769

版权

机器学习算法专栏收录该内容

29 篇文章 2 订阅

订阅专栏

1 tensorFlow 自动求导实现线性回归

2 使用优化器求最优解分轮次分批次随机的求解

1 tensorFlow 自动求导实现线性回归

总结：1 注意tensorflow中y值要转化为列向量

gradients=tf.gradients(mes,[theta])[0]   #重点要了解的内容

import tensorflow as tf
import numpy as np
from sklearn.datasets import fetch_california_housing
from sklearn.preprocessing import StandardScaler

n_epochs=10000
learning_rate=0.01
housing=fetch_california_housing()
m,n=housing.data.shape
# print(m,n)  #20640 8
# print(housing.data,housing.target)
"""
[[   8.3252       41.            6.98412698 ...    2.55555556
    37.88       -122.23      ]
 [   8.3014       21.            6.23813708 ...    2.10984183
    37.86       -122.22      ]
 [   7.2574       52.            8.28813559 ...    2.80225989
    37.85       -122.24      ]
 ...
 [   1.7          17.            5.20554273 ...    2.3256351
    39.43       -121.22      ]
 [   1.8672       18.            5.32951289 ...    2.12320917
    39.43       -121.32      ]
 [   2.3886       16.            5.25471698 ...    2.61698113
    39.37       -121.24      ]] [4.526 3.585 3.521 ... 0.923 0.847 0.894]

"""
housing_data_plus_bias=np.c_[np.ones((m,1)),housing.data]
# print(housing_data_plus_bias)
"""
 [   1.            2.3886       16.         ...    2.61698113
    39.37       -121.24      ]]
"""
scaler=StandardScaler().fit(housing_data_plus_bias) #标准归一化 计算每列的均值跟方差
scaler_housing_data_plus_bias=scaler.transform(housing_data_plus_bias)
# print(scaler_housing_data_plus_bias)
"""
 ...
 [ 0.         -1.14259331 -0.92485123 ... -0.0717345   1.77823747
  -0.8237132 ]
 [ 0.         -1.05458292 -0.84539315 ... -0.09122515  1.77823747
  -0.87362627]
 [ 0.         -0.78012947 -1.00430931 ... -0.04368215  1.75014627
  -0.83369581]]
"""
X=tf.constant(scaler_housing_data_plus_bias,dtype=tf.float32,name="X")  #20640 9
y=tf.constant(housing.target.reshape(-1,1),dtype=tf.float32,name="y")   #20640 1           

# 初始化theta 20640 9 服从均匀分布的 -1，1之间 的变量
theta=tf.Variable(tf.random_uniform([n+1,1],-1,1),name="theta")         #9 1

# y的预测值
y_pred=tf.matmul(X,theta,name="predictions")                            #20640 1

# 计算误差
error=y_pred-y                                                          #20640 1

# 求损失函数
mes=tf.reduce_mean(tf.square(error),name='mse')

# 梯度的公式GD
# tf.matmul(tf.transpose(X),error) #9 20640 *20640 1 =9 1
gradients=tf.gradients(mes,[theta])[0]   #自动求导

# 更新theta
training_op=tf.assign(theta,theta-learning_rate*gradients)
init=tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init)
    for epoch in range(n_epochs):
        if epoch%100==0:
            print("epoch=",epoch,"MSE=",mes.eval())
        sess.run(training_op)
    best_theta=theta.eval()
    print(best_theta)

epoch= 9400 MSE= 4.803255
epoch= 9500 MSE= 4.803255
epoch= 9600 MSE= 4.803255
epoch= 9700 MSE= 4.8032546
epoch= 9800 MSE= 4.803255
epoch= 9900 MSE= 4.803255
[[ 0.0176723 ]
 [ 0.8296331 ]
 [ 0.11875448]
 [-0.26555073]
 [ 0.3057157 ]
 [-0.00450223]
 [-0.03932685]
 [-0.8998542 ]
 [-0.87051094]]

反向传播的本质：

反向传播的第一步： $\partial Loss/\partial y_p$

这个值取决于Loss函数公式如果 $Loss=\sum\left ( y-y\hat{} \right )^{2}$ 则等于y-y_hat

加法门：反向传播求梯度时值等于1

乘法门：反向传播求梯度时值定于另一个数的值

sigmoid门：反向传播求梯度时值等于 $\left ( 1-\sigma \left ( x \right ) \right )*\sigma \left ( x \right )$

2 使用优化器求最优解分轮次分批次随机的求解

优化器：tf.train.GradientDescentOptimizer(leaning_rate=learning_rate).minimize(mse)

问题：怎样知道对mse的那个参数求导呢？根据mes中的所有variable来求梯度，然后更新这个变量。

模型训练的目标是找到一组theta使损失函数最小。实际使对mse的变量theta求梯度，然后采用梯度下降法更新变量theta

总结 ：
1  划分训练集和测试集
2. 测试集也是根据训练集进行归一化
3.使用Min——GD   x y 使用placeholder 
4. 在做训练或评估的时候  需要feed_dict={X:   ,y:  }
5 创建梯度跟更新theta的方式  采用优化器一步完成
6 使用双层for循环  分伦次   分批次来跑。跑之前需要根据索引打乱顺序。

import tensorflow as tf
import numpy as np
from sklearn.datasets import fetch_california_housing
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split


# TensorFlow为我们去计算梯度，但是同时也给了我们更方便的求解方式
# 它提供给我们与众不同的，有创意的一些优化器，包括梯度下降优化器
# 替换前面代码相应的行，并且一切工作正常

# 设定超参数，Grid Search进行栅格搜索，其实说白了就是排列组合找到Loss Function最小的时刻
# 的那组超参数结果
n_epochs = 1000  #轮次
learning_rate = 0.001
batch_size = 2000  #批次大小  批次=m/batch_size  向上取整 math。ceil()  ；向下取整 math.floor()  //

# 读取数据，这里读取数据是一下子就把所有数据交给X，Y节点，所以下面去做梯度下降的时候
#   BGD = Batch Gradient Decrease ，如果面向数据集比较大的时候，我们倾向与 Mini GD
housing = fetch_california_housing()
m, n = housing.data.shape
# 可以使用TensorFlow或者Numpy或者sklearn的StandardScaler去进行归一化
X_train, X_test, y_train, y_test = train_test_split(housing.data,housing.target)
scaler = StandardScaler().fit(X_train)
X_train = scaler.transform(X_train)
X_train= np.c_[np.ones((len(X_train), 1)), X_train]

X_test = scaler.transform(X_test)
X_test= np.c_[np.ones((len(X_test), 1)), X_test]
#housing_data_plus_bias = np.c_[np.ones((m, 1)), housing.data] # m 1+n
# 可以使用TensorFlow或者Numpy或者sklearn的StandardScaler去进行归一化
# scaler = StandardScaler().fit(housing_data_plus_bias)
# scaled_housing_data_plus_bias = scaler.transform(housing_data_plus_bias)
#
# X_train = scaled_housing_data_plus_bias[:18000]
# X_test = scaled_housing_data_plus_bias[18000:]
# y_train = housing.target.reshape(-1, 1)[:18000]
# y_test = housing.target.reshape(-1, 1)[18000:]

# 下面部分X，Y最后用placeholder可以改成使用Mini BGD
# 构建计算的图
# X = tf.constant(scaled_housing_data_plus_bias, dtype=tf.float32, name='X')
# y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name='y')
X = tf.placeholder(dtype=tf.float32, name='X') # Mini BGD 每次传递的数据不一样  先占个位置
y = tf.placeholder(dtype=tf.float32, name='y') #

# random_uniform函数创建图里一个节点包含随机数值，给定它的形状和取值范围，就像numpy里面rand()函数
theta = tf.Variable(tf.random_uniform([n + 1, 1], -1.0, 1.0), name='theta')
y_pred = tf.matmul(X, theta, name="predictions")
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name="mse") #损失函数
# 梯度的公式：(y_pred - y) * xj
# gradients = 2/m * tf.matmul(tf.transpose(X), error)  #方式一  根据计算公式求梯度
# gradients = tf.gradients(mse, [theta])[0]            #方式二  直接求梯度
# 赋值函数对于BGD来说就是 theta_new = theta - (learning_rate * gradients)
# training_op = tf.assign(theta, theta - learning_rate * gradients)  #按梯度方向更新theta

# MomentumOptimizer收敛会比梯度下降更快
# optimizer = tf.train.MomentumOptimizer(learning_rate=learning_rate, momentum=0.9)
training_op = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(mse)  #在这里会对mes所以来的所有变量求导，在这里theta为变量
init = tf.global_variables_initializer()

# 下面是开始训练
with tf.Session() as sess:
    sess.run(init)  #变量初始化

    # n_batch = int(18000 / batch_size) #9批次
    n_batch = int(len(X_train)/ batch_size)  # 9批次

    for epoch in range(n_epochs):    #迭代的轮次  1000
        if epoch % 100==0:
            temp_theta=theta.eval()
            print(temp_theta)
            #打印看一下 训练集 和测定的损失函数是多少  有没有在减小（不是核心）
            print("Epoch", epoch, "MSE = ",
                  sess.run(mse, feed_dict={  #前面有 placeholder 后面就的要feed_dict
                      X: X_train,
                      y: y_train
                  }))
            print("Epoch", epoch, "MSE = ",
                  sess.run(mse, feed_dict={
                      X: X_test,
                      y: y_test
                  }))
       # shuffer 的位置很重要 放置在第一层for循环的里面 在第二层for循环的外面
        # arr = np.arange(18000)
        arr = np.arange(len(X_train))
        np.random.shuffle(arr)      # shuffle  对索引 arr 打乱顺序   保证每轮的数据不一样
        X_train = X_train[arr]      # 按照打乱的索引去x_train保证和Y_train 一一对应   这里需要注意的一点，用索引取数据的X_train必须是array类型的
        y_train = y_train[arr]

        for i in range(n_batch):    ##9批次
            sess.run(training_op, feed_dict={
                X: X_train[i*batch_size: i*batch_size + batch_size],
                y: y_train[i*batch_size: i*batch_size + batch_size]
            })

    best_theta = theta.eval()
    print(best_theta)

# 最后还要进行模型的测试，防止过拟合

结果：

Epoch 997 MSE =  4.755162
Epoch 997 MSE =  6.118693
Epoch 998 MSE =  4.755121
Epoch 998 MSE =  6.1186137
Epoch 999 MSE =  4.7550774
Epoch 999 MSE =  6.1185174
[[-0.76788545]
 [ 0.9318873 ]
 [ 0.3088649 ]
 [-0.4280878 ]
 [ 0.433204  ]
 [ 0.04665219]
 [-0.1027037 ]
 [-0.520092  ]
 [-0.21609364]]

Process finished with exit code 0

熙仪繁华

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
机器学习算法 09—04 tensorFlow 自动求导实现线性回归，使用优化器求最优解分轮次分批次随机的求解

1 自动求导gradients=tf.gradients(mes,[theta])[0] #重点要了解的内容import tensorflow as tfimport numpy as npfrom sklearn.datasets import fetch_california_housingfrom sklearn.preprocessing import StandardScalern_epochs=10000learning_rate=0.01housing=fetc.
复制链接

扫一扫