Gradient Descent in Linear Regression

最新推荐文章于 2023-06-19 15:49:32 发布

BogeyDa

最新推荐文章于 2023-06-19 15:49:32 发布

阅读量355

点赞数

分类专栏： Algorithm_TensorFlow 文章标签：梯度下降 Tensorflow 线性回归

本文链接：https://blog.csdn.net/lrglgy/article/details/91644497

版权

Algorithm_TensorFlow 专栏收录该内容

12 篇文章 0 订阅

订阅专栏

TensorFlow学习记录

第五章梯度下降

通过直接计算得到的参数值一定是不够精确的，还需要通过训练不断将参数进行优化，而梯度下降为常用的优化算法。
在Tensorflow中可以通过手工和自动的方式计算梯度也可以通过种优化器计算并优化梯度，此外还可以修改数据的提供方式来实现最小梯度下降。

5.1 准备工作

导入相关包

import tensorflow as tf
import numpy as np
from sklearn.datasets import fetch_california_housing
from sklearn.preprocessing import StandardScaler

下载并整理数据

housing = fetch_california_housing()
m,n = housing.data.shape # 获取数据的行列数
housing_data_plus_bias = np.c_[np.ones((m,1)),housing.data] # 为数据添加偏差项，即添加y=ax+b中的b

数据预处理(归一化)

scaler = StandardScaler().fit(housing_data_plus_bias)
scaled_housing_data_plus_bias = scaler.transform(housing_data_plus_bias)

5.2 手工计算梯度

通过公式计算梯度，手写梯度下降算法进行优化。

创建计算图

n_epochs = 1000
global_learning_rate = 0.01
X = tf.constant(scaled_housing_data_plus_bias, dtype=tf.float32, name="X")
y = tf.constant(housing.target.reshape(-1,1), dtype=tf.float32, name="y") # 数据标签
XT = tf.transpose(X)
theta = tf.Variable(tf.random_uniform([n+1,1],-1.0,1.0),name="theta")     # 参数
y_pred =  tf.matmul(X, theta, name="prediction")                          # 预测值
error = y_pred-y                                                          # 误差
mse = tf.reduce_mean(tf.square(error), name="mse")                        # 均方误差(成本函数)
gradient = 2/m * tf.matmul(XT, error)                                     # 梯度
training_op = tf.assign(theta, theta-global_learning_rate*gradient)       # 训练

创建会话，执行计算图

init = tf.global_variables_initializer()                                  # 添加初始化节点

with tf.Session() as sess:
    sess.run(init)

    for epoch in range(n_epochs):                                       # 逐步训练
      if epoch%100==0:
         print("Epoch:", epoch, "MSE=", mse.eval())                    # 每一步均方误差
         sess.run(training_op)                                             # 执行每一步训练，更新梯度
  
    best_theta = theta.eval()                                             # 训练完毕，返回最佳参数
    print("The best theta is", best_theta)

5.3 自动计算梯度

通过Tensorflow自带函数自动计算梯度，依然手写梯度下降算法进行优化。

创建计算图
只需在手工计算梯度的计算图上将gradient=...修改为：
```
gradient = tf.gradients(mse, [theta])[0]                                  # 使用反向自动微分计算梯度
```
在 Tensorflow中自动计算梯度使用的是反向自动微分方法，此外还有数值微分，符号微分和前向自动微分，它们的用法及区别参见博客自动微分。
创建会话
与手工计算梯度相同，不再赘述。

5.4 优化器

自动计算梯度并利用Tensorflow中的优化器进行优化。

创建计算图
只需在自动计算梯度的计算图上将training_op=...修改为：

## 定义优化器(梯度下降)
# optimizer = tf.train.GradientDescentOptimizer(learning_rate =   global_learning_rate)
## 定义优化器（动量）
optimizer = tf.train.MomentumOptimizer(learning_rate = global_learning_rate, momentum = 0.9)
training_op = optimizer.minimize(mse)

创建会话
与自动计算梯度相同，不再赘述。

5.5 批量梯度下降

要实现最小批量梯度下降算法，需要每次训练时用小批量替换输入数据X和y。可以添加一个占位符节点执行该替换操作。它不进行任何计算，只在运行时输出需要输出的值。

创建计算图
需要将X，y定义为占位符节点并且定义批量的大小与批量的个数，其它与之前的计算图相同，但不可使用动量优化器：

X = tf.placeholder(tf.float32, shape=(None, n+1), name="X")
y = tf.placeholder(tf.float32, shape=(None, 1), name="y")
batch_size = 100
n_batches= int(np.ceil(m/batch_size))

创建会话
先创建进行批量替换的函数，然后创建会话，详情如下：

init = tf.global_variables_initializer()                                  # 添加初始化节点

def fetch_batch(epoch, batch_index, batch_size):
     np.random.seed(epoch * n_batches + batch_index) 
     indices = np.random.randint(m, size=batch_size)
     X_batch = scaled_housing_data_plus_bias[indices] 
     y_batch = housing.target.reshape(-1, 1)[indices] 
     return X_batch, y_batch

with tf.Session() as sess:
    sess.run(init)

    for epoch in range(n_epochs):
       for batch_index in range(n_batches):
          X_batch, y_batch = fetch_batch(epoch, batch_index, batch_size)
          sess.run(training_op, feed_dict={X:X_batch, y:y_batch})
    best_theta = theta.eval()
    print("The best theta is", best_theta)