线性回归算法能表示为矩阵计算,Ax=b。这里要解决的是用矩阵x来求解系数。注意,如果观测矩阵不是方阵,那求解出的矩阵x为 x=(AT∗A)−1∗AT∗b x = ( A T ∗ A ) − 1 ∗ A T ∗ b 。为了更直观地展示这种情况,我们将生成二维数据,用TensorFlow来求解,然后绘制最终结果。
# 线性回归: 逆矩阵方法
#----------------------------------
#
# This function shows how to use TensorFlow to
# solve linear regression via the matrix inverse.
#
# Given Ax=b, solving for x:
# x = (t(A) * A)^(-1) * t(A) * b
# where t(A) is the transpose of A
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
from tensorflow.python.framework import ops
ops.reset_default_graph()
# 初始化计算图
sess = tf.Session()
# 生成数据
x_vals = np.linspace(0, 10, 100)
y_vals = x_vals + np.random.normal(0, 1, 100)
# 创建后续求逆方法所需的矩阵。
# 创建A矩阵,其为矩阵x_vals_column和ones_column的合并。
x_vals_column = np.transpose(np.matrix(x_vals))
ones_column = np.transpose(np.matrix(np.repeat(1, 100)))
A = np.column_stack((x_vals_column, ones_column))
# 然后以矩阵y_vals创建b矩阵
b = np.transpose(np.matrix(y_vals))
# 将A和矩阵转换成张量
A_tensor = tf.constant(A)
b_tensor = tf.constant(b)
# 逆矩阵方法(Matrix inverse solution)
tA_A = tf.matmul(tf.transpose(A_tensor), A_tensor)
tA_A_inv = tf.matrix_inverse(tA_A)
product = tf.matmul(tA_A_inv, tf.transpose(A_tensor))
solution = tf.matmul(product, b_tensor)
solution_eval = sess.run(solution)
# 从解中抽取系数、斜率和y截距y-intercept
slope = solution_eval[0][0]
y_intercept = solution_eval[1][0]
print('slope: ' + str(slope))
print('y_intercept: ' + str(y_intercept))
# 求解拟合直线
best_fit = []
for i in x_vals:
best_fit.append(slope*i+y_intercept)
# 绘制结果
plt.plot(x_vals, y_vals, 'o', label='Data')
plt.plot(x_vals, best_fit, 'r-', label='Best fit line', linewidth=3)
plt.legend(loc='upper left')
plt.show()
输出:
slope: 0.997867286573
y_intercept: -0.0626160341584
这里的解决方法是通过矩阵操作直接求解结果。大部分TensorFlow算法是通过迭代训练实现的,利用反向传播自动更新模型变量。这里通过实现数据直接求解的方法拟合模型,仅仅是为了说明TensorFlow的灵活用法。