TF/03_Linear_Regression/04_Loss_Functions_in_Linear_Regressions

Loss Functions in Linear Regression

The choice of loss function can significantly impact the convergence of TensorFlow algorithms. We will compare and contrast using the L1 and L2 loss functions for linear regression.

L1 Loss

The L1 loss is the absolute value of the difference between the target (y) and the prediction (p):

L1(y,p) = |y - p|

L2 Loss

The L2 loss is the squared difference between the target (y) and the prediction (p):

L2(y,p) = (y - p)^2

Summary

L2 LossL1 Loss
More stableLess Stable
Not very robustRobust

Graph of L1 vs L2 Loss Functions

L1 and L2 Loss

Graph of L1 vs L2 Loss Functions (L2 not converging)

Here is an example of the L2 function not converging. Despite a large learning rate, L1 has converged but L2 has not.
L1 Converging

L2 Not Converging

Graphical Summary of L1 and L2 with Learning Rates

Here is a plot of a 1D example of L1 and L2 loss with a small and large learning rate.

L1 and L2

To note:

Top Left

  • L1 loss with small learning rate: Robust, converges, but may take a while to converge.

Top Right

  • L2 loss with small learning rate: Robust, converges, but may take a while to converge.

Bottom Left

  • L1 loss with large learning rate: More robust, and less likely to explode, but may bounce around the optimum at the end.

Bottom Right

  • L2 loss with large learning rate: Not robust, explodes because of large learning rate. Very sensitive to learning rate.

Moral of the story: When your algorithm isn’t converging, try decreasing the learning rate first.

04_lin_reg_l1_vs_l2.py

# Linear Regression: L1 vs L2
#----------------------------------
#
# This function shows how to use TensorFlow to
# solve linear regression via the matrix inverse.

import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
from sklearn import datasets
from tensorflow.python.framework import ops
ops.reset_default_graph()

# Create graph
#修改位置
config = tf.ConfigProto(allow_soft_placement= True, log_device_placement= True)
sess = tf.Session(config= config)
# Create graph
# sess = tf.Session()

# Load the data
# iris.data = [(Sepal Length, Sepal Width, Petal Length, Petal Width)]
iris = datasets.load_iris()
x_vals = np.array([x[3] for x in iris.data])
y_vals = np.array([y[0] for y in iris.data])

# Declare batch size and number of iterations
batch_size = 25
learning_rate = 0.4 # Will not converge with learning rate at 0.4
iterations = 50

# Initialize placeholders
x_data = tf.placeholder(shape=[None, 1], dtype=tf.float32)
y_target = tf.placeholder(shape=[None, 1], dtype=tf.float32)

# Create variables for linear regression
A = tf.Variable(tf.random_normal(shape=[1,1]))
b = tf.Variable(tf.random_normal(shape=[1,1]))

# Declare model operations
model_output = tf.add(tf.matmul(x_data, A), b)

# Declare loss functions
loss_l1 = tf.reduce_mean(tf.abs(y_target - model_output))

# Declare optimizers
my_opt_l1 = tf.train.GradientDescentOptimizer(learning_rate)
train_step_l1 = my_opt_l1.minimize(loss_l1)

# Initialize variables
init = tf.global_variables_initializer()
sess.run(init)

# Training loop
loss_vec_l1 = []
for i in range(iterations):
    rand_index = np.random.choice(len(x_vals), size=batch_size)
    rand_x = np.transpose([x_vals[rand_index]])
    rand_y = np.transpose([y_vals[rand_index]])
    sess.run(train_step_l1, feed_dict={x_data: rand_x, y_target: rand_y})
    temp_loss_l1 = sess.run(loss_l1, feed_dict={x_data: rand_x, y_target: rand_y})
    loss_vec_l1.append(temp_loss_l1)
    if (i+1)%25==0:
        print('Step #' + str(i+1) + ' A = ' + str(sess.run(A)) + ' b = ' + str(sess.run(b)))


# L2 Loss
# Reinitialize graph
ops.reset_default_graph()

# Create graph
#修改位置
config = tf.ConfigProto(allow_soft_placement= True, log_device_placement= True)
sess = tf.Session(config= config)
# Create graph
# sess = tf.Session()

# Initialize placeholders
x_data = tf.placeholder(shape=[None, 1], dtype=tf.float32)
y_target = tf.placeholder(shape=[None, 1], dtype=tf.float32)

# Create variables for linear regression
A = tf.Variable(tf.random_normal(shape=[1,1]))
b = tf.Variable(tf.random_normal(shape=[1,1]))

# Declare model operations
model_output = tf.add(tf.matmul(x_data, A), b)

# Declare loss functions
loss_l2 = tf.reduce_mean(tf.square(y_target - model_output))

# Declare optimizers
my_opt_l2 = tf.train.GradientDescentOptimizer(learning_rate)
train_step_l2 = my_opt_l2.minimize(loss_l2)

# Initialize variables
init = tf.global_variables_initializer()
sess.run(init)

loss_vec_l2 = []
for i in range(iterations):
    rand_index = np.random.choice(len(x_vals), size=batch_size)
    rand_x = np.transpose([x_vals[rand_index]])
    rand_y = np.transpose([y_vals[rand_index]])
    sess.run(train_step_l2, feed_dict={x_data: rand_x, y_target: rand_y})
    temp_loss_l2 = sess.run(loss_l2, feed_dict={x_data: rand_x, y_target: rand_y})
    loss_vec_l2.append(temp_loss_l2)
    if (i+1)%25==0:
        print('Step #' + str(i+1) + ' A = ' + str(sess.run(A)) + ' b = ' + str(sess.run(b)))


# Plot loss over time
plt.plot(loss_vec_l1, 'k-', label='L1 Loss')
plt.plot(loss_vec_l2, 'r--', label='L2 Loss')
plt.title('L1 and L2 Loss per Generation')
plt.xlabel('Generation')
plt.ylabel('L1 Loss')
plt.legend(loc='upper right')
plt.show()
InternalError: Blas SGEMM launch failed : a.shape=(25, 1), b.shape=(1, 1), m=25, n=1, k=1
     [[Node: MatMul = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/gpu:0"](_recv_Placeholder_0/_7, Variable/read)]]
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值