吴恩达机器学习 笔记二 回归regression

1. 单变量线性回归

  假设样本中有 m m 组数据(x(1),y(1)),(x(2),y(2)),,(x(m),y(m))
  预测函数

hθ(x)=θ0+θ1x h θ ( x ) = θ 0 + θ 1 x
  代价函数
J(θ0,θ1)=12mi=1m(hθx(i)y(i))2 J ( θ 0 , θ 1 ) = 1 2 m ∑ i = 1 m ( h θ x ( i ) − y ( i ) ) 2
  目标求解
(θ0,θ1)=argminθ0,θ1J(θ0,θ1) ( θ 0 , θ 1 ) = a r g min θ 0 , θ 1 J ( θ 0 , θ 1 )
  比较常用的方法就是梯度下降法。梯度与方向导数密切相关,是从泰勒级数的展开来证明的,比较简单,这里就不推导了。具体算法过程:
temp0=θ0αθ0J(θ0,θ1) t e m p 0 = θ 0 − α ∂ ∂ θ 0 J ( θ 0 , θ 1 )
temp1=θ1αθ1J(θ0,θ1) t e m p 1 = θ 1 − α ∂ ∂ θ 1 J ( θ 0 , θ 1 )
θ0=temp0 θ 0 = t e m p 0
θ1=temp1 θ 1 = t e m p 1

注意

  • 梯度下降算法过程中的步骤二、三顺序一定不要乱,因为第二步中的计算与 θ0 θ 0 是有关系的

  • 对训练集数据进行均值归一化,利于提升收敛速度

  • 注意学习步长的选择

2. 多元线性回归:

  多元线性回归是由单变量线性回归延伸而来的。多变量线性回归中的 m m 组数据(x1(1),x2(1),,xn(1),y(1)),(x1(2),x2(2),,xn(2),y(2)),,(x1(m),x2(m),,xn(m),y(m))
  预测函数

hθ⃗ (x⃗ )=θ⃗ Tx⃗ =θ0+θ1x1+θ2x2++θnxn h θ → ( x → ) = θ → T x → = θ 0 + θ 1 x 1 + θ 2 x 2 + … + θ n x n
其中, θ⃗ =(θ0,θ1,,θn)T θ → = ( θ 0 , θ 1 , … , θ n ) T , x⃗ =(1,x1,x2,,xn)T x → = ( 1 , x 1 , x 2 , … , x n ) T
  代价函数
J(θ⃗ )=12mi=1m(hθ⃗ x⃗ (i)y(i))2 J ( θ → ) = 1 2 m ∑ i = 1 m ( h θ → x → ( i ) − y ( i ) ) 2
  目标求解
θ⃗ =argminθ⃗ J(θ⃗ ) θ → = a r g min θ → J ( θ → )

2.1 梯度下降法

  多元线性回归依然可以用梯度下架你敢发进行求解,只不过是在高维空间的梯度下降,不再是三维空间那么地可视化。具体算法过程:

temp0=θ0αθ0J(θ⃗ ) t e m p 0 = θ 0 − α ∂ ∂ θ 0 J ( θ → )
temp1=θ1αθ1J(θ⃗ ) t e m p 1 = θ 1 − α ∂ ∂ θ 1 J ( θ → )
tempn=θnαθnJ(θ))
θ0=temp0 θ 0 = t e m p 0
θ1=temp1 θ 1 = t e m p 1
θn=tempn

2.2 最小二乘

  另外一个求解多元线性回归的方法是最小二乘法。具体推导如下:
  整个预测过程可以用方程组表示为

x⃗ (1)Tθ⃗ =y(1) x → ( 1 ) T θ → = y ( 1 )
x⃗ (2)Tθ⃗ =y(2) x → ( 2 ) T θ → = y ( 2 )
x(m)Tθ=y(m)

  将方程组表示成矩阵的形式,
Xθ⃗ =y⃗  X θ → = y →
其中, X=[x⃗ (1)T;x⃗ (2)T;;x⃗ (m)T] 矩 阵 X = [ x → ( 1 ) T ; x → ( 2 ) T ; … ; x → ( m ) T ] ,可以求解得到
θ⃗ =(XTX)1XTy⃗  θ → = ( X T X ) − 1 X T y →

2.3最小二乘的几何意义

  对方程组进行变换,令

x⃗ i=(xi(1)xi(2)xi(m))T x → i = ( x i ( 1 ) x i ( 2 ) … x i ( m ) ) T
y⃗ =(y(1)y(2)y(m))T y → = ( y ( 1 ) y ( 2 ) … y ( m ) ) T
那么
y⃗ =θ0x⃗ 0+θ1x⃗ 1++θnx⃗ n y → = θ 0 x → 0 + θ 1 x → 1 + … + θ n x → n
  其实,上式基本上是没有解的。但是,我们可以找到一个使得代价函数最小的解。代价函数最小的几何意义在于 x⃗ 0x⃗ 1x⃗ n x → 0 x → 1 … x → n 所张成的线性子空间中,寻找一点,使得这一点到 y⃗  y → 的距离最短。很显然,这一点就是 y⃗  y → 的投影,这也是最小二乘法求解的精髓所在。
  其实, 每个权值都代表了相应的特征对最终结果的贡献,多个训练样本会让特征对结构的贡献的衡量变得更加准确

2.4 最小二乘与梯度下降法的选择

  当数据的特征 n n 超过一定界限时,最小二乘中矩阵的求逆运算将会变得十分复杂,此时一般会选择梯度下降法。至于这个界限,可以选择105~ 106 10 6 作为参考。

3. 多项式回归

  多项式回归可以转变为多元线性回归,核心在于用已知的特征组合出新的特征。预测函数,例如

hθ⃗ (x⃗ )=θ0+θ1x1+θ2x21+θ3x1 h θ → ( x → ) = θ 0 + θ 1 x 1 + θ 2 x 1 2 + θ 3 x 1
其中每一项都是可以计算出的。
  至于多项式中应该选择怎样的高次项,就需要根据大概形状进行一个初次的选择。此外,也要根据实际情况,比如,房屋总价随着面积增长一般是不会减少的,所以此时应该二次项是不够的,还需要一个三次项。
  在多项式回归中,特征的缩放将会变得尤其重要,因为其中含有同一特征的不同次项,他们的范围是不同的,但是由于是同一特征,不可能进行不同size的缩放的。

4. 单变量线性回归tensorflow

  代码有参考网上的博客,第一次写tensorflow,万事开头难,还好总算是上手了。关于tensorflow的一些常用函数用法,后面会专门总结一些,毕竟不死不活,记死了才能活。

'''
Author       :  vivalazxp
Date         :  8/23/2018
Description  :  linear regression with one value
'''
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
'''
Description   :  create data for linear regression with one value
Param         :  @weight    weight of the line needed to be fitting
                 @bias      bias of the line needed to be fitting
                 @numData   number of training data
                 @sigma     power of noises
Return        :  @data_horizon   horizontal-axis of training data    shape=(numData,)
                 @data_vertical  vertical-axis of training data    shape=(numData,)
'''
def data_create_lin_reg_one_val(numData, weight, bias, horizon_limit, sigma):
    data_horizon = horizon_limit * 2*(np.random.rand(numData)-0.5)
    data_vertical = weight * data_horizon + bias
    # add noise
    data_vertical += sigma * np.random.randn(numData)
    print('------------- create training data sucessfully --------------')
    return data_horizon, data_vertical
'''
Description   :  ues tensorflow to complete linear regression with one value
Param         :  @alpha     learning rate
                 @steps     sum learning steps
Return        :  @weight_fitted    weight of the fitting line
                 @bias_fitted      bias of the fitting line
'''
def tf_lin_reg_one_val(data_horizon, data_vertical, steps, alpha):
    horizon_from_data = tf.placeholder(tf.float32)
    vertical_from_data = tf.placeholder(tf.float32)
    # initialize randomly weight and bias
    weight_fitted = tf.Variable(tf.random_normal([1]))
    bias_fitted = tf.Variable(tf.random_normal([1]))
    # cost function and optimizer
    vertical_pred = tf.multiply(weight_fitted, horizon_from_data) + bias_fitted
    cost = tf.reduce_mean(tf.pow(vertical_pred - vertical_from_data, 2))
    optimizer = tf.train.GradientDescentOptimizer(alpha).minimize(cost)
    # session initialization
    sess = tf.Session()
    init = tf.global_variables_initializer()
    sess.run(init)
    print('---------------- train started ------------------------')
    loss = np.zeros(steps)
    for step in range(steps):
        sess.run(optimizer, feed_dict={horizon_from_data: data_horizon, vertical_from_data: data_vertical})
        loss[step] = sess.run(cost,feed_dict={horizon_from_data: data_horizon, vertical_from_data: data_vertical})
    print('---------------- train finished ------------------------')
    weight_fitted = sess.run(weight_fitted)
    bias_fitted = sess.run(bias_fitted)
    return weight_fitted, bias_fitted, loss


if __name__ == "__main__":
    weight = 100
    bias = 2.0
    horizon_limit = 10
    numData = 1000
    sigma = weight
    steps = 10000
    alpha = 0.0001
    data_horizon, data_vertical = data_create_lin_reg_one_val(numData, weight, bias, horizon_limit, sigma)
    weight_fitted, bias_fitted, loss = tf_lin_reg_one_val(data_horizon, data_vertical, steps, alpha)
    # log
    print('expected  weight = ', weight, ', expected  bias = ', bias)
    print('regression weight = ', weight_fitted, ', regression bias = ', bias_fitted)
    # fitting line
    plt.figure(1)
    horizon_fit = np.linspace(-horizon_limit, horizon_limit, 200)
    vertical_fit = weight_fitted*horizon_fit + bias_fitted
    plt.plot(data_horizon, data_vertical, 'o', label='training data')
    plt.plot(horizon_fit, vertical_fit, 'r', label='regression line')
    plt.legend()
    plt.xlabel('horizontal axis')
    plt.ylabel('vertical axis')
    plt.title('linear regression with one value')
    # cost variation
    plt.figure(2)
    plt.plot(range(steps), loss)
    plt.xlabel('step')
    plt.ylabel('loss')
    plt.title('loss variation in linear regression with one value')

    plt.show()

这里写图片描述
这里写图片描述

5. 多项式回归 梯度下降 tensorflow

  这段代码也有参考网上,在调节参数的时候,遇到了出现nan的问题,后面继续研究一下。

'''
Author       :  vivalazxp
Date         :  11/9/2018
Description  :  non-linear regression regulization
'''
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt

'''
Description   :  create data for non-linear regression of sin(x)
Param         :  @numData   numbers of training data     
                 @sigma     power of noises                      
Return        :  @data_horizon   horizontal-axis of training data    shape=(numData,)
                 @data_vertical  vertical-axis of training data    shape=(numData,)
'''
def data_create_sin_non_lin_reg(numData, sigma, horizon_limit):
     data_horizon = np.linspace(-horizon_limit, horizon_limit, numData)
     data_vertical = np.sin(data_horizon)
     # add noise
     data_vertical += sigma * np.random.randn(numData)
     print('---------- create data sucessfully ----------')
     return data_horizon, data_vertical
'''
Description   :  use tensorflow to complete non-linear regression of sin(x)
Param         :  @alpha    learning rate
                 @steps    sum learning steps
                 @n_order  use n-order polynomial to fit sin(x)
Return        :  @theta    weights of fitting sin(x)  shape=(1,n_order+1)  
'''
def tf_non_lin_reg(n_order,data_horizon, data_vertical, alpha, steps):
    numData = data_vertical.shape[0]
    #placeholder for training data
    horizon_from_data = tf.placeholder(tf.float32)
    vertical_from_data = tf.placeholder(tf.float32)
    #initialize randomly theta and theta
    theta = tf.Variable(tf.random_normal([n_order+1]))
    vertical_pred = tf.zeros(numData)
    for index_n in range(n_order+1):
        vertical_pred = tf.add( vertical_pred, tf.multiply( theta[index_n], tf.pow( horizon_from_data, index_n*tf.ones([1,numData]))))

    #cost function and optimizer
    cost = tf.reduce_mean(tf.square(vertical_pred - vertical_from_data))
    optimizer = tf.train.GradientDescentOptimizer(alpha).minimize(cost)
    #session
    sess = tf.Session()
    sess.run(tf.global_variables_initializer())
    print('-------- train started --------')
    loss = np.zeros(steps)
    for step in range(steps):
        sess.run(optimizer, feed_dict={horizon_from_data: data_horizon, vertical_from_data: data_vertical})
        loss[step] = sess.run(cost, feed_dict={horizon_from_data: data_horizon, vertical_from_data: data_vertical})
    print('-------- train finished --------')
    theta = sess.run(theta)
    return theta, loss

def main():
    numData = 100
    sigma = 0.2
    n_order = 3
    horizon_limit = 3
    alpha = 0.005
    steps = 1000
    data_horizon, data_vertical = data_create_sin_non_lin_reg(numData, sigma, horizon_limit)
    theta, loss = tf_non_lin_reg(n_order, data_horizon, data_vertical, alpha, steps)
    # fitting line
    plt.figure(1)
    horizon_fit = np.linspace(-horizon_limit, horizon_limit, 200)
    vertical_fit = np.zeros(200)
    for index in range(n_order+1):
        vertical_fit = np.add(vertical_fit, theta[index]* horizon_fit ** index)

    plt.plot(data_horizon, data_vertical, 'o', label='training data')
    plt.plot(horizon_fit, vertical_fit, 'r', label='regression curve')
    plt.legend()
    plt.xlabel('horizontal axis')
    plt.ylabel('vertical axis')
    plt.title('non-linear regression')

    # cost variation
    plt.figure(2)
    plt.plot(range(steps), loss)
    plt.xlabel('step')
    plt.ylabel('loss')
    plt.title('loss variation in non-linear regression')
    plt.show()

if __name__ == "__main__":
    main()

这里写图片描述
这里写图片描述

  • 0
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值