线性回归

最新推荐文章于 2023-07-14 15:50:37 发布

steven_miao

最新推荐文章于 2023-07-14 15:50:37 发布

阅读量332

点赞数 1

分类专栏：机器学习

本文链接：https://blog.csdn.net/steven_miao/article/details/51966269

版权

机器学习专栏收录该内容

4 篇文章 0 订阅

订阅专栏

线性回归

仅以此文纪念过往岁月

公式

对于一些数据的线性模型如下：

Y = θ T * X

$Y = \theta^T*X$
损失函数如下，什么是损失函数，即拟合值与实际值得平方差之和，为什么损失函数是该函数，可以参数斯坦福大学机器学习的课程。

J (θ) = 1 2 \sum i = 0 m (h θ (x (i) - y (i)) 2

$J(\theta) = \frac{1} {2} \sum_{i=0}^m ( h_\theta(x^{(i)}-y^{(i)})^{2}$
对于拟合值而言，是损失函数越小越好。即：

min (J θ)

$\min(J_{\theta})$
对于上公式计算存在两种算法
1.最小二乘法

θ = (X T * X) - 1 X T Y

$\theta = (X^T*X)^{-1} X^TY$
2.梯度下降算法

d J ( θ ) d θ j = (h θ (x) - y) x j

$\frac {dJ(\theta)} {d\theta_j} = (h_\theta(x)-y)x_j$

θ j : = θ j + α (y (i) - h θ (x (i))) x (i) j

$\theta_j := \theta_j+\alpha(y^{(i)}-h_\theta(x^{(i)}))x_j^{(i)}$
其中alpha为学习率，其值很关键，如果过大，梯度下降过快无法有效的收敛，而其值过小，收敛较小，耗时很长。

Python实现

from numpy import *
import matplotlib.pyplot as plt
import time

#formula
def leastSquares(train_x,train_y):
    train_xt = train_x.T
    dotTrainX = dot(train_xt,train_x)
    matX = mat(dotTrainX)
    dotTrainXInversion = matX.I
    theta = dotTrainXInversion*train_xt;
    theta = theta*train_y;
    return theta

def LMS(train_x,train_y,opts):
    numSamples, numfeatures = shape(train_x)
    alpha = opts['alpha'];
    maxIter = opts['maxIter']
    weights = ones((numfeatures, 1))
    if (opts['optimizeType'] == 'gradDescent'):
        for k in range(maxIter):
            output = train_x*weights
            error = train_y - output
            weights = weights + alpha * train_x.transpose()*error
    return  weights

def showLogRegres(weights, train_x, train_y):
    # notice: train_x and train_y is mat datatype
    numSamples, numFeatures = shape(train_x)
    if numFeatures != 2:
        print "Sorry! I can not draw because the dimension of your data is not 2!"
        return 1

    # draw all samples
    for i in xrange(numSamples):
        plt.plot(train_x[i, 1], train_y[i, 0], 'or')
    # draw the classify line
    min_x = min(train_x[:, 1])[0, 0]
    max_x = max(train_x[:, 1])[0, 0]
    weights = weights.getA()  # convert mat to array
    y_min_x = float(weights[0] + weights[1] * min_x)
    y_max_x = float(weights[0] + weights[1] * max_x)
    plt.plot([min_x, max_x], [y_min_x, y_max_x], '-g')
    plt.xlabel('X1'); plt.ylabel('X2')
    plt.show()

if __name__ == '__main__':
    train_x = mat([(1,2104),(1,1600),(1,2400),(1,1416),(1,3000)]);
    train_y = mat([400,330,369,232,540]).transpose();
    opts = {'alpha': 0.00000001,
            'maxIter': 10000,
            'optimizeType': 'gradDescent'}

    weights= LMS(train_x,train_y,opts);
    theta = leastSquares(train_x, train_y)
    showLogRegres(theta, train_x, train_y)

程序说明

该程序是对一组数据进行线性拟合，其中在梯度下降算法中会发现alpha值很小，该值是测试出来的，当该值为0.1，无法收敛。

思考

对于该模型梯度下降算法中，其中alpha很重要，有没有一种很好的办法自动调整alpha，如alpha设置过大时，自动调小，如果过小，自动调大。

steven_miao

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
1
评论
线性回归

线性回归仅以此文纪念过往岁月公式对于一些数据的线性模型如下： Y=θT∗XY = \theta^T*X 损失函数如下，什么是损失函数，即拟合值与实际值得平方差之和，为什么损失函数是该函数，可以参数斯坦福大学机器学习的课程。 J(θ)=12∑i=0m(hθ(x(i)−y(i))2J(\theta) = \frac{1} {2} \sum_{i=0}^m ( h_\theta(x^{(i)}-y^
复制链接

扫一扫