梯度下降-单变量线性回归-理论+代码+解释

最新推荐文章于 2022-05-08 17:32:37 发布

金石开1510

最新推荐文章于 2022-05-08 17:32:37 发布

阅读量674

点赞数 1

分类专栏：机器学习

本文链接：https://blog.csdn.net/winone361/article/details/88787256

版权

机器学习专栏收录该内容

40 篇文章 2 订阅

订阅专栏

用梯度下降实现线性回归

y = mx + b 其中m是线性函数的斜率（Slope of the line），b是偏置（ bias）

数据集：Swedish Insurance Dataset

代价函数/损失函数 Cost Function [J]= (1/2n) * sum((y_hat-y)^2)

其中:n为输入数据点的总个数

y_hat：使用从梯度下降获得的“m”和“b”值的y的预测值。

y：已经的标签数据y

梯度对参数求导：

Cost Function [J]= (1/2n) * sum(((m*X + b)-y)^2)

CostFunction对m求导：dJ/dm = (1/n) X ((m*X+b)-y)

CostFunction对b求导：dJ/db = (1/n) ((mX+b)-y) * 1

梯度更新：

m = m - alpha*(dJ/dm)

b = b - alpha*(dJ/db)

其中,alpha是斜率(Learning Rate)

code

# Import Dependencies
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

df = pd.read_excel('F:/datasets/slr06.xls', index_col=u'ID')
#dataset txt format http://t.cn/RfHWAbI
#dataset Swedish Auto Insurance Dataset 
#http://college.cengage.com/mathematics/brase/understandable_statistics/7e/students/datasets/slr/excel/slr06.xls

# convert data to array
X = np.array(df['X'], dtype=np.float64)
y = np.array(df['Y'], dtype=np.float64)

fig,ax = plt.subplots()
ax.scatter(X,y)

def cost_Function(m,b,X,y):
    return sum(((m*X + b) - y)**2)/(2*float(len(X)))

def gradientDescent(X,y,m,b,alpha,iters):
    # Initialize Values of Gradients
    gradient_m = 0
    gradient_b = 0
    # n: Number of items in a row
    n = float(len(X))
    a = 0
    # Array to store values of error for analysis
    hist = []
    # Perform Gradient Descent for iters
    for _ in range(iters):
        # Perform Gradient Descent
        for i in range(len(X)):
            gradient_m = (1/n) * X[i] * ((m*X[i] + b) - y[i])
            gradient_b = (1/n) * ((m*X[i] + b) - y[i])
        m = m - (alpha*gradient_m)
        b = b - (alpha*gradient_b)
        # Calculate the change in error with new values of "m" and "b"
        a = cost_Function(m,b,X,y)
        hist.append(a)
    return [m,b,hist]


# Learning Rate
lr = 0.0001
# Initial Values of "m" and "b"
initial_m = 0
initial_b = 0
# Number of Iterations
iterations = 1000
print("Starting gradient descent...")


# Check error with initial Values of m and b
print("Initial Error at m = {0} and b = {1} is error = {2}".format(initial_m, initial_b, cost_Function(initial_m, initial_b, X, y)))


[m,b,hist] = gradientDescent(X, y, initial_m, initial_b, lr, iterations)

print('Values obtained after {0} iterations are m = {1} and b = {2}'.format(iterations,m,b))


y_hat = (m*X + b)
print('y_hat: ',y_hat)

fig,ax = plt.subplots()
ax.scatter(X,y,c='r')
ax.plot(X,y_hat,c='y')
ax.set_xlabel('X')
ax.set_ylabel('y')
ax.set_title('Best Fit Line Plot')

predict_X = 76
predict_y = (m*predict_X + b)
print('predict_y: ',predict_y)



fig,ax = plt.subplots()
ax.scatter(X,y)
ax.scatter(predict_X,predict_y,c='r',s=100)
ax.plot(X,y_hat,c='y')
ax.set_xlabel('X')
ax.set_ylabel('y')
ax.set_title('Prediction Plot')


fig,ax = plt.subplots()
ax.plot(hist)
ax.set_title('Cost Function Over Time')

金石开1510

关注

1
点赞
踩
5

收藏

觉得还不错? 一键收藏
0
评论
梯度下降-单变量线性回归-理论+代码+解释

用梯度下降实现线性回归y = mx + b 其中m是线性函数的斜率（Slope of the line），b是偏置（bias）数据集：Swedish Insurance Dataset代价函数/损失函数 Cost Function [J]= (1/2n) * sum((y_hat-y)^2)其中:n为输入数据点的总个数y_hat：使用从梯度下降获得的“m”和“b”值的y...
复制链接

扫一扫