梯度下降-单变量线性回归-理论+代码+解释

用梯度下降实现线性回归

y = mx + b  其中m是线性函数的斜率(Slope of the line),b是偏置( bias)

数据集:Swedish Insurance Dataset

代价函数/损失函数  Cost Function  [J]= (1/2n) * sum((y_hat-y)^2)

其中:n为输入数据点的总个数

y_hat:使用从梯度下降获得的“m”和“b”值的y的预测值。

y:已经的标签数据y

梯度对参数求导:

Cost Function  [J]= (1/2n) * sum(((m*X + b)-y)^2)

CostFunction对m求导:dJ/dm = (1/n) X ((m*X+b)-y)

CostFunction对b求导:dJ/db = (1/n) ((mX+b)-y) * 1

梯度更新:

m = m - alpha*(dJ/dm)

b = b - alpha*(dJ/db)

其中,alpha是斜率(Learning Rate)

code

# Import Dependencies
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

df = pd.read_excel('F:/datasets/slr06.xls', index_col=u'ID')
#dataset txt format http://t.cn/RfHWAbI
#dataset Swedish Auto Insurance Dataset 
#http://college.cengage.com/mathematics/brase/understandable_statistics/7e/students/datasets/slr/excel/slr06.xls

# convert data to array
X = np.array(df['X'], dtype=np.float64)
y = np.array(df['Y'], dtype=np.float64)

fig,ax = plt.subplots()
ax.scatter(X,y)

def cost_Function(m,b,X,y):
    return sum(((m*X + b) - y)**2)/(2*float(len(X)))

def gradientDescent(X,y,m,b,alpha,iters):
    # Initialize Values of Gradients
    gradient_m = 0
    gradient_b = 0
    # n: Number of items in a row
    n = float(len(X))
    a = 0
    # Array to store values of error for analysis
    hist = []
    # Perform Gradient Descent for iters
    for _ in range(iters):
        # Perform Gradient Descent
        for i in range(len(X)):
            gradient_m = (1/n) * X[i] * ((m*X[i] + b) - y[i])
            gradient_b = (1/n) * ((m*X[i] + b) - y[i])
        m = m - (alpha*gradient_m)
        b = b - (alpha*gradient_b)
        # Calculate the change in error with new values of "m" and "b"
        a = cost_Function(m,b,X,y)
        hist.append(a)
    return [m,b,hist]


# Learning Rate
lr = 0.0001
# Initial Values of "m" and "b"
initial_m = 0
initial_b = 0
# Number of Iterations
iterations = 1000
print("Starting gradient descent...")


# Check error with initial Values of m and b
print("Initial Error at m = {0} and b = {1} is error = {2}".format(initial_m, initial_b, cost_Function(initial_m, initial_b, X, y)))


[m,b,hist] = gradientDescent(X, y, initial_m, initial_b, lr, iterations)

print('Values obtained after {0} iterations are m = {1} and b = {2}'.format(iterations,m,b))


y_hat = (m*X + b)
print('y_hat: ',y_hat)

fig,ax = plt.subplots()
ax.scatter(X,y,c='r')
ax.plot(X,y_hat,c='y')
ax.set_xlabel('X')
ax.set_ylabel('y')
ax.set_title('Best Fit Line Plot')

predict_X = 76
predict_y = (m*predict_X + b)
print('predict_y: ',predict_y)



fig,ax = plt.subplots()
ax.scatter(X,y)
ax.scatter(predict_X,predict_y,c='r',s=100)
ax.plot(X,y_hat,c='y')
ax.set_xlabel('X')
ax.set_ylabel('y')
ax.set_title('Prediction Plot')


fig,ax = plt.subplots()
ax.plot(hist)
ax.set_title('Cost Function Over Time')

 

  • 1
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值