初识机器学习——线性回归单变量梯度下降(jupyter notebook 代码)

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt 
%matplotlib inline

取出图像数据,列表示三通道的同类事物图像数据

In [2]:
data = pd.read_table('ex1data1.txt', header=None,names=['x','y'],delimiter=',')
print data.head()
data.describe()  #自动计算x y均值、标准差、最值等
        x        y
0  6.1101  17.5920
1  5.5277   9.1302
2  8.5186  13.6620
3  7.0032  11.8540
4  5.8598   6.8233
Out[2]:
 xy
count97.00000097.000000
mean8.1598005.839135
std3.8698845.510262
min5.026900-2.680700
25%5.7077001.986900
50%6.5894004.562300
75%8.5781007.046700
max22.20300024.147000

提取训练数据,输入和标签

In [3]:
X = data.loc[:,'x']
Y = data.loc[:,'y']
one = np.ones(X.shape)
X1 = np.array([one,X])
#X1 = np.insert(X, 0, values=one, axis=0)
print X1.shape
print Y.shape
(2L, 97L)
(97L,)

迭代计算theta(线性方程参数) 损失函数方程 y = theta1 + theta2*x

In [4]:
def theta_cal(X, Y, a, num):
    theta1 = 0
    theta2 = 0
    res = np.array([0, 0])
    for count in range(num):    
        sum1 = 0
        sum2 = 0
        for i in range(X.shape[0]):
            sum1 += theta1+theta2*X[i]-Y[i]
            sum2 += (theta1+theta2*X[i]-Y[i])*X[i]
        theta1 = theta1 - a*sum1/X.shape[0]
        theta2 = theta2 - a*sum2/X.shape[0]
        temp = np.array([theta1, theta2])
        res = np.row_stack((res, temp))
    return res

根据得到的theta,计算损失值

In [7]:
def lost_fun(theta, X, Y):
    res = []
    for i in range(theta.shape[0]):
        lost = (theta[i].T).dot(X) - Y.T
        lost = lost**2
        lost = lost.sum()/X.shape[0]
        res.append(lost)
    return res
In [13]:
a = theta_cal(X, Y, 0.01, 3000)
print a
[[ 0.          0.        ]
 [ 0.05839135  0.6532885 ]
 [ 0.06289175  0.77000978]
 ..., 
 [-3.87798708  1.19124606]
 [-3.87801916  1.19124929]
 [-3.87805118  1.1912525 ]]

计算梯度下降的迭代损失值

In [14]:
lost = lost_fun(a, X1, Y)
lost = np.array(lost)
x=np.arange(0,lost.shape[0],1)
plt.axis([0, x.shape[0], 430, 600])
plt.plot(x, lost)
plt.show()

绘制散点图和拟合曲线逼近图

In [39]:
plt.scatter(X, Y)
x = np.linspace(0,25,25)
#基本上前10次梯度下降已经拟合出一条效果还行的曲线
for i in np.linspace(0, 10, 10):  
    i = i.astype(int)
    plt.plot(x, a[i,0]+a[i,1]*x)
plt.show()
In [ ]:
 
In [ ]:
 

  • 1
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值