单一变量线性回归梯度下降算法
在网易云课堂下载吴恩达机器学习线性回归部分的作业要求和相关资料。
实验结果图:
公式:
假设函数:
h
θ
(
x
)
=
θ
0
+
θ
1
x
h_{\theta}(x)=\theta_0+\theta_1x
hθ(x)=θ0+θ1x
代价函数:
J
(
θ
0
,
θ
1
)
=
1
2
m
∑
i
=
1
m
(
h
θ
(
x
(
i
)
)
−
y
(
i
)
)
2
J(\theta_0,\theta_1)=\frac{1}{2m}\sum_{i=1}^m(h_{\theta}(x^{(i)})-y^{(i)})^2
J(θ0,θ1)=2m1∑i=1m(hθ(x(i))−y(i))2
代价函数是关于theta的函数,目标是找到 θ 0 \theta_0 θ0和 θ 1 \theta1 θ1使得代价函数值最小。
梯度下降:用来求函数最小值
θ
j
=
θ
j
−
α
∂
∂
θ
j
J
(
θ
0
,
θ
1
)
\theta_j=\theta_j-\alpha\frac{\partial}{\partial\theta_j}J(\theta_0,\theta_1)
θj=θj−α∂θj∂J(θ0,θ1) (for
j
j
j=0 and
j
j
j=1)
j = 0 ∣ θ 0 = θ 0 − α ∂ ∂ θ 0 J ( θ 0 , θ 1 ) j=0 \mid \theta_0=\theta_0-\alpha\frac{\partial}{\partial\theta_0}J(\theta_0,\theta_1) j=0∣θ0=θ0−α∂θ0∂J(θ0,θ1)
j = 1 ∣ θ 1 = θ 1 − α ∂ ∂ θ 1 J ( θ 0 , θ 1 ) j=1\mid\theta_1=\theta_1-\alpha\frac{\partial}{\partial\theta_1}J(\theta_0,\theta_1) j=1∣θ1=θ1−α∂θ1∂J(θ0,θ1)
α
\alpha
α是学习率
∂
∂
θ
j
J
(
θ
0
,
θ
1
)
\frac{\partial}{\partial\theta_j}J(\theta_0,\theta_1)
∂θj∂J(θ0,θ1)分别对
θ
0
\theta_0
θ0或
θ
1
\theta_1
θ1求偏导数
将 假设函数 带入 代价函数
∂
∂
θ
j
1
2
m
∑
i
=
1
m
(
θ
0
+
θ
1
x
(
i
)
−
y
(
i
)
)
2
\frac{\partial}{\partial\theta_j}\frac{1}{2m}\sum_{i=1}^{m}(\theta_0+\theta_1x^{(i)}-y^{(i)})^2
∂θj∂2m1i=1∑m(θ0+θ1x(i)−y(i))2求偏导数
∂
∂
θ
0
J
(
θ
0
,
θ
1
)
=
1
m
∑
i
=
1
m
(
θ
0
+
θ
1
x
(
i
)
−
y
(
i
)
)
\frac{\partial}{\partial\theta_0}J(\theta_0,\theta_1)=\frac{1}{m}\sum_{i=1}^{m}(\theta_0+\theta_1x^{(i)}-y^{(i)})
∂θ0∂J(θ0,θ1)=m1i=1∑m(θ0+θ1x(i)−y(i))
∂
∂
θ
1
J
(
θ
0
,
θ
1
)
=
1
m
∑
i
=
1
m
(
(
θ
0
+
θ
1
x
(
i
)
−
y
(
i
)
)
x
(
i
)
)
\frac{\partial}{\partial\theta_1}J(\theta_0,\theta_1)=\frac{1}{m}\sum_{i=1}^{m}((\theta_0+\theta_1x^{(i)}-y^{(i)})x^{(i)})
∂θ1∂J(θ0,θ1)=m1i=1∑m((θ0+θ1x(i)−y(i))x(i))
迭代 iterations 次,每次同时更新
θ
\theta
θ
t
e
m
p
0
=
θ
0
−
α
∂
∂
θ
0
J
(
θ
0
,
θ
1
)
temp_0=\theta_0-\alpha\frac{\partial}{\partial\theta_0}J(\theta_0,\theta_1)
temp0=θ0−α∂θ0∂J(θ0,θ1)
t
e
m
p
1
=
θ
1
−
α
∂
∂
θ
1
J
(
θ
0
,
θ
1
)
temp_1=\theta_1-\alpha\frac{\partial}{\partial\theta_1}J(\theta_0,\theta_1)
temp1=θ1−α∂θ1∂J(θ0,θ1)
θ
0
=
t
e
m
p
0
\theta_0=temp_0
θ0=temp0
θ
1
=
t
e
m
p
1
\theta_1=temp_1
θ1=temp1
将 θ \theta θ带回假设函数
代码:
# -*- coding: utf-8 -*-
"""
Spyder Editor
This is a temporary script file.
"""
import numpy as np
import matplotlib.pyplot as plt
data = np.loadtxt("ex1data1.txt")
x=data[:,0]
y=data[:,1]
m=len(y)
fig = plt.figure(figsize=(8,6))
plt.scatter(x,y)
plt.title("MarkerSize")
plt.xlabel("Population of City in 10,000s")
plt.ylabel("Profit in $10,000s")
ones=[[1 for i in range(1)] for j in range(m)] # x0是全 1 矩阵
x=np.column_stack((ones,x)) # 为x增加一列 x0
theta=np.zeros((2,1)) # theta0 和 theta1 初始化为0
iterations = 1500 #迭代次数
alpha = 0.01 #学习率
#cost function 代价函数
def cost_function(m,theta):
j=0
for i in range(m):
j=j+np.square(theta[0]+theta[1]*x[i,1]-y[i])
return 1/(2*m)*j
def derivatives_theta0(m,theta): #对 theta0 求偏导数
j=0
for i in range(m):
j=j+(theta[0]+theta[1]*x[i,1]-y[i])
return 1/m*j
def derivatives_theta1(m,theta): #对 theta1 求偏导数
j=0
for i in range(m):
j=j+((theta[0]+theta[1]*x[i,1]-y[i])*x[i,1])
return 1/m*j
for i in range(iterations): #同时更新 theta0 和 theta1
temp0=theta[0]-alpha*(derivatives_theta0(m,theta))
temp1=theta[1]-alpha*(derivatives_theta1(m,theta))
theta[0]=temp0
theta[1]=temp1
print(cost_function(m,theta))
z=theta[0]+theta[1]*x
plt.plot(x, z)
plt.show()
多变量线性回归梯度下降
# -*- coding: utf-8 -*-
"""
Created on Thu Jul 18 10:32:24 2019
@author: f_atm
"""
import numpy as np
import matplotlib.pyplot as plt
data = np.loadtxt("ex1data2.txt")
x1=data[:,0] #size
x2=data[:,1] #number of bedrooms
y=data[:,2] #price
m=len(y)
ones=[[1 for i in range(1)] for j in range(m)]
data=np.column_stack((ones,data))
def featureNormalize(feature,m):#特征值缩放 or 归一化
mean_data=0
sum_data=0
sn_data=0
for i in range(m):
sum_data=sum_data+feature[i-1]
mean_data=sum_data/m
sn_data=feature.max()-feature.min()
print(mean_data)
print(sn_data)
return (feature-mean_data)/sn_data
#hypothesis function h(x)=theta_0*x_0+theta_1*x_1+theta_2*x_2
theta=np.zeros((3,1))
f_x1=featureNormalize(data[:,1],m)
f_x2=featureNormalize(data[:,2],m)
#cost function
def cost_function(m,theta):
j=0
for i in range(m):
j=j+np.square(theta[0]+theta[1]*f_x1[i]+theta[2]*f_x2[i]-y[i])
return 1/(2*m)*j
def derivatives_theta0(m,theta):
j=0
for i in range(m):
j=j+(theta[0]+theta[1]*f_x1[i]+theta[2]*f_x2[i]-y[i])
return 1/m*j
def derivatives_theta1(m,theta):
j=0
for i in range(m):
j=j+((theta[0]+theta[1]*f_x1[i]+theta[2]*f_x2[i]-y[i])*f_x1[i])
return 1/m*j
def derivatives_theta2(m,theta):
j=0
for i in range(m):
j=j+((theta[0]+theta[1]*f_x1[i]+theta[2]*f_x2[i]-y[i])*f_x2[i])
return 1/m*j
iterations = 1500
alpha = 0.01
for i in range(iterations):
temp0=theta[0]-alpha*(derivatives_theta0(m,theta))
temp1=theta[1]-alpha*(derivatives_theta1(m,theta))
temp2=theta[2]-alpha*(derivatives_theta2(m,theta))
theta[0]=temp0
theta[1]=temp1
theta[2]=temp2
def predict_house_price(size,num):
price=0
price=theta[0]+theta[1]*size+theta[2]*num
return price
#对输入的值也要进行归一化啊!
print('%f' % predict_house_price((2104-2000)/3626,(3-3.17)/4))
正规方程
# -*- coding: utf-8 -*-
"""
Created on Thu Jul 18 14:29:16 2019
@author: f_atm
"""
import numpy as np
import matplotlib.pyplot as plt
data = np.loadtxt("ex1data2.txt")
x1=data[:,0] #size
x2=data[:,1] #number of bedrooms
y=data[:,2] #price
m=len(y)
ones=[[1 for i in range(1)] for j in range(m)]
x=np.column_stack((ones,x1,x2))
theta=np.linalg.inv((x.T@x))@x.T@y #矩阵的点乘
def predict_house_price(size,num):
price=0
price=theta[0]+theta[1]*size+theta[2]*num
return price
print('%f' % predict_house_price(2104,3))