单变量的线性回归(Linear Regression with one variable)#
导入所需要的库#
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
读取数据#
path = 'ex1data1.txt'#路径
data = pd.read_csv(path, header=None, names=['Population','Profit'])
#print(data.head())
运行结果:
Population | Profit | |
---|---|---|
0 | 6.1101 | 17.5920 |
1 | 5.5277 | 9.1302 |
2 | 8.5186 | 13.6620 |
3 | 7.0032 | 11.8540 |
4 | 5.8598 | 6.8233 |
绘制散点图#
data.plot(kind='scatter', x='Population', y='Profit', figsize=(12,8))
#plt.show()
运行结果:
计算损失函数#
J(θ)=12m∑i=1m(hθ(x(i))−y(i))2J(θ)=12m∑i=1m(hθ(x(i))−y(i))2
其中:h(θ)=θTX=θ0x0+θ1x1+...+θnxnh(θ)=θTX=θ0x0+θ1x1+...+θnxn (n=1,x0=1n=1,x0=1)
len = len(data)
c = 0
theta = np.zeros(2)
t = []#t为迭代轮数
cost = []#cost为每轮的损失值
theta0 = []
theta1 = []
for j in range(len):
c += 1.0/(2*len) * pow(theta[0] * 1 + theta[1] * data.Population[j] - data.Profit[j],2)
#print(c)
运行结果:
32.072733877455676
实现梯度下降#
批量梯度下