机器学习--吴恩达系列课程课后作业（1-单变量线性回归）

最新推荐文章于 2024-06-19 21:37:28 发布

Po1nterzz

最新推荐文章于 2024-06-19 21:37:28 发布

阅读量569

点赞数 5

文章标签：机器学习线性回归算法

本文链接：https://blog.csdn.net/m0_64080654/article/details/134058476

版权

本人是一枚机器学习的小白，偶然接触到Andrew Wu的机器学习课程，根据B站两位UP主进行课程与作业的学习。现将理论知识进行整理，并用Python中的Jupyter Notebook将作业内容复现，将问题记录于此，也算是知识的巩固。若有错误，敬请指正！

参考：吴恩达机器学习系列课后习题01-线性回归_哔哩哔哩_bilibili

一、单变量线性回归

读取数据

data = pd.read_csv('ex1data1.txt',names = ['population' , 'profit'])
data.head()

设置一个data变量，通过pd中的 read_csv() 方法，将保存在同级目录下的'ex1data1.txt'文本文件读入到data变量中，并为其设置对应的列名 'population' & 'profit' 。

通过DataFrame数据结构中的.head()方法读取数据的前五行。

data.head()

通过DataFrame数据结构中的.tail()方法读取数据的后五行。

data.tail()

通过DataFrame数据结构中的 .describe() 方法读取数组的大小、列平均值、列标准差、列最小值最大值等信息。

data.describe()

散点图绘制

使用DataFrame中绘制散点图的方法 .plot.scatter() 将数据在图上呈现出来。其中'population' & 'profit' 是两个列名，图像图例名 'population' 。

data.plot.scatter('population' , 'profit'  , label = 'population')

增添常数列并实现切片

$Hypothesis: h_{\theta }(x) = \theta_{0} + \theta_{1}x$

根据该特征假设函数，可以看到 $\theta _{0}$ 是单独的一项参数，不存在对应的特征变量。而在Python中，可以使用向量实现矩阵的简单运算，因此对于特征向量而言，增添一列全1，以便于使维度保持一致，实现矩阵运算。

通过DataFrame数据结构中的 .insert() 方法能够实现在指定位置插入列。

data.insert(0,'ones',1) #各个参数: 0表示在第0列插入, 'ones'表示插入的列的列名,1表示插入的值为全1
data.head()

在上述线性函数的指导下，我们需要对data变量进行切片操作。取列名'ones'和'population'的两列作为特征向量X，取'profit'列作为实际值y。

可以使用DataFrame数据结构中的 .iloc() 方法实现切片操作。

.iloc[ , ] 中第一项是取行，: 表示取所有行；第二项是取列，0:-1 表示从第一项取到最后一项，但不包含最后一项，即左闭右开。

X = data.iloc[:,0:-1] 
y = data.iloc[:,-1]
#检查
X.head()
y.head()

数据类型转换

由于DataFrame数据结构无法进行矩阵的运算，因此我们需要将该数据类型转化为Numpy,实现的方法是 .values 。再通过 .shape 方法查看X与y的大小。

X = X.values
y = y.values #将 DataFrame 转为 Numpy

X.shape
y.shape

注意：对于只有一列的y数组而言，转化为Numpy后，变成了一维，需要通过 .reshape() 方法转化为二维。

y = y.reshape(97,1)
y.shape

代价函数costFunction

以 $\theta _{0} , \theta _{1}$ 为参数，样本数为 m 的代价函数的表达式为：

$Hypothesis: h_{\theta }(x) = \theta_{0} + \theta_{1}x$

$J(\theta _{0},\theta _{1}) = \frac{1}{2m}\sum_{1}^{m}(h_{\theta }(x^{i}) - y^{i})^{2}$

在矩阵运算下，将theta初始化为2行1列的0矩阵，并将其与X进行矩阵相乘。

theta = np.zeros((2,1))
HX = X @ theta

def costFunction(X,y,theta):
    preY2 = np.power(X @ theta - y , 2)
    return np.sum(preY2) / (2 * len(X))

梯度下降gradientDescent

对于每一个参数 $\theta _{j}$ ,梯度下降算法如下：

$\theta _{j} := \theta _{j} - \alpha \frac{\partial }{\partial \theta _{j}} J(\theta )$

$\theta _{j} := \theta _{j} - \alpha \frac{1}{m}((h_{\theta }(x^{i}) - y^{i}) * x^{i})$

在矩阵运算下的梯度下降算法（返回最终的theta值和每次迭代的代价函数值）：

def gradientDescent(X,y,theta,alpha,iterations):
    costs = []
    
    for i in range(iterations):
        theta = theta - alpha * (X.T @ (X@theta - y)) / len(X)
        cost = costFunction(X,y,theta)
        costs.append(cost)
        
    return theta,costs

初始化相关必要的参数，并进行函数的调用。

alpha = 0.01
iterations = 2000

theta,costs = gradientDescent(X,y,theta,alpha,iterations)

画图

1.代价函数与迭代次数的关系图

fig,ax = plt.subplots() 实现对一个图形对象（Figure）和一个子图对象（Axes）的创建。fig 是整个图形对象，而 ax 是子图对象，可以用来对子图进行操作。

接着在子图 ax 上通过 .plot() 方法，将迭代次数作为x轴，costs参数值作为y轴，画出折线图。

fig,ax = plt.subplots()
ax.plot(np.arange(iterations),costs)
ax.set(xlabel = 'iterations',ylabel = 'cost',title = 'cost vs iterations')
plt.show()

2.迭代后的参数直线&散点图

设置自变量x，通过 np.linspace() 方法取X的第二列的最小值和最大值作为X轴的端点，中间分100等分；并通过Hypothesis函数计算出y的估计值，拟合一条直线并实现绘制。

$Hypothesis: h_{\theta }(x) = \theta_{0} + \theta_{1}x$

x = np.linspace(X[:,-1].min(),X[:,-1].max(),100)
y_ = theta[0,0] + theta[1,0] * x

fig,ax = plt.subplots()
ax.scatter(X[:,-1],y,label = 'training-data')
ax.plot(x,y_,'r',label = 'predictLine')
ax.legend()
ax.set(xlabel = 'population' , ylabel='profit')

二、完整代码

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

#读入数据
data = pd.read_csv('ex1data1.txt' , names=['population' , 'profit'])

#绘制散点图
data.plot.scatter('population' , 'profit' , label = 'population')
plt.show()

#加入常数列
data.insert( 0 , 'ones' , 1)

#进行切片
X = data.iloc[:,0:-1]
y = data.iloc[:,-1]

#进行数据类型转换
X = X.values
y = y.values
y = y.reshape(len(X),1)

#theta初始化
theta = np.zeros((2,1))

#代价函数
def costFunction(X,y,theta):
    preY = X @ theta
    return np.sum(np.power((preY - y),2)) / (2 * len(X))

#梯度下降函数
def gradientDescent(X,y,theta,alpha,iterations):
    costs = []

    for i in range(iterations):
        preY = X @ theta
        theta = theta - alpha * (X.T @ (preY - y)) / len(X)
        cost = costFunction(X,y,theta)
        costs.append(cost)

    return costs,theta

#初始化相关参数
alpha = 0.003
iterations = 2000
costs,theta_final = gradientDescent(X,y,theta,alpha,iterations)

#画图
#1.代价函数与迭代次数
fig,ax = plt.subplots()
ax.plot(np.arange(iterations) , costs , label = 'costs vs iterations')
ax.set(xlabel = 'iterations' , ylabel = 'costFunction')
plt.show()

#2.直线绘制
fig,ax = plt.subplots()
x = np.linspace(X[:,-1].min() , X[:,-1].max() , 100)
preY = theta_final[0,0] + theta_final[1,0] * x

ax.scatter(X[:,-1] , y , label = 'data')
ax.plot(x,preY,c = 'r' , label = 'preY')
ax.legend()
ax.set(xlabel = 'population' , ylabel = 'profit')

plt.show()