有监督学习-线性回归

不想努力了TT

已于 2022-10-25 22:11:22 修改

阅读量835

点赞数 2

分类专栏：机器学习

于 2022-10-23 19:39:37 首次发布

本文链接：https://blog.csdn.net/Isaac_gk/article/details/127479270

版权

线性回归机器学习

机器学习专栏收录该内容

3 篇文章 0 订阅

订阅专栏

简介

    回归是监督学习的一个重要问题，回归用于预测输入变量和输出变量之间的关系，特别是当输入变量的值发生变化时，输出变量的值也随之发生变化。回归模型正是表示从输入变量到输出变量之间映射的函数

    回归的目的是预测数组型的目标值。

    线性回归：根据已知的数据集，通过梯度下降的方法来训练线性回归模型的参数w，从而用线性回归模型来预测数据的未知的类别。

形式化定义

假设函数（hypotheses function）： $h_{\theta}(x) = \sum\limits_{i=0}^{n} \theta_{i}x_{i} = \theta^{T}x$ ，其中 $x_{0} = 1$

损失函数（loss function）： $L(\theta) = (h_{\theta}(x) - y)^{2}$

代价函数（cost function）： $J(\theta) = \frac{1}{2m}\sum\limits_{i = 1}^{m}(h_{\theta}(x^{i})-y^{i})^{2}$

梯度下降算法

      主要目的是通过迭代找到目标函数的最小值，或者收敛到最小值。

设目标函数为 $J(\theta)$ ，随机初始化 $\theta$ 设置步长 $\alpha$ ，设置迭代次数 m ，求 $J(\theta)$ 的导数 $\nabla J(\theta)$
$\; i = 0\;\; to\;\; m :$
$\theta : = \theta - \alpha \nabla J(\theta)$
当目标函数为严格的凸函数时，能够得到最小值，否则取到的是局部最小值，取决于初始的起点 $\theta$

简单的示例代码实现

import numpy as np
import matplotlib.pyplot as plt

x = np.linspace(-6,4,100)
y = x**2 + 2*x + 5  # 这里直接指定了目标函数

# 初始化 x , alpha 和迭代次数
x = 3
alpha = 0.8 
iteraterNum = 100

for i in range(iteraterNum) :
  x = x - alpha * (x*x + 2)  # 这里也是直接算出来了梯度

print(x)

参数的重要性：
alpha 决定了 x 往低处走的速度，若alpha给的过小，速度就会很慢，若alpha给的过大，可能就会形成振荡，函数会无法收敛。
iteraterNum 为迭代次数，过小的话可能会到不了最低点，会有精度问题，过大的话也没什么意义，因为最后已经很接近最低点了，所以我们这里可以设置一个误差值，当两次之间的误差值小于指定值的时候，我们就可以退出了。

用梯度下降算法求解线性回归问题

使用梯度下降法求解，使得代价函数损失值最小。

$\frac{\partial J(\theta)}{\partial \theta_{j}} = \frac{\partial }{\partial \theta_{j}} \frac{1}{2m}\sum\limits_{i=1}^{m}( h_{\theta}(x^{i})-y^{i})^{2} = 2 \times \frac{1}{2m} \sum\limits_{i=1}^{m} [ (h_{\theta}(x^{i})-y^{i}) \frac{\partial }{\partial \theta_{j}}(h_{\theta}(x^{i})-y^{i}) ]$
$\frac{1}{m} \sum\limits_{i=1}^{m} [(h_{\theta}(x^{i})-y^{i})x_{j}^{i}]$

综上： $\theta_{j} = \theta_{j} - \alpha \frac{1}{m} \sum\limits_{i=1}^{m}(h_{\theta}(x^{i})-y^{i})x_{j}^{i}$

线性回归代码实现（修正）

import numpy as np
import matplotlib.pyplot as plt

# 加载数据集
def load_dataset():
  data = np.loadtxt('data/data.txt',deliter=',')
  n = data.shape[1] - 1  # 特征数
  X = data[:0:n]
  Y = data[:,-1].reshape(-1,1)
  return X,y


# 特征归一化
def feature_normalize(X):
  mu = np.average(X,axis=0)
  sigma = np.std(X,axis=0,ddof=1)
  X = (X-mu)/sigma
  return X,mu,sigma


# 计算损失函数
def compute_cost(X,y,theta):
  m = X.shape[0]
  return np.sum(np.power(np.dot(X,theta)-y,2))/(2*m)


# 求梯度
def gradient_descent(X,y,theta,iterations,alpha):
  c = np.ones(X.shape[0]).transpose()
  X = np.insert(X,0,values = c, axis=1)  # 对原始数据加入一个全为1的列
  m = X.shape[0]
  n = X.shape[1]
  costs = np.zeros(iterations)	# 存每一次迭代的损失值
  for num in range(iterations):
    for j in range(n):
      theta[j] = theta[j] + (alpha/m)*np.sum(y-np.dpt(X,theta)*X[:,j].reshape(-1,1))
    costs[m] = compute_cost(X,y,theta)
  return theta,costs


# 预测函数
def predict(X) :
  X = (X-mu)/sigma
  c = np.ones(X.shape[0]).transpose()
  X = np.insert(X,0,values=c,axis=1)
  return np.dot(X,theta)
  
  
# 主函数
if __name__ == '__main__':
  X,y = load_dataset()
  X,mu,sigma = feature_normalize(X)
  theta = np.zeros(X.shape[1]+1).reshape(-1,1)
  iterations = 400
  alpha = 0.01
  theta,costs = gradient_descent(X,y,theta,iterations,alpha)
  # 画损失函数
  x_axis = np.linspace(1,iterations,iterations)
  plt.plot(x_axis,costs[0:iterations])
  # 画原始数据散点图和求得的直线
  plt.scatter(X,y)
  h_theta = theta[0] + theta[1]*X
  plt.plot(X,h_theta)
  # 预测数据
  print(predict([[14,164]]))