机器学习-线性回归
前言
线性回归是机器学习中基础的模型之一,是有监督模型。
定义:线性回归是确定两种或两种以上变量间相互依赖的定量关系的一种统计分析方法
表达式:$y=wx+b $
参数: θ 0 , θ 1 \theta_0,\theta_1 θ0,θ1
预测: h = θ 0 + θ 1 x h = \theta_0 + \theta_1x h=θ0+θ1x 可得 [ 1 x ] \begin{bmatrix}1 & x\end{bmatrix} [1x] [ θ 0 θ 1 ] \begin{bmatrix}\theta_0\\\theta_1\end{bmatrix} [θ0θ1] = X θ X\theta Xθ
h = np.dot(X,theta)
代价: J = 1 2 m ∑ ( h − y ) 2 J = \frac{1}{2m}\sum(h-y)^2 J=2m1∑(h−y)2
J = 0.5 * np.mean((h - y) ** 2)
梯度下降: Δ θ j = 1 m X T ( h − y ) \Delta\theta_j=\frac{1}{m}X^T(h-y) Δθj=m1XT(h−y)
deltatheta = (1.0 / m) * X.T.dot(h - y)
更新参数: θ j = θ j − α Δ θ j \theta_j = \theta_j - \alpha\Delta\theta_j θj=θj−αΔθj
theta = theta - alpha * deltatheta
正则化
正则化代价: J = 1 2 m ∑ ( h − y ) 2 + λ 2 m θ 2 J = \frac{1}{2m}\sum(h-y)^2 +\frac{\lambda}{2m}\theta^2 J=2m1∑(h−y)2+2mλθ2
正则化梯度下降: Δ θ j = 1 m X T ( h − y ) + λ 2 m θ \Delta\theta_j=\frac{1}{m}X^T(h-y)+\frac{\lambda}{2m}\theta Δθj=m1XT(h−y)+2mλθ
单变量线性回归
import numpy as np
import matplotlib.pyplot as plt
x = [4, 3, 3, 4, 2, 2, 0, 1, 2, 5, 1, 2, 5, 1, 3]
y = [8, 6, 6, 7, 4, 4, 2, 4, 5, 9, 3, 4, 8, 3, 6]
X = np.c_[np.ones(len(x)),x]
y = np.c_[y]
#预测
def mov(theta):
h = np.dot(X,theta)
return h
#代价
def cos(h):
j = 0.5*np.mean((h-y)**2)
return j
#梯度下降
def grad(sums=10000,alph=0.1):
m,n = X.shape
theta = np.zeros((n,1))
j = np.zeros(sums)
for i in range(sums):
h = mov(theta)
j[i] = cos(h)
te = (1/m)*X.T.dot(h-y)
theta -= alph * te
return h,j,theta
if __name__ == '__main__':
h,j,theta = grad()