机器学习–线性回归
如有错误或者疑问,欢迎交流,转载请注明出处
线性回归定义
h(x)=∑i=1nθixi=θTx
上式中n是特征维度,目标是使损失函数 J最小,其中m是训练样本数目
J(θ)=∑i=1m12m(h(x(i))−y(i))2
梯度下降
∂J(θ)∂θj==1m∑i=1m(h(x(i))−y(i))⋅∂h(x(i))∂θj1m∑i=1m(h(x(i))−y(i))⋅x(i)j
- batch gradient descent
对每个j:
θj:=θj−α1m∑mi=1(h(x(i))−y(i))⋅x(i)j
def batch_gradient_descent(X, y, theta, alpha, num_iters):
'''
X: (m,n) ndarray, m是训练样本数目,n是特征维度
y: (m,) ndarray
'''
m = y.size
J_history = np.zeros(num_iters)
for i in range(num_iters):
predictions = np.dot(X,theta)
updates = X.T.dot(predictions - y)
theta = theta - alpha * (1.0/m) * updates
J_history[i] = compute_cost(X,y,theta)
retrun theta, J_history
- stochastic gradient descent
训练样本过多时,训练一轮的代价比较大,一个一个样本来,一批一批的来叫mini_batch~~
for i=1:m{
for j = 1:n{
θj:=θj−α(h(x(i))−y(i))⋅x(i)j }
}
def stochastic_gradient_descent(X, y, theta, alpha, num_iters):
'''
X: (m,n) ndarray, m是训练样本数目,n是特征维度
y: (m,) ndarray
'''
m = y.size
J_history = np.zeros(num_iters)
for i in range(num_iters):
predictions = np.dot(X,theta)
for j in range(m):
updates = X[j,:]*(predictions[j] - y[i])
theta = theta - alpha*updates
J_history[i] = compute_cost(X, y, theta)
return theta, J_history
最小二乘法
将损失函数写成矩阵形式
12(Xθ−y)T(Xθ−y)==12∑i=1m(h(x(i))−y(i))2J(θ)
对
θ
求导得
∇θJ(θ)=XTXθ−XTy
,极值处导数为0得
θ=(XTX)−1Xy
最小化 J(θ) 的 θ 由闭式解(closed form)一步求得
def normal_eqn(X,y):
theta = np.zeros((X.shape[1], 1))
X_temp = np.mat(X.T.dot(X))#转成mat矩阵,后面才能求逆
X_pinv = np.array(X_temp.I)
theta = (X_pinv.dot(X.T)).dot(y)
return theta