Coursera吴恩达机器学习(五)——Exercise5-偏差与方差

最新推荐文章于 2024-07-10 11:21:17 发布

芋圆乌龙茶

最新推荐文章于 2024-07-10 11:21:17 发布

阅读量328

点赞数

分类专栏：机器学习文章标签：机器学习

本文链接：https://blog.csdn.net/qq_37751989/article/details/108033441

版权

机器学习专栏收录该内容

7 篇文章 0 订阅

订阅专栏

文章目录

一、数据
- 1.加载数据
- 2.处理数据
二、定义代价函数
三、定义梯度
四、拟合数据

一、数据

import numpy as np
import matplotlib.pyplot as plt
from scipy.io import loadmat
import scipy.optimize as opt

1.加载数据

data=loadmat('G:\Coursera-ML-AndrewNg-Notes\code\ex5-bias vs variance\ex5data1.mat')
X=data['X']
Xval=data['Xval']
Xtest=data['Xtest']
y=data['y']
yval=data['yval']
ytest=data['ytest']
#X.shape(12, 1),Xval.shape(21, 1),Xtest.shape(21, 1)
#y.shape(12, 1),yval.shape(21, 1),ytest.shape(21, 1)

plt.scatter(X,y,color='b')
plt.xlabel('water level')
plt.ylabel('flow')
plt.show()

在这里插入图片描述

2.处理数据

X=np.insert(X,0,np.ones(X.shape[0]),axis=1)
Xval=np.insert(Xval,0,np.ones(Xval.shape[0]),axis=1)
Xtest=np.insert(Xtest,0,np.ones(Xtest.shape[0]),axis=1)
y=y.flatten()
yval=yval.flatten()
ytest=ytest.flatten()
theta=np.ones(X.shape[1])

二、定义代价函数

def cost(theta,X,y):
    return np.sum(np.power(X@theta-y,2))/(2*len(X))
def regularized_cost(theta,X,y,l):
    return cost(theta,X,y)+l/(2*len(X))*np.sum(np.power(theta[1:],2))

三、定义梯度

def gradient(theta,X,y):
    return (X.T@(X@theta-y))/len(X)
def regularized_gradient(theta,X,y,l):
    reg=(l/len(X))*theta
    reg[0]=0
    return gradient(theta,X,y)+reg

四、拟合数据

def training(theta,X,y,l):
    result=opt.minimize(fun=regularized_cost,x0=theta,args=(X,y,l),method='TNC',jac=regularized_gradient,options={'disp': True})
    return result.x

1.线性回归

final_theta=training(theta,X,y,1)
a=final_theta[1]
b=final_theta[0]

（1）拟合曲线

plt.scatter(X[:,1],y,color='b',label='training data')
plt.plot(X[:,1],a*X[:,1]+b,color='r',label='prediction')
plt.xlabel('water level')
plt.ylabel('flow')
plt.legend()
plt.show()

在这里插入图片描述

（2）学习曲线

代价函数误差：
$J_{train}(\theta)=\frac{1}{2m}\sum_{i=1}^{m}(h_{\theta}x^{(i)})-y^{(i)})^2$
$J_{cv}(\theta)=\frac{1}{2m_{cv}}\sum_{i=1}^{m_{cv}}(h_{\theta}x_{cv}^{(i)})-y_{cv}^{(i)})^2$

training_cost=[]
cv_cost=[]
for i in range(1,len(X)+1):
    t=training(theta,X[:i,:],y[:i],l=0)
    tc=cost(t,X[:i,:],y[:i])
    cv=cost(t,Xval,yval)#所有的验证集来检验
    training_cost.append(tc)
    cv_cost.append(cv)
plt.plot(range(1,len(X)+1),training_cost,color='b',label='training cost')
plt.plot(range(1,len(X)+1),cv_cost,color='r',label='cv cost')
plt.xlabel('m(training set size)')
plt.ylabel('error')
plt.legend()
plt.show()

在这里插入图片描述
随着样本数量的增加，训练误差和交叉验证误差都很高，这属于高偏差，欠拟合。

2.多项式回归

归一化：所有数据集应该都用训练集的均值和样本标准差处理。所以要将训练集的均值和样本标准差存储起来，对后面的数据进行处理。

def add_polynomial_feature(X,power):
    X_poly=X.copy()
    for i in range(2,power+1):
        X_poly=np.insert(X_poly,i,np.power(X_poly[:,1],i),axis=1)
    return X_poly

def means_stds(X):
    means=np.mean(X,axis=0)#每行相加
    stds=np.std(X,axis=0,ddof=1)#样本标准差而不是总体标准差，使用np.std()时，将ddof=1则是样本标准差，默认=0是总体标准差。而pandas默认计算样本标准差。
    return means,stds

def feature_normalize(X,means,stds):
    X[:,1:]=(X[:,1:]-means[1:])/stds[1:]
    return X

（1）代价函数误差与多项式次数

def plot_degree_error(degree):
    training_cost=[]
    cv_cost=[]
    for d in range(1,degree+1):
        X_poly=add_polynomial_feature(X,d)
        X_poly_means,X_poly_stds=means_stds(X_poly)
        X_poly=feature_normalize(X_poly,X_poly_means,X_poly_stds)
        Xval_poly=add_polynomial_feature(Xval,d)
        Xval_poly=feature_normalize(Xval_poly,X_poly_means,X_poly_stds)
        theta=np.ones(d+1)
        final_theta=training(theta,X_poly,y,0)
        tc=cost(final_theta,X_poly,y)
        cv=cost(final_theta,Xval_poly,yval)
        training_cost.append(tc)
        cv_cost.append(cv)
    plt.plot(range(1,degree+1),training_cost,color='b',label='training cost')
    plt.plot(range(1,degree+1),cv_cost,color='r',label='cv cost')
    plt.xlabel('degree of polynomial d')
    plt.ylabel('error')
    plt.legend()
    plt.show()

plot_degree_error(10)

在这里插入图片描述
从图中可以看出三次多项式和八次多项式比较好。

（2）不同 $\lambda$ 的学习曲线

power=8
X_poly=add_polynomial_feature(X,power)
X_poly_means,X_poly_stds=means_stds(X_poly)
X_poly=feature_normalize(X_poly,X_poly_means,X_poly_stds)

Xval_poly=add_polynomial_feature(Xval,power)
Xval_poly=feature_normalize(Xval_poly,X_poly_means,X_poly_stds)

Xtest_poly=add_polynomial_feature(Xtest,power)
Xtest_poly=feature_normalize(Xtest_poly,X_poly_means,X_poly_stds)

theta2=np.ones(power+1)

def plot_fitting_figure(l):
    final_theta=training(theta2,X_poly,y,l)
    x=np.linspace(X[:,1].min(),X[:,1].max(),100)
    xx=x.reshape(-1,1)
    xx=np.insert(xx,0,1,axis=1)
    xx=add_polynomial_feature(xx,power)
    xx=feature_normalize(xx,X_poly_means,X_poly_stds)
    yy=xx@final_theta
    plt.plot(x,yy,color='r',label='prediction')
    plt.scatter(X[:,1],y,color='b',label='training data')
    plt.xlabel('water level')
    plt.ylabel('flow')
    plt.legend()
    plt.show()

def plot_learning_curve(l):
    training_cost=[]
    cv_cost=[]
    for i in range(1,len(X)+1):
        t=training(theta2,X_poly[:i,:],y[:i],l)
        tc=regularized_cost(t,X_poly[:i,:],y[:i],0)
        cv=regularized_cost(t,Xval_poly,yval,0)#所有的验证集来检验
        training_cost.append(tc)
        cv_cost.append(cv)
    plt.plot(range(1,len(X)+1),training_cost,color='b',label='training cost')
    plt.plot(range(1,len(X)+1),cv_cost,color='r',label='cv cost')
    plt.xlabel('m(training set size)')
    plt.ylabel('error')
    plt.legend()
    plt.show()
    return cv_cost[len(X)-1]

$\lambda=0$

plot_fitting_figure(l=0)
plot_learning_curve(l=0)

在这里插入图片描述
训练的代价太低了，不真实. 这是 过拟合了

$\lambda=1$

plot_fitting_figure(l=1)
plot_learning_curve(l=1)

在这里插入图片描述
训练代价增加了些，不再是0了。
也就是说我们减轻过拟合

$\lambda=100$

plot_fitting_figure(l=100)
plot_learning_curve(l=100)

在这里插入图片描述
太多正则化了.
变成 欠拟合状态

（3）选择合适的 $\lambda$

l_candidate = [0, 0.001, 0.003, 0.01, 0.03, 0.1, 0.3, 1, 3, 10]
training_cost=[]
cv_cost=[]
for l in l_candidate:
    final_theta2=training(theta2,X_poly,y,l)
    tc=cost(final_theta2,X_poly,y)
    cv=cost(final_theta2,Xval_poly,yval)
    training_cost.append(tc)
    cv_cost.append(cv)
plt.plot(l_candidate,training_cost,color='b',label='training cost')
plt.plot(l_candidate,cv_cost,color='r',label='cv cost')
plt.xlabel('m(training set size)')
plt.ylabel('error')
plt.legend()
plt.show()

在这里插入图片描述

l_candidate[np.argmin(cv_cost)]
# l=3的时候cv_cost最低

for l in l_candidate:
    final_theta=training(theta2,X_poly,y,l)
    print('test cost(l={}) = {}'.format(l, cost(final_theta, Xval_poly, yval)))

在这里插入图片描述

final_theta=training(theta2,X_poly,y,l=3)
cost(final_theta, Xtest_poly, ytest)

用测试集计算代价函数误差为：3.8598760845199265

芋圆乌龙茶

关注

0
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
Coursera吴恩达机器学习(五)——Exercise5-偏差与方差

文章目录一.数据1.加载数据2.处理数据二.定义代价函数三.定义梯度四.拟合数据1.线性回归（1）拟合曲线（2）学习曲线2.多项式回归（1）代价函数误差与多项式次数（3）不同λ\lambdaλ的学习曲线λ=0\lambda=0λ=0λ=1\lambda=1λ=1λ=100\lambda=100λ=100（4）选择合适的λ\lambdaλ一.数据import numpy as npimport matplotlib.pyplot as pltfrom scipy.io import loadmati
复制链接

扫一扫