交叉验证方法

该博客探讨了交叉验证在机器学习模型评估中的重要性,分别介绍了K折交叉验证、留一法交叉验证和留P法交叉验证的实现过程。通过使用`sklearn`库中的`KFold`、`LeaveOneOut`和`LeavePOut`方法,对数据集进行分割并训练SGDRegressor模型。每种方法都计算了训练集和验证集的均方误差,以评估模型性能。
摘要由CSDN通过智能技术生成
  1. K折交叉验证
    将原始数据分成K组,然后将每个子集数据分别做一次验证集,其余K-1组子集数据作为训练集,这样就会得到K个模型,将K个模型最终的验证集的分类准确率取平均值,作为K折交叉验证分类器的性能指标
from sklearn.model_selection import KFold
from sklearn.linear_model import SGDRegressor
from sklearn.metrics import mean_squared_error

kf = KFold(n_splits=10)

for k, (train_index, test_index) in enumerate(kf.split(new_train_pca_16)):
    train_data, test_data, train_target, test_target = train.values[train_index],train.values[test_index],target[train_index],target[test_index]
    clf = SGDRegressor(max_iter=1000,tol=1e-3)
    clf.fit(train_data,train_target)
    train_pred = clf.predict(train_data)
    test_pred = clf.predict(test_data)
    train_score = mean_squared_error(train_pred, train_target)
    test_score = mean_squared_error(test_pred, test_target)
    print(k,'+',train_score)
    print(k,'+',test_score)

  1. 留一法交叉验证
    训练集由除一个样本之外的其余样本组成,留下一个样本组成验证集,对于N个样本的数据集,可以组成N个不同的训练集和N个不同的验证集
from sklearn.model_selection import LeaveOneOut
loo = LeaveOneOut()
num = 100
for k, (train_index, test_index) in enumerate(loo.split(new_train_pca_16)):
    train_data, test_data, train_target, test_target = train.values[train_index],train.values[test_index],target[train_index],target[test_index]
    clf = SGDRegressor(max_iter=1000,tol=1e-3)
    clf.fit(train_data,train_target)
    train_pred = clf.predict(train_data)
    test_pred = clf.predict(test_data)
    train_score = mean_squared_error(train_pred, train_target)
    test_score = mean_squared_error(test_pred, test_target)
    print(k,'+',train_score)
    print(k,'+',test_score)
    if k>9:
        break
  1. 留P法交叉验证
    从完成的数据集中删除P个样本,产生所有可能的训练集和验证集
from sklearn.model_selection import LeavePOut
lpo = LeavePOut(p=10)
num = 100
for k, (train_index, test_index) in enumerate(loo.split(new_train_pca_16)):
    train_data, test_data, train_target, test_target = train.values[train_index],train.values[test_index],target[train_index],target[test_index]
    clf = SGDRegressor(max_iter=1000,tol=1e-3)
    clf.fit(train_data,train_target)
    train_pred = clf.predict(train_data)
    test_pred = clf.predict(test_data)
    train_score = mean_squared_error(train_pred, train_target)
    test_score = mean_squared_error(test_pred, test_target)
    print(k,'+',train_score)
    print(k,'+',test_score)
    if k>9:
        break
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值