一、二折交叉验证
import numpy as np
from sklearn.model_selection import KFold
X = np.array([[1, 2], [3, 4], [1, 2], [3, 4]])
#y = np.array([1, 2, 3, 4])
kf = KFold(n_splits=2)
#2折交叉验证,将数据分为两份即前后对半分,每次取一份作为test集
for train_index, test_index in kf.split(X):
print('train_index', train_index, 'test_index', test_index)
#train_index与test_index为下标
train_X = X[train_index]
test_X= X[test_index]
print("train_X",train_X)
print("test_X",test_X)
实验结果
说明:因为是二折交叉验证,将数据集分为两个小块
D1与D2分别作为训练集和测试集
实验结果
train_index [2 3] test_index [0 1]
train_index [0 1] test_index [2 3]
train_X [[1 2]
[3 4]]
test_X [[1 2]
[3 4]]
二、三折
Y = np.array([[1, 2], [3, 4], [5, 6], [7, 8],[9,10],[11,12]])
#y = np.array([1, 2, 3, 4])
i=0
kf = KFold(n_splits=3)
#2折交叉验证,将数据分为两份即前后对半分,每次取一份作为test集
for train_index, test_index in kf.split(Y):
i=i+1
print(i)
print('train_index', train_index, 'test_index', test_index)
#train_index与test_index为下标
train_Y = Y[train_index]
test_Y= Y[test_index]
print("train_Y",train_Y)
print("test_Y",test_Y)
说明:三折交叉验证将整个数据集分为三份
实验结果:
#第一次D2、D3作为训练集,D1作为测试集
train_index [2 3 4 5] test_index [0 1]
train_Y [[ 5 6]
[ 7 8]
[ 9 10]
[11 12]]
test_Y [[1 2]
[3 4]]
#第二次D1、D3作为训练集,D2作为测试集
train_index [0 1 4 5] test_index [2 3]
train_Y [[ 1 2]
[ 3 4]
[ 9 10]
[11 12]]
test_Y [[5 6]
[7 8]]
#第一次D1、D2作为训练集,D3作为测试集
train_index [0 1 2 3] test_index [4 5]
train_Y [[1 2]
[3 4]
[5 6]
[7 8]]
test_Y [[ 9 10]
[11 12]]