1、基础
K-fold(k折交叉切分)
是一个生成器
sklearn.model_selection.folder.split(<n_splits=整数k,大于等于2>,<random_state=随机数种子or随机数生成器>,<shuffle=布尔值,True-在切分数据集前先混洗数据集>)
split(X[,y,groups])
X:训练数据集(n_samples,n_features)
y:标记信息(n_samples,)
划分数据集为训练集、测试集
2、代码不混洗(前边有空格,需要自己删除哟)
# 数训练据集
X=np.array([[1,2,3,4],
[11,12,13,14],
[21,22,23,24],
[31,32,33,34],
[41,42,43,44],
[51,52,53,54],
[61,62,63,64],
[71,72,73,74],
[81,82,83,84]])
# 标记信息
y=np.array([1,1,0,0,1,1,0,0,1])
# 切分之前不混洗数据集
folder=KFold(n_splits=3,shuffle=False)
'''没有用随机数种子'''
for train_index,test_index in folder.split(X,y):
print("Train Index:",train_index)
print("Test Index:",test_index)
print("X_train:",X[train_index])
print("X_test:",X[test_index])
print("")
3、运行结果
【out】:
Train Index: [3 4 5 6 7 8]
Test Index: [0 1 2]
X_train: [[31 32 33 34]
[41 42 43 44]
[51 52 53 54]
[61 62 63 64]
[71 72 73 74]
[81 82 83 84]]
X_test: [[ 1 2 3 4]
[11 12 13 14]
[21 22 23 24]]Train Index: [0 1 2 6 7 8]
Test Index: [3 4 5]
X_train: [[ 1 2 3 4]
[11 12 13 14]
[21 22 23 24]
[61 62 63 64]
[71 72 73 74]
[81 82 83 84]]
X_test: [[31 32 33 34]
[41 42 43 44]
[51 52 53 54]]Train Index: [0 1 2 3 4 5]
Test Index: [6 7 8]
X_train: [[ 1 2 3 4]
[11 12 13 14]
[21 22 23 24]
[31 32 33 34]
[41 42 43 44]
[51 52 53 54]]
X_test: [[61 62 63 64]
[71 72 73 74]
[81 82 83 84]]
4、代码混洗
# 切分之前混洗数据集
shuffle_folder=KFold(n_splits=3,random_state=0,shuffle=True)
for train_index,test_index in shuffle_folder.split(X,y):
print("Shuffled Train Index:",train_index)
print("Shuffled Test Index:",test_index)
print("Shuffled X_train:",X[train_index])
print("Shuffled X_test:",X[test_index])
print("")
5、运行结果
【out】:
Shuffled Train Index: [0 3 4 5 6 8]
Shuffled Test Index: [1 2 7]
Shuffled X_train: [[ 1 2 3 4]
[31 32 33 34]
[41 42 43 44]
[51 52 53 54]
[61 62 63 64]
[81 82 83 84]]
Shuffled X_test: [[11 12 13 14]
[21 22 23 24]
[71 72 73 74]]Shuffled Train Index: [0 1 2 3 5 7]
Shuffled Test Index: [4 6 8]
Shuffled X_train: [[ 1 2 3 4]
[11 12 13 14]
[21 22 23 24]
[31 32 33 34]
[51 52 53 54]
[71 72 73 74]]
Shuffled X_test: [[41 42 43 44]
[61 62 63 64]
[81 82 83 84]]Shuffled Train Index: [1 2 4 6 7 8]
Shuffled Test Index: [0 3 5]
Shuffled X_train: [[11 12 13 14]
[21 22 23 24]
[41 42 43 44]
[61 62 63 64]
[71 72 73 74]
[81 82 83 84]]
Shuffled X_test: [[ 1 2 3 4]
[31 32 33 34]
[51 52 53 54]]