机器学习小知识点系列:sklearn.model_selection.KFold
KFold
class sklearn.model_selection.KFold(n_splits=5, *, shuffle=False, random_state=None)
K-Folds cross-validator
Provides train/test indices(索引) to split data in train/test sets. Split dataset into k consecutive(连续的) folds (without shuffling by default默认没有洗牌).
Each fold is then used once as a validation while the k - 1 remaining folds form the training set.
Parameters
n_splits : int, default=5
Number of folds. Must be at least 2.
(Changed in version 0.22: n_splits default value changed from 3 to 5.)
shuffle : bool, default=False
Whether to shuffle the data before splitting into batches. Note that the samples within each split will not be shuffled.
在分割成批之前是否打乱数据。注意,每个分割中的样本将不会被打乱。
random_state : int or RandomState instance, default=None
When shuffle is True, random_state affects the ordering of the indices, which controls the randomness of each fold. Otherwise, this parameter has no effect. Pass an int for reproducible output across multiple function calls. See Glossary.
Methods
get_n_splits(X=None, y=None, groups=None)
Returns the number of splitting iterations in the cross-validator返回交叉验证器中分割迭代的次数
- Parameters参数:
X : object
Always ignored, exists for compatibility.
y : object
Always ignored, exists for compatibility.
groups : object
Always ignored, exists for compatibility. - Returns返回:
n_splits : int
Returns the number of splitting iterations in the cross-validator.
split(X, y=None, groups=None)
Generate indices to split data into training and test set.生成索引将数据分割成训练集和测试集。
-
Parameters
X : array-like of shape (n_samples, n_features)
Training data, where n_samples is the number of samples and n_features is the number of features.y : array-like of shape (n_samples,), default=None The target variable for supervised learning problems. groups : array-like of shape (n_samples,), default=None Group labels for the samples used while splitting the dataset into train/test set.
-
Yields
train : ndarray
The training set indices for that split.
test : ndarray
The testing set indices for t