参数:
n : int
数据集中的元素总数。
n_iter : int (default 10)
重新洗牌和分裂迭代次数。
test_size : float (default 0.1), int, or None
如果是float类型的数据, 这个数应该介于0-1.0之间,代表test集所占比例. 如果是int类型, 代表test集的数量. 如果为None, 值将自动设置为train集大小的补集
train_size : float, int, or None (default is None)
如果是float类型的数据 应该介于0和1之间,并表示数据集在train集分割中所占的比例 如果是int类型, 代表train集的样本数量. 如果为None, 值将自动设置为test集大小的补集
random_state : int or RandomState
用于随机抽样的伪随机数发生器状态。
>>> from sklearn import cross_validation >>> rs = cross_validation.ShuffleSplit(4, n_iter=3, ... test_size=.25, random_state=0) >>> len(rs) 3 >>> print(rs) ... ShuffleSplit(4, n_iter=3, test_size=0.25, ...) >>> for train_index, test_index in rs: ... print("TRAIN:", train_index, "TEST:", test_index) ... TRAIN: [3 1 0] TEST: [2] TRAIN: [2 1 3] TEST: [0] TRAIN: [0 2 1] TEST: [3]
>>> rs = cross_validation.ShuffleSplit(4, n_iter=3, ... train_size=0.5, test_size=.25, random_state=0) >>> for train_index, test_index in rs: ... print("TRAIN:", train_index, "TEST:", test_index) ... TRAIN: [3 1] TEST: [2] TRAIN: [2 1] TEST: [0] TRAIN: [0 2] TEST: [3] .. automethod:: __init__