sklearn.model_selection.train_test_split 将数据集分为8:2

最新推荐文章于 2022-08-08 21:29:07 发布

lilibiu

最新推荐文章于 2022-08-08 21:29:07 发布

阅读量1.1k

点赞数 1

文章标签： sklearn pytorch python

本文链接：https://blog.csdn.net/HJ33_/article/details/120432636

版权

本文介绍了如何使用sklearn.model_selection.train_test_split将数据集按8:2比例划分为训练集和测试集，详细解析了参数含义，通过实例展示了具体用法，并强调了random_state的作用以确保可重复性。

摘要由CSDN通过智能技术生成

使用pytorch，将数据分为训练：测试=8:2

划分数据集的图解

在这里插入图片描述

sklearn.model_selection.train_test_split 书写规范与参数含义

在这里插入图片描述
关于参数的含义

Parameters

*arrayssequence of indexables with same length / shape[0]

    Allowed inputs are lists, numpy arrays, scipy-sparse matrices or pandas dataframes.
test_sizefloat or int, default=None

    If float, should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the test split. If int, represents the absolute number of test samples. If None, the value is set to the complement of the train size. If train_size is also None, it will be set to 0.25.
train_sizefloat or int, default=None

    If float, should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the train split. If int, represents the absolute number of train samples. If None, the value is automatically set to the complement of the test size.
random_stateint, RandomState instance or None, default=None

    Controls the shuffling applied to the data before applying the split. Pass an int for reproducible output across multiple function calls. See Glossary.
shufflebool, default=True

    Whether or not to shuffle the data before splitting. If shuffle=False then stratify must be None.
stratifyarray-like, default=None

    If not None, data is split in a stratified fashion, using this as the class labels. Read more in the User Guide.

Returns

splittinglist, length=2 * len(arrays)

    List containing train-test split of inputs.

    New in version 0.16: If the input is sparse, the output will be a scipy.sparse.csr_matrix. Else, output type is the same as the input type.

这里举个例子

>>> import numpy as np
>>> from sklearn.model_selection import train_test_split
>>> X

最低0.47元/天解锁文章