train_test_split的参数
test_size : float, int, None, optional
If float, should be between 0.0 and 1.0 and represent the proportion
of the dataset to include in the test split. If int, represents the
absolute number of test samples. If None, the value is set to the
complement of the train size. By default, the value is set to 0.25.
The default will change in version 0.21. It will remain 0.25 only
if ``train_size`` is unspecified, otherwise it will complement
the specified ``train_size``.
train_size : float, int, or None, default None
If float, should be between 0.0 and 1.0 and represent the
proportion of the dataset to include in the train split. If
int, represents the absolute number of train samples. If None,
the value is automatically set to the complement of the test size.
random_state : int, RandomState instance or None, optional (default=None)
If int, random_state is the seed used by the random number generator;
If RandomState instance, random_state is the random number generator;
If None, the random number generator is the RandomState instance used
by `np.random`.
shuffle : boolean, optional (default=True)
Whether or not to shuffle the data before splitting. If shuffle=False
then stratify must be None.
stratify : array-like or None (default is None)
If not None, data is split in a stratified fashion, using this as
the class labels.
from sklearn.model_selection import train_test_split
from sklearn import datasets
iris=datasets.load_iris()#鸢尾花数据
X=iris.data
y=iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)#数据按%80,%20划分
X_train.shape,y_train.shape,X_test.shape,y_test.shape #得到训练集和测试集的大小