随机抽样:
import pandas as pd
#对dataframe随机抽取100个样本
pd.sample(df, n=100)
分层抽样:
利用train_test_split
中的函数灵活进行抽样
from sklearn.model_selection import train_test_split
#y是在X中的某一个属性列
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size=0.1, stratify=y)