1、读取csv
import pandas as pd
df=pd.read_csv('/data1/littlesc/Uplift/criteo-uplift-v2.1.csv')
2、返回数据的行数
df.shape[0]
3、给某一列/某些列改名
df = df.rename(columns={
'treatment':'treatment_label'})
4、筛选
df_t = df[df['treatment_label']==1]
5、划分训练集和验证集
from sklearn.model_selection import train_test_split
df_train, df_test = train_test_split(df_use, test_size=0.3, random_state=111)
6、把两个列名一样的