读取csv文件:
data2 = pd.read_csv('data2(1).csv',encoding='gbk')
读取Excel文件:
dm = pd.read_excel(path+ "2021MCM_ProblemC_ Images_by_GlobalID.xlsx");
数据连接:
data = pd.merge(ds,dm);
数据筛选:
从’Lab Status’列中选取值为’Negative ID’的行;
dN = data[data['Lab Status'].isin(['Negative ID'])];
数据行堆叠:
dfr = pd.concat([dNt, dPt], axis=0, ignore_index=True);
数值替换:
dfr = dfr.replace({'Lab Status': {'Negative ID': 0, 'Positive ID': 1}});
随机打乱顺序:
dfr = dfr.reindex(np.random.permutation(dfr.index));
抽选出部分列:
dfr = dfr.loc[:,['Notes','Lab Status']];
更改数据类型:
dfr['Notes'] = dfr['Notes'].astype('string');
查看数据信息:
dfr.info()