又来参加打卡啦~这次选择的是一个相对而言比较简单的项目
学习目标
简单来说就是通过理解和分析数据,生成一个excel的结果,然后提交查看分数
数据概况
train.csv
- id 为心跳信号分配的唯一标识
- heartbeat_signals 心跳信号序列(数据之间采用“,”进行分隔)
- label心跳信号类别(0、1、2、3)
testA.csv
- id 心跳信号分配的唯一标识
- heartbeat_signals 心跳信号序列(数据之间采用“,”进行分隔)
baseline微调
因为我的环境里没有一些包,所以先行安装
%pip install lightgbm
%pip install xgboost
%pip install catboost
读取数据
train = pd.read_csv('train.csv')
test=pd.read_csv('testA.csv')
train.head()
id heartbeat_signals label
0 0 0.9912297987616655,0.9435330436439665,0.764677… 0.0
1 1 0.9714822034884503,0.9289687459588268,0.572932… 0.0
2 2 1.0,0.9591487564065292,0.7013782792997189,0.23… 2.0
3 3 0.9757952826275774,0.9340884687738161,0.659636… 0.0
4 4 0.0,0.055816398940721094,0.26129357194994196,0… 2.0
test.head()
id heartbeat_signals
0 100000 0.9915713654170097,1.0,0.6318163407681274,0.13…
1 100001 0.6075533139615096,0.5417083883163654,0.340694…
2 100002 0.9752726292239277,0.6710965234906665,0.686758…
3 100003 0.9956348033996116,0.9170249621481004,0.521096…
4 100004 1.0,0.8879490481178918,0.745564725322326,0.531…