1. 问题
今天来看一个回归问题——Kaggle竞赛Bike Sharing Demand,根据日期时间、天气、温度等特征,预测自行车的租借量。训练与测试数据集大概长这样:
// train
datetime,season,holiday,workingday,weather,temp,atemp,humidity,windspeed,casual,registered,count
2011-01-01 00:00:00,1,0,0,1,9.84,14.395,81,0,3,13,16
2011-01-01 01:00:00,1,0,0,1,9.02,13.635,80,0,8,32,40
// test
datetime,season,holiday,workingday,weather,temp,atemp,humidity,windspeed
2011-01-20 00:00:00,1,0,1,1,10.66,11.365,56,26.0027
2011-01-20 01:00:00,1,0,1,1,10.66,13.635,56,
观察上面的数据,我们可以发现:租借量等于注册用户租借量加上未注册用户租借量,即casual
+ regis