Kaggle (Bike Sharing Demand)* 20%*
题目: https://www.kaggle.com/c/bike-sharing-demand
Github地址: https://github.com/cqychen/mykaggle/tree/master/Bike%20Sharing%20Demand
强调,特征决定结果的高度,模型决定如何逼近这个高度
数据探探
这是一个关于自行车租赁预测的题目,相当于国内的ofo,摩拜单车啦。
You are provided hourly rental data spanning two years. For this competition, the training set is comprised of the first 19 days of each month, while the test set is the 20th to the end of the month. You must predict the total count of bikes rented during each hour covered by the test set, using only information available prior to the rental period.
训练集提供了一个月的前19天的数据和使用情况,测试集提供后面20号以后的数据,我们主要的任务就是预测20号以后的使用量。
列名 | desc | 中文描述 |
---|---|---|
datetime | hourly date + timestamp | 小时日期 和时间戳 |
season | 1 = spring, 2 = summer, 3 = fall, 4 = winter | 1:春天 2:夏天 3:秋天 4:冬天 |
holiday | whether the day is considered a holiday | 当天是否是节假日 |
workingday | whether the day is neither a weekend nor holiday | 当天是否是工作日 |
weather | 1: Clear, Few clouds, Partly cloudy, Partly cloudy2: Mist + Cloudy, Mist + Broken clouds, Mist + Few clouds, Mist 3: Light Snow, Light Rain + Thunderstorm + Scattered clouds, Light Rain + Scattered clouds 4: Heavy Rain + Ice Pallets + Thunderstorm + Mist, Snow + Fog | 1:晴,少云,部分多云,部分多云。 2:薄雾+多云,薄雾+破碎的云,薄雾+少量的云,雾 3:小雪,小雨+雷雨+散云,小雨+散云 4:大雨+冰盘+雷雨+雾,雪+雾 |
temp | temperature in Celsius | 温度 |
atemp | "feels like" temperature in Celsius | 感受到的温度 |
humidity | relative humidity | 湿度 |
windspeed | wind speed | 风速 |
casual | number of non-registered user rentals initiated | 未注册用户的租赁数量 |
registered | number of registered user rentals initiated | 注册用户的租赁数量 |
count | number of total rentals | 总的租赁数量 |
数据总览
读入数据,看看大致信息:
训练集数据共12列,没有数据缺失。哇咔咔