数据来源:kaggle
本文源码及数据链接:百度云提取码:2vng
本文通过Python进行数据清洗,正文中使用Tableau作图进行展示。
项目背景
你有没有想过一年中预订酒店房间的最佳时间是什么时候?或者为了获得最佳的每日费用而选择的最佳停留时间?如果你想预测一家酒店是否有可能收到过高数量的特别要求呢?
这个酒店预订数据集可以帮助您探索这些问题!
该数据集包含城市酒店和度假酒店的预订信息,包括预订时间、入住时间、成人、儿童和/或婴儿数量以及可用停车位数量等信息。
数据来源于Nuno Antonio、Ana Almeida和Luis Nunes撰写的文章《酒店预订需求数据集》(Hotel Booking Demand Datasets),简明扼要,第22卷,2019年2月。
这些数据是由托马斯·莫克和安托万·比查特在2020年2月11日这一周的第三天下载和清理的。
确认目标
分析角度:
- 酒店运营角度
- 预定入住率、预定取消率、复订率、客户结构、预定房型、预定均价
- 押金、分房、客流量趋势
- 渠道质量、订单占比
- 用户画像
- 提前预定时长、预定入住月份、入住人数、出行结构、入住时长、复订间隔、餐饮预定
- 预定修改次数、用户类型
提出问题:
- 对用户来说,一年中预订酒店房间的最佳时间是什么时候?
- 如何提高酒店营收?
- 预测一家酒店的预订单是否会被取消。(待更新)
数据概览
字段 | 定义 |
---|---|
hotel | Hotel (H1 = Resort Hotel or H2 = City Hotel) |
is_canceled | Value indicating if the booking was canceled (1) or not (0) |
lead_time | Number of days that elapsed between the entering date of the booking into the PMS and the arrival date |
arrival_date_year | Year of arrival date |
arrival_date_month | Month of arrival date |
arrival_date_week_number | Week number of year for arrival date |
arrival_date_day_of_month | Day of arrival date |
stays_in_weekend_nights | Number of weekend nights (Saturday or Sunday) the guest stayed or booked to stay at the hotel |
stays_in_week_nights | Number of week nights (Monday to Friday) the guest stayed or booked to stay at the hotel |
adults | Number of adults |
children | Number of children |
babies | Number of babies |
meal | Type of meal booked. Categories are presented in standard hospitality meal packages: Undefined/SC – no meal package; BB – Bed & Breakfast; HB – Half board (breakfast and one other meal – usually dinner); FB – Full board (breakfast, lunch and dinner) |
country | Country of origin. Categories are represented in the ISO 3155–3:2013 format |
market_segment | Market segment designation. In categories, the term “TA” means “Travel Agents” and “TO” means “Tour Operators” |
distribution_channel | Booking distribution channel. The term “TA” means “Travel Agents” and “TO” means “Tour Operators” |
is_repeated_guest | Value indicating if the booking name was from a repeated guest (1) or not (0) |
previous_cancellations | Number of previous bookings that were cancelled by the customer prior to the current booking |
previous_bookings_not_canceled | Number of previous bookings not cancelled by the customer prior to the current booking |
reserved_room_type | Code of room type reserved. Code is presented instead of designation for anonymity reasons. |
assigned_room_type | Code for the type of room assigned to the booking. Sometimes the assigned room type differs from the reserved room type due to hotel operation reasons (e.g. overbooking) or by customer request. Code is presented instead of designation for anonymity reasons. |
booking_changes | Number of changes/amendments made to the booking from the moment the booking was entered on the PMS until the moment of check-in or cancellation |
deposit_type | Indication on if the customer made a deposit to guarantee the booking. This variable can assume three categories: No Deposit – no deposit was made; Non Refund – a deposit was made in the value of the total stay cost; Refundable – a deposit was made with a value under the total cost of stay. |
agent | ID of the travel agency that made the booking |
company | ID of the company/entity that made the booking or responsible for paying the booking. ID is presented instead of designation for anonymity reasons |
days_in_waiting_list | Number of days the booking was in the waiting list before it was confirmed to the customer |
customer_type | Type of booking, assuming one of four categories: Contract - when the booking has an allotment or other type of contract associated to it; Group – when the booking is associated to a group; Transient – when the booking is not part of a group or contract, and is not associated to other transient booking; Transient-party – when the booking is transient, but is associated to at least othe |