MySQL数据分析1_酒店预定需求分析
本文所有数据来源于kaggle:Hotel booking demand
使用工具:MySQL,Excel
1.背景
您是否想过一年中什么时候预定酒店房间?还是为了获得最佳每日房价而获得的最佳停留时间?如果您想预测酒店是否可能收到过多的特殊要求,该怎么办?该酒店预订数据集可以帮助您探索这些问题!
2.提出问题
1)酒店运营分析(城市酒店和假日酒店预订需求和入住率比较、客流量趋势、渠道等角度)
2)用户分析(预定时长、入住时长、预定餐饮、特殊要求、用户类型等)
3)顾客一年中最佳预定酒店时间是什么时候?
4)酒店该怎样增加收入?
3.数据理解
以上两张图片是字段和部分数据截图,原始数据共32个字段,每个字段119390行
对每个字段理解如下:
字段 | 解释 |
---|---|
hotel | Hotel(H1 = Resort Hotel or H2 = City Hotel) |
is_canceled | 是否取消预定(取消(1),不取消(0)) |
lead_time | 预定时长(从预定到入住的时间) |
arrival_date_year | 到达时间(年) |
arrival_date_month | 到达时间(月) |
arrival_date_day_of_month | 到达时间(日) |
arrival_date_week_number | 到达时间(在当年为第几周) |
stays_in_weekend_nights | 在周末的入住天数 |
stays_in_week_nights | 在工作日的入住天数 |
adults | 入住成人数 |
children | 入住儿童数 |
babies | 入住婴儿数 |
meal | 预定餐饮类型(Undefined/SC – no meal package; BB – Bed & Breakfast; HB – Half board (breakfast and one other meal – usually dinner; FB – Full board (breakfast, lunch and dinner)) |
market_segment | 细分市场( “TA” means “Travel Agents” and “TO” means “Tour Operators”) |
distribution_channel | 分销渠道 |
is_repeated_guest | 是否为老客(是(1),否(0)) |
previous_cancellations | 客户在当前预订之前取消的先前预订数 |
previous_bookings_not_canceled | 客户在当前预订之前未取消的先前预订数 |
reserved_room_type | 预定房间类型 |
assigned_room_type | (酒店)安排的入住房间类型(房型可能会受时间等因素有所调整) |
deposit_type | 押金类型 |
agent | 预定的旅行社ID |
company | 预定的公司ID |
days_in_waiting_list | 酒店方确认预定所需时长(从顾客下预定订单到酒店方确认订单所需时长) |
customer_type | 顾客类型(Contract - when the booking has an allotment or other type of contract associated to it; Group – when the booking is associated to a group; Transient – when the booking is not part of a group or contract, and is not associated to other transient booking; Transient-party – when the booking is transient, but is associated to at least other transient booking) |
adr | 日平均消费 |
required_car_parking_spaces | 顾客要求提供的停车位数量 |
total_of_special_requests | 顾客特殊需求的总数量 |
reservation_status | 预定最终状态(Canceled – booking was canceled by the customer; Check-Out – customer has checked in but already departed; No-Show – customer did not check-in and did inform the hotel of the reason why) |
reservation_status_date | 预定最终状态确认时间 |
4.数据清洗
分为两个大方向,用户方向和酒店运营方向
用户方向
– 1.选择用户画像需要字段的子集,创建用户表。
CREATE table hotel_customer
AS(SELECT hotel,lead_time,adults,children,babies,country,meal,stays_in_week_nights,stays_in_weekend_nights,booking_changes,customer_type,required_car_parking_spaces,total_of_special_requests
FROM hotel_bookings)
– 2.清洗数据,检查数据完整性
SELECT count(hotel),count(lead_time),count(adults),count(children),count(babies),count(country),count(meal