酒店预定需求分析

通过对酒店预定数据的分析,发现城市酒店预定量大但取消率高,度假酒店则相反。房间变更过的客户预订取消率低,城市酒店应关注双人入住服务,度假酒店应针对家庭客户改善服务。新客取消预订率高于熟客,酒店可实施首单优惠策略。此外,城市酒店可通过1/2晚优惠降低取消率,度假酒店适合推出长租方案吸引客户。
摘要由CSDN通过智能技术生成

在这里插入图片描述

了解样本基本信息

import pandas as pd
path = 'C:/Users/Administrator/Desktop/酒店需求案例分析/hotel_bookings.csv'
data = pd.read_csv(path)
pd.set_option('display.max_columns', 100)
pd.set_option('display.width', 500)
print(data.describe())
         is_canceled      lead_time  arrival_date_year  arrival_date_week_number  arrival_date_day_of_month  stays_in_weekend_nights  stays_in_week_nights         adults       children         babies  is_repeated_guest  previous_cancellations  previous_bookings_not_canceled  booking_changes          agent      company  days_in_waiting_list            adr  required_car_parking_spaces  total_of_special_requests
count  119390.000000  119390.000000      119390.000000             119390.000000              119390.000000            119390.000000         119390.000000  119390.000000  119386.000000  119390.000000      119390.000000           119390.000000                   119390.000000    119390.000000  103050.000000  6797.000000         119390.000000  119390.000000                119390.000000              119390.000000
mean        0.370416     104.011416        2016.156554                 27.165173                  15.798241                 0.927599              2.500302       1.856403       0.103890       0.007949           0.031912                0.087118                        0.137097         0.221124      86.693382   189.266735              2.321149     101.831122                     0.062518                   0.571363
std         0.482918     106.863097           0.707476                 13.605138                   8.780829                 0.998613              1.908286       0.579261       0.398561       0.097436           0.175767                0.844336                        1.497437         0.652306     110.774548   131.655015             17.594721      50.535790                     0.245291                   0.792798
min         0.000000       0.000000        2015.000000                  1.000000                   1.000000                 0.000000              0.000000       0.000000       0.000000       0.000000           0.000000                0.000000                        0.000000         0.000000       1.000000     6.000000              0.000000      -6.380000                     0.000000                   0.000000
25%         0.000000      18.000000        2016.000000                 16.000000                   8.000000                 0.000000              1.000000       2.000000       0.000000       0.000000           0.000000                0.000000                        0.000000         0.000000       9.000000    62.000000              0.000000      69.290000                     0.000000                   0.000000
50%         0.000000      69.000000        2016.000000                 28.000000                  16.000000                 1.000000              2.000000       2.000000       0.000000       0.000000           0.000000                0.000000                        0.000000         0.000000      14.000000   179.000000              0.000000      94.575000                     0.000000                   0.000000
75%         1.000000     160.000000        2017.000000                 38.000000                  23.000000                 2.000000              3.000000       2.000000       0.000000       0.000000           0.000000                0.000000                        0.000000         0.000000     229.000000   270.000000              0.000000     126.000000                     0.000000                   1.000000
max         1.000000     737.000000        2017.000000                 53.000000                  31.000000                19.000000             50.000000      55.000000      10.000000      10.000000           1.000000               26.000000                       72.000000        21.000000     535.000000   543.000000            391.000000    5400.000000                     8.000000                   5.000000

异常值处理

统计出现缺失值的各列所包含的缺失值数目

import pandas as pd
path = 'C:/Users/Administrator/Desktop/酒店需求案例分析/hotel_bookings.csv'
data = pd.read_csv(path)
missing = data.isnull().sum()[data.isnull().sum()>0]
print(missing)
children         4
country        488
agent        16340
company     112593
dtype: int64

填充缺失值

# children和country列由于缺失数目较小,采用对应列众数进行填充
data['children'].fillna(data['children'].mode()[0],inplace=True)
data['country'].fillna(data['country'].mode()[0],inplace=True)
# agent列将缺失值单独作为新类别,标记为0
data['agent'].fillna(0,inplace=True)
# company缺失值过多 直接删除此列
data.drop('company',axis=1,inplace=True)

脏数据处理

#将meal中undefined和SC都表示未预定,合并为一类
data['meal'].replace('Undefined','SC',inplace=True)
#删除入住人数为0以及入住天数为0,删除细分市场不确定的,删除客房日均价小于0的列
zero_guest = data[data[['adults', 'children', 'babies']].sum(axis = 1)==0]
data.drop(zero_guest.index,inplace=True)
zero_day = data[data[['stays_in_weekend_nights','stays_in_week_nights']].sum(axis = 1)==0]
data.drop(zero_day.index,inplace=True)
uadr = data[data['adr']<0]
data.drop(uadr.index,inplace=True)
umarket_segment = data[data['market_segment']=='Undefined']
data.drop(umarket_segment.index,inplace=True)

酒店信息分析

城市酒店与度假酒店对比分析

不同年份两家酒店总营业额比较

data['total_adr'] = (data['stays_in_weekend_nights']+data['stays_in_week_nights'])*data['adr']
pd.pivot_table(data,values='total_adr',index=['arrival_date_year'],columns=['hotel'],aggfunc='sum').plot.bar()
plt.show()

在这里插入图片描述
2015年,城市酒店与度假酒店年总营业额持平。其余年份,城市酒店营业额都遥遥领先于度假酒店。

营业额精确到月份

data['total_adr'] = (data['stays_in_weekend_nights']+data['stays_in_week_nights'])*data['adr']
dict = {
   'July':7,'August':8,'September':9,'October':10,'November':11,'December':12,'January':1,'February':2,'March':3,'April':4,'May':5,'June':6,}
data['arrival_date_month'] = data['arrival_date_month'].map(dict)
pd.pivot_table(data,values='total_adr',index=['arrival_date_month'],columns=['hotel'],aggfunc='sum').plot.bar()
plt.show()

在这里插入图片描述
从图可以发现,7,8月份度假酒店月营业额高于城市酒店,其余月份的营业额均是城市酒店高于度假酒店,且两家酒店月营业额走势一致。

两家酒店月均收费比较

pd.pivot_table(data,values='total_adr',index=['arrival_date_month'],columns=['hotel'],aggfunc='mean').plot()
plt.show()

在这里插入图片描述
可以看出,度假酒店的月均收费变化较大,七八月份的月均收费较高,这可能是其此时月营业额高于城市酒店的原因。其余月份,城市酒店的日均收费高于度假酒店。

酒店类型与预定率取消率

a = data[data['is_canceled'] == 0].groupby(['hotel']).is_canceled.count()
b = data[data[
  • 2
    点赞
  • 10
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值