Hotel Booking Data Analysis

In this project, Our goal is to develop a prediction model for predicting the possibility of hotel booking cancellation using data mining algorithms. This data is extracted from Hotel Profit Management System (PMS) and it contains booking information due to arrive between 7/1/2015 and 8/31/2017 from one resort hotel and one city hotel located in Portugal.

Introduction

Hotel cancellations are always a bad news for the business yet it’s unavoidable. The incorrect estimation of guest can lead to wastage of resources for eg. if a large group of individuals were to cancel their booking, at the last moment, this may lead to wastage of raw food materials and it is often difficult to find new guests at such a short notice

If hotel’s booking system were to have an algorithm, which could predict the possibility of cancellation, it would be easier for the hotel management to develop better cancellation policy, to predict demand accurately and to take proactive actions to manage business and resources accordingly. We are trying to analyze the booking data of 3 years (between 7/1/2015 and 8/31/2017) of multiple hotels and trying to find a common trend which can be seen in the case of cancellation. In short, we are trying to determine, what are some of the common characteristics which indicate possibility of a cancellation.

We will first identify which parameters are correlated with booking cancellation. After deciding the correlated variables, we will see whether they positively influence the cancellation or negatively influence the cancellation. Finally, we plan to come up with a logistic regression model which can predict cancellation.

Packages Required

Below is the list of packages used in the project.

library(psych)
library(magrittr)   
library(tidyverse)  
library(car)        
library(knitr)

The purpose of the packages are as below:

  • Package psych is used to plot multiple histograms.
  • Package tidyverse consists of 6 core packages out of which the below 3 are most important for this project:
  • ggplot2: Used for creating powerful visualizations
  • dplyr: Used for data manipulation
  • tidyr: Used for data modifications
  • Package car is used for identifying outliers in a boxplot.
  • Package knitr is for displaying tables.

Data Preparation

The data was obtained at tidytuesday and detailed information can be found in the original publication (Nuno Antonio, Ana de Almeida, Luis Nunes, 2019). Since it is uncommon to find real business data, the authors intended to provide the hotel demand data to researchers and educators in revenue management, machine learning or data mining.

Data Import

There are 119390 observation and 32 variables in hotels.csv file.

# Importing .csv files 
hotels <- read.csv("data/hotels.csv") 
# View the structure of hotels
str(hotels)
## 'data.frame':    119390 obs. of  32 variables:
##  $ hotel                         : chr  "Resort Hotel" "Resort Hotel" "Resort Hotel" "Resort Hotel" ...
##  $ is_canceled                   : int  0 0 0 0 0 0 0 0 1 1 ...
##  $ lead_time                     : int  342 737 7 13 14 14 0 9 85 75 ...
##  $ arrival_date_year             : int  2015 2015 2015 2015 2015 2015 2015 2015 2015 2015 ...
##  $ arrival_date_month            : chr  "July" "July" "July" "July" ...
##  $ arrival_date_week_number      : int  27 27 27 27 27 27 27 27 27 27 ...
##  $ arrival_date_day_of_month     : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ stays_in_weekend_nights       : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ stays_in_week_nights          : int  0 0 1 1 2 2 2 2 3 3 ...
##  $ adults                        : int  2 2 1 1 2 2 2 2 2 2 ...
##  $ children                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ babies                        : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ meal                          : chr  "BB" "BB" "BB" "BB" ...
##  $ country                       : chr  "PRT" "PRT" "GBR" "GBR" ...
##  $ market_segment                : chr  "Direct" "Direct" "Direct" "Corporate" ...
##  $ distribution_channel          : chr  "Direct" "Direct" "Direct" "Corporate" ...
##  $ is_repeated_guest             : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ previous_cancellations        : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ previous_bookings_not_canceled: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ reserved_room_type            : chr  "C" "C" "A" "A" ...
##  $ assigned_room_type            : chr  "C" "C" "C" "A" ...
##  $ booking_changes               : int  3 4 0 0 0 0 0 0 0 0 ...
##  $ deposit_type                  : chr  "No Deposit" "No Deposit" "No Deposit" "No Deposit" ...
##  $ agent                         : chr  "NULL" "NULL" "NULL" "304" ...
##  $ company                       : chr  "NULL" "NULL" "NULL" "NULL" ...
##  $ days_in_waiting_list          : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ customer_type                 : chr  "Transient" "Transient" "Transient" "Transient" ...
##  $ adr                           : num  0 0 75 75 98 ...
##  $ required_car_parking_spaces   : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ total_of_special_requests     : int  0 0 0 0 1 1 0 1 1 0 ...
##  $ reservation_status            : chr  "Check-Out" "Check-Out" "Check-Out" "Check-Out" ...
##  $ reservation_status_date       : chr  "2015-07-01" "2015-07-01" "2015-07-02" "2015-07-02" ...
# View the first 6 rows of hotels
head(hotels)
##          hotel is_canceled lead_time arrival_date_year arrival_date_month
## 1 Resort Hotel           0       342              2015               July
## 2 Resort Hotel           0       737              2015               July
## 3 Resort Hotel           0         7              2015               July
## 4 Resort Hotel           0        13              2015               July
## 5 Resort Hotel           0        14              2015               July
## 6 Resort Hotel           0        14              2015               July
##   arrival_date_week_number arrival_date_day_of_month stays_in_weekend_nights
## 1                       27                         1                       0
## 2                       27                         1                       0
## 3                       27                         1                       0
## 4                       27                         1                       0
## 5                       27                         1                       0
## 6                       27                         1                       0
##   stays_in_week_nights adults children babies meal country market_segment
## 1                    0      2        0      0   BB     PRT         Direct
## 2                    0      2        0      0   BB     PRT         Direct
## 3                    1      1        0      0   BB     GBR         Direct
## 4                    1      1        0      0   BB     GBR      Corporate
## 5                    2      2        0      0   BB     GBR      Online TA
## 6                    2      2        0      0   BB     GBR      Online TA
##   distribution_channel is_repeated_guest previous_cancellations
## 1               Direct                 0                      0
## 2               Direct                 0                      0
## 3               Direct                 0                      0
## 4            Corporate                 0                      0
## 5                TA/TO                 0                      0
## 6                TA/TO                 0                      0
##   previous_bookings_not_canceled reserved_room_type assigned_room_type
## 1                              0                  C                  C
## 2                              0                  C                  C
## 3                              0                  A                  C
## 4                              0                  A                  A
## 5                              0                  A                  A
## 6                              0                  A                  A
##   booking_changes deposit_type agent company days_in_waiting_list customer_type
## 1               3   No Deposit  NULL    NULL                    0     Transient
## 2               4   No Deposit  NULL    NULL                    0     Transient
## 3               0   No Deposit  NULL    NULL                    0     Transient
## 4               0   No Deposit   304    NULL                    0     Transient
## 5               0   No Deposit   240    NULL                    0     Transient
## 6               0   No Deposit   240    NULL                    0     Transient
##   adr required_car_parking_spaces total_of_special_requests reservation_status
## 1   0                           0                         0          Check-Out
## 2   0                           0                         0          Check-Out
## 3  75                           0                         0          Check-Out
## 4  75                           0                         0          Check-Out
## 5  98                           0                         1          Check-Out
## 6  98                           0                         1          Check-Out
##   reservation_status_date
## 1              2015-07-01
## 2              2015-07-01
## 3              2015-07-02
## 4              2015-07-02
## 5              2015-07-03
## 6              2015-07-03

Change Variable Types

Following integer variables have been switched to factors for the convenience in analysis.

hotels$hotel <- as.factor(hotels$hotel)
hotels$is_canceled  <- as.factor(hotels$is_canceled)
hotels$arrival_date_year   <- as.factor(hotels$arrival_date_year )
hotels$arrival_date_month   <- as.factor(hotels$arrival_date_month )
hotels$arrival_date_week_number  <- as.factor(hotels$arrival_date_week_number )
hotels$arrival_date_day_of_month   <- as.factor(hotels$arrival_date_day_of_month)
hotels$meal <- as.factor(hotels$meal)
hotels$country <- as.factor(hotels$country)
hotels$market_segment <- as.factor(hotels$market_segment)
hotels$distribution_channel <- as.factor(hotels$distribution_channel)
hotels$is_repeated_guest   <- as.factor(hotels$is_repeated_guest)
hotels$previous_cancellations  <- as.factor(hotels$previous_cancellations)
hotels$previous_bookings_not_canceled <- as.factor(hotels$previous_bookings_not_canceled)
hotels$previous_cancellations  <- as.factor(hotels$previous_cancellations)
hotels$reserved_room_type <- as.factor(hotels$reserved_room_type)
hotels$assigned_room_type <- as.factor(hotels$assigned_room_type)
hotels$booking_changes  <- as.factor(hotels$booking_changes)
hotels$deposit_type <- as.factor(hotels$deposit_type)
hotels$agent <- as.factor(hotels$agent)
hotels$company <- as.factor(hotels$company)
hotels$customer_type <- as.factor(hotels$customer_type)
hotels$reservation_status <- as.factor(hotels$reservation_status)
hotels$reservation_status_date <- as.factor(hotels$reservation_status_date)
#hotels$arrival_date <- as.factor(hotels$arrival_date)

Create new variables

New variable “arrival_date” and “stay_nights” were created using existing variables.

#create new variable: arrival_date
hotels$arrival_date <-paste(hotels$arrival_date_month, 
                            hotels$arrival_date_day_of_month,
                            hotels$arrival_date_year,sep="-")
as.Date(hotels$arrival_date, format="%B-%d-%Y")
#create new variable: stays_nights
hotels %>% mutate(stays_nights= stays_in_weekend_nights + stays_in_week_nights) -> hotels_new

Regrouping categorical variable: meal

Regarding type of meal, there are 10,650 records of SC (no meal package) and 1,169 records of Undefined. The level of “Undefined” were combined as the level of SC.

#summary(hotels_new)
levels(hotels_new$meal)[5] <- "SC"
summary(hotels_new)
##           hotel       is_canceled   lead_time   arrival_date_year
##  City Hotel  :79330   0:75166     Min.   :  0   2015:21996       
##  Resort Hotel:40060   1:44224     1st Qu.: 18   2016:56707       
##                                   Median : 69   2017:40687       
##                                   Mean   :104                    
##                                   3rd Qu.:160                    
##                                   Max.   :737                    
##                                                                  
##  arrival_date_month arrival_date_week_number arrival_date_day_of_month
##  August :13877      33     :  3580           17     : 4406            
##  July   :12661      30     :  3087           5      : 4317            
##  May    :11791      32     :  3045           15     : 4196            
##  October:11160      34     :  3040           25     : 4160            
##  April  :11089      18     :  2926           26     : 4147            
##  June   :10939      21     :  2854           9      : 4096            
##  (Other):47873      (Other):100858           (Other):94068            
##  stays_in_weekend_nights stays_in_week_nights     adults      
##  Min.   : 0.0000         Min.   : 0.0         Min.   : 0.000  
##  1st Qu.: 0.0000         1st Qu.: 1.0         1st Qu.: 2.000  
##  Median : 1.0000         Median : 2.0         Median : 2.000  
##  Mean   : 0.9276         Mean   : 2.5         Mean   : 1.856  
##  3rd Qu.: 2.0000         3rd Qu.: 3.0         3rd Qu.: 2.000  
##  Max.   :19.0000         Max.   :50.0         Max.   :55.000  
##                                                               
##     children           babies          meal          country     
##  Min.   : 0.0000   Min.   : 0.000000   BB:92310   PRT    :48590  
##  1st Qu.: 0.0000   1st Qu.: 0.000000   FB:  798   GBR    :12129  
##  Median : 0.0000   Median : 0.000000   HB:14463   FRA    :10415  
##  Mean   : 0.1039   Mean   : 0.007949   SC:11819   ESP    : 8568  
##  3rd Qu.: 0.0000   3rd Qu.: 0.000000              DEU    : 7287  
##  Max.   :10.0000   Max.   :10.000000              ITA    : 3766  
##  NA's   :4                                        (Other):28635  
##        market_segment  distribution_channel is_repeated_guest
##  Online TA    :56477   Corporate: 6677      0:115580         
##  Offline TA/TO:24219   Direct   :14645      1:  3810         
##  Groups       :19811   GDS      :  193                       
##  Direct       :12606   TA/TO    :97870                       
##  Corporate    : 5295   Undefined:    5                       
##  Complementary:  743                                         
##  (Other)      :  239                                         
##  previous_cancellations previous_bookings_not_canceled reserved_room_type
##  0      :112906         0      :115770                 A      :85994     
##  1      :  6051         1      :  1542                 D      :19201     
##  2      :   116         2      :   580                 E      : 6535     
##  3      :    65         3      :   333                 F      : 2897     
##  24     :    48         4      :   229                 G      : 2094     
##  11     :    35         5      :   181                 B      : 1118     
##  (Other):   169         (Other):   755                 (Other): 1551     
##  assigned_room_type booking_changes      deposit_type        agent      
##  A      :74053      0      :101314   No Deposit:104641   9      :31961  
##  D      :25322      1      : 12701   Non Refund: 14587   NULL   :16340  
##  E      : 7806      2      :  3805   Refundable:   162   240    :13922  
##  F      : 3751      3      :   927                       1      : 7191  
##  G      : 2553      4      :   376                       14     : 3640  
##  C      : 2375      5      :   118                       7      : 3539  
##  (Other): 3530      (Other):   149                       (Other):42797  
##     company       days_in_waiting_list         customer_type  
##  NULL   :112593   Min.   :  0.000      Contract       : 4076  
##  40     :   927   1st Qu.:  0.000      Group          :  577  
##  223    :   784   Median :  0.000      Transient      :89613  
##  67     :   267   Mean   :  2.321      Transient-Party:25124  
##  45     :   250   3rd Qu.:  0.000                             
##  153    :   215   Max.   :391.000                             
##  (Other):  4354                                               
##       adr          required_car_parking_spaces total_of_special_requests
##  Min.   :  -6.38   Min.   :0.00000             Min.   :0.0000           
##  1st Qu.:  69.29   1st Qu.:0.00000             1st Qu.:0.0000           
##  Median :  94.58   Median :0.00000             Median :0.0000           
##  Mean   : 101.83   Mean   :0.06252             Mean   :0.5714           
##  3rd Qu.: 126.00   3rd Qu.:0.00000             3rd Qu.:1.0000           
##  Max.   :5400.00   Max.   :8.00000             Max.   :5.0000           
##                                                                         
##  reservation_status reservation_status_date arrival_date        stays_nights   
##  Canceled :43017    2015-10-21:  1461       Length:119390      Min.   : 0.000  
##  Check-Out:75166    2015-07-06:   805       Class :character   1st Qu.: 2.000  
##  No-Show  : 1207    2016-11-25:   790       Mode  :character   Median : 3.000  
##                     2015-01-01:   763                          Mean   : 3.428  
##                     2016-01-18:   625                          3rd Qu.: 4.000  
##                     2015-07-02:   469                          Max.   :69.000  
##                     (Other)   :114477

Missing Values

There are almost no missing values except “children”.

colSums(is.na(hotels_new))
##                          hotel                    is_canceled 
##                              0                              0 
##                      lead_time              arrival_date_year 
##                              0                              0 
##             arrival_date_month       arrival_date_week_number 
##                              0                              0 
##      arrival_date_day_of_month        stays_in_weekend_nights 
##                              0                              0 
##           stays_in_week_nights                         adults 
##                              0                              0 
##                       children                         babies 
##                              4                              0 
##                           meal                        country 
##                              0                              0 
##                 market_segment           distribution_channel 
##                              0                              0 
##              is_repeated_guest         previous_cancellations 
##                              0                              0 
## previous_bookings_not_canceled             reserved_room_type 
##                              0                              0 
##             assigned_room_type                booking_changes 
##                              0                              0 
##                   deposit_type                          agent 
##                              0                              0 
##                        company           days_in_waiting_list 
##                              0                              0 
##                  customer_type                            adr 
##                              0                              0 
##    required_car_parking_spaces      total_of_special_requests 
##                              0                              0 
##             reservation_status        reservation_status_date 
##                              0                              0 
##                   arrival_date                   stays_nights 
##                              0                              0

There are 4 missing values in “children” and they were replaced by zero.

hotels_new$children[is.na(hotels_new$children)]  <- 0

Outlier

To see the distribution of adr (Average Daily Rate) and to investigate any existing outliers, a histogram and a boxplot were created.

hist(hotels_new$adr)

boxplot(hotels_new$adr) 

The outlier in adr was identified.

Boxplot(~adr, data=hotels_new, id=list(n=Inf))  

The outlier was replaced by NA and below is new boxplot without the outlier.

hotels_new[48516, 28] <- NA # row num.  is 48516; column num. for adr is 28
boxplot(hotels_new$adr) 

Variable Description

Below is the table of variable description.

Variable Type Description
hotel factor Hotel (City Hotel or Resort Hotel)
is_canceled factor Value indicating if the booking was canceled (1) or not (0)
lead_time integer Number of days that elapsed between the entering date of the booking into the PMS and the arrival date
arrival_date_year factor Year of arrival date
arrival_date_month factor Month of arrival date
arrival_date_week_number factor Week number of year for arrival date
arrival_date_day_of_month factor Day of arrival date
stays_in_weekend_nights integer Number of weekend nights (Saturday or Sunday) the guest stayed or booked to stay at the hotel
stays_in_week_nights integer Number of week nights (Monday to Friday) the guest stayed or booked to stay at the hotel
adults integer Number of adults
children numeric Number of children
babies integer Number of babies
meal factor Type of meal booked. SC - no meal package; BB - Bed & Breakfast; HB - Half board (breakfast and one other meal – usually dinner); FB - Full board (breakfast, lunch and dinner)
country factor Country of origin. Categories are represented in the ISO 3155–3:2013 format
market_segment factor Market segment designation. In categories, the term “TA” means “Travel Agents” and “TO” means “Tour Operators”
distribution_channel factor Booking distribution channel. The term “TA” means “Travel Agents” and “TO” means “Tour Operators”
is_repeated_guest factor Value indicating if the booking name was from a repeated guest (1) or not (0)
previous_cancellations factor Number of previous bookings that were cancelled by the customer prior to the current booking
previous_bookings_not_canceled factor Number of previous bookings not cancelled by the customer prior to the current booking
reserved_room_type factor Code of room type reserved. Code is presented instead of designation for anonymity reasons
assigned_room_type factor Code for the type of room assigned to the booking. Sometimes the assigned room type differs from the reserved room type due to hotel operation reasons (e.g. overbooking) or by customer request. Code is presented instead of designation for anonymity reasons
booking_changes factor Number of changes/amendments made to the booking from the moment the booking was entered on the PMS until the moment of check-in or cancellation
deposit_type factor Indication on if the customer made a deposit to guarantee the booking. This variable can assume three categories: No Deposit – no deposit was made; Non Refund – a deposit was made in the value of the total stay cost; Refundable – a deposit was made with a value under the total cost of stay.
agent factor ID of the travel agency that made the booking
company factor ID of the company/entity that made the booking or responsible for paying the booking. ID is presented instead of designation for anonymity reasons
days_in_waiting_list integer Number of days the booking was in the waiting list before it was confirmed to the customer
customer_type factor Type of booking, assuming one of four categories: Contract - when the booking has an allotment or other type of contract associated to it; Group – when the booking is associated to a group; Transient – when the booking is not part of a group or contract, and is not associated to other transient booking; Transient-party – when the booking is transient, but is associated to at least other transient booking
adr numeric Average Daily Rate as defined by dividing the sum of all lodging transactions by the total number of staying nights
required_car_parking_spaces integer Number of car parking spaces required by the customer
total_of_special_requests integer Number of special requests made by the customer (e.g. twin bed or high floor)
reservation_status factor Reservation last status, assuming one of three categories: Canceled – booking was canceled by the customer; Check-Out – customer has checked in but already departed; No-Show – customer did not check-in and did inform the hotel of the reason why
reservation_status_date factor Date at which the last status was set. This variable can be used in conjunction with the ReservationStatus to understand when was the booking canceled or when did the customer checked-out of the hotel
arrival_date character Arrival Date
stays_nights integer Stays in nights

Exploratory Data Analysis

We begin Exploratory Data Analysis by looking at the histograms of the numeric columns.This helps us to visualize if there is a skewness to the data or if any particular variable follows a perfectly normal distribution. We see that stays_nights follows almost a normal distribution curve with a longer tail to the right.

multi.hist(hotels_new[,sapply(hotels_new, is.numeric)])

The canceled and not canceled groups were displayed for the two hotels. It is clear that the number of cancellations for city hotel were higher than resort hotel.

# hotel and cancellation
hotels_new %>% ggplot( aes(x=hotel, fill=is_canceled)) + geom_bar() + scale_fill_discrete(labels=c("not canceled", "canceled")) + guides(fill=guide_legend(title=NULL)) +  scale_fill_manual(values=c("#999999", "#E69F00")) 

We further expand our analysis to visualize the number of cancellations on a monthly basis as below. This will help us to know in which months the hotel owners were affected the worst.

count <- table(hotels_new$is_canceled, hotels_new$arrival_date_month)

barplot(count, main = "Number of cancellations by Months",horiz = TRUE, names.arg = c("Jan", "Feb","March","April","May","June","July","Aug","Sept","Oct","Nov","Dec"),col=c("red","pink") ,cex.names = 0.4)

We can find the total number of guests by adding 3 columns of adults, children and babies. This will tell us the total number of guests per room which is an important metric but was not included in the original dataset.

hotels_new$no_of_guests = rowSums(cbind(hotels_new$adults,hotels_new$children, hotels_new$babies),na.rm = TRUE)
summary (hotels_new$no_of_guests)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   2.000   2.000   1.968   2.000  55.000

Next we see the trend of bookings that were converted and canceled over the period of three years period over which the data was collected. This will also help us to identify if any outside factors or events led people to cancel their hotel bookings during a particular year.

ggplot(data = hotels_new,aes(arrival_date_year,fill = (is_canceled))) +
  geom_bar(position = 'dodge') +
  scale_y_continuous(name = "No. of Bookings",labels = scales::comma) +
  xlab("Year") +
  ggtitle("Status by number of bookings across years") +
  labs(fill = 'Booking Status')

We tried correlating the variables with the dependent variable is_canceled but we were not able to make conclusive decisions on the basis of that information. We only see variables that are weakly correlated or are negatively correlated. It would be helpful to learn a modification technique that would help us to increase the correlation of the variables.

corr_hotels <- hotels_new[unlist(lapply(hotels_new, is.numeric))]
cor(corr_hotels)[,1] * 100
##                   lead_time     stays_in_weekend_nights 
##                  100.000000                    8.567113 
##        stays_in_week_nights                      adults 
##                   16.579936                   11.951869 
##                    children                      babies 
##                   -3.761282                   -2.091502 
##        days_in_waiting_list                         adr 
##                   17.008418                          NA 
## required_car_parking_spaces   total_of_special_requests 
##                  -11.645057                   -9.571205 
##                stays_nights                no_of_guests 
##                   15.716697                    7.226505

We have done a preliminary fit in a linear regression. Going forward we plan to use forward and backward variable selection techniques to optimize the models. As of now we have dropped certain variables from our regression model.

lm_model<- glm(is_canceled ~ lead_time  + hotel + no_of_guests + deposit_type + adr + total_of_special_requests + required_car_parking_spaces  + days_in_waiting_list + customer_type, data = hotels_new , family = "binomial")
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
summary(lm_model)
## 
## Call:
## glm(formula = is_canceled ~ lead_time + hotel + no_of_guests + 
##     deposit_type + adr + total_of_special_requests + required_car_parking_spaces + 
##     days_in_waiting_list + customer_type, family = "binomial", 
##     data = hotels_new)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -3.0201  -0.8151  -0.5512   0.1949   2.7214  
## 
## Coefficients:
##                                Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                  -2.217e+00  5.059e-02 -43.811  < 2e-16 ***
## lead_time                     4.474e-03  8.072e-05  55.432  < 2e-16 ***
## hotelResort Hotel            -1.425e-01  1.580e-02  -9.022  < 2e-16 ***
## no_of_guests                  1.772e-01  1.212e-02  14.620  < 2e-16 ***
## deposit_typeNon Refund        5.180e+00  1.054e-01  49.156  < 2e-16 ***
## deposit_typeRefundable       -5.177e-02  1.992e-01  -0.260 0.794956    
## adr                           4.671e-03  1.701e-04  27.462  < 2e-16 ***
## total_of_special_requests    -5.504e-01  1.032e-02 -53.350  < 2e-16 ***
## required_car_parking_spaces  -1.665e+01  4.289e+01  -0.388 0.697812    
## days_in_waiting_list         -2.822e-03  4.739e-04  -5.955 2.61e-09 ***
## customer_typeGroup           -7.372e-01  1.607e-01  -4.588 4.47e-06 ***
## customer_typeTransient        7.130e-01  4.486e-02  15.895  < 2e-16 ***
## customer_typeTransient-Party -1.691e-01  4.701e-02  -3.597 0.000322 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 157396  on 119388  degrees of freedom
## Residual deviance: 112211  on 119376  degrees of freedom
##   (1 observation deleted due to missingness)
## AIC: 112237
## 
## Number of Fisher Scoring iterations: 16