library(tidyverse)
## Warning: package 'tidyverse' was built under R version 3.6.3
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5 v purrr 0.3.4
## v tibble 3.1.1 v dplyr 1.0.6
## v tidyr 1.1.3 v stringr 1.4.0
## v readr 1.4.0 v forcats 0.5.1
## Warning: package 'tibble' was built under R version 3.6.3
## Warning: package 'tidyr' was built under R version 3.6.3
## Warning: package 'readr' was built under R version 3.6.3
## Warning: package 'purrr' was built under R version 3.6.3
## Warning: package 'dplyr' was built under R version 3.6.3
## Warning: package 'stringr' was built under R version 3.6.2
## Warning: package 'forcats' was built under R version 3.6.3
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(ggplot2)
In this project, the hotel booking dataset was used to analyze the average daily rate for city hotel and resort hotel, respectively. As the average daily rate for most (99.9%) of the booking are 326.2016 (‘quantile(booking$adr,0.999)’), we set the cut off value at 350 to remove the outliers.
The figure below clearly showed that while the average daily rate for majority of the city hotel are around 100, the average daily rate for large proportion of the resort hotel are less than 100. So, we can conclude that resort hotels are grossly cheaper than city hotels. Another interesting finding is that there are more resort hotels than city hotels when the average daily rate is higher than 180.
##
## -- Column specification --------------------------------------------------------
## cols(
## .default = col_double(),
## hotel = col_character(),
## arrival_date_month = col_character(),
## meal = col_character(),
## country = col_character(),
## market_segment = col_character(),
## distribution_channel = col_character(),
## reserved_room_type = col_character(),
## assigned_room_type = col_character(),
## deposit_type = col_character(),
## agent = col_character(),
## company = col_character(),
## customer_type = col_character(),
## reservation_status = col_character(),
## reservation_status_date = col_date(format = "")
## )
## i Use `spec()` for the full column specifications.
## 99.9%
## 326.2016