1.INTRODUCTION
when was the last time you travelled? How much did you pay for it? Did you feel like the price you paid for the hotel was higher than ususal . Did you happen to reason behind this behaviour? Did you ever happen to think what drive the pricing behaviour of hotels in India? What factors might have influenced the price of the hotel rent to drive up or go down? If not then here is the place for you. Today we will try to investigate the issue and try to answer all you queries.
It is a common phenomena that we experience variation in hotel price not only in India but across the globe. Why this is so? What provoked hotel owners to charge differently and what motivates a tourist to pay more for some hotels or hotel at a particular place. Here in this report we are going to analyze what could be the independent factors that contribute towards this price strategy. We will take the help of the data and some graphs and diagrams and regression analysis and on the basis of these we will try to analyse the data and try to figure out the potential factors affecting the price-behaviour in the hotel industries. A hotel’s price reflects an assessment of the value that tourists see and their willingness-to-pay for the hotel’s rooms and services. It is the price which reflect the fact that whether the hotel is worth for it or not.we will see whether the hotel industry charges tourists a price premium based on location or based on timing of tour visits? We evaluate whether hotels extract the maximum willingness of a tourist to pay for touring any particular location.
2. OVERVIEW OF THE STUDY
We are concerned with pricing startegy of hotel industry based on the evidences collected from the 42 different cities dependent on several parameters. Through this we try to establish any sort of influence of TouristDestination on Roomrents of the hotels. The data is collected from 42 cities for 8 different dates. We will do empirical study based on the dataset available with us. The price the hotel industry charges reflects the worth of that place. It represents the true value of the sight. Tourists behaviour bear a direct dependence on this pricing stategy to assess the worth of a particular destination . In this study we have studied the differences in price based on tourist destination vs normal destination along with some other variables like new yeareve vs normal days for different cities to strengthen our analysis. If there is a price for location, we would expect price for tourist destination will be high or price for certain cities will be high as compared to rest.
3. An empirical field study of Hotel price
We will focus on the data of hotels of 42 different cities classified in some categories on the basis of their unique features representing their worth or true value. consumers are asked to pay a price-premium for the pleasure of watching a particular exotic location from the hotel room. we expect that the hotel rooms with exotic views will be priced higher than the hotel rooms without exotic views, after controlling for other factors. Accordingly, we construct the following hypothesis:
‘HYPOTHESIS’-H1- There is a indeed a price difference of the hotels according to the tourist destination . Tourist Destination increse the prices of the hotels.
4. DATA SOURCE AND DESCRIPTION
For the purpose of this study we have gathered data from this website https://in.hotels.com/ and is of size 2523KB. This data set contains lot of Hotel features factors and some external factors. We have used a dummy variable IsTouristDestination to categories all the cities into two parts. If it is Tourist place then 1 and if it is not then 0. Another dummy variables are IsMetroCity , FreeBreakFast, HasSwimmingPool etc. to categories the data set into two separate parts for ease in analysing. It is worth to note that Hotel Desription and Hotel Name are also there which can strongly affect the prices but we have ignored these variables in order to limit the scope of our anlaysis. Date Factor is also ignored and insted we have Used a dummy variable IsNewYeareve to summarize the entire 8 dates into one and marked 1 for NewYeareve and 0 for normal date. Hotel Pincode seems to be absurd information here. Who cares for Hotel pincode while deciding how much to pay for it. We have ignored the Variable called CityName because we have summarised all the 42 cities with two dummy variables IsMetroCity and IsTouristDestination with values 0 and 1.
5 REGRESSION ANALYSIS
For this we have used approach called StepWise Regression model and used two models and reached the conclusion that model2 is robust and best fit model. We have used VIF and AIC checking techniques to check for the multicollinearity and significance of the model2 which we are claiming to be the best fit model. We have also used Adjusted R square concept to show the robustness of th model. On the basis of the above analysis i.e less AIC and more Adjusted R squared proves our model2 as best fit and statistically significant model.
To test the hypothesis we proposed the following model
RoomRent= b0+ b1*Population + b2*IsMetroCity + b3*IsTouristDestination + b4*IsNewYearEve + b5*StarRating + b6*Airport + b7*FreeWifi + b8*HotelCapacity + b9*HasSwimmingPool + error
error term caputures the deviation of the true and esyimated values.
#read the data
hotel<-read.csv(paste("Cities42.csv",sep = ""))
View(hotel)
## Ols models
Model1 <- RoomRent ~ Population+CityRank+IsMetroCity+IsTouristDestination+IsWeekend+IsNewYearEve+StarRating+Airport+FreeWifi+FreeBreakfast+HotelCapacity+HasSwimmingPool
fit1 <- lm(Model1, data = hotel)
## BEST FIT MODEL
Model2 <- RoomRent ~ StarRating+Population+IsMetroCity+IsTouristDestination+IsNewYearEve+Airport+FreeWifi+HotelCapacity+HasSwimmingPool
fit2 <- lm(Model2, data = hotel)
summary(fit2)
##
## Call:
## lm(formula = Model2, data = hotel)
##
## Residuals:
## Min 1Q Median 3Q Max
## -11839 -2385 -691 1045 309532
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -8.560e+03 4.055e+02 -21.109 < 2e-16 ***
## StarRating 3.598e+03 1.104e+02 32.582 < 2e-16 ***
## Population -1.244e-04 2.263e-05 -5.499 3.88e-08 ***
## IsMetroCity -6.369e+02 2.132e+02 -2.988 0.00282 **
## IsTouristDestination 1.918e+03 1.374e+02 13.958 < 2e-16 ***
## IsNewYearEve 8.430e+02 1.739e+02 4.849 1.26e-06 ***
## Airport 1.001e+01 2.716e+00 3.684 0.00023 ***
## FreeWifi 5.952e+02 2.217e+02 2.685 0.00726 **
## HotelCapacity -1.040e+01 1.029e+00 -10.115 < 2e-16 ***
## HasSwimmingPool 2.147e+03 1.598e+02 13.434 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 6600 on 13222 degrees of freedom
## Multiple R-squared: 0.1904, Adjusted R-squared: 0.1899
## F-statistic: 345.5 on 9 and 13222 DF, p-value: < 2.2e-16
We have established the effect of TouristDestination on RoomRent via this regression. We regressed price on various factors and got the results up to our expectation. If there is a “price for Tourist Destinaton” we expect the coefficient to be positive.
6 RESULTS
We have shown that the Torist Destination coefficient is positive which will increase the price of the hotels. Freewifi and SwimmingPool facility the internal features will positively affect the hotel prices. NewYearEve will certainly make the Tourist Destination much more hot favourite which will further esclate the prices.
7 CONCLUSION
This paper was motivated by the need for research that could shed light on our understanding of how some internal and external factors and especially TouristDestination influences the pricing strategies in the hotel industry. The peculiar contribution of this paper is that we investigated that there is a price premium charged by hotels to tourists who travel to exotic Tourist Places to experience something different. We found that tourists visiting the Tourist Destination in non-metro cities are charged more as compared to MetroCities. Ofcourse you gonna pay for the peace which non-metro cities offer.
REGRESSION ANALYSIS IN THE HOTEL PRICING STRATEGIES.
beta Std error t-stats
-8.560e+03 4.055e+02 -21.109
StarRating 3.598e+03 1.104e+02 32.582
Population -1.244e-04 2.263e-05 -5.499 IsMetroCity -6.369e+02 2.132e+02 -2.988
IsTouristDestination 1.918e+03 1.374e+02 13.958 IsNewYearEve 8.430e+02 1.739e+02 4.849 Airport 1.001e+01 2.716e+00 3.684 FreeWifi 5.952e+02 2.217e+02 2.685
HotelCapacity -1.040e+01 1.029e+00 -10.115 HasSwimmingPool 2.147e+03 1.598e+02 13.434
APPENDIX
DESCRIPTIVE STATISTICS
library(psych)
summary(hotel)
## CityName Population CityRank IsMetroCity
## Delhi :2048 Min. : 8096 Min. : 0.00 Min. :0.0000
## Jaipur : 768 1st Qu.: 744983 1st Qu.: 2.00 1st Qu.:0.0000
## Mumbai : 712 Median : 3046163 Median : 9.00 Median :0.0000
## Bangalore: 656 Mean : 4416837 Mean :14.83 Mean :0.2842
## Goa : 624 3rd Qu.: 8443675 3rd Qu.:24.00 3rd Qu.:1.0000
## Kochi : 608 Max. :12442373 Max. :44.00 Max. :1.0000
## (Other) :7816
## IsTouristDestination IsWeekend IsNewYearEve Date
## Min. :0.0000 Min. :0.0000 Min. :0.0000 Dec 21 2016:1611
## 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.0000 Dec 24 2016:1611
## Median :1.0000 Median :1.0000 Median :0.0000 Dec 25 2016:1611
## Mean :0.6972 Mean :0.6228 Mean :0.1244 Dec 28 2016:1611
## 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.:0.0000 Dec 31 2016:1611
## Max. :1.0000 Max. :1.0000 Max. :1.0000 Dec 18 2016:1608
## (Other) :3569
## HotelName RoomRent StarRating
## Vivanta by Taj : 32 Min. : 299 Min. :0.000
## Goldfinch Hotel : 24 1st Qu.: 2436 1st Qu.:3.000
## OYO Rooms : 24 Median : 4000 Median :3.000
## The Gordon House Hotel: 24 Mean : 5474 Mean :3.459
## Apnayt Villa : 16 3rd Qu.: 6299 3rd Qu.:4.000
## Bentleys Hotel Colaba : 16 Max. :322500 Max. :5.000
## (Other) :13096
## Airport
## Min. : 0.20
## 1st Qu.: 8.40
## Median : 15.00
## Mean : 21.16
## 3rd Qu.: 24.00
## Max. :124.00
##
## HotelAddress
## The Mall, Shimla : 32
## #2-91/14/8, White Fields, Kondapur, Hitech City, Hyderabad, 500084 India: 16
## 121, City Terrace, Walchand Hirachand Marg, Mumbai, Maharashtra : 16
## 14-4507/9, Balmatta Road, Near Jyothi Circle, Hampankatta : 16
## 144/7, Rajiv Gandi Salai (OMR), Kottivakkam, Chennai, Tamil Nadu : 16
## 17, Oliver Road, Colaba, Mumbai, Maharashtra : 16
## (Other) :13120
## HotelPincode HotelDescription FreeWifi FreeBreakfast
## Min. : 100025 3 : 120 Min. :0.0000 Min. :0.0000
## 1st Qu.: 221001 Abc : 112 1st Qu.:1.0000 1st Qu.:0.0000
## Median : 395003 3-star hotel: 104 Median :1.0000 Median :1.0000
## Mean : 397430 3.5 : 88 Mean :0.9259 Mean :0.6491
## 3rd Qu.: 570001 4 : 72 3rd Qu.:1.0000 3rd Qu.:1.0000
## Max. :7000157 (Other) :12728 Max. :1.0000 Max. :1.0000
## NA's : 8
## HotelCapacity HasSwimmingPool
## Min. : 0.00 Min. :0.0000
## 1st Qu.: 16.00 1st Qu.:0.0000
## Median : 34.00 Median :0.0000
## Mean : 62.51 Mean :0.3558
## 3rd Qu.: 75.00 3rd Qu.:1.0000
## Max. :600.00 Max. :1.0000
##
Average RoomRent on the basis of TouristDestination
attach(hotel)
aggregate(RoomRent, by=list(IsTouristDestination),mean)
## Group.1 x
## 1 0 4111.003
## 2 1 6066.024
Average RoomRent on the basis of MetroCity
aggregate(RoomRent, by=list(IsMetroCity),mean)
## Group.1 x
## 1 0 5782.794
## 2 1 4696.073
Average RoomRent on the NewYearEve
aggregate(RoomRent, by=list(IsNewYearEve),mean)
## Group.1 x
## 1 0 5367.606
## 2 1 6222.826
Average RoomRent on the basis of TouristDestination and MetroCity
aggregate(hotel$RoomRent,by=list(touristplace= hotel$IsTouristDestination, MetroCity= hotel$IsMetroCity),mean)
## touristplace MetroCity x
## 1 0 0 4006.435
## 2 1 0 6755.728
## 3 0 1 4646.136
## 4 1 1 4706.608
Two-way contingency table based on the ToristDestination and MetroCity
view<- xtabs(~ IsTouristDestination+IsMetroCity, data= hotel)
view
## IsMetroCity
## IsTouristDestination 0 1
## 0 3352 655
## 1 6120 3105
View of RoomRent based on different cities
boxplot(hotel$RoomRent ~ hotel$CityName ,main="price bifurcation for cities",xlab="rent",ylab="Cities" ,horizontal=TRUE, ylim=c(0,50000),col=c("red","yellow","brown", "blue", "peachpuff","beige","orchid3", "chartreuse"))

BOXPLOT based on Tourist destination
boxplot(hotel$RoomRent ~ hotel$IsTouristDestination ,main="price for tourist destination",xlab="rent",ylab="Cities" ,horizontal=TRUE,ylim=c(0, 100000), col=c("red","yellow"))

coefficient Plot
library(coefplot)
## Loading required package: ggplot2
##
## Attaching package: 'ggplot2'
## The following objects are masked from 'package:psych':
##
## %+%, alpha
coefplot(fit2, intercept=FALSE)

Plot Of the model
library(leaps)
leap2 <- regsubsets(Model2, data = hotel, nbest=1)
plot(leap2, scale="adjr2")

INTERACTION BETWEEN ROOMRENT AND TOURISTDESTINATION
View<-factor(hotel$RoomRent)
interaction.plot( hotel$RoomRent, IsTouristDestination, hotel$RoomRent, type="b",
col=c("red","blue"), pch=c(16, 18),
main = "Interaction between RoomRent and TouristDestination",)

MAIN EFFECT AND TWO WAY INTERACTION
library(HH)
## Loading required package: lattice
## Loading required package: grid
## Loading required package: latticeExtra
## Loading required package: RColorBrewer
##
## Attaching package: 'latticeExtra'
## The following object is masked from 'package:ggplot2':
##
## layer
## Loading required package: multcomp
## Loading required package: mvtnorm
## Loading required package: survival
## Loading required package: TH.data
## Loading required package: MASS
##
## Attaching package: 'TH.data'
## The following object is masked from 'package:MASS':
##
## geyser
## Loading required package: gridExtra
##
## Attaching package: 'HH'
## The following object is masked from 'package:psych':
##
## logit
interaction2wt(RoomRent ~ IsMetroCity+IsTouristDestination, data=hotel)

FOR MODEL SIGNIFICANCE AND ROBUSTNESS
summary(fit1)$adj.r.squared
## [1] 0.1898256
summary(fit2)$adj.r.squared
## [1] 0.1898573
AIC(fit1)
## [1] 270314.1
AIC(fit2)
## [1] 270310.6
Remove vars with VIF> 2.5 and re-build model until none of VIFs don’t exceed 2.5
Model2 <- RoomRent ~ StarRating+Population+IsMetroCity+IsTouristDestination+IsNewYearEve+Airport+FreeWifi+HotelCapacity+HasSwimmingPool
fit2 <- lm(Model2, data = hotel)
all_vifs <- car::vif(fit2)
print(all_vifs)
## StarRating Population IsMetroCity
## 2.118451 2.820418 2.807261
## IsTouristDestination IsNewYearEve Airport
## 1.210458 1.000013 1.160325
## FreeWifi HotelCapacity HasSwimmingPool
## 1.024652 1.888342 1.777718
signif_all <- names(all_vifs)
# Remove vars with VIF> 2.5 and re-build model until none of VIFs don't exceed 2.5.
while(any(all_vifs > 2.5)){
var_with_max_vif <- names(which(all_vifs == max(all_vifs))) # get the var with max vif
signif_all <- signif_all[!(signif_all) %in% var_with_max_vif] # remove
myForm <- as.formula(paste("RoomRent ~ ", paste (signif_all, collapse=" + "), sep="")) # new formula
selectedMod <- lm(myForm, data=hotel) # re-build model with new formula
all_vifs <- car::vif(selectedMod)
}
summary(selectedMod)
##
## Call:
## lm(formula = myForm, data = hotel)
##
## Residuals:
## Min 1Q Median 3Q Max
## -11654 -2365 -710 1067 309426
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -8838.099 402.792 -21.942 < 2e-16 ***
## StarRating 3569.749 110.439 32.323 < 2e-16 ***
## IsMetroCity -1530.867 138.033 -11.091 < 2e-16 ***
## IsTouristDestination 2094.588 133.722 15.664 < 2e-16 ***
## IsNewYearEve 843.370 174.054 4.845 1.28e-06 ***
## Airport 11.506 2.705 4.253 2.12e-05 ***
## FreeWifi 534.928 221.665 2.413 0.0158 *
## HotelCapacity -11.137 1.021 -10.907 < 2e-16 ***
## HasSwimmingPool 2225.460 159.331 13.968 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 6608 on 13223 degrees of freedom
## Multiple R-squared: 0.1886, Adjusted R-squared: 0.1881
## F-statistic: 384.1 on 8 and 13223 DF, p-value: < 2.2e-16
car::vif(selectedMod)
## StarRating IsMetroCity IsTouristDestination
## 2.113742 1.174547 1.144109
## IsNewYearEve Airport FreeWifi
## 1.000013 1.148621 1.022146
## HotelCapacity HasSwimmingPool
## 1.856658 1.763424