India is the country which is soon going to be a developed country and therefore the tertiary sector of this country is increasing at a very high rate. Tourism sector which is a part of tertiary sector, is the most rapidly growing of all. All the teenagers, and families look for a beautiful, efficient place in every way to stay. On the special occasions like new year eve or some special day in Indian history may lead to increase in prices of hotel rooms.
This project will tell us about which of the factors are correlated with the pricing of hotel rooms.Are these factors consistent for all datasets or are there exceptions? Whole analysis has been done on this.
I have done data analysis pertaining to hotel room prices across 42 cities in India. Various choices has been considered in this report like whether the city is a metro city or not, whether the hotels are close to airport or not etc. The dataset has the pricing of the hotels on different dates, including New Year’s Eve and Christmas.
The specific objective of this study is to analyse the pricing strategy used in various hotels located in 42 cities in India. We made comparisons between the prices of various hotel rooms, based on a few factors. 1. IsNewYearEve, 2. IsWeekend, 3. Star Rating & 4. whether the hotel has Swimming Pool or not.
HYPOTHESIS: RoomRent and IsNewYearEve are independent variables.
For this study, we collected data from the Hilton hotel website(https://in.hotels.com/) in October 2016. The minimum room rent is a merely INR 299 per night, as compared to INR 322500 per night. The mean and median are much closer to the minimum, as compared to the maximum (INR 5474 and INR 4000, respectively). This shows that the more expensive hotels in the dataset are relatively less, hence are outliers. Also, the most expensive ones are a heritage hotel, which prove their pricing strategy. This point is also supported by the fact that most of the hotels have a star rating which is average (median star rating is 3). It is expected that hotels which have a swimming pool will have a higher price tag than a hotel which does not have one. Tourists want to relax after a tiring journey, so they tend to book hotels with swimming pool. Hence, demands for such hotel rooms will be higher. Only 4708 hotels out of total have swimming pools. As the data shows, majority of the entries are from hotels without swimming pools, hence swimming pool is a form of luxury. The price of a hotel room is even expected to depend on the number of rooms the hotel has. More the capacity of hotel, indicates that the hotel has huge infrastructure and better amenities. So the cost of room rent increases drastically.
Room Rent: The room rent is one of the most important factors when considering which hotel to stay in. Generally, the higher the quality of service, the higher the room rent is. There are a lot of factors which affect the room pricing in the hotel industry.
Star Rating: In India, hotels are assigned a star rating by the Ministry of Tourism. Hotels are assigned a star rating between 1 and 5. 1 - Poor service and 5 – Excellent service. The star rating is assigned according to the services, ambience and quality offered by the hotel to its customers. Since the star rating is an assurance of quality and luxury, the price of the hotel rooms is strongly positively correlated with it. Tourists often choose hotels based on star ratings and testimonials.
Hotel Capacity: Usually, the more the number of hotels, the bigger is the hotel and the higher the rent of the room is likely to be. This is because most hotels built on large scales tend to have good amenities, a high star rating, good infrastructure so can charge a higher price for their hotel rooms. Most of the hotels with a large number of rooms have a high star rating and our big hotel brands, so they have a higher price than hotels which low capacities, who cannot afford to maintain a large number of rooms.
Swimming Pools: It is likely that a room in a hotel with a swimming pool will be more expensive than a room in a hotel without one. Majority of the hotels do not have a swimming pool, due to space constrains, financial constraints and hassles related to management of the pool. This makes the swimming pool a luxury. Also, it is the ideal way for a family, group of friends or an individual to have some fun and relax.
City: It plays a major role. Tourists especially foreigners choose to stay in well furnished hotels located in elite cities like Mumbai, Delhi, Goa etc. It is likely that the city in which the hotel is situated in will strongly influence the rent of the rooms of the hotel. Also, a variable, CityRank, was used to uniquely identify each city in the dataset. Each city has different characteristics, which influences the pricing of the hotel rooms in that city. Some cities have a hotel which does not follow the general trend of pricing of the city. This might be because the hotel has something special to offer e.g. it is a heritage hotel.
Tourist Destination: Cities which are popular tourist destinations will have a higher demand for their hotel rooms, as more people would want to stay in the hotels when they visit the city because tourist spots are the ones with exquisite beauty and nature’s bounty. This allows the hotel owners to charge the tourists a higher price for their rooms, as compared to the hotel owners in cities which are not popular tourist destinations.
Distance from nearest airport: It is expected that the hotel room prices are correlated to the distance of the hotel from the nearest airport. A hotel which is very far from the airport would not be preferred by most travelers. This is because they would not like to spend a lot of time travelling from the airport to the hotel and back. Instead, if they could find a hotel closer to the airport, it is more likely that they would prefer staying in that hotel
Free Wifi: Now a days, wifi is an important part of our life as it links us with social media where we share all of our updates related to our daily life. Therefore people expect that free wifi’s should be available at the hotel in which they are staying so if a hotel have free wifi then it is an added advantage for them.
In our hypothesis we proposed the following model: RoomRent = a0 + a1.IsMetroCity + a2.IsTouristDestination + a3.IsNewYearEve + a4.IsWeekend + a5.StarRating + a6.Airport + a7.HasSwimmingPool + a8.FreeWifi + a9.FreeBreakfast + u
hotel <- read.csv("Cities42.csv", header = TRUE, sep = ",")
fit <- lm( RoomRent ~ IsMetroCity + IsTouristDestination + IsNewYearEve + IsWeekend + StarRating + Airport + HasSwimmingPool + FreeWifi + FreeBreakfast, hotel)
summary(fit)
##
## Call:
## lm(formula = RoomRent ~ IsMetroCity + IsTouristDestination +
## IsNewYearEve + IsWeekend + StarRating + Airport + HasSwimmingPool +
## FreeWifi + FreeBreakfast, data = hotel)
##
## Residuals:
## Min 1Q Median 3Q Max
## -10451 -2349 -712 978 310394
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -7726.314 399.765 -19.327 < 2e-16 ***
## IsMetroCity -1867.441 135.859 -13.745 < 2e-16 ***
## IsTouristDestination 2330.496 133.374 17.473 < 2e-16 ***
## IsNewYearEve 887.140 182.783 4.854 1.23e-06 ***
## IsWeekend -96.609 124.494 -0.776 0.4378
## StarRating 3014.799 98.267 30.680 < 2e-16 ***
## Airport 11.310 2.721 4.156 3.26e-05 ***
## HasSwimmingPool 1875.286 156.573 11.977 < 2e-16 ***
## FreeWifi 562.209 224.987 2.499 0.0125 *
## FreeBreakfast 309.581 123.255 2.512 0.0120 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 6636 on 13222 degrees of freedom
## Multiple R-squared: 0.1817, Adjusted R-squared: 0.1811
## F-statistic: 326.2 on 9 and 13222 DF, p-value: < 2.2e-16
We regressed IsMetroCity, IsTouristDestination, IsNewYearEve, IsWeekend, StarRating, Airport, HasSwimmingPool, FreeWifi, and FreeBreakfast on RoomRent. Factors like tourist destination, new year eve, star rating, airport, swimming pool, free wifi and free breakfast have their coefficients positive while the others have their coefficients negative. The factors which have positive coefficient shows that these factors have a positive effect on the room rent, for example, if a hotel star rating is higher then their room rent will also be higher than those who have lower star ratings.
We found empirical support in the statement that if the dates of booking are around or on new year eve then the room rent prices of hotels increases(i.e. our null hypothesis is incorrect). Also we get to know that star ratings of an hotel matters the most for the room rent prices and then whether it is a tourist destination or not.
This paper was motivated by the need for research that could improve our understanding of how heritage tourism influences the pricing strategies in the hotel industry. We found out that factors like tourist destination, new year eve, star rating, airport, swimming pool, free wifi and free breakfast have positive effect on pricing in hotel rooms. But only some of them have a huge effect like tourist destination, star rating and swimming pool have major effect and others have a very small but relevant effect.
hotel <- read.csv("Cities42.csv", header = TRUE, sep = ",")
View(hotel)
summary(hotel)
## CityName Population CityRank IsMetroCity
## Delhi :2048 Min. : 8096 Min. : 0.00 Min. :0.0000
## Jaipur : 768 1st Qu.: 744983 1st Qu.: 2.00 1st Qu.:0.0000
## Mumbai : 712 Median : 3046163 Median : 9.00 Median :0.0000
## Bangalore: 656 Mean : 4416837 Mean :14.83 Mean :0.2842
## Goa : 624 3rd Qu.: 8443675 3rd Qu.:24.00 3rd Qu.:1.0000
## Kochi : 608 Max. :12442373 Max. :44.00 Max. :1.0000
## (Other) :7816
## IsTouristDestination IsWeekend IsNewYearEve Date
## Min. :0.0000 Min. :0.0000 Min. :0.0000 Dec 21 2016:1611
## 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.0000 Dec 24 2016:1611
## Median :1.0000 Median :1.0000 Median :0.0000 Dec 25 2016:1611
## Mean :0.6972 Mean :0.6228 Mean :0.1244 Dec 28 2016:1611
## 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.:0.0000 Dec 31 2016:1611
## Max. :1.0000 Max. :1.0000 Max. :1.0000 Dec 18 2016:1608
## (Other) :3569
## HotelName RoomRent StarRating
## Vivanta by Taj : 32 Min. : 299 Min. :0.000
## Goldfinch Hotel : 24 1st Qu.: 2436 1st Qu.:3.000
## OYO Rooms : 24 Median : 4000 Median :3.000
## The Gordon House Hotel: 24 Mean : 5474 Mean :3.459
## Apnayt Villa : 16 3rd Qu.: 6299 3rd Qu.:4.000
## Bentleys Hotel Colaba : 16 Max. :322500 Max. :5.000
## (Other) :13096
## Airport
## Min. : 0.20
## 1st Qu.: 8.40
## Median : 15.00
## Mean : 21.16
## 3rd Qu.: 24.00
## Max. :124.00
##
## HotelAddress
## The Mall, Shimla : 32
## #2-91/14/8, White Fields, Kondapur, Hitech City, Hyderabad, 500084 India: 16
## 121, City Terrace, Walchand Hirachand Marg, Mumbai, Maharashtra : 16
## 14-4507/9, Balmatta Road, Near Jyothi Circle, Hampankatta : 16
## 144/7, Rajiv Gandi Salai (OMR), Kottivakkam, Chennai, Tamil Nadu : 16
## 17, Oliver Road, Colaba, Mumbai, Maharashtra : 16
## (Other) :13120
## HotelPincode HotelDescription FreeWifi FreeBreakfast
## Min. : 100025 3 : 120 Min. :0.0000 Min. :0.0000
## 1st Qu.: 221001 Abc : 112 1st Qu.:1.0000 1st Qu.:0.0000
## Median : 395003 3-star hotel: 104 Median :1.0000 Median :1.0000
## Mean : 397430 3.5 : 88 Mean :0.9259 Mean :0.6491
## 3rd Qu.: 570001 4 : 72 3rd Qu.:1.0000 3rd Qu.:1.0000
## Max. :7000157 (Other) :12728 Max. :1.0000 Max. :1.0000
## NA's : 8
## HotelCapacity HasSwimmingPool
## Min. : 0.00 Min. :0.0000
## 1st Qu.: 16.00 1st Qu.:0.0000
## Median : 34.00 Median :0.0000
## Mean : 62.51 Mean :0.3558
## 3rd Qu.: 75.00 3rd Qu.:1.0000
## Max. :600.00 Max. :1.0000
##
a <- with(hotel, table(HasSwimmingPool))
a
## HasSwimmingPool
## 0 1
## 8524 4708
b <- with(hotel, table(FreeBreakfast))
b
## FreeBreakfast
## 0 1
## 4643 8589
c <- with(hotel, table(FreeWifi))
c
## FreeWifi
## 0 1
## 981 12251
d <- with(hotel, table(StarRating))
d
## StarRating
## 0 1 2 2.5 3 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4 4.1
## 16 8 440 632 5953 8 16 8 1752 8 24 16 32 2463 24
## 4.3 4.4 4.5 4.7 4.8 5
## 16 8 376 8 16 1408
e <- with(hotel, table(IsNewYearEve))
e
## IsNewYearEve
## 0 1
## 11586 1646
f <- with(hotel, table(IsWeekend))
f
## IsWeekend
## 0 1
## 4991 8241
g <- with(hotel, table(IsTouristDestination))
g
## IsTouristDestination
## 0 1
## 4007 9225
h <- with(hotel, table(IsMetroCity))
h
## IsMetroCity
## 0 1
## 9472 3760
i <- with(hotel, table(CityRank))
i
## CityRank
## 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
## 712 2048 656 416 536 424 512 80 600 768 32 128 16 136 160
## 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
## 432 448 624 128 264 40 224 336 392 48 160 120 272 104 456
## 32 33 34 35 36 37 38 39 40 42 43 44
## 48 56 280 64 136 88 128 136 264 144 328 288
j <- with(hotel, table(Population))
j
## Population
## 8096 38471 38472 38473 41377 65471 88430 98658
## 288 325 1 2 144 264 136 128
## 102138 132016 140925 169578 201026 451735 499487 595575
## 88 136 64 280 56 456 104 608
## 744983 755379 885363 957352 960787 1180570 1201815 1286678
## 392 160 120 48 336 40 264 128
## 1447187 1457723 1465625 1637875 1760285 2167447 2490891 2765348
## 48 624 112 224 432 160 136 16
## 2817105 2975440 3046163 3124458 4467797 4496694 5577940 6731790
## 128 32 768 600 80 512 424 536
## 7088416 8443675 11034555 12442373
## 416 656 2048 712
k <- with(hotel, table(CityName))
k
## CityName
## Agra Ahmedabad Amritsar Bangalore
## 432 424 136 656
## Bhubaneswar Chandigarh Chennai Darjeeling
## 120 336 416 136
## Delhi Gangtok Goa Guwahati
## 2048 128 624 48
## Haridwar Hyderabad Indore Jaipur
## 48 536 160 768
## Jaisalmer Jodhpur Kanpur Kochi
## 264 224 16 608
## Kolkata Lucknow Madurai Manali
## 512 128 112 288
## Mangalore Mumbai Munnar Mysore
## 104 712 328 160
## Nainital Ooty Panchkula Pune
## 144 136 64 600
## Puri Rajkot Rishikesh Shimla
## 56 128 88 280
## Srinagar Surat Thiruvanthipuram Thrissur
## 40 80 392 32
## Udaipur Varanasi
## 456 264
l <- xtabs( ~ CityName + HasSwimmingPool, data = hotel)
l
## HasSwimmingPool
## CityName 0 1
## Agra 248 184
## Ahmedabad 312 112
## Amritsar 96 40
## Bangalore 408 248
## Bhubaneswar 96 24
## Chandigarh 256 80
## Chennai 176 240
## Darjeeling 128 8
## Delhi 1304 744
## Gangtok 112 16
## Goa 192 432
## Guwahati 16 32
## Haridwar 40 8
## Hyderabad 392 144
## Indore 152 8
## Jaipur 395 373
## Jaisalmer 184 80
## Jodhpur 120 104
## Kanpur 16 0
## Kochi 312 296
## Kolkata 329 183
## Lucknow 96 32
## Madurai 88 24
## Manali 272 16
## Mangalore 80 24
## Mumbai 472 240
## Munnar 320 8
## Mysore 120 40
## Nainital 120 24
## Ooty 120 16
## Panchkula 48 16
## Pune 368 232
## Puri 24 32
## Rajkot 96 32
## Rishikesh 64 24
## Shimla 240 40
## Srinagar 24 16
## Surat 64 16
## Thiruvanthipuram 128 264
## Thrissur 32 0
## Udaipur 248 208
## Varanasi 216 48
m <- xtabs( ~ CityName + FreeBreakfast, data = hotel)
m
## FreeBreakfast
## CityName 0 1
## Agra 256 176
## Ahmedabad 72 352
## Amritsar 96 40
## Bangalore 248 408
## Bhubaneswar 32 88
## Chandigarh 280 56
## Chennai 96 320
## Darjeeling 0 136
## Delhi 652 1396
## Gangtok 72 56
## Goa 264 360
## Guwahati 8 40
## Haridwar 24 24
## Hyderabad 201 335
## Indore 8 152
## Jaipur 370 398
## Jaisalmer 155 109
## Jodhpur 104 120
## Kanpur 0 16
## Kochi 216 392
## Kolkata 217 295
## Lucknow 29 99
## Madurai 0 112
## Manali 80 208
## Mangalore 24 80
## Mumbai 240 472
## Munnar 48 280
## Mysore 16 144
## Nainital 64 80
## Ooty 80 56
## Panchkula 24 40
## Pune 152 448
## Puri 32 24
## Rajkot 0 128
## Rishikesh 19 69
## Shimla 64 216
## Srinagar 16 24
## Surat 8 72
## Thiruvanthipuram 160 232
## Thrissur 8 24
## Udaipur 128 328
## Varanasi 80 184
n <- xtabs( ~ CityName + FreeWifi, data = hotel)
n
## FreeWifi
## CityName 0 1
## Agra 8 424
## Ahmedabad 56 368
## Amritsar 16 120
## Bangalore 24 632
## Bhubaneswar 0 120
## Chandigarh 64 272
## Chennai 16 400
## Darjeeling 0 136
## Delhi 48 2000
## Gangtok 8 120
## Goa 128 496
## Guwahati 0 48
## Haridwar 8 40
## Hyderabad 0 536
## Indore 0 160
## Jaipur 54 714
## Jaisalmer 32 232
## Jodhpur 16 208
## Kanpur 0 16
## Kochi 80 528
## Kolkata 39 473
## Lucknow 0 128
## Madurai 0 112
## Manali 32 256
## Mangalore 0 104
## Mumbai 48 664
## Munnar 40 288
## Mysore 0 160
## Nainital 32 112
## Ooty 40 96
## Panchkula 0 64
## Pune 24 576
## Puri 24 32
## Rajkot 0 128
## Rishikesh 8 80
## Shimla 56 224
## Srinagar 8 32
## Surat 8 72
## Thiruvanthipuram 32 360
## Thrissur 0 32
## Udaipur 32 424
## Varanasi 0 264
o <- xtabs( ~ CityName + StarRating, data = hotel)
o
## StarRating
## CityName 0 1 2 2.5 3 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4
## Agra 0 8 32 16 176 0 0 0 72 0 0 0 0 48
## Ahmedabad 0 0 8 24 160 0 0 0 104 0 0 0 0 80
## Amritsar 0 0 0 0 80 0 0 0 24 0 0 0 0 24
## Bangalore 0 0 0 0 304 0 0 0 8 0 0 0 0 224
## Bhubaneswar 0 0 0 0 96 0 0 0 16 0 0 0 0 0
## Chandigarh 0 0 0 8 144 0 0 0 96 0 0 0 0 48
## Chennai 0 0 32 40 64 0 0 0 24 0 0 0 0 160
## Darjeeling 0 0 0 0 88 0 0 0 24 0 0 0 0 24
## Delhi 0 0 32 96 920 0 0 0 208 8 0 0 0 416
## Gangtok 0 0 0 0 72 0 0 0 48 0 0 0 0 8
## Goa 0 0 128 24 216 0 0 0 32 0 0 0 0 136
## Guwahati 0 0 0 0 24 0 0 0 16 0 0 0 0 0
## Haridwar 0 0 0 8 8 0 0 0 8 0 0 0 0 16
## Hyderabad 0 0 8 24 256 0 0 0 48 0 8 16 16 80
## Indore 0 0 0 8 120 0 0 0 16 0 0 0 0 0
## Jaipur 0 0 24 24 320 0 0 0 112 0 0 0 0 160
## Jaisalmer 8 0 8 32 144 0 0 0 24 0 0 0 0 40
## Jodhpur 8 0 0 0 96 0 0 0 56 0 0 0 0 32
## Kanpur 0 0 0 0 16 0 0 0 0 0 0 0 0 0
## Kochi 0 0 48 72 224 0 0 0 48 0 0 0 0 104
## Kolkata 0 0 8 16 305 0 0 0 48 0 0 0 0 63
## Lucknow 0 0 0 0 64 0 0 0 24 0 0 0 0 32
## Madurai 0 0 0 0 72 0 0 0 32 0 0 0 0 8
## Manali 0 0 0 16 160 0 0 0 32 0 0 0 0 72
## Mangalore 0 0 0 0 64 0 0 0 16 0 0 0 0 24
## Mumbai 0 0 24 32 216 0 0 0 176 0 0 0 0 120
## Munnar 0 0 0 32 128 0 0 0 88 0 0 0 0 64
## Mysore 0 0 8 0 120 0 0 0 8 0 0 0 0 24
## Nainital 0 0 0 0 72 0 0 0 48 0 0 0 0 8
## Ooty 0 0 0 0 88 0 0 0 16 0 0 0 0 32
## Panchkula 0 0 0 16 24 0 0 0 8 0 0 0 0 8
## Pune 0 0 24 32 264 0 0 0 96 0 0 0 0 88
## Puri 0 0 0 0 48 0 0 0 0 0 0 0 0 8
## Rajkot 0 0 0 0 88 0 0 0 0 0 0 0 0 24
## Rishikesh 0 0 0 8 48 0 0 0 16 0 0 0 0 16
## Shimla 0 0 8 40 120 0 0 0 48 0 0 0 0 40
## Srinagar 0 0 0 0 8 0 0 0 0 0 0 0 0 16
## Surat 0 0 0 0 48 0 0 0 0 0 0 0 0 32
## Thiruvanthipuram 0 0 8 0 168 0 0 0 8 0 0 0 0 120
## Thrissur 0 0 0 0 16 0 0 0 0 0 0 0 0 16
## Udaipur 0 0 24 40 176 8 0 8 80 0 16 0 16 24
## Varanasi 0 0 16 24 128 0 16 0 24 0 0 0 0 24
## StarRating
## CityName 4.1 4.3 4.4 4.5 4.7 4.8 5
## Agra 0 0 0 16 0 0 64
## Ahmedabad 0 0 0 8 0 0 40
## Amritsar 0 0 0 8 0 0 0
## Bangalore 0 0 0 8 0 0 112
## Bhubaneswar 0 0 0 0 0 0 8
## Chandigarh 0 0 0 16 0 0 24
## Chennai 0 0 0 16 0 0 80
## Darjeeling 0 0 0 0 0 0 0
## Delhi 0 0 0 16 0 0 352
## Gangtok 0 0 0 0 0 0 0
## Goa 0 0 0 8 0 0 80
## Guwahati 0 0 0 0 0 0 8
## Haridwar 0 0 0 8 0 0 0
## Hyderabad 0 8 8 16 8 0 40
## Indore 0 0 0 8 0 0 8
## Jaipur 0 0 0 40 0 0 88
## Jaisalmer 0 0 0 0 0 0 8
## Jodhpur 0 0 0 16 0 0 16
## Kanpur 0 0 0 0 0 0 0
## Kochi 0 0 0 24 0 0 88
## Kolkata 0 0 0 16 0 0 56
## Lucknow 0 0 0 0 0 0 8
## Madurai 0 0 0 0 0 0 0
## Manali 0 0 0 0 0 0 8
## Mangalore 0 0 0 0 0 0 0
## Mumbai 0 0 0 48 0 0 96
## Munnar 0 0 0 16 0 0 0
## Mysore 0 0 0 0 0 0 0
## Nainital 0 0 0 8 0 0 8
## Ooty 0 0 0 0 0 0 0
## Panchkula 0 0 0 8 0 0 0
## Pune 0 8 0 48 0 0 40
## Puri 0 0 0 0 0 0 0
## Rajkot 0 0 0 0 0 0 16
## Rishikesh 0 0 0 0 0 0 0
## Shimla 0 0 0 8 0 0 16
## Srinagar 0 0 0 0 0 0 16
## Surat 0 0 0 0 0 0 0
## Thiruvanthipuram 0 0 0 0 0 0 88
## Thrissur 0 0 0 0 0 0 0
## Udaipur 16 0 0 8 0 16 24
## Varanasi 8 0 0 8 0 0 16
p <- xtabs( ~ CityName + IsTouristDestination, data = hotel)
p
## IsTouristDestination
## CityName 0 1
## Agra 0 432
## Ahmedabad 424 0
## Amritsar 0 136
## Bangalore 656 0
## Bhubaneswar 120 0
## Chandigarh 336 0
## Chennai 328 88
## Darjeeling 0 136
## Delhi 0 2048
## Gangtok 0 128
## Goa 0 624
## Guwahati 0 48
## Haridwar 0 48
## Hyderabad 536 0
## Indore 160 0
## Jaipur 0 768
## Jaisalmer 0 264
## Jodhpur 0 224
## Kanpur 16 0
## Kochi 0 608
## Kolkata 327 185
## Lucknow 128 0
## Madurai 0 112
## Manali 0 288
## Mangalore 104 0
## Mumbai 0 712
## Munnar 0 328
## Mysore 0 160
## Nainital 0 144
## Ooty 0 136
## Panchkula 64 0
## Pune 600 0
## Puri 0 56
## Rajkot 128 0
## Rishikesh 0 88
## Shimla 0 280
## Srinagar 0 40
## Surat 80 0
## Thiruvanthipuram 0 392
## Thrissur 0 32
## Udaipur 0 456
## Varanasi 0 264
q <- xtabs( ~ FreeBreakfast + HasSwimmingPool, data = hotel)
q
## HasSwimmingPool
## FreeBreakfast 0 1
## 0 2805 1838
## 1 5719 2870
r <- xtabs( ~ FreeBreakfast + FreeWifi, data = hotel)
r
## FreeWifi
## FreeBreakfast 0 1
## 0 606 4037
## 1 375 8214
s <- xtabs( ~ FreeWifi + HasSwimmingPool, data = hotel)
s
## HasSwimmingPool
## FreeWifi 0 1
## 0 592 389
## 1 7932 4319
t <- xtabs( ~ IsWeekend + IsTouristDestination, data = hotel)
t
## IsTouristDestination
## IsWeekend 0 1
## 0 1454 3537
## 1 2553 5688
u <- xtabs( ~ IsNewYearEve + IsTouristDestination, data = hotel)
u
## IsTouristDestination
## IsNewYearEve 0 1
## 0 3504 8082
## 1 503 1143
v <- xtabs( ~ StarRating + HasSwimmingPool, data = hotel)
v
## HasSwimmingPool
## StarRating 0 1
## 0 8 8
## 1 8 0
## 2 392 48
## 2.5 616 16
## 3 5236 717
## 3.2 0 8
## 3.3 16 0
## 3.4 0 8
## 3.5 1272 480
## 3.6 0 8
## 3.7 0 24
## 3.8 8 8
## 3.9 8 24
## 4 848 1615
## 4.1 8 16
## 4.3 0 16
## 4.4 8 0
## 4.5 48 328
## 4.7 0 8
## 4.8 0 16
## 5 48 1360
w <- xtabs( ~ StarRating + FreeBreakfast, data = hotel)
w
## FreeBreakfast
## StarRating 0 1
## 0 16 0
## 1 0 8
## 2 216 224
## 2.5 296 336
## 3 1789 4164
## 3.2 0 8
## 3.3 8 8
## 3.4 0 8
## 3.5 661 1091
## 3.6 8 0
## 3.7 0 24
## 3.8 8 8
## 3.9 16 16
## 4 783 1680
## 4.1 0 24
## 4.3 16 0
## 4.4 0 8
## 4.5 224 152
## 4.7 8 0
## 4.8 0 16
## 5 594 814
x <- xtabs( ~ StarRating + FreeWifi, data = hotel)
x
## FreeWifi
## StarRating 0 1
## 0 0 16
## 1 0 8
## 2 80 360
## 2.5 104 528
## 3 336 5617
## 3.2 0 8
## 3.3 0 16
## 3.4 0 8
## 3.5 96 1656
## 3.6 0 8
## 3.7 0 24
## 3.8 0 16
## 3.9 0 32
## 4 231 2232
## 4.1 0 24
## 4.3 0 16
## 4.4 0 8
## 4.5 24 352
## 4.7 0 8
## 4.8 0 16
## 5 110 1298
y <- xtabs( ~ IsMetroCity + HasSwimmingPool, data = hotel)
y
## HasSwimmingPool
## IsMetroCity 0 1
## 0 6163 3309
## 1 2361 1399
z <- xtabs( ~ IsWeekend + HasSwimmingPool, data = hotel)
z
## HasSwimmingPool
## IsWeekend 0 1
## 0 3229 1762
## 1 5295 2946
A <- xtabs( ~ IsNewYearEve + HasSwimmingPool, data = hotel)
A
## HasSwimmingPool
## IsNewYearEve 0 1
## 0 7466 4120
## 1 1058 588
B <- xtabs( ~ IsMetroCity + FreeBreakfast, data = hotel)
B
## FreeBreakfast
## IsMetroCity 0 1
## 0 3470 6002
## 1 1173 2587
C <- xtabs( ~ IsWeekend + FreeBreakfast, data = hotel)
C
## FreeBreakfast
## IsWeekend 0 1
## 0 1728 3263
## 1 2915 5326
D <- xtabs( ~ IsNewYearEve + FreeBreakfast, data = hotel)
D
## FreeBreakfast
## IsNewYearEve 0 1
## 0 4060 7526
## 1 583 1063
E <- xtabs( ~ IsMetroCity + FreeWifi, data = hotel)
E
## FreeWifi
## IsMetroCity 0 1
## 0 838 8634
## 1 143 3617
Fi <- xtabs( ~ IsWeekend + FreeWifi, data = hotel)
Fi
## FreeWifi
## IsWeekend 0 1
## 0 375 4616
## 1 606 7635
G <- xtabs( ~ IsNewYearEve + FreeWifi, data = hotel)
G
## FreeWifi
## IsNewYearEve 0 1
## 0 859 10727
## 1 122 1524
H <- xtabs( ~ StarRating + IsMetroCity, data = hotel)
H
## IsMetroCity
## StarRating 0 1
## 0 16 0
## 1 8 0
## 2 344 96
## 2.5 456 176
## 3 4336 1617
## 3.2 8 0
## 3.3 16 0
## 3.4 8 0
## 3.5 1312 440
## 3.6 0 8
## 3.7 24 0
## 3.8 16 0
## 3.9 32 0
## 4 1696 767
## 4.1 24 0
## 4.3 16 0
## 4.4 8 0
## 4.5 288 88
## 4.7 8 0
## 4.8 16 0
## 5 840 568
I <- xtabs( ~ StarRating + IsTouristDestination, data = hotel)
I
## IsTouristDestination
## StarRating 0 1
## 0 0 16
## 1 0 8
## 2 64 376
## 2.5 152 480
## 3 1888 4065
## 3.2 0 8
## 3.3 0 16
## 3.4 0 8
## 3.5 448 1304
## 3.6 0 8
## 3.7 8 16
## 3.8 16 0
## 3.9 16 16
## 4 839 1624
## 4.1 0 24
## 4.3 16 0
## 4.4 8 0
## 4.5 128 248
## 4.7 8 0
## 4.8 0 16
## 5 416 992
J <- xtabs( ~ IsMetroCity + IsWeekend, data = hotel)
J
## IsWeekend
## IsMetroCity 0 1
## 0 3578 5894
## 1 1413 2347
K <- xtabs( ~ IsNewYearEve + IsMetroCity, data = hotel)
K
## IsMetroCity
## IsNewYearEve 0 1
## 0 8295 3291
## 1 1177 469
boxplot( RoomRent ~ CityRank, data = hotel)
boxplot( RoomRent ~ IsMetroCity, data = hotel)
boxplot( RoomRent ~ IsTouristDestination, data = hotel)
boxplot( RoomRent ~ IsNewYearEve, data = hotel)
boxplot( RoomRent ~ IsWeekend, data = hotel)
boxplot( RoomRent ~ StarRating, data = hotel)
boxplot( RoomRent ~ Airport, data = hotel)
boxplot( RoomRent ~ FreeWifi, data = hotel)
boxplot( RoomRent ~ FreeBreakfast, data = hotel)
boxplot( RoomRent ~ HasSwimmingPool, data = hotel)
hist(hotel$CityRank)
hist(hotel$IsMetroCity)
hist(hotel$IsTouristDestination)
hist(hotel$IsWeekend)
hist(hotel$IsNewYearEve)
hist(hotel$StarRating)
hist(hotel$Airport)
hist(hotel$FreeWifi)
hist(hotel$FreeBreakfast)
hist(hotel$HotelCapacity)
hist(hotel$HasSwimmingPool)
hist(hotel$RoomRent)
hist(hotel$HotelCapacity)
plot(hotel$RoomRent, hotel$CityRank)
plot(hotel$RoomRent, hotel$IsMetroCity)
plot(hotel$RoomRent, hotel$IsTouristDestination)
plot(hotel$RoomRent, hotel$IsNewYearEve)
plot(hotel$RoomRent, hotel$IsWeekend)
plot(hotel$RoomRent, hotel$StarRating)
plot(hotel$RoomRent, hotel$Airport)
plot(hotel$RoomRent, hotel$FreeWifi)
plot(hotel$RoomRent, hotel$FreeBreakfast)
plot(hotel$RoomRent, hotel$HasSwimmingPool)
cor(hotel[, c(2,3,4,5,6,7,10,11,18)])
## Population CityRank IsMetroCity
## Population 1.0000000000 -0.8353204432 0.7712260105
## CityRank -0.8353204432 1.0000000000 -0.5643937903
## IsMetroCity 0.7712260105 -0.5643937903 1.0000000000
## IsTouristDestination -0.0482029722 0.2807134520 0.1763717063
## IsWeekend 0.0115926802 -0.0072564766 0.0018118005
## IsNewYearEve 0.0007332482 -0.0006326444 0.0006464753
## RoomRent -0.0887280632 0.0939855292 -0.0668397705
## StarRating 0.1341365933 -0.1333810133 0.0776028661
## HotelCapacity 0.2599830516 -0.2561197059 0.1871502153
## IsTouristDestination IsWeekend IsNewYearEve
## Population -0.048202972 0.011592680 0.0007332482
## CityRank 0.280713452 -0.007256477 -0.0006326444
## IsMetroCity 0.176371706 0.001811801 0.0006464753
## IsTouristDestination 1.000000000 -0.019481101 -0.0022663884
## IsWeekend -0.019481101 1.000000000 0.2923820508
## IsNewYearEve -0.002266388 0.292382051 1.0000000000
## RoomRent 0.122502963 0.004580134 0.0384912269
## StarRating -0.040554998 0.006378436 0.0023608970
## HotelCapacity -0.094356091 0.006306507 0.0013526790
## RoomRent StarRating HotelCapacity
## Population -0.088728063 0.134136593 0.259983052
## CityRank 0.093985529 -0.133381013 -0.256119706
## IsMetroCity -0.066839771 0.077602866 0.187150215
## IsTouristDestination 0.122502963 -0.040554998 -0.094356091
## IsWeekend 0.004580134 0.006378436 0.006306507
## IsNewYearEve 0.038491227 0.002360897 0.001352679
## RoomRent 1.000000000 0.369373425 0.157873308
## StarRating 0.369373425 1.000000000 0.637430337
## HotelCapacity 0.157873308 0.637430337 1.000000000
library(corrgram)
## Warning: package 'corrgram' was built under R version 3.4.3
corrgram(hotel, order = TRUE, lower.panel=panel.shade,
upper.panel=panel.pie, text.panel=panel.txt)
pairs( ~ CityRank + StarRating + Airport + RoomRent, hotel)
chisq.test(hotel$RoomRent, hotel$CityRank)
## Warning in chisq.test(hotel$RoomRent, hotel$CityRank): Chi-squared
## approximation may be incorrect
##
## Pearson's Chi-squared test
##
## data: hotel$RoomRent and hotel$CityRank
## X-squared = 238740, df = 88355, p-value < 2.2e-16
chisq.test(hotel$RoomRent, hotel$IsMetroCity)
## Warning in chisq.test(hotel$RoomRent, hotel$IsMetroCity): Chi-squared
## approximation may be incorrect
##
## Pearson's Chi-squared test
##
## data: hotel$RoomRent and hotel$IsMetroCity
## X-squared = 6443.9, df = 2155, p-value < 2.2e-16
chisq.test(hotel$RoomRent, hotel$IsTouristDestination)
## Warning in chisq.test(hotel$RoomRent, hotel$IsTouristDestination): Chi-
## squared approximation may be incorrect
##
## Pearson's Chi-squared test
##
## data: hotel$RoomRent and hotel$IsTouristDestination
## X-squared = 6413.2, df = 2155, p-value < 2.2e-16
chisq.test(hotel$RoomRent, hotel$IsNewYearEve)
## Warning in chisq.test(hotel$RoomRent, hotel$IsNewYearEve): Chi-squared
## approximation may be incorrect
##
## Pearson's Chi-squared test
##
## data: hotel$RoomRent and hotel$IsNewYearEve
## X-squared = 2047.8, df = 2155, p-value = 0.9505
chisq.test(hotel$RoomRent, hotel$IsWeekend)
## Warning in chisq.test(hotel$RoomRent, hotel$IsWeekend): Chi-squared
## approximation may be incorrect
##
## Pearson's Chi-squared test
##
## data: hotel$RoomRent and hotel$IsWeekend
## X-squared = 1587.6, df = 2155, p-value = 1
chisq.test(hotel$RoomRent, hotel$StarRating)
## Warning in chisq.test(hotel$RoomRent, hotel$StarRating): Chi-squared
## approximation may be incorrect
##
## Pearson's Chi-squared test
##
## data: hotel$RoomRent and hotel$StarRating
## X-squared = 132390, df = 43100, p-value < 2.2e-16
chisq.test(hotel$RoomRent, hotel$Airport)
## Warning in chisq.test(hotel$RoomRent, hotel$Airport): Chi-squared
## approximation may be incorrect
##
## Pearson's Chi-squared test
##
## data: hotel$RoomRent and hotel$Airport
## X-squared = 1584700, df = 581850, p-value < 2.2e-16
chisq.test(hotel$RoomRent, hotel$FreeWifi)
## Warning in chisq.test(hotel$RoomRent, hotel$FreeWifi): Chi-squared
## approximation may be incorrect
##
## Pearson's Chi-squared test
##
## data: hotel$RoomRent and hotel$FreeWifi
## X-squared = 5914.3, df = 2155, p-value < 2.2e-16
chisq.test(hotel$RoomRent, hotel$FreeBreakfast)
## Warning in chisq.test(hotel$RoomRent, hotel$FreeBreakfast): Chi-squared
## approximation may be incorrect
##
## Pearson's Chi-squared test
##
## data: hotel$RoomRent and hotel$FreeBreakfast
## X-squared = 6492.4, df = 2155, p-value < 2.2e-16
chisq.test(hotel$RoomRent, hotel$HasSwimmingPool)
## Warning in chisq.test(hotel$RoomRent, hotel$HasSwimmingPool): Chi-squared
## approximation may be incorrect
##
## Pearson's Chi-squared test
##
## data: hotel$RoomRent and hotel$HasSwimmingPool
## X-squared = 8394.6, df = 2155, p-value < 2.2e-16
t.test(hotel$RoomRent, hotel$CityRank)
##
## Welch Two Sample t-test
##
## data: hotel$RoomRent and hotel$CityRank
## t = 85.635, df = 13231, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 5334.200 5584.116
## sample estimates:
## mean of x mean of y
## 5473.99184 14.83374
t.test(hotel$RoomRent, hotel$IsMetroCity)
##
## Welch Two Sample t-test
##
## data: hotel$RoomRent and hotel$IsMetroCity
## t = 85.863, df = 13231, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 5348.750 5598.666
## sample estimates:
## mean of x mean of y
## 5473.9918380 0.2841596
t.test(hotel$RoomRent, hotel$IsTouristDestination)
##
## Welch Two Sample t-test
##
## data: hotel$RoomRent and hotel$IsTouristDestination
## t = 85.856, df = 13231, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 5348.337 5598.253
## sample estimates:
## mean of x mean of y
## 5473.9918380 0.6971735
t.test(hotel$RoomRent, hotel$IsNewYearEve)
##
## Welch Two Sample t-test
##
## data: hotel$RoomRent and hotel$IsNewYearEve
## t = 85.865, df = 13231, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 5348.910 5598.825
## sample estimates:
## mean of x mean of y
## 5473.9918380 0.1243954
t.test(hotel$RoomRent, hotel$IsWeekend)
##
## Welch Two Sample t-test
##
## data: hotel$RoomRent and hotel$IsWeekend
## t = 85.858, df = 13231, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 5348.411 5598.327
## sample estimates:
## mean of x mean of y
## 5473.9918380 0.6228083
t.test(hotel$RoomRent, hotel$StarRating)
##
## Welch Two Sample t-test
##
## data: hotel$RoomRent and hotel$StarRating
## t = 85.813, df = 13231, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 5345.575 5595.491
## sample estimates:
## mean of x mean of y
## 5473.991838 3.458933
t.test(hotel$RoomRent, hotel$Airport)
##
## Welch Two Sample t-test
##
## data: hotel$RoomRent and hotel$Airport
## t = 85.535, df = 13231, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 5327.875 5577.792
## sample estimates:
## mean of x mean of y
## 5473.99184 21.15874
t.test(hotel$RoomRent, hotel$FreeWifi)
##
## Welch Two Sample t-test
##
## data: hotel$RoomRent and hotel$FreeWifi
## t = 85.853, df = 13231, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 5348.108 5598.024
## sample estimates:
## mean of x mean of y
## 5473.9918380 0.9258615
t.test(hotel$RoomRent, hotel$FreeBreakfast)
##
## Welch Two Sample t-test
##
## data: hotel$RoomRent and hotel$FreeBreakfast
## t = 85.857, df = 13231, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 5348.385 5598.301
## sample estimates:
## mean of x mean of y
## 5473.9918380 0.6491082
t.test(hotel$RoomRent, hotel$HasSwimmingPool)
##
## Welch Two Sample t-test
##
## data: hotel$RoomRent and hotel$HasSwimmingPool
## t = 85.862, df = 13231, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 5348.678 5598.594
## sample estimates:
## mean of x mean of y
## 5473.9918380 0.3558041
t.test(hotel$RoomRent, hotel$IsNewYearEve)
##
## Welch Two Sample t-test
##
## data: hotel$RoomRent and hotel$IsNewYearEve
## t = 85.865, df = 13231, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 5348.910 5598.825
## sample estimates:
## mean of x mean of y
## 5473.9918380 0.1243954
We can see that the p-value is less than 0.05 and therefore we will reject the null hypothesis.
fit <- lm( RoomRent ~ IsMetroCity + IsTouristDestination + IsNewYearEve + IsWeekend + StarRating + Airport + HasSwimmingPool + FreeWifi + FreeBreakfast, hotel)
summary(fit)
##
## Call:
## lm(formula = RoomRent ~ IsMetroCity + IsTouristDestination +
## IsNewYearEve + IsWeekend + StarRating + Airport + HasSwimmingPool +
## FreeWifi + FreeBreakfast, data = hotel)
##
## Residuals:
## Min 1Q Median 3Q Max
## -10451 -2349 -712 978 310394
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -7726.314 399.765 -19.327 < 2e-16 ***
## IsMetroCity -1867.441 135.859 -13.745 < 2e-16 ***
## IsTouristDestination 2330.496 133.374 17.473 < 2e-16 ***
## IsNewYearEve 887.140 182.783 4.854 1.23e-06 ***
## IsWeekend -96.609 124.494 -0.776 0.4378
## StarRating 3014.799 98.267 30.680 < 2e-16 ***
## Airport 11.310 2.721 4.156 3.26e-05 ***
## HasSwimmingPool 1875.286 156.573 11.977 < 2e-16 ***
## FreeWifi 562.209 224.987 2.499 0.0125 *
## FreeBreakfast 309.581 123.255 2.512 0.0120 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 6636 on 13222 degrees of freedom
## Multiple R-squared: 0.1817, Adjusted R-squared: 0.1811
## F-statistic: 326.2 on 9 and 13222 DF, p-value: < 2.2e-16