A Machine Learning approach in Revenue Management

Can Machine Learning methods be applied to Revenue Management in the hospitality industry?

Let’s see it with a case study of a small hotel in Ecuador.

In this report the reader will see step by step the creation of a regression model capable of predicting a suitable ADR (Average Daily Rate) given a specific period of the year, in this case it will be weekly rates, but it can be applied to any window in time.

This model is specially made to deal with small data, the output is not a specific price, but a recommended discount or increment on the tariff. Random Forest an ensemble learning method for classification and regression was used, it operates by constructing a multitude of decision trees at training time and outputs the mean prediction (in regression) of the individual trees.

The data used contains a complete year of operations of the analysed hotel and uses the information gathered by its booking system.

Variable Forecasting

A very important variable that is included in the model is the ADR of the Luxury Hotels segment, this information is publicly available in the Ecuadorian Tourism website. While training the model this variable was considered to be relevant and it’s a creative way of solving the lack of data, as big luxury hotels tariffs rely widely on Revenue Management approaches, the demand of hospitality should be implicit on their prices.

At first, the Luxury Hotels’ ADR will be forecasted so it can be applied in the main algorithm. This will be done with the use of ARIMA modelling, a widely know method in econometrics.

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,1,1)
## Q* = 4.305, df = 9, p-value = 0.8902
## 
## Model df: 1.   Total lags used: 10

The function auto.arima() is very powerful and most of the time very accurate at determining the ARIMA parameters, but a good practice is to use it as a reference and try different values that make AIC and BIC as small as possible.

## initial  value 2.162026 
## iter   2 value 1.820634
## iter   3 value 1.769744
## iter   4 value 1.678337
## iter   5 value 1.667284
## iter   6 value 1.606492
## iter   7 value 1.501966
## iter   8 value 1.482634
## iter   9 value 1.479730
## iter  10 value 1.474727
## iter  11 value 1.465515
## iter  12 value 1.461840
## iter  13 value 1.460484
## iter  14 value 1.460024
## iter  15 value 1.459134
## iter  16 value 1.458995
## iter  17 value 1.458951
## iter  18 value 1.458349
## iter  19 value 1.458099
## iter  20 value 1.458093
## iter  21 value 1.458093
## iter  21 value 1.458093
## iter  21 value 1.458093
## final  value 1.458093 
## converged

## $fit
## 
## Call:
## stats::arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D, 
##     Q), period = S), xreg = constant, optim.control = list(trace = trc, REPORT = 1, 
##     reltol = tol))
## 
## Coefficients:
##           ma1     ma2  constant
##       -1.7901  0.7901    0.0172
## s.e.   0.2235  0.2016    0.0199
## 
## sigma^2 estimated as 14.85:  log likelihood = -97.82,  aic = 203.64
## 
## $degrees_of_freedom
## [1] 31
## 
## $ttable
##          Estimate     SE t.value p.value
## ma1       -1.7901 0.2235 -8.0098  0.0000
## ma2        0.7901 0.2016  3.9191  0.0005
## constant   0.0172 0.0199  0.8622  0.3952
## 
## $AIC
## [1] 3.864838
## 
## $AICc
## [1] 3.956236
## 
## $BIC
## [1] 2.996798

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,1,2)
## Q* = 4.6684, df = 8, p-value = 0.7924
## 
## Model df: 2.   Total lags used: 10

##                      ME     RMSE      MAE       MPE     MAPE      MASE
## Training set -0.8501674 4.328935 3.482772 -1.058143 3.690139 0.8313273
## Test set      3.9128048 3.912805 3.912805  4.285657 4.285657 0.9339748
##                     ACF1 Theil's U
## Training set -0.09855856        NA
## Test set              NA       NaN

##                      ME     RMSE      MAE       MPE     MAPE     MASE
## Training set -0.8792335 4.385621 3.574708 -1.092797 3.787806 0.853272
## Test set      3.4860220 3.486022 3.486022  3.818206 3.818206 0.832103
##                    ACF1 Theil's U
## Training set -0.2176559        NA
## Test set             NA       NaN

The parameters p = 0, d= 1, q = 2 got a smaller AIC and BIC but higher RMSE; due to reasons that won’t be explained on this document, the latter will be rejected and these values will be utilized.

Demand forecasting

As regular Revenue Management models, it’s necessary to determine the expected demand to output a recommended price. Trying to forecast the demand of a small hotel through common regressive methods can be a real challenge, as the data may not follow any trend, seasonality or cyclicality, making it a simple white noise. For this reason a Machine Learning method is going to be used, a Random Forest model.

## + Fold1: mtry= 2, min.node.size=5, splitrule=variance 
## - Fold1: mtry= 2, min.node.size=5, splitrule=variance 
## + Fold1: mtry=11, min.node.size=5, splitrule=variance 
## - Fold1: mtry=11, min.node.size=5, splitrule=variance 
## + Fold1: mtry=20, min.node.size=5, splitrule=variance 
## - Fold1: mtry=20, min.node.size=5, splitrule=variance 
## + Fold1: mtry=29, min.node.size=5, splitrule=variance 
## - Fold1: mtry=29, min.node.size=5, splitrule=variance 
## + Fold1: mtry=38, min.node.size=5, splitrule=variance 
## - Fold1: mtry=38, min.node.size=5, splitrule=variance 
## + Fold1: mtry=48, min.node.size=5, splitrule=variance 
## - Fold1: mtry=48, min.node.size=5, splitrule=variance 
## + Fold1: mtry=57, min.node.size=5, splitrule=variance 
## - Fold1: mtry=57, min.node.size=5, splitrule=variance 
## + Fold1: mtry=66, min.node.size=5, splitrule=variance 
## - Fold1: mtry=66, min.node.size=5, splitrule=variance 
## + Fold1: mtry=75, min.node.size=5, splitrule=variance 
## - Fold1: mtry=75, min.node.size=5, splitrule=variance 
## + Fold1: mtry=85, min.node.size=5, splitrule=variance 
## - Fold1: mtry=85, min.node.size=5, splitrule=variance 
## + Fold1: mtry= 2, min.node.size=5, splitrule=extratrees 
## - Fold1: mtry= 2, min.node.size=5, splitrule=extratrees 
## + Fold1: mtry=11, min.node.size=5, splitrule=extratrees 
## - Fold1: mtry=11, min.node.size=5, splitrule=extratrees 
## + Fold1: mtry=20, min.node.size=5, splitrule=extratrees 
## - Fold1: mtry=20, min.node.size=5, splitrule=extratrees 
## + Fold1: mtry=29, min.node.size=5, splitrule=extratrees 
## - Fold1: mtry=29, min.node.size=5, splitrule=extratrees 
## + Fold1: mtry=38, min.node.size=5, splitrule=extratrees 
## - Fold1: mtry=38, min.node.size=5, splitrule=extratrees 
## + Fold1: mtry=48, min.node.size=5, splitrule=extratrees 
## - Fold1: mtry=48, min.node.size=5, splitrule=extratrees 
## + Fold1: mtry=57, min.node.size=5, splitrule=extratrees 
## - Fold1: mtry=57, min.node.size=5, splitrule=extratrees 
## + Fold1: mtry=66, min.node.size=5, splitrule=extratrees 
## - Fold1: mtry=66, min.node.size=5, splitrule=extratrees 
## + Fold1: mtry=75, min.node.size=5, splitrule=extratrees 
## - Fold1: mtry=75, min.node.size=5, splitrule=extratrees 
## + Fold1: mtry=85, min.node.size=5, splitrule=extratrees 
## - Fold1: mtry=85, min.node.size=5, splitrule=extratrees 
## + Fold2: mtry= 2, min.node.size=5, splitrule=variance 
## - Fold2: mtry= 2, min.node.size=5, splitrule=variance 
## + Fold2: mtry=11, min.node.size=5, splitrule=variance 
## - Fold2: mtry=11, min.node.size=5, splitrule=variance 
## + Fold2: mtry=20, min.node.size=5, splitrule=variance 
## - Fold2: mtry=20, min.node.size=5, splitrule=variance 
## + Fold2: mtry=29, min.node.size=5, splitrule=variance 
## - Fold2: mtry=29, min.node.size=5, splitrule=variance 
## + Fold2: mtry=38, min.node.size=5, splitrule=variance 
## - Fold2: mtry=38, min.node.size=5, splitrule=variance 
## + Fold2: mtry=48, min.node.size=5, splitrule=variance 
## - Fold2: mtry=48, min.node.size=5, splitrule=variance 
## + Fold2: mtry=57, min.node.size=5, splitrule=variance 
## - Fold2: mtry=57, min.node.size=5, splitrule=variance 
## + Fold2: mtry=66, min.node.size=5, splitrule=variance 
## - Fold2: mtry=66, min.node.size=5, splitrule=variance 
## + Fold2: mtry=75, min.node.size=5, splitrule=variance 
## - Fold2: mtry=75, min.node.size=5, splitrule=variance 
## + Fold2: mtry=85, min.node.size=5, splitrule=variance 
## - Fold2: mtry=85, min.node.size=5, splitrule=variance 
## + Fold2: mtry= 2, min.node.size=5, splitrule=extratrees 
## - Fold2: mtry= 2, min.node.size=5, splitrule=extratrees 
## + Fold2: mtry=11, min.node.size=5, splitrule=extratrees 
## - Fold2: mtry=11, min.node.size=5, splitrule=extratrees 
## + Fold2: mtry=20, min.node.size=5, splitrule=extratrees 
## - Fold2: mtry=20, min.node.size=5, splitrule=extratrees 
## + Fold2: mtry=29, min.node.size=5, splitrule=extratrees 
## - Fold2: mtry=29, min.node.size=5, splitrule=extratrees 
## + Fold2: mtry=38, min.node.size=5, splitrule=extratrees 
## - Fold2: mtry=38, min.node.size=5, splitrule=extratrees 
## + Fold2: mtry=48, min.node.size=5, splitrule=extratrees 
## - Fold2: mtry=48, min.node.size=5, splitrule=extratrees 
## + Fold2: mtry=57, min.node.size=5, splitrule=extratrees 
## - Fold2: mtry=57, min.node.size=5, splitrule=extratrees 
## + Fold2: mtry=66, min.node.size=5, splitrule=extratrees 
## - Fold2: mtry=66, min.node.size=5, splitrule=extratrees 
## + Fold2: mtry=75, min.node.size=5, splitrule=extratrees 
## - Fold2: mtry=75, min.node.size=5, splitrule=extratrees 
## + Fold2: mtry=85, min.node.size=5, splitrule=extratrees 
## - Fold2: mtry=85, min.node.size=5, splitrule=extratrees 
## + Fold3: mtry= 2, min.node.size=5, splitrule=variance 
## - Fold3: mtry= 2, min.node.size=5, splitrule=variance 
## + Fold3: mtry=11, min.node.size=5, splitrule=variance 
## - Fold3: mtry=11, min.node.size=5, splitrule=variance 
## + Fold3: mtry=20, min.node.size=5, splitrule=variance 
## - Fold3: mtry=20, min.node.size=5, splitrule=variance 
## + Fold3: mtry=29, min.node.size=5, splitrule=variance 
## - Fold3: mtry=29, min.node.size=5, splitrule=variance 
## + Fold3: mtry=38, min.node.size=5, splitrule=variance 
## - Fold3: mtry=38, min.node.size=5, splitrule=variance 
## + Fold3: mtry=48, min.node.size=5, splitrule=variance 
## - Fold3: mtry=48, min.node.size=5, splitrule=variance 
## + Fold3: mtry=57, min.node.size=5, splitrule=variance 
## - Fold3: mtry=57, min.node.size=5, splitrule=variance 
## + Fold3: mtry=66, min.node.size=5, splitrule=variance 
## - Fold3: mtry=66, min.node.size=5, splitrule=variance 
## + Fold3: mtry=75, min.node.size=5, splitrule=variance 
## - Fold3: mtry=75, min.node.size=5, splitrule=variance 
## + Fold3: mtry=85, min.node.size=5, splitrule=variance 
## - Fold3: mtry=85, min.node.size=5, splitrule=variance 
## + Fold3: mtry= 2, min.node.size=5, splitrule=extratrees 
## - Fold3: mtry= 2, min.node.size=5, splitrule=extratrees 
## + Fold3: mtry=11, min.node.size=5, splitrule=extratrees 
## - Fold3: mtry=11, min.node.size=5, splitrule=extratrees 
## + Fold3: mtry=20, min.node.size=5, splitrule=extratrees 
## - Fold3: mtry=20, min.node.size=5, splitrule=extratrees 
## + Fold3: mtry=29, min.node.size=5, splitrule=extratrees 
## - Fold3: mtry=29, min.node.size=5, splitrule=extratrees 
## + Fold3: mtry=38, min.node.size=5, splitrule=extratrees 
## - Fold3: mtry=38, min.node.size=5, splitrule=extratrees 
## + Fold3: mtry=48, min.node.size=5, splitrule=extratrees 
## - Fold3: mtry=48, min.node.size=5, splitrule=extratrees 
## + Fold3: mtry=57, min.node.size=5, splitrule=extratrees 
## - Fold3: mtry=57, min.node.size=5, splitrule=extratrees 
## + Fold3: mtry=66, min.node.size=5, splitrule=extratrees 
## - Fold3: mtry=66, min.node.size=5, splitrule=extratrees 
## + Fold3: mtry=75, min.node.size=5, splitrule=extratrees 
## - Fold3: mtry=75, min.node.size=5, splitrule=extratrees 
## + Fold3: mtry=85, min.node.size=5, splitrule=extratrees 
## - Fold3: mtry=85, min.node.size=5, splitrule=extratrees 
## + Fold4: mtry= 2, min.node.size=5, splitrule=variance 
## - Fold4: mtry= 2, min.node.size=5, splitrule=variance 
## + Fold4: mtry=11, min.node.size=5, splitrule=variance 
## - Fold4: mtry=11, min.node.size=5, splitrule=variance 
## + Fold4: mtry=20, min.node.size=5, splitrule=variance 
## - Fold4: mtry=20, min.node.size=5, splitrule=variance 
## + Fold4: mtry=29, min.node.size=5, splitrule=variance 
## - Fold4: mtry=29, min.node.size=5, splitrule=variance 
## + Fold4: mtry=38, min.node.size=5, splitrule=variance 
## - Fold4: mtry=38, min.node.size=5, splitrule=variance 
## + Fold4: mtry=48, min.node.size=5, splitrule=variance 
## - Fold4: mtry=48, min.node.size=5, splitrule=variance 
## + Fold4: mtry=57, min.node.size=5, splitrule=variance 
## - Fold4: mtry=57, min.node.size=5, splitrule=variance 
## + Fold4: mtry=66, min.node.size=5, splitrule=variance 
## - Fold4: mtry=66, min.node.size=5, splitrule=variance 
## + Fold4: mtry=75, min.node.size=5, splitrule=variance 
## - Fold4: mtry=75, min.node.size=5, splitrule=variance 
## + Fold4: mtry=85, min.node.size=5, splitrule=variance 
## - Fold4: mtry=85, min.node.size=5, splitrule=variance 
## + Fold4: mtry= 2, min.node.size=5, splitrule=extratrees 
## - Fold4: mtry= 2, min.node.size=5, splitrule=extratrees 
## + Fold4: mtry=11, min.node.size=5, splitrule=extratrees 
## - Fold4: mtry=11, min.node.size=5, splitrule=extratrees 
## + Fold4: mtry=20, min.node.size=5, splitrule=extratrees 
## - Fold4: mtry=20, min.node.size=5, splitrule=extratrees 
## + Fold4: mtry=29, min.node.size=5, splitrule=extratrees 
## - Fold4: mtry=29, min.node.size=5, splitrule=extratrees 
## + Fold4: mtry=38, min.node.size=5, splitrule=extratrees 
## - Fold4: mtry=38, min.node.size=5, splitrule=extratrees 
## + Fold4: mtry=48, min.node.size=5, splitrule=extratrees 
## - Fold4: mtry=48, min.node.size=5, splitrule=extratrees 
## + Fold4: mtry=57, min.node.size=5, splitrule=extratrees 
## - Fold4: mtry=57, min.node.size=5, splitrule=extratrees 
## + Fold4: mtry=66, min.node.size=5, splitrule=extratrees 
## - Fold4: mtry=66, min.node.size=5, splitrule=extratrees 
## + Fold4: mtry=75, min.node.size=5, splitrule=extratrees 
## - Fold4: mtry=75, min.node.size=5, splitrule=extratrees 
## + Fold4: mtry=85, min.node.size=5, splitrule=extratrees 
## - Fold4: mtry=85, min.node.size=5, splitrule=extratrees 
## + Fold5: mtry= 2, min.node.size=5, splitrule=variance 
## - Fold5: mtry= 2, min.node.size=5, splitrule=variance 
## + Fold5: mtry=11, min.node.size=5, splitrule=variance 
## - Fold5: mtry=11, min.node.size=5, splitrule=variance 
## + Fold5: mtry=20, min.node.size=5, splitrule=variance 
## - Fold5: mtry=20, min.node.size=5, splitrule=variance 
## + Fold5: mtry=29, min.node.size=5, splitrule=variance 
## - Fold5: mtry=29, min.node.size=5, splitrule=variance 
## + Fold5: mtry=38, min.node.size=5, splitrule=variance 
## - Fold5: mtry=38, min.node.size=5, splitrule=variance 
## + Fold5: mtry=48, min.node.size=5, splitrule=variance 
## - Fold5: mtry=48, min.node.size=5, splitrule=variance 
## + Fold5: mtry=57, min.node.size=5, splitrule=variance 
## - Fold5: mtry=57, min.node.size=5, splitrule=variance 
## + Fold5: mtry=66, min.node.size=5, splitrule=variance 
## - Fold5: mtry=66, min.node.size=5, splitrule=variance 
## + Fold5: mtry=75, min.node.size=5, splitrule=variance 
## - Fold5: mtry=75, min.node.size=5, splitrule=variance 
## + Fold5: mtry=85, min.node.size=5, splitrule=variance 
## - Fold5: mtry=85, min.node.size=5, splitrule=variance 
## + Fold5: mtry= 2, min.node.size=5, splitrule=extratrees 
## - Fold5: mtry= 2, min.node.size=5, splitrule=extratrees 
## + Fold5: mtry=11, min.node.size=5, splitrule=extratrees 
## - Fold5: mtry=11, min.node.size=5, splitrule=extratrees 
## + Fold5: mtry=20, min.node.size=5, splitrule=extratrees 
## - Fold5: mtry=20, min.node.size=5, splitrule=extratrees 
## + Fold5: mtry=29, min.node.size=5, splitrule=extratrees 
## - Fold5: mtry=29, min.node.size=5, splitrule=extratrees 
## + Fold5: mtry=38, min.node.size=5, splitrule=extratrees 
## - Fold5: mtry=38, min.node.size=5, splitrule=extratrees 
## + Fold5: mtry=48, min.node.size=5, splitrule=extratrees 
## - Fold5: mtry=48, min.node.size=5, splitrule=extratrees 
## + Fold5: mtry=57, min.node.size=5, splitrule=extratrees 
## - Fold5: mtry=57, min.node.size=5, splitrule=extratrees 
## + Fold5: mtry=66, min.node.size=5, splitrule=extratrees 
## - Fold5: mtry=66, min.node.size=5, splitrule=extratrees 
## + Fold5: mtry=75, min.node.size=5, splitrule=extratrees 
## - Fold5: mtry=75, min.node.size=5, splitrule=extratrees 
## + Fold5: mtry=85, min.node.size=5, splitrule=extratrees 
## - Fold5: mtry=85, min.node.size=5, splitrule=extratrees 
## Aggregating results
## Selecting tuning parameters
## Fitting mtry = 2, splitrule = variance, min.node.size = 5 on full training set

## Random Forest 
## 
## 255 samples
##   6 predictor
## 
## No pre-processing
## Resampling: Cross-Validated (10 fold) 
## Summary of sample sizes: 52, 51, 51, 50, 51 
## Resampling results across tuning parameters:
## 
##   mtry  splitrule   RMSE      Rsquared     MAE     
##    2    variance    1.675814  0.006994966  1.355533
##    2    extratrees  1.675815  0.006072875  1.356658
##   11    variance    1.730637  0.007140870  1.402845
##   11    extratrees  1.728605  0.008897949  1.402465
##   20    variance    1.754023  0.006416812  1.417889
##   20    extratrees  1.753579  0.009141068  1.419164
##   29    variance    1.761820  0.007476158  1.420380
##   29    extratrees  1.761859  0.009065438  1.422141
##   38    variance    1.772503  0.007215856  1.426317
##   38    extratrees  1.772155  0.009571863  1.430146
##   48    variance    1.778333  0.007633092  1.429860
##   48    extratrees  1.777419  0.010154849  1.429631
##   57    variance    1.786463  0.006773515  1.436016
##   57    extratrees  1.786545  0.010748347  1.437259
##   66    variance    1.796578  0.006555431  1.441382
##   66    extratrees  1.791495  0.010055486  1.439941
##   75    variance    1.795988  0.007355445  1.438707
##   75    extratrees  1.797735  0.009935190  1.443549
##   85    variance    1.795583  0.007309348  1.439665
##   85    extratrees  1.800963  0.009939183  1.443698
## 
## Tuning parameter 'min.node.size' was held constant at a value of 5
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were mtry = 2, splitrule =
##  variance and min.node.size = 5.

## ranger variable importance
## 
##   only 20 most important variables shown (out of 85)
## 
##                  Overall
## Wholeyear_week49  100.00
## Weekendyes         95.86
## Wholeyear_week52   84.98
## Wholeyear_week21   73.03
## Wholeyear_week35   72.00
## Wholeyear_week18   60.06
## Wholeyear_week26   59.79
## Wholeyear_week19   53.37
## National_ADR_Lux   44.40
## Date27             41.55
## Date8              40.82
## Date15             37.52
## Date9              37.00
## Wholeyear_week9    36.42
## Date2              36.19
## Wholeyear_week48   34.58
## Wholeyear_week12   33.89
## Wholeyear_week24   31.37
## Date7              31.34
## Wholeyear_week44   30.40

## Ranger result
## 
## Call:
##  ranger::ranger(dependent.variable.name = ".outcome", data = x,      mtry = param$mtry, min.node.size = param$min.node.size, splitrule = as.character(param$splitrule),      write.forest = TRUE, probability = classProbs, ...) 
## 
## Type:                             Regression 
## Number of trees:                  500 
## Sample size:                      255 
## Number of independent variables:  85 
## Mtry:                             2 
## Target node size:                 5 
## Variable importance mode:         permutation 
## OOB prediction error (MSE):       2.81277 
## R squared (OOB):                  0.002446965

##       RMSE     Rsquared      MAE Resample
## 1 1.678531 2.409955e-03 1.349231    Fold1
## 2 1.655384 2.322109e-03 1.354431    Fold3
## 3 1.677604 2.394917e-02 1.354889    Fold5
## 4 1.682520 4.709294e-06 1.364029    Fold2
## 5 1.685028 6.288887e-03 1.355086    Fold4

After applying a cross validation procedure it’s known that the model missed to predict the number of nights sold by around 1.6 rooms, not a bad prediction, specially considering the small dataset used to train the model; as more observations are included, specially from other hotels, the accuracy should get higher.

With the trained model it is possible to predict the desired value, in this case the number of rooms that are expected to be booked on a specific week, in this case it is almost 5 rooms.

## [1] 4.893775

Revenue Management model

After forecasting the ADR of luxury hotels and the expected demand, all of the values required by the main model are available, so the next step is to train it. Once again, a Random Forest algorithm is used.

## + Fold1: mtry= 2, min.node.size=5, splitrule=variance 
## - Fold1: mtry= 2, min.node.size=5, splitrule=variance 
## + Fold1: mtry=11, min.node.size=5, splitrule=variance 
## - Fold1: mtry=11, min.node.size=5, splitrule=variance 
## + Fold1: mtry=20, min.node.size=5, splitrule=variance 
## - Fold1: mtry=20, min.node.size=5, splitrule=variance 
## + Fold1: mtry=30, min.node.size=5, splitrule=variance 
## - Fold1: mtry=30, min.node.size=5, splitrule=variance 
## + Fold1: mtry=39, min.node.size=5, splitrule=variance 
## - Fold1: mtry=39, min.node.size=5, splitrule=variance 
## + Fold1: mtry=48, min.node.size=5, splitrule=variance 
## - Fold1: mtry=48, min.node.size=5, splitrule=variance 
## + Fold1: mtry=58, min.node.size=5, splitrule=variance 
## - Fold1: mtry=58, min.node.size=5, splitrule=variance 
## + Fold1: mtry=67, min.node.size=5, splitrule=variance 
## - Fold1: mtry=67, min.node.size=5, splitrule=variance 
## + Fold1: mtry=76, min.node.size=5, splitrule=variance 
## - Fold1: mtry=76, min.node.size=5, splitrule=variance 
## + Fold1: mtry=86, min.node.size=5, splitrule=variance 
## - Fold1: mtry=86, min.node.size=5, splitrule=variance 
## + Fold1: mtry= 2, min.node.size=5, splitrule=extratrees 
## - Fold1: mtry= 2, min.node.size=5, splitrule=extratrees 
## + Fold1: mtry=11, min.node.size=5, splitrule=extratrees 
## - Fold1: mtry=11, min.node.size=5, splitrule=extratrees 
## + Fold1: mtry=20, min.node.size=5, splitrule=extratrees 
## - Fold1: mtry=20, min.node.size=5, splitrule=extratrees 
## + Fold1: mtry=30, min.node.size=5, splitrule=extratrees 
## - Fold1: mtry=30, min.node.size=5, splitrule=extratrees 
## + Fold1: mtry=39, min.node.size=5, splitrule=extratrees 
## - Fold1: mtry=39, min.node.size=5, splitrule=extratrees 
## + Fold1: mtry=48, min.node.size=5, splitrule=extratrees 
## - Fold1: mtry=48, min.node.size=5, splitrule=extratrees 
## + Fold1: mtry=58, min.node.size=5, splitrule=extratrees 
## - Fold1: mtry=58, min.node.size=5, splitrule=extratrees 
## + Fold1: mtry=67, min.node.size=5, splitrule=extratrees 
## - Fold1: mtry=67, min.node.size=5, splitrule=extratrees 
## + Fold1: mtry=76, min.node.size=5, splitrule=extratrees 
## - Fold1: mtry=76, min.node.size=5, splitrule=extratrees 
## + Fold1: mtry=86, min.node.size=5, splitrule=extratrees 
## - Fold1: mtry=86, min.node.size=5, splitrule=extratrees 
## + Fold2: mtry= 2, min.node.size=5, splitrule=variance 
## - Fold2: mtry= 2, min.node.size=5, splitrule=variance 
## + Fold2: mtry=11, min.node.size=5, splitrule=variance 
## - Fold2: mtry=11, min.node.size=5, splitrule=variance 
## + Fold2: mtry=20, min.node.size=5, splitrule=variance 
## - Fold2: mtry=20, min.node.size=5, splitrule=variance 
## + Fold2: mtry=30, min.node.size=5, splitrule=variance 
## - Fold2: mtry=30, min.node.size=5, splitrule=variance 
## + Fold2: mtry=39, min.node.size=5, splitrule=variance 
## - Fold2: mtry=39, min.node.size=5, splitrule=variance 
## + Fold2: mtry=48, min.node.size=5, splitrule=variance 
## - Fold2: mtry=48, min.node.size=5, splitrule=variance 
## + Fold2: mtry=58, min.node.size=5, splitrule=variance 
## - Fold2: mtry=58, min.node.size=5, splitrule=variance 
## + Fold2: mtry=67, min.node.size=5, splitrule=variance 
## - Fold2: mtry=67, min.node.size=5, splitrule=variance 
## + Fold2: mtry=76, min.node.size=5, splitrule=variance 
## - Fold2: mtry=76, min.node.size=5, splitrule=variance 
## + Fold2: mtry=86, min.node.size=5, splitrule=variance 
## - Fold2: mtry=86, min.node.size=5, splitrule=variance 
## + Fold2: mtry= 2, min.node.size=5, splitrule=extratrees 
## - Fold2: mtry= 2, min.node.size=5, splitrule=extratrees 
## + Fold2: mtry=11, min.node.size=5, splitrule=extratrees 
## - Fold2: mtry=11, min.node.size=5, splitrule=extratrees 
## + Fold2: mtry=20, min.node.size=5, splitrule=extratrees 
## - Fold2: mtry=20, min.node.size=5, splitrule=extratrees 
## + Fold2: mtry=30, min.node.size=5, splitrule=extratrees 
## - Fold2: mtry=30, min.node.size=5, splitrule=extratrees 
## + Fold2: mtry=39, min.node.size=5, splitrule=extratrees 
## - Fold2: mtry=39, min.node.size=5, splitrule=extratrees 
## + Fold2: mtry=48, min.node.size=5, splitrule=extratrees 
## - Fold2: mtry=48, min.node.size=5, splitrule=extratrees 
## + Fold2: mtry=58, min.node.size=5, splitrule=extratrees 
## - Fold2: mtry=58, min.node.size=5, splitrule=extratrees 
## + Fold2: mtry=67, min.node.size=5, splitrule=extratrees 
## - Fold2: mtry=67, min.node.size=5, splitrule=extratrees 
## + Fold2: mtry=76, min.node.size=5, splitrule=extratrees 
## - Fold2: mtry=76, min.node.size=5, splitrule=extratrees 
## + Fold2: mtry=86, min.node.size=5, splitrule=extratrees 
## - Fold2: mtry=86, min.node.size=5, splitrule=extratrees 
## + Fold3: mtry= 2, min.node.size=5, splitrule=variance 
## - Fold3: mtry= 2, min.node.size=5, splitrule=variance 
## + Fold3: mtry=11, min.node.size=5, splitrule=variance 
## - Fold3: mtry=11, min.node.size=5, splitrule=variance 
## + Fold3: mtry=20, min.node.size=5, splitrule=variance 
## - Fold3: mtry=20, min.node.size=5, splitrule=variance 
## + Fold3: mtry=30, min.node.size=5, splitrule=variance 
## - Fold3: mtry=30, min.node.size=5, splitrule=variance 
## + Fold3: mtry=39, min.node.size=5, splitrule=variance 
## - Fold3: mtry=39, min.node.size=5, splitrule=variance 
## + Fold3: mtry=48, min.node.size=5, splitrule=variance 
## - Fold3: mtry=48, min.node.size=5, splitrule=variance 
## + Fold3: mtry=58, min.node.size=5, splitrule=variance 
## - Fold3: mtry=58, min.node.size=5, splitrule=variance 
## + Fold3: mtry=67, min.node.size=5, splitrule=variance 
## - Fold3: mtry=67, min.node.size=5, splitrule=variance 
## + Fold3: mtry=76, min.node.size=5, splitrule=variance 
## - Fold3: mtry=76, min.node.size=5, splitrule=variance 
## + Fold3: mtry=86, min.node.size=5, splitrule=variance 
## - Fold3: mtry=86, min.node.size=5, splitrule=variance 
## + Fold3: mtry= 2, min.node.size=5, splitrule=extratrees 
## - Fold3: mtry= 2, min.node.size=5, splitrule=extratrees 
## + Fold3: mtry=11, min.node.size=5, splitrule=extratrees 
## - Fold3: mtry=11, min.node.size=5, splitrule=extratrees 
## + Fold3: mtry=20, min.node.size=5, splitrule=extratrees 
## - Fold3: mtry=20, min.node.size=5, splitrule=extratrees 
## + Fold3: mtry=30, min.node.size=5, splitrule=extratrees 
## - Fold3: mtry=30, min.node.size=5, splitrule=extratrees 
## + Fold3: mtry=39, min.node.size=5, splitrule=extratrees 
## - Fold3: mtry=39, min.node.size=5, splitrule=extratrees 
## + Fold3: mtry=48, min.node.size=5, splitrule=extratrees 
## - Fold3: mtry=48, min.node.size=5, splitrule=extratrees 
## + Fold3: mtry=58, min.node.size=5, splitrule=extratrees 
## - Fold3: mtry=58, min.node.size=5, splitrule=extratrees 
## + Fold3: mtry=67, min.node.size=5, splitrule=extratrees 
## - Fold3: mtry=67, min.node.size=5, splitrule=extratrees 
## + Fold3: mtry=76, min.node.size=5, splitrule=extratrees 
## - Fold3: mtry=76, min.node.size=5, splitrule=extratrees 
## + Fold3: mtry=86, min.node.size=5, splitrule=extratrees 
## - Fold3: mtry=86, min.node.size=5, splitrule=extratrees 
## + Fold4: mtry= 2, min.node.size=5, splitrule=variance 
## - Fold4: mtry= 2, min.node.size=5, splitrule=variance 
## + Fold4: mtry=11, min.node.size=5, splitrule=variance 
## - Fold4: mtry=11, min.node.size=5, splitrule=variance 
## + Fold4: mtry=20, min.node.size=5, splitrule=variance 
## - Fold4: mtry=20, min.node.size=5, splitrule=variance 
## + Fold4: mtry=30, min.node.size=5, splitrule=variance 
## - Fold4: mtry=30, min.node.size=5, splitrule=variance 
## + Fold4: mtry=39, min.node.size=5, splitrule=variance 
## - Fold4: mtry=39, min.node.size=5, splitrule=variance 
## + Fold4: mtry=48, min.node.size=5, splitrule=variance 
## - Fold4: mtry=48, min.node.size=5, splitrule=variance 
## + Fold4: mtry=58, min.node.size=5, splitrule=variance 
## - Fold4: mtry=58, min.node.size=5, splitrule=variance 
## + Fold4: mtry=67, min.node.size=5, splitrule=variance 
## - Fold4: mtry=67, min.node.size=5, splitrule=variance 
## + Fold4: mtry=76, min.node.size=5, splitrule=variance 
## - Fold4: mtry=76, min.node.size=5, splitrule=variance 
## + Fold4: mtry=86, min.node.size=5, splitrule=variance 
## - Fold4: mtry=86, min.node.size=5, splitrule=variance 
## + Fold4: mtry= 2, min.node.size=5, splitrule=extratrees 
## - Fold4: mtry= 2, min.node.size=5, splitrule=extratrees 
## + Fold4: mtry=11, min.node.size=5, splitrule=extratrees 
## - Fold4: mtry=11, min.node.size=5, splitrule=extratrees 
## + Fold4: mtry=20, min.node.size=5, splitrule=extratrees 
## - Fold4: mtry=20, min.node.size=5, splitrule=extratrees 
## + Fold4: mtry=30, min.node.size=5, splitrule=extratrees 
## - Fold4: mtry=30, min.node.size=5, splitrule=extratrees 
## + Fold4: mtry=39, min.node.size=5, splitrule=extratrees 
## - Fold4: mtry=39, min.node.size=5, splitrule=extratrees 
## + Fold4: mtry=48, min.node.size=5, splitrule=extratrees 
## - Fold4: mtry=48, min.node.size=5, splitrule=extratrees 
## + Fold4: mtry=58, min.node.size=5, splitrule=extratrees 
## - Fold4: mtry=58, min.node.size=5, splitrule=extratrees 
## + Fold4: mtry=67, min.node.size=5, splitrule=extratrees 
## - Fold4: mtry=67, min.node.size=5, splitrule=extratrees 
## + Fold4: mtry=76, min.node.size=5, splitrule=extratrees 
## - Fold4: mtry=76, min.node.size=5, splitrule=extratrees 
## + Fold4: mtry=86, min.node.size=5, splitrule=extratrees 
## - Fold4: mtry=86, min.node.size=5, splitrule=extratrees 
## + Fold5: mtry= 2, min.node.size=5, splitrule=variance 
## - Fold5: mtry= 2, min.node.size=5, splitrule=variance 
## + Fold5: mtry=11, min.node.size=5, splitrule=variance 
## - Fold5: mtry=11, min.node.size=5, splitrule=variance 
## + Fold5: mtry=20, min.node.size=5, splitrule=variance 
## - Fold5: mtry=20, min.node.size=5, splitrule=variance 
## + Fold5: mtry=30, min.node.size=5, splitrule=variance 
## - Fold5: mtry=30, min.node.size=5, splitrule=variance 
## + Fold5: mtry=39, min.node.size=5, splitrule=variance 
## - Fold5: mtry=39, min.node.size=5, splitrule=variance 
## + Fold5: mtry=48, min.node.size=5, splitrule=variance 
## - Fold5: mtry=48, min.node.size=5, splitrule=variance 
## + Fold5: mtry=58, min.node.size=5, splitrule=variance 
## - Fold5: mtry=58, min.node.size=5, splitrule=variance 
## + Fold5: mtry=67, min.node.size=5, splitrule=variance 
## - Fold5: mtry=67, min.node.size=5, splitrule=variance 
## + Fold5: mtry=76, min.node.size=5, splitrule=variance 
## - Fold5: mtry=76, min.node.size=5, splitrule=variance 
## + Fold5: mtry=86, min.node.size=5, splitrule=variance 
## - Fold5: mtry=86, min.node.size=5, splitrule=variance 
## + Fold5: mtry= 2, min.node.size=5, splitrule=extratrees 
## - Fold5: mtry= 2, min.node.size=5, splitrule=extratrees 
## + Fold5: mtry=11, min.node.size=5, splitrule=extratrees 
## - Fold5: mtry=11, min.node.size=5, splitrule=extratrees 
## + Fold5: mtry=20, min.node.size=5, splitrule=extratrees 
## - Fold5: mtry=20, min.node.size=5, splitrule=extratrees 
## + Fold5: mtry=30, min.node.size=5, splitrule=extratrees 
## - Fold5: mtry=30, min.node.size=5, splitrule=extratrees 
## + Fold5: mtry=39, min.node.size=5, splitrule=extratrees 
## - Fold5: mtry=39, min.node.size=5, splitrule=extratrees 
## + Fold5: mtry=48, min.node.size=5, splitrule=extratrees 
## - Fold5: mtry=48, min.node.size=5, splitrule=extratrees 
## + Fold5: mtry=58, min.node.size=5, splitrule=extratrees 
## - Fold5: mtry=58, min.node.size=5, splitrule=extratrees 
## + Fold5: mtry=67, min.node.size=5, splitrule=extratrees 
## - Fold5: mtry=67, min.node.size=5, splitrule=extratrees 
## + Fold5: mtry=76, min.node.size=5, splitrule=extratrees 
## - Fold5: mtry=76, min.node.size=5, splitrule=extratrees 
## + Fold5: mtry=86, min.node.size=5, splitrule=extratrees 
## - Fold5: mtry=86, min.node.size=5, splitrule=extratrees 
## Aggregating results
## Selecting tuning parameters
## Fitting mtry = 20, splitrule = variance, min.node.size = 5 on full training set

## Random Forest 
## 
## 255 samples
##   7 predictor
## 
## No pre-processing
## Resampling: Cross-Validated (10 fold) 
## Summary of sample sizes: 51, 51, 50, 51, 52 
## Resampling results across tuning parameters:
## 
##   mtry  splitrule   RMSE      Rsquared    MAE     
##    2    variance    35.64537  0.04260288  16.94683
##    2    extratrees  35.67870  0.04238226  17.01525
##   11    variance    35.05880  0.05293725  16.20334
##   11    extratrees  35.08789  0.04656351  16.33825
##   20    variance    35.02170  0.05469900  16.17562
##   20    extratrees  35.14911  0.04806177  16.24721
##   30    variance    35.14785  0.05375290  16.17055
##   30    extratrees  35.08443  0.04823476  16.21399
##   39    variance    35.28313  0.05628401  16.17481
##   39    extratrees  35.35568  0.04615051  16.33081
##   48    variance    35.24087  0.05793983  16.18200
##   48    extratrees  35.40581  0.04714222  16.26177
##   58    variance    35.34334  0.05919189  16.11184
##   58    extratrees  35.67065  0.04423829  16.31402
##   67    variance    35.54663  0.05643594  16.21936
##   67    extratrees  36.01755  0.04191758  16.39337
##   76    variance    35.57972  0.05819911  16.18512
##   76    extratrees  36.09467  0.04067501  16.41347
##   86    variance    35.82114  0.05992753  16.21665
##   86    extratrees  36.21089  0.04207217  16.40301
## 
## Tuning parameter 'min.node.size' was held constant at a value of 5
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were mtry = 20, splitrule =
##  variance and min.node.size = 5.

## ranger variable importance
## 
##   only 20 most important variables shown (out of 86)
## 
##                     Overall
## Night_Sold_Complete  100.00
## Holidayyes            60.61
## National_ADR_Lux      60.32
## Wholeyear_week27      49.80
## Date6                 46.19
## GoodWeatheryes        41.75
## Weekendyes            41.75
## Wholeyear_week52      19.41
## Wholeyear_week41      17.17
## Wholeyear_week51      14.93
## Wholeyear_week22      13.66
## Date5                 13.56
## Wholeyear_week9       12.95
## Wholeyear_week29      12.48
## Date8                 12.08
## Date4                 12.03
## Wholeyear_week11      12.01
## Wholeyear_week21      11.79
## Wholeyear_week35      11.66
## Wholeyear_week30      11.46

## Ranger result
## 
## Call:
##  ranger::ranger(dependent.variable.name = ".outcome", data = x,      mtry = param$mtry, min.node.size = param$min.node.size, splitrule = as.character(param$splitrule),      write.forest = TRUE, probability = classProbs, ...) 
## 
## Type:                             Regression 
## Number of trees:                  500 
## Sample size:                      255 
## Number of independent variables:  86 
## Mtry:                             20 
## Target node size:                 5 
## Variable importance mode:         permutation 
## OOB prediction error (MSE):       1120.516 
## R squared (OOB):                  0.1381651

##       RMSE    Rsquared      MAE Resample
## 1 26.95810 0.006798056 16.54126    Fold2
## 2 35.22445 0.081101753 15.16477    Fold4
## 3 37.58829 0.098947744 16.22267    Fold1
## 4 38.95422 0.022246992 16.75195    Fold3
## 5 36.38342 0.064400471 16.19743    Fold5

On the test set, the predictions were different from the real values in around $35, considering the average ADR of the hotel this makes the model 80% accurate. But as the previous graph shows, when the model failed to predict the values, most of the time it accurately estimated the direction of the price, wether it went up or down.

After fitting the model, just as done before the next step is to create a data frame with the information of the predictors at a certain time, in this case a specific week.

Supporting visualisations

In business having a model is not enough, to add value it is needed to have a concrete process. For this reason, some visualisations were created to support any decision of lifting the price or making a discount on a certain week.

As this model is intended to help small and medium hotels to manage their tariffs intelligently, it’s clear that is unlikely to find a team of developers or a booking engine capable of fully automatising the revenue management proces. For this reason, the delivered tool must be understandable even by the owner, who sometimes is in charge of deciding changes in fares.

The following graph shows the recommended price by the model and the past year price, with the size of the point representing the number of nights sold on that particular day of the week.

The next visualisation may be the most important one because it suggests the discount or increment to be applied on the tariff. The suggested value can be used exactly as the model indicates, or by establishing thresholds according to the recommended percentage and apply a fixed discount or lift to the price (e.g. any suggested discount greater than 10% will be considered just as 10%, while any below as 5%).

As it can be seen in the previous graph, the model forecasted a similar demand (same week of previous year = 5, predicted week = 4.89 ) but suggested a decrease in price (-2.87%). As stated before, this value can also be applied as a fixed percentage, so it can mean a 5% discount.

The next two visualisations can be useful to have a general overview of how Occupancy rates have being behaving according to the ADR on a certain week. Particularly, the last one will give a clear image of how the revenue management approach is adding value to the business, if it actually does the RevPar should increase as well as the Total Revenue. This will be easily showed by the increased high of the bars and the lighter color of them.

Note:

The model can get more complex as more data is used when training it and therefore improve it’s accuracy, it can be done by adding other hotels observations or new variables related to finance, social media, website traffic, APIs, among others.

Sometimes small and medium hotels lack of proper information systems, so in real life the work may not be restricted to building a model, but also to improve their data strategy.