This Multinomial Logistic model uses data collected by a travel company on hotel reservations made for its clients. The dataset consists of variables such as price, commission earned as well as client specific information such as desired location of the hotel within a certain square mile radius to predict the kind of room booked. All results below are for this particular model that uses the variables on commission percentage, the number of days left to cancel the reservation, number of days booked in advance, and provider name to predict the type of room booked. Out of all the fitted models, this was found to be the best fit in terms of residual deviances and AIC. Shown below are some of the R code and the results from the model.
The plots below show the distribution of the numerical variables where the variable ‘Days To Cancel’ appears to have a very small distribution with very significant outliers. Hence it is treated by capping the values.
Then the Multinomial Logistic model is estimated and the results are as follows.
## # weights: 88 (70 variable)
## initial value 40773.689749
## iter 10 value 29231.487055
## iter 20 value 26757.516629
## iter 30 value 25010.096129
## iter 40 value 23191.352424
## iter 50 value 21743.045702
## iter 60 value 19933.012389
## iter 70 value 17872.486022
## iter 80 value 17155.056061
## iter 90 value 16881.400846
## iter 100 value 16798.566166
## final value 16798.566166
## stopped after 100 iterations
| y.level | term | estimate | std.error | statistic | p.value | conf.low | conf.high |
|---|---|---|---|---|---|---|---|
| DDNR | (Intercept) | 1.269939e+00 | 0.2357378 | 1.013706e+00 | 0.3107229 | 8.000599e-01 | 2.015781e+00 |
| DDNR | ProviderBedsOnline | 4.751408e-01 | 0.4225471 | -1.761092e+00 | 0.0782229 | 2.075629e-01 | 1.087665e+00 |
| DDNR | ProviderBooKing | 7.634502e+34 | 0.4933594 | 1.628034e+02 | 0.0000000 | 2.902902e+34 | 2.007840e+35 |
| DDNR | ProviderGoGlobal | 7.694611e-01 | 0.2833867 | -9.247604e-01 | 0.3550906 | 4.415374e-01 | 1.340929e+00 |
| DDNR | ProviderRateHawk | 0.000000e+00 | NaN | NaN | NaN | NaN | NaN |
| DDNR | ProviderRestel | 4.439713e-01 | 0.4047762 | -2.006036e+00 | 0.0448525 | 2.008208e-01 | 9.815241e-01 |
| DDNR | ProviderTeldar | 3.008218e-01 | 0.3479662 | -3.452167e+00 | 0.0005561 | 1.520968e-01 | 5.949749e-01 |
| DDNR | CommisionPercentage | 0.000000e+00 | 0.2361755 | -1.207480e+02 | 0.0000000 | 0.000000e+00 | 0.000000e+00 |
| DDNR | DaysToCancel | 8.174740e+02 | 0.6717884 | 9.982636e+00 | 0.0000000 | 2.191022e+02 | 3.050010e+03 |
| DDNR | AdvanceBooking | 9.743475e-01 | 0.0264677 | -9.818466e-01 | 0.3261754 | 9.250911e-01 | 1.026227e+00 |
| DDR | (Intercept) | 6.520242e+00 | 0.3166923 | 5.920293e+00 | 0.0000000 | 3.505054e+00 | 1.212922e+01 |
| DDR | ProviderBedsOnline | 2.024770e-02 | 0.7939363 | -4.911873e+00 | 0.0000009 | 4.271500e-03 | 9.597880e-02 |
| DDR | ProviderBooKing | 0.000000e+00 | NaN | NaN | NaN | NaN | NaN |
| DDR | ProviderGoGlobal | 1.658320e-02 | 0.8939510 | -4.585673e+00 | 0.0000045 | 2.875600e-03 | 9.563130e-02 |
| DDR | ProviderRateHawk | 0.000000e+00 | 0.0000000 | -5.631053e+13 | 0.0000000 | 0.000000e+00 | 0.000000e+00 |
| DDR | ProviderRestel | 1.464900e-03 | 0.8161423 | -7.996142e+00 | 0.0000000 | 2.959000e-04 | 7.252700e-03 |
| DDR | ProviderTeldar | 3.043528e-01 | 0.4099529 | -2.901718e+00 | 0.0037112 | 1.362777e-01 | 6.797198e-01 |
| DDR | CommisionPercentage | 0.000000e+00 | 0.2424708 | -3.064543e+02 | 0.0000000 | 0.000000e+00 | 0.000000e+00 |
| DDR | DaysToCancel | 2.753900e+05 | 0.7404843 | 1.691588e+01 | 0.0000000 | 6.451293e+04 | 1.175573e+06 |
| DDR | AdvanceBooking | 9.422913e-01 | 0.0370143 | -1.605887e+00 | 0.1082987 | 8.763519e-01 | 1.013192e+00 |
| DSNR | (Intercept) | 2.114600e-03 | 0.6975084 | -8.829825e+00 | 0.0000000 | 5.389000e-04 | 8.297600e-03 |
| DSNR | ProviderBedsOnline | 1.183538e+00 | 0.1559123 | 1.080790e+00 | 0.2797907 | 8.719057e-01 | 1.606553e+00 |
| DSNR | ProviderBooKing | 9.343628e+00 | 0.0000000 | 6.395492e+12 | 0.0000000 | 9.343628e+00 | 9.343628e+00 |
| DSNR | ProviderGoGlobal | 3.299812e+00 | 0.1083269 | 1.102095e+01 | 0.0000000 | 2.668586e+00 | 4.080348e+00 |
| DSNR | ProviderRateHawk | 0.000000e+00 | NaN | NaN | NaN | NaN | NaN |
| DSNR | ProviderRestel | 4.643915e-01 | 0.1762666 | -4.351517e+00 | 0.0000135 | 3.287349e-01 | 6.560285e-01 |
| DSNR | ProviderTeldar | 7.102536e-01 | 0.1286369 | -2.659681e+00 | 0.0078215 | 5.519726e-01 | 9.139226e-01 |
| DSNR | CommisionPercentage | 6.643898e+17 | 5.4351510 | 7.550415e+00 | 0.0000000 | 1.570414e+13 | 2.810812e+22 |
| DSNR | DaysToCancel | 5.650031e+00 | 0.7789354 | 2.223113e+00 | 0.0262082 | 1.227495e+00 | 2.600650e+01 |
| DSNR | AdvanceBooking | 9.979370e-01 | 0.0068445 | -3.017242e-01 | 0.7628623 | 9.846390e-01 | 1.011415e+00 |
| DSR | (Intercept) | 6.400000e-05 | 0.6573950 | -1.469033e+01 | 0.0000000 | 1.760000e-05 | 2.320000e-04 |
| DSR | ProviderBedsOnline | 1.602527e-01 | 0.7286642 | -2.512821e+00 | 0.0119770 | 3.842070e-02 | 6.684140e-01 |
| DSR | ProviderBooKing | 0.000000e+00 | NaN | NaN | NaN | NaN | NaN |
| DSR | ProviderGoGlobal | 2.221293e-01 | 0.8712552 | -1.726814e+00 | 0.0842011 | 4.027100e-02 | 1.225236e+00 |
| DSR | ProviderRateHawk | 0.000000e+00 | 0.0000000 | -4.772163e+15 | 0.0000000 | 0.000000e+00 | 0.000000e+00 |
| DSR | ProviderRestel | 2.020900e-03 | 0.7970210 | -7.784224e+00 | 0.0000000 | 4.238000e-04 | 9.637800e-03 |
| DSR | ProviderTeldar | 2.683466e+00 | 0.3178244 | 3.105831e+00 | 0.0018974 | 1.439340e+00 | 5.002979e+00 |
| DSR | CommisionPercentage | 1.543467e+12 | 4.6587135 | 6.024207e+00 | 0.0000000 | 1.671065e+08 | 1.425613e+16 |
| DSR | DaysToCancel | 2.480189e+05 | 0.7400724 | 1.678384e+01 | 0.0000000 | 5.814789e+04 | 1.057878e+06 |
| DSR | AdvanceBooking | 1.022640e+00 | 0.0288561 | 7.758274e-01 | 0.4378509 | 9.664077e-01 | 1.082144e+00 |
| OtherNR | (Intercept) | 4.315714e+00 | 0.5044300 | 2.898841e+00 | 0.0037454 | 1.605761e+00 | 1.159910e+01 |
| OtherNR | ProviderBedsOnline | 2.593187e-01 | 0.1087575 | -1.241016e+01 | 0.0000000 | 2.095363e-01 | 3.209286e-01 |
| OtherNR | ProviderBooKing | 8.705562e+35 | 0.5839896 | 1.417053e+02 | 0.0000000 | 2.771431e+35 | 2.734574e+36 |
| OtherNR | ProviderGoGlobal | 2.605211e-01 | 0.0779069 | -1.726511e+01 | 0.0000000 | 2.236291e-01 | 3.034991e-01 |
| OtherNR | ProviderRateHawk | 1.621222e+05 | 0.3605474 | 3.327192e+01 | 0.0000000 | 7.997308e+04 | 3.286555e+05 |
| OtherNR | ProviderRestel | 5.501854e-01 | 0.0888941 | -6.721477e+00 | 0.0000000 | 4.622131e-01 | 6.549014e-01 |
| OtherNR | ProviderTeldar | 2.860313e-01 | 0.0806742 | -1.551493e+01 | 0.0000000 | 2.441988e-01 | 3.350298e-01 |
| OtherNR | CommisionPercentage | 8.538400e-03 | 3.9904708 | -1.193638e+00 | 0.2326195 | 3.400000e-06 | 2.128495e+01 |
| OtherNR | DaysToCancel | 1.680788e+00 | 0.2977218 | 1.744120e+00 | 0.0811382 | 9.377598e-01 | 3.012549e+00 |
| OtherNR | AdvanceBooking | 1.007553e+00 | 0.0050252 | 1.497442e+00 | 0.1342784 | 9.976784e-01 | 1.017526e+00 |
| OtherR | (Intercept) | 3.373000e-02 | 0.5193976 | -6.525575e+00 | 0.0000000 | 1.218720e-02 | 9.335300e-02 |
| OtherR | ProviderBedsOnline | 3.667270e-02 | 0.7196617 | -4.593440e+00 | 0.0000044 | 8.948800e-03 | 1.502865e-01 |
| OtherR | ProviderBooKing | 7.444359e+34 | 0.8416818 | 9.539870e+01 | 0.0000000 | 1.430166e+34 | 3.874969e+35 |
| OtherR | ProviderGoGlobal | 1.725410e-02 | 0.8677726 | -4.678307e+00 | 0.0000029 | 3.149500e-03 | 9.452380e-02 |
| OtherR | ProviderRateHawk | 2.677728e+04 | 0.3592058 | 2.838292e+01 | 0.0000000 | 1.324372e+04 | 5.414059e+04 |
| OtherR | ProviderRestel | 2.716800e-03 | 0.7793395 | -7.581164e+00 | 0.0000000 | 5.898000e-04 | 1.251510e-02 |
| OtherR | ProviderTeldar | 1.062443e+00 | 0.3005546 | 2.015313e-01 | 0.8402831 | 5.894856e-01 | 1.914865e+00 |
| OtherR | CommisionPercentage | 4.604200e-03 | 3.5594911 | -1.511676e+00 | 0.1306164 | 4.300000e-06 | 4.931671e+00 |
| OtherR | DaysToCancel | 2.555965e+05 | 0.7400054 | 1.682603e+01 | 0.0000000 | 5.993233e+04 | 1.090055e+06 |
| OtherR | AdvanceBooking | 9.943564e-01 | 0.0274743 | -2.059972e-01 | 0.8367931 | 9.422277e-01 | 1.049369e+00 |
| SSR | (Intercept) | 1.413012e-01 | 0.5381193 | -3.636482e+00 | 0.0002764 | 4.921510e-02 | 4.056897e-01 |
| SSR | ProviderBedsOnline | 1.303432e-01 | 0.7192120 | -2.833079e+00 | 0.0046102 | 3.183420e-02 | 5.336821e-01 |
| SSR | ProviderBooKing | 0.000000e+00 | 0.0000000 | -2.586047e+92 | 0.0000000 | 0.000000e+00 | 0.000000e+00 |
| SSR | ProviderGoGlobal | 3.467950e-02 | 0.8684947 | -3.870612e+00 | 0.0001086 | 6.321300e-03 | 1.902555e-01 |
| SSR | ProviderRateHawk | 0.000000e+00 | 0.0000000 | -1.397089e+29 | 0.0000000 | 0.000000e+00 | 0.000000e+00 |
| SSR | ProviderRestel | 3.221900e-03 | 0.7810546 | -7.346195e+00 | 0.0000000 | 6.971000e-04 | 1.489190e-02 |
| SSR | ProviderTeldar | 2.850531e+00 | 0.3020752 | 3.467697e+00 | 0.0005249 | 1.576881e+00 | 5.152912e+00 |
| SSR | CommisionPercentage | 0.000000e+00 | 3.7332925 | -6.429057e+00 | 0.0000000 | 0.000000e+00 | 1.000000e-07 |
| SSR | DaysToCancel | 2.543593e+05 | 0.7400064 | 1.681945e+01 | 0.0000000 | 5.964211e+04 | 1.084781e+06 |
| SSR | AdvanceBooking | 9.966835e-01 | 0.0274349 | -1.210864e-01 | 0.9036226 | 9.445058e-01 | 1.051744e+00 |
The estimate column shows the odds of choosing a particular room type as compared to the base category (which is the Single Standard Non-Refundable room) when the provider, the commission percentage or the number of days left to cancel changes. The odds are typically very small. However, they are large only for some providers (such as BooKing or RateHawk). Most of the results are significant. The graphs below show the predicted probabilities of choosing each room type against the two numerical covariates. The probabilities of choosing the refundable rooms (Single Standard, Double Standard, Double Deluxe and Other types) go up as number of days booked in advance and the number of days available to cancel the reservation go up.
Below are the plots for residuals against the numerical variables to check model fit.
## Joining, by = "obs_num"
## Joining, by = "obs_num"
And the residual vs predicted value plot for the Single Standard and Other room type
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
The plots dont look like a random scatter around the mean like they should. In the residual-variable plot, the residuals are concentrated at a few specific values rather than occurring randomly which might suggest other influences on room choices that are unaccounted for. In the residual-predicted value plot, there is a definite trend with the residuals going down as the predicted probability of choosing the room type goes up. Thus, the predictions are more accurate when the probability of choosing the room type is higher. Thus, the model is not doing very well in terms of predicting the room type that the client is less likely to pick. The model definitely has several shortfalls. It does not account for all the variables included in deciding on room type and hence cannot generate the best predictions. The classes are not very well separable with the current features/attributes the model accounts for. We need more (and better) features. However, the data is not currently available. Models with more complex decision boundaries such as SVM or NN with deep layers and neurons might work better in this case. What can be gleaned from the model in terms of business implications is that the single standard and unclassified ‘Other’ room types are most popular and there is some unaccounted variable(s) that maybe interacting with commission percentage and affecting the results obtained.