Operation Analytics - Casino

Import Libraries

Import Data

Explore Data

datatable(dataCasino)
str(dataCasino)
## Classes 'tbl_df', 'tbl' and 'data.frame':    5000 obs. of  9 variables:
##  $ ID         : chr  "Player 1" "Player 2" "Player 3" "Player 4" ...
##  $ Slots      : int  1013 68 148 63 92 658 358 139 673 65 ...
##  $ BJ         : int  6190 23 0 17 44 0 359 0 279 15 ...
##  $ Craps      : int  4276 23 0 28 18 0 172 0 103 10 ...
##  $ Bac        : int  868 12 0 9 10 0 103 0 103 12 ...
##  $ Bingo      : int  0 0 0 0 0 106 0 0 0 0 ...
##  $ Poker      : int  0 28 0 23 26 0 258 0 161 26 ...
##  $ Other      : int  0 53 0 52 60 0 422 0 546 69 ...
##  $ Total Spend: int  12348 207 148 193 250 764 1671 139 1866 196 ...

There are 5000 observations with 9 variables. The first column shows the player ID (in running order), the next 7 columns show the amount spend on each type of game and the last column shows the total spend by each player.

Correlation Clusters

Let’s take a look at the correlation clusters. We see that players who play Bingo fall into one cluster, and Poker/Others into another. The remaining - Slots, Bac, Craps and BJ - fall into the third cluster. The correlation between BJ, Bac and Craps, in particular, is very high. People tend to play these games together. Bingo, on the other hand, has low correlations with the other games, suggesting people who play Bingo tend to stick to Bingo.

cor(dataCasino[,2:9]) %>% corrplot(order = "hclust", addrect = 3, method ="number", tl.cex=0.9)

Visualisation

Let’s take a quick look at the distributions of the variables. We can see that there are outliers and the distibutions are highly skewed.

Desc(dataCasino$Slots, main = "Slots", plotit = TRUE)
## ------------------------------------------------------------------------- 
## Slots
## 
##     length       n    NAs  unique      0s    mean  meanCI
##      5'000   5'000      0     992     625  291.77  282.71
##             100.0%   0.0%           12.5%          300.84
##                                                          
##        .05     .10    .25  median     .75     .90     .95
##       0.00    0.00  62.00  104.00  507.00  794.00  935.05
##                                                          
##      range      sd  vcoef     mad     IQR    skew    kurt
##   1'861.00  326.86   1.12  154.19  445.00    1.27    1.01
##                                                          
## lowest : 0 (625), 9, 10, 11, 13 (2)
## highest: 1'673, 1'697, 1'753, 1'854, 1'861

Desc(dataCasino$BJ, main = "BJ", plotit = TRUE)
## ------------------------------------------------------------------------- 
## BJ
## 
##     length       n    NAs  unique      0s    mean    meanCI
##      5'000   5'000      0     735   2'084  283.29    258.36
##             100.0%   0.0%           41.7%            308.22
##                                                            
##        .05     .10    .25  median     .75     .90       .95
##       0.00    0.00   0.00   24.00  190.00  404.10  1'768.35
##                                                            
##      range      sd  vcoef     mad     IQR    skew      kurt
##   7'294.00  899.17   3.17   35.58  190.00    4.52     20.87
##                                                            
## lowest : 0 (2'084), 3, 4 (3), 5 (2), 7 (3)
## highest: 6'778, 6'839, 7'052, 7'104, 7'294

Desc(dataCasino$Craps, main = "Craps", plotit = TRUE)
## ------------------------------------------------------------------------- 
## Craps
## 
##     length       n    NAs  unique      0s    mean    meanCI
##      5'000   5'000      0     619   2'128  267.63    241.74
##             100.0%   0.0%           42.6%            293.52
##                                                            
##        .05     .10    .25  median     .75     .90       .95
##       0.00    0.00   0.00   16.00  117.50  265.10  1'942.65
##                                                            
##      range      sd  vcoef     mad     IQR    skew      kurt
##   7'251.00  933.88   3.49   23.72  117.50    4.44     19.45
##                                                            
## lowest : 0 (2'128), 2, 3, 4 (4), 5 (7)
## highest: 6'749, 6'869, 7'059, 7'111, 7'251

Desc(dataCasino$Bac, main = "Bac", plotit = TRUE)
## ------------------------------------------------------------------------- 
## Bac
## 
##     length       n    NAs  unique     0s    mean  meanCI
##      5'000   5'000      0     438  2'180   82.07   75.22
##             100.0%   0.0%          43.6%           88.92
##                                                         
##        .05     .10    .25  median    .75     .90     .95
##       0.00    0.00   0.00    7.00  35.00  122.00  620.05
##                                                         
##      range      sd  vcoef     mad    IQR    skew    kurt
##   2'254.00  247.14   3.01   10.38  35.00    4.20   18.09
##                                                         
## lowest : 0 (2'180), 1, 2 (7), 3 (11), 4 (31)
## highest: 1'752, 1'807, 1'893, 1'990, 2'254

Desc(dataCasino$Bingo, main = "Bingo", plotit = TRUE)
## ------------------------------------------------------------------------- 
## Bingo
## 
##   length       n    NAs  unique     0s   mean  meanCI
##    5'000   5'000      0     135  4'506  10.09    9.20
##           100.0%   0.0%          90.1%          10.97
##                                                      
##      .05     .10    .25  median    .75    .90     .95
##     0.00    0.00   0.00    0.00   0.00   0.00  100.05
##                                                      
##    range      sd  vcoef     mad    IQR   skew    kurt
##   213.00   31.94   3.17    0.00   0.00   3.12    8.67
##                                                      
## lowest : 0 (4'506), 5, 17, 23, 24
## highest: 172 (2), 175, 180, 184, 213

Desc(dataCasino$Poker, main = "Poker", plotit = TRUE)
## ------------------------------------------------------------------------- 
## Poker
## 
##   length       n    NAs  unique     0s    mean  meanCI
##    5'000   5'000      0     373  2'395   54.59   51.66
##           100.0%   0.0%          47.9%           57.53
##                                                       
##      .05     .10    .25  median    .75     .90     .95
##     0.00    0.00   0.00   11.00  27.00  214.00  260.00
##                                                       
##    range      sd  vcoef     mad    IQR    skew    kurt
##   914.00  105.86   1.94   16.31  27.00    2.78    9.90
##                                                       
## lowest : 0 (2'395), 1, 3, 4 (4), 5 (7)
## highest: 749, 812, 828, 860, 914

Desc(dataCasino$Other, main = "Other", plotit = TRUE)
## ------------------------------------------------------------------------- 
## Other
## 
##     length       n    NAs  unique     0s    mean  meanCI
##      5'000   5'000      0     596  2'278  132.97  126.94
##             100.0%   0.0%          45.6%          139.00
##                                                         
##        .05     .10    .25  median    .75     .90     .95
##       0.00    0.00   0.00   34.50  74.00  536.00  610.05
##                                                         
##      range      sd  vcoef     mad    IQR    skew    kurt
##   1'025.00  217.51   1.64   51.15  74.00    1.57    1.04
##                                                         
## lowest : 0 (2'278), 3, 4, 6, 7
## highest: 899 (2), 901, 962, 972, 1'025

Desc(dataCasino$`Total Spend`, main = "Total Spend", plotit = TRUE)
## ------------------------------------------------------------------------- 
## Total Spend
## 
##      length         n     NAs  unique      0s      mean    meanCI
##       5'000     5'000       0   1'597       0  1'122.42  1'060.70
##                100.0%    0.0%            0.0%            1'184.14
##                                                                  
##         .05       .10     .25  median     .75       .90       .95
##       87.00    113.00  185.00  336.50  757.00  2'185.20  6'740.10
##                                                                  
##       range        sd   vcoef     mad     IQR      skew      kurt
##   15'569.00  2'226.22    1.98  292.81  572.00      3.66     13.49
##                                                                  
## lowest : 13, 16, 18, 25, 27
## highest: 14'330, 14'333, 14'394, 14'568, 15'582

We are interested in the impact of removing Bingo. Note that Bingo only contributes 0.9% of total spend. The bulk of total spend comes from Slots, BJ and Craps.

apply(dataCasino[,2:9],2,sum)
##       Slots          BJ       Craps         Bac       Bingo       Poker 
##     1458871     1416453     1338136      410343       50432      272961 
##       Other Total Spend 
##      664869     5612088
(bingoPct <- sum(dataCasino$Bingo))/sum(dataCasino$`Total Spend`)*100
## [1] 0.8986317

Simple Linear Regression

fit <- lm(`Total Spend` ~ Bingo*Slots + Bingo*BJ + Bingo*Craps, data=dataCasino) 
stargazer(fit,  type = "text",
          dep.var.labels.include = TRUE, column.labels = c("Linear", "LinearInteration"))
## 
## ================================================
##                         Dependent variable:     
##                     ----------------------------
##                            `Total Spend`        
##                                Linear           
## ------------------------------------------------
## Bingo                          0.296            
##                               (0.347)           
##                                                 
## Slots                         1.685***          
##                               (0.013)           
##                                                 
## BJ                            1.061***          
##                               (0.009)           
##                                                 
## Craps                         1.017***          
##                               (0.009)           
##                                                 
## Bingo:Slots                  -0.006***          
##                               (0.001)           
##                                                 
## Bingo:BJ                                        
##                                                 
##                                                 
## Bingo:Craps                                     
##                                                 
##                                                 
## Constant                     79.622***          
##                               (4.834)           
##                                                 
## ------------------------------------------------
## Observations                   5,000            
## R2                             0.988            
## Adjusted R2                    0.988            
## Residual Std. Error     248.364 (df = 4994)     
## F Statistic         79,330.240*** (df = 5; 4994)
## ================================================
## Note:                *p<0.1; **p<0.05; ***p<0.01

Study Group 2

19 August 2017