Store 24 : Managing Employee Retention

This R Markdown file is written to find any correlations between the employee tenure and the sales/profits of a given store. It involves reading a dataset of 75 stores, containing various variables.

Reading of the dataset

setwd("~/Muyeena/Internship/Case Studies/Store 24")
store = read.csv("Store24.csv")
library(psych)
str(store)
## 'data.frame':    75 obs. of  14 variables:
##  $ store     : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ Sales     : int  1060294 1619874 1099921 1053860 1227841 1703140 1809256 1378482 2113089 1080979 ...
##  $ Profit    : int  265014 424007 222735 210122 300480 469050 476355 361115 474725 278625 ...
##  $ MTenure   : num  0 86.22 23.89 0 3.88 ...
##  $ CTenure   : num  24.8 6.64 5.03 5.37 6.87 ...
##  $ Pop       : int  7535 8630 9695 2797 20335 16926 17754 20824 26519 16381 ...
##  $ Comp      : num  2.8 4.24 4.49 4.25 1.65 ...
##  $ Visibility: int  3 4 3 4 2 3 2 4 2 4 ...
##  $ PedCount  : int  3 3 3 2 5 4 5 3 4 3 ...
##  $ Res       : int  1 1 1 1 0 1 1 1 1 1 ...
##  $ Hours24   : int  1 1 1 1 1 0 1 1 1 0 ...
##  $ CrewSkill : num  3.56 3.2 3.8 2.06 3.65 3.58 3.94 3.98 3.22 3.54 ...
##  $ MgrSkill  : num  3.15 3.56 4.12 4.1 3.59 ...
##  $ ServQual  : num  86.8 94.7 78.9 100 68.4 ...
head(store)
##   store   Sales Profit   MTenure   CTenure   Pop     Comp Visibility
## 1     1 1060294 265014   0.00000 24.804930  7535 2.797888          3
## 2     2 1619874 424007  86.22219  6.636550  8630 4.235555          4
## 3     3 1099921 222735  23.88854  5.026694  9695 4.494666          3
## 4     4 1053860 210122   0.00000  5.371663  2797 4.253946          4
## 5     5 1227841 300480   3.87737  6.866530 20335 1.651364          2
## 6     6 1703140 469050 149.93590 11.351130 16926 3.184613          3
##   PedCount Res Hours24 CrewSkill MgrSkill  ServQual
## 1        3   1       1      3.56 3.150000  86.84327
## 2        3   1       1      3.20 3.556667  94.73510
## 3        3   1       1      3.80 4.116667  78.94776
## 4        2   1       1      2.06 4.100000 100.00000
## 5        5   0       1      3.65 3.588889  68.42164
## 6        4   1       0      3.58 4.605556  94.73510
attach(store)
## The following object is masked _by_ .GlobalEnv:
## 
##     store

As can be seen above, the dataset is read into a variable called store and the same has been attached for easier access to its variables.


Summary Statistics of R

In this, we get the summary of all the variables of R, and compare it with Exhibit 3 given in the case study.

sum = describe(store)
sum[2:10,c(3,4,8,9)]
##                  mean        sd       min        max
## Sales      1205413.12 304531.31 699306.00 2113089.00
## Profit      276313.61  89404.08 122180.00  518998.00
## MTenure         45.30     57.67      0.00     277.99
## CTenure         13.93     17.70      0.89     114.15
## Pop           9825.59   5911.67   1046.00   26519.00
## Comp             3.79      1.31      1.65      11.13
## Visibility       3.08      0.75      2.00       5.00
## PedCount         2.96      0.99      1.00       5.00
## Res              0.96      0.20      0.00       1.00

This is now compared with the summary data available in Exhibit 3 of the case study.

As can be seen above, both the values are identical.


Analysis of important variables

Here, we analyse the most important variables for this case study, which include -

  • Profit
  • MTenure
  • CTenure

The same has been shown in a table, by executing the following code.

sum[3:5,3:4]
##              mean       sd
## Profit  276313.61 89404.08
## MTenure     45.30    57.67
## CTenure     13.93    17.70

Least and Most Profitable stores

In this task, we arrange the dataset in an ascending/descending order of the Profit, and display data related to it.

The top ten MOST profitable stores are
newdata = store[order(-store$Profit),]  
newdata[1:10,1:5]
##    store   Sales Profit   MTenure    CTenure
## 74    74 1782957 518998 171.09720  29.519510
## 7      7 1809256 476355  62.53080   7.326488
## 9      9 2113089 474725 108.99350   6.061602
## 6      6 1703140 469050 149.93590  11.351130
## 44    44 1807740 439781 182.23640 114.151900
## 2      2 1619874 424007  86.22219   6.636550
## 45    45 1602362 410149  47.64565   9.166325
## 18    18 1704826 394039 239.96980  33.774130
## 11    11 1583446 389886  44.81977   2.036961
## 47    47 1665657 387853  12.84790   6.636550
The top ten LEAST profitable stores are
newdata = store[order(store$Profit),]
newdata[1:10,1:5]
##    store   Sales Profit     MTenure   CTenure
## 57    57  699306 122180  24.3485700  2.956879
## 66    66  879581 146058 115.2039000  3.876797
## 41    41  744211 147327  14.9180200 11.926080
## 55    55  925744 147672   6.6703910 18.365500
## 32    32  828918 149033  36.0792600  6.636550
## 13    13  857843 152513   0.6571813  1.577002
## 54    54  811190 159792   6.6703910  3.876797
## 52    52 1073008 169201  24.1185600  3.416838
## 61    61  716589 177046  21.8184200 13.305950
## 37    37 1202917 187765  23.1985000  1.347023

Scatter Plots

In this task, we first draw scatterplot of Profit vs. MTenure so that we can visualize their relationship.

par(mfrow=c(1,1))
plot(store$Profit~store$MTenure,
     cex = 1,
     col = "blue",
     pch = 20,
     main = "Scatterplot of Profit vs. MTenure", 
     xlab = "MTenure",
     ylab = "Profit")
abline(lm(store$Profit ~ store$MTenure), col = "green")

We now draw scatterplot of Profit vs. CTenure so that we can visualize their relationship too.

par(mfrow=c(1,1))
plot(store$Profit~store$CTenure,
     cex = 1,
     col = "blue",
     pch = 20,
     main = "Scatterplot of Profit vs. CTenure", 
     xlab = "CTenure",
     ylab = "Profit")
abline(lm(store$Profit ~ store$CTenure), col = "green")

Correlation Matrix

In R, we can see the correlations of each variable with every other variable using a correlation matrix. Higher the magnitude, greater the relationship between the two variables.

round(cor(store), 2)
##            store Sales Profit MTenure CTenure   Pop  Comp Visibility
## store       1.00 -0.23  -0.20   -0.06    0.02 -0.29  0.03      -0.03
## Sales      -0.23  1.00   0.92    0.45    0.25  0.40 -0.24       0.13
## Profit     -0.20  0.92   1.00    0.44    0.26  0.43 -0.33       0.14
## MTenure    -0.06  0.45   0.44    1.00    0.24 -0.06  0.18       0.16
## CTenure     0.02  0.25   0.26    0.24    1.00  0.00 -0.07       0.07
## Pop        -0.29  0.40   0.43   -0.06    0.00  1.00 -0.27      -0.05
## Comp        0.03 -0.24  -0.33    0.18   -0.07 -0.27  1.00       0.03
## Visibility -0.03  0.13   0.14    0.16    0.07 -0.05  0.03       1.00
## PedCount   -0.22  0.42   0.45    0.06   -0.08  0.61 -0.15      -0.14
## Res        -0.03 -0.17  -0.16   -0.06   -0.34 -0.24  0.22       0.02
## Hours24     0.03  0.06  -0.03   -0.17    0.07 -0.22  0.13       0.05
## CrewSkill   0.05  0.16   0.16    0.10    0.26  0.28 -0.04      -0.20
## MgrSkill   -0.07  0.31   0.32    0.23    0.12  0.08  0.22       0.07
## ServQual   -0.32  0.39   0.36    0.18    0.08  0.12  0.02       0.21
##            PedCount   Res Hours24 CrewSkill MgrSkill ServQual
## store         -0.22 -0.03    0.03      0.05    -0.07    -0.32
## Sales          0.42 -0.17    0.06      0.16     0.31     0.39
## Profit         0.45 -0.16   -0.03      0.16     0.32     0.36
## MTenure        0.06 -0.06   -0.17      0.10     0.23     0.18
## CTenure       -0.08 -0.34    0.07      0.26     0.12     0.08
## Pop            0.61 -0.24   -0.22      0.28     0.08     0.12
## Comp          -0.15  0.22    0.13     -0.04     0.22     0.02
## Visibility    -0.14  0.02    0.05     -0.20     0.07     0.21
## PedCount       1.00 -0.28   -0.28      0.21     0.09    -0.01
## Res           -0.28  1.00   -0.09     -0.15    -0.03     0.09
## Hours24       -0.28 -0.09    1.00      0.11    -0.04     0.06
## CrewSkill      0.21 -0.15    0.11      1.00    -0.02    -0.03
## MgrSkill       0.09 -0.03   -0.04     -0.02     1.00     0.36
## ServQual      -0.01  0.09    0.06     -0.03     0.36     1.00

Correlations

Based on the above table, we can just measure and display the correlation between Profit and MTenure as follows :

0.44

Similarly, we have the correlation between Profit and CTenure as :

0.26


Corrgram

In R, we can construct visualizations called Corrgram, which is basically a visual representation of the correlation matrix made above. This is an important way to visualize the data.

library(corrplot)
## corrplot 0.84 loaded
library(gplots)
## 
## Attaching package: 'gplots'
## The following object is masked from 'package:stats':
## 
##     lowess
library(corrgram)

corrgram(store, lower.panel=panel.shade,
         upper.panel=panel.pie, text.panel=panel.txt,
         main="Corrgram of stores intercorrelations")

As can be seen in the above data, Profit has a high correlation with -

  • Sales
  • Mtenure
  • Pop
  • PedCount

Whereas, the following variables have somewhat an effect on Profit -

  • CTenure
  • Visibility
  • CrewSkill
  • MgrSkill
  • ServQual

Pearson’s Correlation Tests

Here, we run the pearson’s correlation test to find any relationship between the profit of each store, and the manager/crew tenure. We also make a note of the p-value as it is essential in rejecting or accepting a hypothesis.

Profit and MTenure :

The Null hypothesis states that :

There is no relationship between Profit and Manager’s Tenure.

m = cor.test(store$Profit, store$MTenure)
m
## 
##  Pearson's product-moment correlation
## 
## data:  store$Profit and store$MTenure
## t = 4.1731, df = 73, p-value = 8.193e-05
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.2353497 0.6055175
## sample estimates:
##       cor 
## 0.4388692

The p-value thus obtained is 8.210^{-5} which is very small. Therefore, we can reject the null hypothesis.

Profit and CTenure

In terms of crew’s tenure, the hypothesis will be as such :

There is no relationship between Profit and Crew’s Tenure.

c = cor.test(store$Profit, store$CTenure)
c
## 
##  Pearson's product-moment correlation
## 
## data:  store$Profit and store$CTenure
## t = 2.2786, df = 73, p-value = 0.02562
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.03262507 0.45786339
## sample estimates:
##       cor 
## 0.2576789

The p-value thus obtained is 0.02562 This value is less than 0.05, which means we can reject the null hypothesis. But, it also show’s that a Crew Member’s tenure will not have that significant an impact on the profit of the store.


Regression Analysis

The major task asked in the case study was to see the effect of a Manager’s and Crew’s tenure in the organization on the sales of their store. This had to be calculated, while also accounting for the store location, popularity, etc.

We can run a basic linear regression model to get an estimated effect of the tenure’s of the employees on the profit of the stores.

This can be carried out as follows :

model = Profit ~ MTenure + CTenure + Comp + PedCount + Res + Hours24 + Visibility

reg = lm(model, data = store)
summary(reg)
## 
## Call:
## lm(formula = model, data = store)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -117573  -31547   -3433   26249  115426 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  19521.5    69215.5   0.282  0.77878    
## MTenure        721.6      131.0   5.510 6.16e-07 ***
## CTenure       1006.9      437.1   2.303  0.02437 *  
## Comp        -27692.7     5615.0  -4.932 5.66e-06 ***
## PedCount     46618.9     7854.5   5.935 1.15e-07 ***
## Res          87331.6    40701.8   2.146  0.03553 *  
## Hours24      59035.6    20321.9   2.905  0.00497 ** 
## Visibility   14140.2     9416.1   1.502  0.13787    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 59160 on 67 degrees of freedom
## Multiple R-squared:  0.6036, Adjusted R-squared:  0.5622 
## F-statistic: 14.57 on 7 and 67 DF,  p-value: 2.271e-11
Explanatory Variables

The following variables, have a significant effect (p<0.05) on the profit of the store :

  • MTenure
  • CTenure
  • Comp
  • PedCount
  • Res
  • Hours24

The explanatory variable which does not have a significant effect (p>0.05) on the profit of the store is :

  • Visibility
Beta Co-effecients

The expected change in the Profit at a store, if the Manager’s tenure i.e. number of months of experience with Store24, increases by one month is 721.58.

The expected change in the Profit at a store, if the Crew’s tenure i.e. number of months of experience with Store24, increases by one month is 1006.91

Executive Summary

After all the tasks carried out on the given dataset, when can infer a lot of important points. Some of the important points are -

  • Manager’s tenure in the organization has a great impact on the overall profit of Store24. A very low p-value indicates that the two of them are highly related.

  • A crew member’s tenure will also have an impact on the profit of the store, but it’s p-value is not as high as compared to the manager’s tenure.

  • In terms of overall impact, a crew member’s tenure is supposed to have greater impact than a manager’s tenure.

  • Apart from the manager’s tenure; the total pedestrian footfall in the store area, the loaction of the store (residential or industrial) and the working hours of the store also have an impact on it’s profit.

  • The number of competitors in its Vicinity, has a great negative impact on the profit garnered from a store.

  • Visibility does not have any significant impact on the profit of the store.