Store 24(a):-Managing Employee Retention

Task 4c.:-

Code to read the data into the dataframe called Store:-

setwd("~/Desktop/5 SRM Kashish Mukheja/Downoad content")
Store.df<-read.csv(paste("Store24.csv",sep=""))
View(Store.df)

Code to get the summary statistics of the data:-

library(psych)
describe(Store.df)
##            vars  n       mean        sd     median    trimmed       mad
## store         1 75      38.00     21.79      38.00      38.00     28.17
## Sales         2 75 1205413.12 304531.31 1127332.00 1182031.25 288422.04
## Profit        3 75  276313.61  89404.08  265014.00  270260.34  90532.00
## MTenure       4 75      45.30     57.67      24.12      33.58     29.67
## CTenure       5 75      13.93     17.70       7.21      10.60      6.14
## Pop           6 75    9825.59   5911.67    8896.00    9366.07   7266.22
## Comp          7 75       3.79      1.31       3.63       3.66      0.82
## Visibility    8 75       3.08      0.75       3.00       3.07      0.00
## PedCount      9 75       2.96      0.99       3.00       2.97      1.48
## Res          10 75       0.96      0.20       1.00       1.00      0.00
## Hours24      11 75       0.84      0.37       1.00       0.92      0.00
## CrewSkill    12 75       3.46      0.41       3.50       3.47      0.34
## MgrSkill     13 75       3.64      0.41       3.59       3.62      0.45
## ServQual     14 75      87.15     12.61      89.47      88.62     15.61
##                  min        max      range  skew kurtosis       se
## store           1.00      75.00      74.00  0.00    -1.25     2.52
## Sales      699306.00 2113089.00 1413783.00  0.71    -0.09 35164.25
## Profit     122180.00  518998.00  396818.00  0.62    -0.21 10323.49
## MTenure         0.00     277.99     277.99  2.01     3.90     6.66
## CTenure         0.89     114.15     113.26  3.52    15.00     2.04
## Pop          1046.00   26519.00   25473.00  0.62    -0.23   682.62
## Comp            1.65      11.13       9.48  2.48    11.31     0.15
## Visibility      2.00       5.00       3.00  0.25    -0.38     0.09
## PedCount        1.00       5.00       4.00  0.00    -0.52     0.11
## Res             0.00       1.00       1.00 -4.60    19.43     0.02
## Hours24         0.00       1.00       1.00 -1.82     1.32     0.04
## CrewSkill       2.06       4.64       2.58 -0.43     1.64     0.05
## MgrSkill        2.96       4.62       1.67  0.27    -0.53     0.05
## ServQual       57.90     100.00      42.10 -0.66    -0.72     1.46

Task 4d:-

code to measure the mean and standard deviation of Profit:-

mean(Store.df$Profit)
## [1] 276313.6
sd(Store.df$Profit)
## [1] 89404.08

Code to measure the mean and standard deviation of MTenure:-

mean(Store.df$MTenure)
## [1] 45.29644
sd(Store.df$MTenure)
## [1] 57.67155

Code to measure the mean and standard deviation of CTenure:-

mean(Store.df$CTenure)
## [1] 13.9315
sd(Store.df$CTenure)
## [1] 17.69752

Task 4f:-

Code to print the {StoreID, Sales, Profit, MTenure, CTenure} of the top 10 most profitable stores:-

TopProfit<- Store.df[order(-Store.df$Profit),]
head(TopProfit[c("store","Sales","Profit","MTenure","CTenure")],n=10)
##    store   Sales Profit   MTenure    CTenure
## 74    74 1782957 518998 171.09720  29.519510
## 7      7 1809256 476355  62.53080   7.326488
## 9      9 2113089 474725 108.99350   6.061602
## 6      6 1703140 469050 149.93590  11.351130
## 44    44 1807740 439781 182.23640 114.151900
## 2      2 1619874 424007  86.22219   6.636550
## 45    45 1602362 410149  47.64565   9.166325
## 18    18 1704826 394039 239.96980  33.774130
## 11    11 1583446 389886  44.81977   2.036961
## 47    47 1665657 387853  12.84790   6.636550

Code to print the {StoreID, Sales, Profit, MTenure, CTenure} of the bottom 10 least profitable stores:-

BotProfit<- Store.df[order(-Store.df$Profit),]
tail(BotProfit[c("store","Sales","Profit","MTenure","CTenure")],n=10)
##    store   Sales Profit     MTenure   CTenure
## 37    37 1202917 187765  23.1985000  1.347023
## 61    61  716589 177046  21.8184200 13.305950
## 52    52 1073008 169201  24.1185600  3.416838
## 54    54  811190 159792   6.6703910  3.876797
## 13    13  857843 152513   0.6571813  1.577002
## 32    32  828918 149033  36.0792600  6.636550
## 55    55  925744 147672   6.6703910 18.365500
## 41    41  744211 147327  14.9180200 11.926080
## 66    66  879581 146058 115.2039000  3.876797
## 57    57  699306 122180  24.3485700  2.956879

Task 4g:-

Code to draw a scatter plot of Profit vs. MTenure:-

library(car)
## 
## Attaching package: 'car'
## The following object is masked from 'package:psych':
## 
##     logit
scatterplot(Profit ~ MTenure , data=Store.df, 
    xlab="Mtenure", ylab="Profit", 
   main="Scatterplot of Profit Vs Mtenure", 
   labels=row.names(Store.df))

Task 4h:-

Code to draw a scatter plot of Profit vs. CTenure:-

scatterplot(Profit ~ CTenure , data=Store.df, 
    xlab="Ctenure", ylab="Profit", 
   main="Scatterplot of Profit Vs Ctenure", 
   labels=row.names(Store.df))

Task 4i:-

Code to construct a Correlation Matrix for all the variables in the dataset. (Display the numbers up to 2 Decimal places)

res<-cor(Store.df)
round(res,2)
##            store Sales Profit MTenure CTenure   Pop  Comp Visibility
## store       1.00 -0.23  -0.20   -0.06    0.02 -0.29  0.03      -0.03
## Sales      -0.23  1.00   0.92    0.45    0.25  0.40 -0.24       0.13
## Profit     -0.20  0.92   1.00    0.44    0.26  0.43 -0.33       0.14
## MTenure    -0.06  0.45   0.44    1.00    0.24 -0.06  0.18       0.16
## CTenure     0.02  0.25   0.26    0.24    1.00  0.00 -0.07       0.07
## Pop        -0.29  0.40   0.43   -0.06    0.00  1.00 -0.27      -0.05
## Comp        0.03 -0.24  -0.33    0.18   -0.07 -0.27  1.00       0.03
## Visibility -0.03  0.13   0.14    0.16    0.07 -0.05  0.03       1.00
## PedCount   -0.22  0.42   0.45    0.06   -0.08  0.61 -0.15      -0.14
## Res        -0.03 -0.17  -0.16   -0.06   -0.34 -0.24  0.22       0.02
## Hours24     0.03  0.06  -0.03   -0.17    0.07 -0.22  0.13       0.05
## CrewSkill   0.05  0.16   0.16    0.10    0.26  0.28 -0.04      -0.20
## MgrSkill   -0.07  0.31   0.32    0.23    0.12  0.08  0.22       0.07
## ServQual   -0.32  0.39   0.36    0.18    0.08  0.12  0.02       0.21
##            PedCount   Res Hours24 CrewSkill MgrSkill ServQual
## store         -0.22 -0.03    0.03      0.05    -0.07    -0.32
## Sales          0.42 -0.17    0.06      0.16     0.31     0.39
## Profit         0.45 -0.16   -0.03      0.16     0.32     0.36
## MTenure        0.06 -0.06   -0.17      0.10     0.23     0.18
## CTenure       -0.08 -0.34    0.07      0.26     0.12     0.08
## Pop            0.61 -0.24   -0.22      0.28     0.08     0.12
## Comp          -0.15  0.22    0.13     -0.04     0.22     0.02
## Visibility    -0.14  0.02    0.05     -0.20     0.07     0.21
## PedCount       1.00 -0.28   -0.28      0.21     0.09    -0.01
## Res           -0.28  1.00   -0.09     -0.15    -0.03     0.09
## Hours24       -0.28 -0.09    1.00      0.11    -0.04     0.06
## CrewSkill      0.21 -0.15    0.11      1.00    -0.02    -0.03
## MgrSkill       0.09 -0.03   -0.04     -0.02     1.00     0.36
## ServQual      -0.01  0.09    0.06     -0.03     0.36     1.00

Task 4j:-

Code to measure the correlation between Profit and MTenure. (Display the numbers up to 2 Decimal places)

round(cor(Store.df$Profit,Store.df$MTenure),2)
## [1] 0.44

Code to measure the correlation between Profit and CTenure. (Display the numbers up to 2 Decimal places)

round(cor(Store.df$Profit,Store.df$CTenure),2)
## [1] 0.26

Task 4k:-

Code to construct the following Corrgram based on all variables in the dataset:-

library(corrgram)
library(ellipse)
## 
## Attaching package: 'ellipse'
## The following object is masked from 'package:car':
## 
##     ellipse
## The following object is masked from 'package:graphics':
## 
##     pairs
corrgram(res, order = FALSE, lower.panel = panel.shade, upper.panel = panel.pie, text.panel = panel.txt,main = "Corrgram of store variables")

Task 4l:-

Run a Pearson’s Correlation test on the correlation between Profit and MTenure. What is the p-value?

corpm<-cor.test(Store.df$Profit , Store.df$MTenure, method = "pearson")
corpm$p.value
## [1] 8.193133e-05

p-value = 8.193e-05 Run a Pearson’s Correlation test on the correlation between Profit and CTenure. What is the p-value?

corpc<-cor.test(Store.df$Profit , Store.df$CTenure, method = "pearson")
corpc$p.value
## [1] 0.0256203

p-value = 0.0256203

Task 4m:-

Run a regression of Profit on {MTenure, CTenure Comp, Pop, PedCount, Res, Hours24, Visibility}.

fit<-lm(Profit~ MTenure + CTenure + Comp + Pop + PedCount + Res + Hours24 , data = Store.df)
summary(fit)
## 
## Call:
## lm(formula = Profit ~ MTenure + CTenure + Comp + Pop + PedCount + 
##     Res + Hours24, data = Store.df)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -107360  -34389   -6529   35039  122875 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  48251.178  60494.260   0.798 0.427911    
## MTenure        791.141    126.085   6.275 2.94e-08 ***
## CTenure        947.183    424.601   2.231 0.029050 *  
## Comp        -25413.478   5529.167  -4.596 1.96e-05 ***
## Pop              3.802      1.473   2.581 0.012043 *  
## PedCount     32266.120   9040.103   3.569 0.000668 ***
## Res          91993.749  39501.555   2.329 0.022891 *  
## Hours24      64414.635  19758.441   3.260 0.001752 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 57360 on 67 degrees of freedom
## Multiple R-squared:  0.6273, Adjusted R-squared:  0.5884 
## F-statistic: 16.11 on 7 and 67 DF,  p-value: 3.151e-12

Since p<0.05, so we reject the null hypothesis.Hence they all affect the profit.

Task 4n:-

Based on TASK 3m, answer the following questions: 15.List the explanatory variable(s) whose beta-coefficients are statistically significant (p < 0.05).
Ans.]The explanatory variable(s) whose beta-coefficients are statistically significant (p < 0.05) are Mtenure,Ctenure,Comp,Pop,PedCount,Res,Hours24. 16.List the explanatory variable(s) whose beta-coefficients are not statistically significant (p > 0.05).
Ans.]Visibility is the ony explanatory variable whose beta-coefficient is not statistically significant (p > 0.05)

Task 4o:-

Based on TASK 4m, answer the following questions: 17.What is expected change in the Profit at a store, if the Manager’s tenure i.e. number of months of experience with Store24, increases by one month?
Ans.]760.993 units is expected change in the Profit at a store, if the Manager’s tenure i.e. number of months of experience with Store24, increases by one month? 18.What is expected change in the Profit at a store, if the Crew’s tenure i.e. number of months of experience with Store24, increases by one month?
Ans.]944.978 units is expected change in the Profit at a store, if the Crew’s tenure i.e. number of months of experience with Store24, increases by one month?

Additional Analysis

1.T-test to check the hypothesis “Stores in residential area have more profit than indusrtial area”

t.test(Store.df$Profit[Store.df$Res==1],Store.df$Profit[Store.df$Res==0],alternative = "less")
## 
##  Welch Two Sample t-test
## 
## data:  Store.df$Profit[Store.df$Res == 1] and Store.df$Profit[Store.df$Res == 0]
## t = -1.4991, df = 2.2038, p-value = 0.1307
## alternative hypothesis: true difference in means is less than 0
## 95 percent confidence interval:
##      -Inf 60012.99
## sample estimates:
## mean of x mean of y 
##  273422.7  345695.7

Conclusion:-Here Res is a categorical variable.Since p-value>0.05, hence we accepth the Null Hypothesis.So profit in Resendial area equals that of industrial area. 2.Regression between Profit and MTenure,CTenure,Res:-

fit1<-lm(Profit~ MTenure + CTenure + Res , data = Store.df)
summary(fit1)
## 
## Call:
## lm(formula = Profit ~ MTenure + CTenure + Res, data = Store.df)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -165603  -48276   -7952   36649  195254 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 278383.0    52589.0   5.294 1.28e-06 ***
## MTenure        622.9      167.1   3.727 0.000386 ***
## CTenure        652.1      578.1   1.128 0.263089    
## Res         -41009.8    50397.7  -0.814 0.418524    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 80400 on 71 degrees of freedom
## Multiple R-squared:  0.224,  Adjusted R-squared:  0.1912 
## F-statistic: 6.833 on 3 and 71 DF,  p-value: 0.000413

3.T-test to check the hypothesis “Stores open for 24 hours have more profit than indusrtial area”

t.test(Store.df$Profit[Store.df$Hours24==1],Store.df$Profit[Store.df$Hours24==0],alternative = "greater")
## 
##  Welch Two Sample t-test
## 
## data:  Store.df$Profit[Store.df$Hours24 == 1] and Store.df$Profit[Store.df$Hours24 == 0]
## t = -0.18413, df = 13.615, p-value = 0.5717
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  -65861.03       Inf
## sample estimates:
## mean of x mean of y 
##  275318.0  281540.4

Conclusion:-Here Hours24 is a categorical variable.Since p-value>0.05, hence we accepth the Null Hypothesis. ##Task 4p:- Prepare an “Executive Summary”. Please add this to the end of your Rmd file. Specifically, please create a qualitative summary of Managerial Insights, based on your data analysis, especially your Regression Analysis. You may write this in paragraph form or in point form.
Ans.]From the above regression analyses we can infer the following:-
1.There is a statistically significant association between Profit and Mtenure, CTenure, Comp, Pop, PedCount, Res, Hours24.
2.There is a moderate positive correlation between Profit and MTenure and Profit and CTenure.
3.There is no statistically signifocant association between Profit and Visibility.
4.Profit shows a highly negative correlation with Comp.
5.On the basis of regression analysis, there is a need for more predictors, since the R-squared value is not too high.
6.The Mtenure and CTenure inlfuence the profit at a very significant rate.
7.There is a negative correlation between Mtenure and Hours24.
8.The Service quality shows positive correlation with Profit,MTenure,CTenure and Manager’s skill.Hence,increasing the service quality though manager’s and crew’s skill might lead to a respectable amount of increase in profit and their retention.