title: “Employee Retention Strategy Case” author: “Puneet Rajput” output: html_document —

Q.1) Find out the summary of the data frame.

store.df <- read.csv(paste("Store24.csv", sep = ""))
View(store.df)
summary(store.df)

##      store          Sales             Profit          MTenure      
##  Min.   : 1.0   Min.   : 699306   Min.   :122180   Min.   :  0.00  
##  1st Qu.:19.5   1st Qu.: 984579   1st Qu.:211004   1st Qu.:  6.67  
##  Median :38.0   Median :1127332   Median :265014   Median : 24.12  
##  Mean   :38.0   Mean   :1205413   Mean   :276314   Mean   : 45.30  
##  3rd Qu.:56.5   3rd Qu.:1362388   3rd Qu.:331314   3rd Qu.: 50.92  
##  Max.   :75.0   Max.   :2113089   Max.   :518998   Max.   :277.99  
##     CTenure              Pop             Comp          Visibility  
##  Min.   :  0.8871   Min.   : 1046   Min.   : 1.651   Min.   :2.00  
##  1st Qu.:  4.3943   1st Qu.: 5616   1st Qu.: 3.151   1st Qu.:3.00  
##  Median :  7.2115   Median : 8896   Median : 3.629   Median :3.00  
##  Mean   : 13.9315   Mean   : 9826   Mean   : 3.788   Mean   :3.08  
##  3rd Qu.: 17.2156   3rd Qu.:14104   3rd Qu.: 4.230   3rd Qu.:4.00  
##  Max.   :114.1519   Max.   :26519   Max.   :11.128   Max.   :5.00  
##     PedCount         Res          Hours24       CrewSkill    
##  Min.   :1.00   Min.   :0.00   Min.   :0.00   Min.   :2.060  
##  1st Qu.:2.00   1st Qu.:1.00   1st Qu.:1.00   1st Qu.:3.225  
##  Median :3.00   Median :1.00   Median :1.00   Median :3.500  
##  Mean   :2.96   Mean   :0.96   Mean   :0.84   Mean   :3.457  
##  3rd Qu.:4.00   3rd Qu.:1.00   3rd Qu.:1.00   3rd Qu.:3.655  
##  Max.   :5.00   Max.   :1.00   Max.   :1.00   Max.   :4.640  
##     MgrSkill        ServQual     
##  Min.   :2.957   Min.   : 57.90  
##  1st Qu.:3.344   1st Qu.: 78.95  
##  Median :3.589   Median : 89.47  
##  Mean   :3.638   Mean   : 87.15  
##  3rd Qu.:3.925   3rd Qu.: 99.90  
##  Max.   :4.622   Max.   :100.00

Q.2) Find out the mean and standard deviation of Profit, MTenure, CTenure.

mean(store.df$Profit)

## [1] 276313.6

sd(store.df$Profit)

## [1] 89404.08

mean(store.df$MTenure)

## [1] 45.29644

sd(store.df$MTenure)

## [1] 57.67155

mean(store.df$CTenure)

## [1] 13.9315

sd(store.df$CTenure)

## [1] 17.69752

Q.3) Sort the mpg of mtcars in ascending and descending order.

attach(mtcars)
View(mtcars)
newdata <- mtcars[order(mpg),]
View(newdata)
pewdata <- mtcars[order(-mpg),]
View(pewdata)
detach(mtcars)

Q.4) Print {StoreID, Sales, Profit, MTenure, CTenure} of top 10 most profitable stores.

attach(store.df)
topmostdata.df <- store.df[order(-Profit),]
topmostdata.df[1:10, c('store','Sales','Profit','MTenure','CTenure')]

##    store   Sales Profit   MTenure    CTenure
## 74    74 1782957 518998 171.09720  29.519510
## 7      7 1809256 476355  62.53080   7.326488
## 9      9 2113089 474725 108.99350   6.061602
## 6      6 1703140 469050 149.93590  11.351130
## 44    44 1807740 439781 182.23640 114.151900
## 2      2 1619874 424007  86.22219   6.636550
## 45    45 1602362 410149  47.64565   9.166325
## 18    18 1704826 394039 239.96980  33.774130
## 11    11 1583446 389886  44.81977   2.036961
## 47    47 1665657 387853  12.84790   6.636550

Q.4) Print {StoreID, Sales, Profit, MTenure, CTenure} of top 10 least profitable stores.

leastmost.df <- store.df[order(Profit),]
leastmost.df[1:10, c('store','Sales','Profit','MTenure','CTenure')]

##    store   Sales Profit     MTenure   CTenure
## 57    57  699306 122180  24.3485700  2.956879
## 66    66  879581 146058 115.2039000  3.876797
## 41    41  744211 147327  14.9180200 11.926080
## 55    55  925744 147672   6.6703910 18.365500
## 32    32  828918 149033  36.0792600  6.636550
## 13    13  857843 152513   0.6571813  1.577002
## 54    54  811190 159792   6.6703910  3.876797
## 52    52 1073008 169201  24.1185600  3.416838
## 61    61  716589 177046  21.8184200 13.305950
## 37    37 1202917 187765  23.1985000  1.347023

Q.5) Visualise variations of MTEnure and Profit through scatterplot.

plot(Profit ~ MTenure, data= store.df, main="ScatterPlot of MTenure VS Profit")
abline(lm(store.df$Profit~store.df$MTenure))

Q.6) Visualise the variations of Profit with CTenure.

plot(Profit ~ CTenure, data= store.df, main="ScatterPlot of CTenure VS Profit")
abline(lm(store.df$Profit~store.df$CTenure))

Q.7) Obtain the correlation matrix between all the variables.

library(corrplot)
corrplot(corr = cor(store.df[,], use="complete.obs"), method="ellipse")

Q.8) Calculate the correlation between Profit and MTenure.

 cor(store.df[,c("Profit", "MTenure")])

##            Profit   MTenure
## Profit  1.0000000 0.4388692
## MTenure 0.4388692 1.0000000

Q.9) Calculate the correlation between Profit and CTenure.

 cor(store.df[,c("Profit","CTenure")])

##            Profit   CTenure
## Profit  1.0000000 0.2576789
## CTenure 0.2576789 1.0000000

Q.10) Obtain corrgram using all the variables.

library(corrgram)
corrgram(store.df, upper.panel = panel.pie, lower.panel = panel.shade, text.panel = panel.txt, main="Corrgram of Employees Retention Case Review")

Q.11) Run pearson correlation test on Profit Vs. MTenure. What is the p - value?

cor.test(store.df$MTenure, store.df$Profit)

## 
##  Pearson's product-moment correlation
## 
## data:  store.df$MTenure and store.df$Profit
## t = 4.1731, df = 73, p-value = 8.193e-05
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.2353497 0.6055175
## sample estimates:
##       cor 
## 0.4388692

Q.12) Run a pearson correlation test on Profit Vs. CTenure. What is the p - value?

cor.test(store.df$CTenure, store.df$Profit)

## 
##  Pearson's product-moment correlation
## 
## data:  store.df$CTenure and store.df$Profit
## t = 2.2786, df = 73, p-value = 0.02562
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.03262507 0.45786339
## sample estimates:
##       cor 
## 0.2576789

Q.13) Run regression of Profit on {MTenure, CTenure Comp, Pop, PedCount, Res, Hours24, Visibility}.

model = Profit ~ MTenure+CTenure+Comp+Pop+PedCount+Res+Hours24+Visibility
fit <- lm(model, data=store.df)
fit

## 
## Call:
## lm(formula = model, data = store.df)
## 
## Coefficients:
## (Intercept)      MTenure      CTenure         Comp          Pop  
##    7610.041      760.993      944.978   -25286.887        3.667  
##    PedCount          Res      Hours24   Visibility  
##   34087.359    91584.675    63233.307    12625.447

Q.14) Separate out the variables which are statistically significant from those which aren’t.

summary(fit)

## 
## Call:
## lm(formula = model, data = store.df)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -105789  -35946   -7069   33780  112390 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   7610.041  66821.994   0.114 0.909674    
## MTenure        760.993    127.086   5.988 9.72e-08 ***
## CTenure        944.978    421.687   2.241 0.028400 *  
## Comp        -25286.887   5491.937  -4.604 1.94e-05 ***
## Pop              3.667      1.466   2.501 0.014890 *  
## PedCount     34087.359   9073.196   3.757 0.000366 ***
## Res          91584.675  39231.283   2.334 0.022623 *  
## Hours24      63233.307  19641.114   3.219 0.001994 ** 
## Visibility   12625.447   9087.620   1.389 0.169411    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 56970 on 66 degrees of freedom
## Multiple R-squared:  0.6379, Adjusted R-squared:  0.594 
## F-statistic: 14.53 on 8 and 66 DF,  p-value: 5.382e-12

Statistically significant variable(s): {MTenure, CTenure, Comp, Pop, PedCount, Res, Hours24}

Statistically insignificant variable(s): {Visibility}

Q.15) What is expected change in the Profit at a store, if the Manager’s tenure i.e. number of months of experience with Store24, increases by one month?

Ans) Expected change in Profit if Manager’s Tenure is increased by one = 760.041

Expected change in Profit if Crew’s Tenure is increased by one = 944.978

Q.16) Please prepare an “Executive Summary”. Please add this to the end of your Rmd file. Specifically, please create a qualitative summary of Managerial Insights, based on your data analysis, especially your Regression Analysis. You may write this in paragraph form or in point form.

Ans) 1) Average level of managers’ tenure is thrice than the crew.

2) Stiffness of competion is tough: 35 competitors per lac people. Since the results of regression analysis showed that one competitor causes a loss of around 25k so setting up of stores in less competent area would be a better choice initially. Or it can be said that committee should form influencial policies to face the long term competition.

3) Store’s visibility is good: 3 is the average rating. Visibility of store gates is making a positive impact on the outcome, special attention should be given to making visible gates.

4) Crew’s tenure are more deviated as compared to the managers’ with their averages. Uniform retention needs to done in crew department.

5) Most profitable stores have witnessed sales of atleast 15lac. It can be increased if stores are set up in the residential area having empoying more pedestrain footfall.

6)Least profitable stores generally makes less than 1.5lac profit. It also depicts that managers tenure plays a vital role in determining sales at the extremes while crew’s tenure isn’t of prime significance in such cases.

7) Manager’s retention upto 50 months of their initial experience forms remarkable mark on the profit but after then it’s influence on the profit gets saturated. So, they should be exchanged with the new ones as they become most experienced.

8) Crew’s retention upto 20 months of their initial experience forms remarkable mark on the profit but after then it’s influence on the profit gets saturated. So, they should be exchanged with the new ones as they become most experienced.

9) Managers presence produces more sales and profits than the crew’s. It is because we keep on retaining crew who don’t make such big difference in the profit after their initial 2 year of experience while managers remains more proficient upto 4 years.

10) Manager’s skills is important to have to sell more but surprisingly, Local population density outweighs skills in terms of profit made. It is easily analysed that retaining skilled managers at high costs is making less impact than setting up of stores in the densely populated areas - high percentage of locals visiting the stores. Location factors should be given more preference than the employee skills as the former is more statistically correlated to the profit and sales.