Qualitatively identify the crucial issues being faced by the management. Based on your judgment, prepare a list of the most important questions that matter.
The Issues Faced by the Management of Store24 and the most important questions are as follows:
The analysis of the COO of Store 24 showed that manager and crew tenure in the top ten most profitable stores was almost four times the level of manager and crew tenure in the least profitable stores
Increasing Manager and Crew Tenure is a priority to the CEO and the Founder Bob Gordon.
Deciding and giving bonuses as per Crew Retention Rate of the Manager policy is implemented .
Many of the stores have low levels of tenure.
Increasing tenure in most stores may have a relatively large impact on financial performance since managers and crew are developing new skills on a daily basis.
The stores in the sample appear to be widely geographically disbursed, complicating site-location factors.
Lot of Variance in the data. For Eg. population within a half-mile radius ranges from a low of 1,046 to a high of 26,519.
The Management wondered how to incorporate all of this new information into a single coherent analysis
Download and review the Store24.csv data file associated with this case. You may open it in Excel for convenience. Using R, read the data into a data frame called store.
store.df<-read.csv(paste("G:/R Intern/Store24.csv",sep=""))
View(store.df)
dim(store.df)
## [1] 75 14
So we have read this Data from Excel File and Stored it in a Data Frame.
Using R, get the summary statistics of the data. Confirm that the summary statistics generated from R are consistent with Exhibit 3 from the Case.
str(store.df)
## 'data.frame': 75 obs. of 14 variables:
## $ store : int 1 2 3 4 5 6 7 8 9 10 ...
## $ Sales : int 1060294 1619874 1099921 1053860 1227841 1703140 1809256 1378482 2113089 1080979 ...
## $ Profit : int 265014 424007 222735 210122 300480 469050 476355 361115 474725 278625 ...
## $ MTenure : num 0 86.22 23.89 0 3.88 ...
## $ CTenure : num 24.8 6.64 5.03 5.37 6.87 ...
## $ Pop : int 7535 8630 9695 2797 20335 16926 17754 20824 26519 16381 ...
## $ Comp : num 2.8 4.24 4.49 4.25 1.65 ...
## $ Visibility: int 3 4 3 4 2 3 2 4 2 4 ...
## $ PedCount : int 3 3 3 2 5 4 5 3 4 3 ...
## $ Res : int 1 1 1 1 0 1 1 1 1 1 ...
## $ Hours24 : int 1 1 1 1 1 0 1 1 1 0 ...
## $ CrewSkill : num 3.56 3.2 3.8 2.06 3.65 3.58 3.94 3.98 3.22 3.54 ...
## $ MgrSkill : num 3.15 3.56 4.12 4.1 3.59 ...
## $ ServQual : num 86.8 94.7 78.9 100 68.4 ...
print("MEAN")
## [1] "MEAN"
format(apply(store.df[,2:14],MARGIN=2,mean), scientific=FALSE)
## Sales Profit MTenure CTenure
## "1205413.120000" " 276313.613333" " 45.296444" " 13.931499"
## Pop Comp Visibility PedCount
## " 9825.586667" " 3.787751" " 3.080000" " 2.960000"
## Res Hours24 CrewSkill MgrSkill
## " 0.960000" " 0.840000" " 3.456667" " 3.637976"
## ServQual
## " 87.153844"
print("Standard Deviation ")
## [1] "Standard Deviation "
format(apply(store.df[,2:14],MARGIN=2,sd), scientific=FALSE)
## Sales Profit MTenure CTenure
## "304531.3090766" " 89404.0763381" " 57.6715512" " 17.6975171"
## Pop Comp Visibility PedCount
## " 5911.6737697" " 1.3113905" " 0.7490535" " 0.9924036"
## Res Hours24 CrewSkill MgrSkill
## " 0.1972788" " 0.3690748" " 0.4065854" " 0.4084571"
## ServQual
## " 12.6133920"
print("Min ")
## [1] "Min "
format(apply(store.df[,2:14],MARGIN=2,min), scientific=FALSE)
## Sales Profit MTenure CTenure
## "699306.0000000" "122180.0000000" " 0.0000000" " 0.8870637"
## Pop Comp Visibility PedCount
## " 1046.0000000" " 1.6513640" " 2.0000000" " 1.0000000"
## Res Hours24 CrewSkill MgrSkill
## " 0.0000000" " 0.0000000" " 2.0600000" " 2.9566670"
## ServQual
## " 57.8955200"
print("Max ")
## [1] "Max "
format(apply(store.df[,2:14],MARGIN=2,max), scientific=FALSE)
## Sales Profit MTenure CTenure
## "2113089.000000" " 518998.000000" " 277.987700" " 114.151900"
## Pop Comp Visibility PedCount
## " 26519.000000" " 11.127880" " 5.000000" " 5.000000"
## Res Hours24 CrewSkill MgrSkill
## " 1.000000" " 1.000000" " 4.640000" " 4.622222"
## ServQual
## " 100.000000"
Thus, we confirm that the Summary Stats are the same as in Exhibit 3 from the Case.
1.Use R to measure the mean and standard deviation of Profit.
mean(store.df$Profit)
## [1] 276313.6
sd(store.df$Profit)
## [1] 89404.08
2.Use R to measure the mean and standard deviation of MTenure.
mean(store.df$MTenure)
## [1] 45.29644
sd(store.df$MTenure)
## [1] 57.67155
3.Use R to measure the mean and standard deviation of CTenure.
mean(store.df$CTenure)
## [1] 13.9315
sd(store.df$CTenure)
## [1] 17.69752
In this TASK, we will learn how to sort a dataframe based on a data column.
store.profit<-store.df[order(-(store.df$Profit)),]
View(store.profit[1:10,1:6])
Thus We can see the Most profitable 10 Stores and their details as mentioned Above.
View(store.profit[66:75,1:6])
Thus We can see the Least profitable 10 Stores and their details as mentioned Above.
Use R to draw a scatter plot of Profit vs. MTenure.
library(car)
scatterplot(Profit~MTenure,ylab="Profit",xlab="MTenure",data=store.df,main="ScatterPlot of Profit vs MTenure",pch=19)
Use R to draw a scatter plot of Profit vs. CTenure.
library(car)
scatterplot(Profit~CTenure,ylab="Profit",xlab="CTenure",data=store.df,main="ScatterPlot of Profit vs CTenure",pch=19)
Use R to construct a Correlation Matrix for all the variables in the dataset. (Display the numbers up to 2 Decimal places)
r<-round(cor(store.df),2)
r
## store Sales Profit MTenure CTenure Pop Comp Visibility
## store 1.00 -0.23 -0.20 -0.06 0.02 -0.29 0.03 -0.03
## Sales -0.23 1.00 0.92 0.45 0.25 0.40 -0.24 0.13
## Profit -0.20 0.92 1.00 0.44 0.26 0.43 -0.33 0.14
## MTenure -0.06 0.45 0.44 1.00 0.24 -0.06 0.18 0.16
## CTenure 0.02 0.25 0.26 0.24 1.00 0.00 -0.07 0.07
## Pop -0.29 0.40 0.43 -0.06 0.00 1.00 -0.27 -0.05
## Comp 0.03 -0.24 -0.33 0.18 -0.07 -0.27 1.00 0.03
## Visibility -0.03 0.13 0.14 0.16 0.07 -0.05 0.03 1.00
## PedCount -0.22 0.42 0.45 0.06 -0.08 0.61 -0.15 -0.14
## Res -0.03 -0.17 -0.16 -0.06 -0.34 -0.24 0.22 0.02
## Hours24 0.03 0.06 -0.03 -0.17 0.07 -0.22 0.13 0.05
## CrewSkill 0.05 0.16 0.16 0.10 0.26 0.28 -0.04 -0.20
## MgrSkill -0.07 0.31 0.32 0.23 0.12 0.08 0.22 0.07
## ServQual -0.32 0.39 0.36 0.18 0.08 0.12 0.02 0.21
## PedCount Res Hours24 CrewSkill MgrSkill ServQual
## store -0.22 -0.03 0.03 0.05 -0.07 -0.32
## Sales 0.42 -0.17 0.06 0.16 0.31 0.39
## Profit 0.45 -0.16 -0.03 0.16 0.32 0.36
## MTenure 0.06 -0.06 -0.17 0.10 0.23 0.18
## CTenure -0.08 -0.34 0.07 0.26 0.12 0.08
## Pop 0.61 -0.24 -0.22 0.28 0.08 0.12
## Comp -0.15 0.22 0.13 -0.04 0.22 0.02
## Visibility -0.14 0.02 0.05 -0.20 0.07 0.21
## PedCount 1.00 -0.28 -0.28 0.21 0.09 -0.01
## Res -0.28 1.00 -0.09 -0.15 -0.03 0.09
## Hours24 -0.28 -0.09 1.00 0.11 -0.04 0.06
## CrewSkill 0.21 -0.15 0.11 1.00 -0.02 -0.03
## MgrSkill 0.09 -0.03 -0.04 -0.02 1.00 0.36
## ServQual -0.01 0.09 0.06 -0.03 0.36 1.00
round(cor(x=store.df$Profit,y=store.df$MTenure),2)
## [1] 0.44
round(cor(x=store.df$Profit,y=store.df$CTenure),2)
## [1] 0.26
Use R to construct the Corrgram based on all variables in the dataset.
The bivariate relationships underlying correlations can be visualized through scatter plots and scatter plot matrices, whereas correlograms provide a unique and powerful method for comparing a large number of correlation coefficients in a meaningful way.
library(corrgram)
library(corrplot)
## corrplot 0.84 loaded
corrgram(r, lower.panel = panel.shade, upper.panel = panel.pie,
text.panel = panel.txt, main = "Corrgram of Store Variables")
As we see the Profit in the above Corrgram, we see that it is very strongly linked with Sales. Less Strongly to MTenure, yet strongly correlated.
It is linked with CTenure,Pop,PedCount and MGrSkill with lesser correlation in Visibility and Crewskill. Not correlated with Comp,Res,Hours24(as in negative)
The above statements show us that as per a Manager’s Point of View, more emphasis should be paid to variable(s) having more correlation with Profit while keeping lesser tabs on those having no effect at all. The Hours24 column thus hould be evaluated with less concern along with location of the Stores .
Also, a strong negative correlation implies with Comp that as Profit Increases, Comp decreases.
library(psych)
##
## Attaching package: 'psych'
## The following object is masked from 'package:car':
##
## logit
corr.test(store.df, method="pearson",use="complete")
## Call:corr.test(x = store.df, use = "complete", method = "pearson")
## Correlation matrix
## store Sales Profit MTenure CTenure Pop Comp Visibility
## store 1.00 -0.23 -0.20 -0.06 0.02 -0.29 0.03 -0.03
## Sales -0.23 1.00 0.92 0.45 0.25 0.40 -0.24 0.13
## Profit -0.20 0.92 1.00 0.44 0.26 0.43 -0.33 0.14
## MTenure -0.06 0.45 0.44 1.00 0.24 -0.06 0.18 0.16
## CTenure 0.02 0.25 0.26 0.24 1.00 0.00 -0.07 0.07
## Pop -0.29 0.40 0.43 -0.06 0.00 1.00 -0.27 -0.05
## Comp 0.03 -0.24 -0.33 0.18 -0.07 -0.27 1.00 0.03
## Visibility -0.03 0.13 0.14 0.16 0.07 -0.05 0.03 1.00
## PedCount -0.22 0.42 0.45 0.06 -0.08 0.61 -0.15 -0.14
## Res -0.03 -0.17 -0.16 -0.06 -0.34 -0.24 0.22 0.02
## Hours24 0.03 0.06 -0.03 -0.17 0.07 -0.22 0.13 0.05
## CrewSkill 0.05 0.16 0.16 0.10 0.26 0.28 -0.04 -0.20
## MgrSkill -0.07 0.31 0.32 0.23 0.12 0.08 0.22 0.07
## ServQual -0.32 0.39 0.36 0.18 0.08 0.12 0.02 0.21
## PedCount Res Hours24 CrewSkill MgrSkill ServQual
## store -0.22 -0.03 0.03 0.05 -0.07 -0.32
## Sales 0.42 -0.17 0.06 0.16 0.31 0.39
## Profit 0.45 -0.16 -0.03 0.16 0.32 0.36
## MTenure 0.06 -0.06 -0.17 0.10 0.23 0.18
## CTenure -0.08 -0.34 0.07 0.26 0.12 0.08
## Pop 0.61 -0.24 -0.22 0.28 0.08 0.12
## Comp -0.15 0.22 0.13 -0.04 0.22 0.02
## Visibility -0.14 0.02 0.05 -0.20 0.07 0.21
## PedCount 1.00 -0.28 -0.28 0.21 0.09 -0.01
## Res -0.28 1.00 -0.09 -0.15 -0.03 0.09
## Hours24 -0.28 -0.09 1.00 0.11 -0.04 0.06
## CrewSkill 0.21 -0.15 0.11 1.00 -0.02 -0.03
## MgrSkill 0.09 -0.03 -0.04 -0.02 1.00 0.36
## ServQual -0.01 0.09 0.06 -0.03 0.36 1.00
## Sample Size
## [1] 75
## Probability values (Entries above the diagonal are adjusted for multiple tests.)
## store Sales Profit MTenure CTenure Pop Comp Visibility
## store 0.00 1.00 1.00 1.00 1.00 0.89 1.00 1.00
## Sales 0.05 0.00 0.00 0.00 1.00 0.03 1.00 1.00
## Profit 0.09 0.00 0.00 0.01 1.00 0.01 0.26 1.00
## MTenure 0.63 0.00 0.00 0.00 1.00 1.00 1.00 1.00
## CTenure 0.87 0.03 0.03 0.04 0.00 1.00 1.00 1.00
## Pop 0.01 0.00 0.00 0.60 0.99 0.00 1.00 1.00
## Comp 0.79 0.04 0.00 0.12 0.55 0.02 0.00 1.00
## Visibility 0.82 0.26 0.25 0.18 0.57 0.67 0.81 0.00
## PedCount 0.06 0.00 0.00 0.60 0.47 0.00 0.21 0.23
## Res 0.79 0.15 0.17 0.60 0.00 0.04 0.06 0.85
## Hours24 0.82 0.59 0.83 0.16 0.53 0.06 0.27 0.69
## CrewSkill 0.68 0.16 0.17 0.39 0.03 0.01 0.72 0.09
## MgrSkill 0.54 0.01 0.00 0.05 0.29 0.48 0.05 0.53
## ServQual 0.00 0.00 0.00 0.12 0.49 0.29 0.88 0.07
## PedCount Res Hours24 CrewSkill MgrSkill ServQual
## store 1.00 1.00 1.00 1.00 1.00 0.37
## Sales 0.01 1.00 1.00 1.00 0.49 0.05
## Profit 0.00 1.00 1.00 1.00 0.37 0.11
## MTenure 1.00 1.00 1.00 1.00 1.00 1.00
## CTenure 1.00 0.22 1.00 1.00 1.00 1.00
## Pop 0.00 1.00 1.00 1.00 1.00 1.00
## Comp 1.00 1.00 1.00 1.00 1.00 1.00
## Visibility 1.00 1.00 1.00 1.00 1.00 1.00
## PedCount 0.00 0.99 1.00 1.00 1.00 1.00
## Res 0.01 0.00 1.00 1.00 1.00 1.00
## Hours24 0.02 0.45 0.00 1.00 1.00 1.00
## CrewSkill 0.07 0.19 0.37 0.00 1.00 1.00
## MgrSkill 0.46 0.78 0.74 0.86 0.00 0.14
## ServQual 0.96 0.44 0.62 0.78 0.00 0.00
##
## To see confidence intervals of the correlations, print with the short=FALSE option
Ans. 0.01
Ans. 1.00
Run a regression of Profit on {MTenure, CTenure Comp, Pop, PedCount, Res, Hours24, Visibility}.
Model1=Profit~MTenure+CTenure+Comp+Pop+PedCount+Res+Hours24+Visibility
fit<-lm(Model1,data=store.df)
summary(fit)
##
## Call:
## lm(formula = Model1, data = store.df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -105789 -35946 -7069 33780 112390
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 7610.041 66821.994 0.114 0.909674
## MTenure 760.993 127.086 5.988 9.72e-08 ***
## CTenure 944.978 421.687 2.241 0.028400 *
## Comp -25286.887 5491.937 -4.604 1.94e-05 ***
## Pop 3.667 1.466 2.501 0.014890 *
## PedCount 34087.359 9073.196 3.757 0.000366 ***
## Res 91584.675 39231.283 2.334 0.022623 *
## Hours24 63233.307 19641.114 3.219 0.001994 **
## Visibility 12625.447 9087.620 1.389 0.169411
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 56970 on 66 degrees of freedom
## Multiple R-squared: 0.6379, Adjusted R-squared: 0.594
## F-statistic: 14.53 on 8 and 66 DF, p-value: 5.382e-12
Based on TASK 3m, answer the following questions:
List the explanatory variable(s) whose beta-coefficients are statistically significant (p < 0.05). Ans. MTenure,CTenure,Comp,Pop,PedCount,Res,Hours24
List the explanatory variable(s) whose beta-coefficients are not statistically significant (p > 0.05). Ans. Visibility
Model1=Profit~MTenure+CTenure+Comp+Pop+PedCount+Res+Hours24+Visibility
fit<-lm(Model1,data=store.df)
coef(fit)
## (Intercept) MTenure CTenure Comp Pop
## 7610.041452 760.992734 944.978026 -25286.886662 3.666606
## PedCount Res Hours24 Visibility
## 34087.358789 91584.675234 63233.307162 12625.447050
Ans. If the Manager’s tenure i.e. number of months of experience with Store24, increases by one month then the Store24 will experience a profit of $760.992734 .
Ans.If the Manager’s tenure i.e. number of months of experience with Store24, increases by one month then the Store24 will experience a profit of $944.978026 .
Please prepare an “Executive Summary”. Please add this to the end of your Rmd file. Specifically, please create a qualitative summary of Managerial Insights, based on your data analysis, especially your Regression Analysis. You may write this in paragraph form or in point form.
Executive Summary:
As seen first, Store 24 is a wide chain of stores having 75 Subsidiary stores.
Subsidiary Stores are located in geographically different areas having different population sizes.
The stores are located in either Industrial or Residential Areas.
There is a high variance and standard deviation in data with high fluctuations in profit, population size and the number of years of Tenure if the Employees.
There also exists a competition with other store chains.
There is a high tenure rate, either the Manager or the Crew or both for stores performing well.
Stores performing less have nearly one fourth of the Tenure as their more profitable counterparts (top 10).
The tenure of Managers is mostly common between 0-50 months.
The tenure of Crew is considerabky lower, averaging around 13 months implying low customer retention.
The Stores having high tenure employees have higher profit.
The profit better correlated with Manager’s Tenure as compared to Crew’s Tenure.
The nearby population plays a huge part on Profit along with Manager Tenure and Crew Tenure.
Comp, Res, Hours24 have negative correlation with Profit. So, irrespective of the the type of area and the hours a store remains open, the profit still is the same.
The Pedestrian Count near the store and the ManagerTenure along with Population are highly correlated to profit.
Thus Concluding our report, I would like to assert as to how by customer retention and better advertisement and bonuses, the Profit can be increased .