Reading the data set

setwd("C:\\Users\\saihe\\Documents")
store.df <- read.csv(paste("Store24.csv", sep=""))
View(store.df)

Summary statistics generated

library(psych)
describe(store.df)
##            vars  n       mean        sd     median    trimmed       mad
## store         1 75      38.00     21.79      38.00      38.00     28.17
## Sales         2 75 1205413.12 304531.31 1127332.00 1182031.25 288422.04
## Profit        3 75  276313.61  89404.08  265014.00  270260.34  90532.00
## MTenure       4 75      45.30     57.67      24.12      33.58     29.67
## CTenure       5 75      13.93     17.70       7.21      10.60      6.14
## Pop           6 75    9825.59   5911.67    8896.00    9366.07   7266.22
## Comp          7 75       3.79      1.31       3.63       3.66      0.82
## Visibility    8 75       3.08      0.75       3.00       3.07      0.00
## PedCount      9 75       2.96      0.99       3.00       2.97      1.48
## Res          10 75       0.96      0.20       1.00       1.00      0.00
## Hours24      11 75       0.84      0.37       1.00       0.92      0.00
## CrewSkill    12 75       3.46      0.41       3.50       3.47      0.34
## MgrSkill     13 75       3.64      0.41       3.59       3.62      0.45
## ServQual     14 75      87.15     12.61      89.47      88.62     15.61
##                  min        max      range  skew kurtosis       se
## store           1.00      75.00      74.00  0.00    -1.25     2.52
## Sales      699306.00 2113089.00 1413783.00  0.71    -0.09 35164.25
## Profit     122180.00  518998.00  396818.00  0.62    -0.21 10323.49
## MTenure         0.00     277.99     277.99  2.01     3.90     6.66
## CTenure         0.89     114.15     113.26  3.52    15.00     2.04
## Pop          1046.00   26519.00   25473.00  0.62    -0.23   682.62
## Comp            1.65      11.13       9.48  2.48    11.31     0.15
## Visibility      2.00       5.00       3.00  0.25    -0.38     0.09
## PedCount        1.00       5.00       4.00  0.00    -0.52     0.11
## Res             0.00       1.00       1.00 -4.60    19.43     0.02
## Hours24         0.00       1.00       1.00 -1.82     1.32     0.04
## CrewSkill       2.06       4.64       2.58 -0.43     1.64     0.05
## MgrSkill        2.96       4.62       1.67  0.27    -0.53     0.05
## ServQual       57.90     100.00      42.10 -0.66    -0.72     1.46

The mean and standard deviation of Profit, MTenure and CTenure

apply(store.df[,3:5], 2, mean)
##       Profit      MTenure      CTenure 
## 276313.61333     45.29644     13.93150
apply(store.df[,3:5], 2, sd)
##      Profit     MTenure     CTenure 
## 89404.07634    57.67155    17.69752

The top 10 most profitable stores

attach(store.df)
prof <- store.df[order(-Profit),] 
prof[1:10,1:5] 
##    store   Sales Profit   MTenure    CTenure
## 74    74 1782957 518998 171.09720  29.519510
## 7      7 1809256 476355  62.53080   7.326488
## 9      9 2113089 474725 108.99350   6.061602
## 6      6 1703140 469050 149.93590  11.351130
## 44    44 1807740 439781 182.23640 114.151900
## 2      2 1619874 424007  86.22219   6.636550
## 45    45 1602362 410149  47.64565   9.166325
## 18    18 1704826 394039 239.96980  33.774130
## 11    11 1583446 389886  44.81977   2.036961
## 47    47 1665657 387853  12.84790   6.636550

The bottom 10 least profitable stores

least <- store.df[order(Profit),] 
least[1:10,1:5] 
##    store   Sales Profit     MTenure   CTenure
## 57    57  699306 122180  24.3485700  2.956879
## 66    66  879581 146058 115.2039000  3.876797
## 41    41  744211 147327  14.9180200 11.926080
## 55    55  925744 147672   6.6703910 18.365500
## 32    32  828918 149033  36.0792600  6.636550
## 13    13  857843 152513   0.6571813  1.577002
## 54    54  811190 159792   6.6703910  3.876797
## 52    52 1073008 169201  24.1185600  3.416838
## 61    61  716589 177046  21.8184200 13.305950
## 37    37 1202917 187765  23.1985000  1.347023

To draw a scatter plot of Profit vs. MTenure

library(car)
## 
## Attaching package: 'car'
## The following object is masked from 'package:psych':
## 
##     logit
scatterplot(store.df$MTenure, store.df$Profit, main = "Scatterplot of Profit vs MTenure", xlab = "MTenure", ylab = "Profit", pch=10)

To draw a scatter plot of Profit vs. CTenure

scatterplot(store.df$CTenure, store.df$Profit, main = "scatterplot of Profit vs Crew Tenure", xlab = "Crew Tenure", ylab = "Profit", pch=16)

To construct a Correlation Matrix for all the variables in the dataset. (Display the numbers up to 2 Decimal places)

round(cor(store.df),2)
##            store Sales Profit MTenure CTenure   Pop  Comp Visibility
## store       1.00 -0.23  -0.20   -0.06    0.02 -0.29  0.03      -0.03
## Sales      -0.23  1.00   0.92    0.45    0.25  0.40 -0.24       0.13
## Profit     -0.20  0.92   1.00    0.44    0.26  0.43 -0.33       0.14
## MTenure    -0.06  0.45   0.44    1.00    0.24 -0.06  0.18       0.16
## CTenure     0.02  0.25   0.26    0.24    1.00  0.00 -0.07       0.07
## Pop        -0.29  0.40   0.43   -0.06    0.00  1.00 -0.27      -0.05
## Comp        0.03 -0.24  -0.33    0.18   -0.07 -0.27  1.00       0.03
## Visibility -0.03  0.13   0.14    0.16    0.07 -0.05  0.03       1.00
## PedCount   -0.22  0.42   0.45    0.06   -0.08  0.61 -0.15      -0.14
## Res        -0.03 -0.17  -0.16   -0.06   -0.34 -0.24  0.22       0.02
## Hours24     0.03  0.06  -0.03   -0.17    0.07 -0.22  0.13       0.05
## CrewSkill   0.05  0.16   0.16    0.10    0.26  0.28 -0.04      -0.20
## MgrSkill   -0.07  0.31   0.32    0.23    0.12  0.08  0.22       0.07
## ServQual   -0.32  0.39   0.36    0.18    0.08  0.12  0.02       0.21
##            PedCount   Res Hours24 CrewSkill MgrSkill ServQual
## store         -0.22 -0.03    0.03      0.05    -0.07    -0.32
## Sales          0.42 -0.17    0.06      0.16     0.31     0.39
## Profit         0.45 -0.16   -0.03      0.16     0.32     0.36
## MTenure        0.06 -0.06   -0.17      0.10     0.23     0.18
## CTenure       -0.08 -0.34    0.07      0.26     0.12     0.08
## Pop            0.61 -0.24   -0.22      0.28     0.08     0.12
## Comp          -0.15  0.22    0.13     -0.04     0.22     0.02
## Visibility    -0.14  0.02    0.05     -0.20     0.07     0.21
## PedCount       1.00 -0.28   -0.28      0.21     0.09    -0.01
## Res           -0.28  1.00   -0.09     -0.15    -0.03     0.09
## Hours24       -0.28 -0.09    1.00      0.11    -0.04     0.06
## CrewSkill      0.21 -0.15    0.11      1.00    -0.02    -0.03
## MgrSkill       0.09 -0.03   -0.04     -0.02     1.00     0.36
## ServQual      -0.01  0.09    0.06     -0.03     0.36     1.00

To measure the correlation between Profit and MTenure. (Display the numbers up to 2 Decimal places)

round(cor(store.df$Profit, store.df$MTenure),2)
## [1] 0.44

To measure the correlation between Profit and CTenure. (Display the numbers up to 2 Decimal places)

round(cor(store.df$Profit, store.df$CTenure),2)
## [1] 0.26

To construct a Corrgram based on all variables in the dataset

library(corrgram)
par(mfrow = c(1,1))
corrgram(store.df,order = FALSE,
         upper.panel = panel.pie,
         lower.panel = panel.shade,
         text.panel = panel.txt,
         main = "Corrgram of store variables",
         )

Run a Pearson’s Correlation test on the correlation between Profit and MTenure

cor.test(Profit,MTenure)
## 
##  Pearson's product-moment correlation
## 
## data:  Profit and MTenure
## t = 4.1731, df = 73, p-value = 8.193e-05
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.2353497 0.6055175
## sample estimates:
##       cor 
## 0.4388692

The p-value is 8.193e-05 which is less than 0.05, hence there is a significant correlation between Profit and MTenure.

Run a Pearson’s Correlation test on the correlation between Profit and CTenure

cor.test(Profit,CTenure)
## 
##  Pearson's product-moment correlation
## 
## data:  Profit and CTenure
## t = 2.2786, df = 73, p-value = 0.02562
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.03262507 0.45786339
## sample estimates:
##       cor 
## 0.2576789

The p-value is 0.02562 which is less than 0.05, hence there is a significant correlation between Profit and CTenure.

Run a regression of Profit on {MTenure, CTenure Comp, Pop, PedCount, Res, Hours24, Visibility}

reg <-  lm( Profit ~ MTenure+CTenure+Comp+Pop+PedCount+Res+Hours24+Visibility,data=store.df )
summary(reg)
## 
## Call:
## lm(formula = Profit ~ MTenure + CTenure + Comp + Pop + PedCount + 
##     Res + Hours24 + Visibility, data = store.df)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -105789  -35946   -7069   33780  112390 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   7610.041  66821.994   0.114 0.909674    
## MTenure        760.993    127.086   5.988 9.72e-08 ***
## CTenure        944.978    421.687   2.241 0.028400 *  
## Comp        -25286.887   5491.937  -4.604 1.94e-05 ***
## Pop              3.667      1.466   2.501 0.014890 *  
## PedCount     34087.359   9073.196   3.757 0.000366 ***
## Res          91584.675  39231.283   2.334 0.022623 *  
## Hours24      63233.307  19641.114   3.219 0.001994 ** 
## Visibility   12625.447   9087.620   1.389 0.169411    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 56970 on 66 degrees of freedom
## Multiple R-squared:  0.6379, Adjusted R-squared:  0.594 
## F-statistic: 14.53 on 8 and 66 DF,  p-value: 5.382e-12

From above, the variables whose beta-coefficients are statistically significant (<0.05) are MTenure,CTenure,Comp,Pop,PedCount,Res,Hours24 and that which is not statistically significant (>0.05) is Visibility.

The coefficients:

reg$coefficients
##   (Intercept)       MTenure       CTenure          Comp           Pop 
##   7610.041452    760.992734    944.978026 -25286.886662      3.666606 
##      PedCount           Res       Hours24    Visibility 
##  34087.358789  91584.675234  63233.307162  12625.447050

The expected change in the Profit at a store, if the Manager’s tenure i.e., number of months of experience with Store24, increases by one month is 760.992734 dollars and if the Crew’s tenure i.e. number of months of experience with Store24, increases by one month then the expected change is 944.978026 dollars.

Executive Summary

We can infer that the profit of a store depends upon both the manager’s tenure and the crew’s tenure. The population and also the pedestrian density near the location of the store have a significant impact on the profits of the store. The 24/7 open stores have lesser profits the strong correlation between the crew skills and manager skills with their tenure suggests that investment in the development and training programs for crew members or managers would be highly beneficial for the firm. The lesser the number of competitors for the firm, the more the profit. The above model can be used to make good decisions because the p value < 0.05 which means the model is good.