1. Introduction

These days the fast food sector is becoming highly competitive. But the one thing common in this industry with rest of the businesses is that it is dominated by large companies. For driving customer traffic, smaller businesses must develop new and creative marketing techniques. This emplies that the businesses stay in constant touch with customers. One of the best ways to do this is to do a proper marketing research. A small fast food company must know what key customers want and will buy before developing marketing and advertising strategies (http://smallbusiness.chron.com) {1}. Therefore, marketing campaigns are conducted to get a current overview of the trends and kinds of tastes that people in any area prefer.

This report addresses the following issues concerning the fast food chain in New Delhi, VeganVista’s promotional campaign. The first issue concerns that which customers will respond to which campaign more positively. In this study, I investigate about the key metrics that drive the sales of VeganVista, located in different localities, in thousands {2}.

2. Literature Review

Marketing managers today need to utilize to the core, the full variety of marketing mix tools so as to achieve maximum benefit. Sales Promotion is a very important marketing tool in the food industry (Sue Peattie; 1998) {3}.

In today’s time, there are many promotional tools such as coupons, bonus packs, free samples and sweepstakes. These are very common activities offered to consumers. It is the the next step in this process, the consumer response to such promotional strategies, which has not been well understood (Chem L. Narayan and P.S. Raju; 1985) {4}. Understanding the importance of consumer response towards promotional campaigns, the influence of these promotional strategies driving sales and consumer decision is analyzed in this study.

3. Data Description

For this study, I collected data from the IBM Watson Analytics website (https://www.ibm.com/communities/analytics/watson-analytics-blog/marketing-campaign-eff-usec_-fastf/).

VeganVista is a fast food restaurant where no table service is provided. It is a self service restaurant. It is located in the Cannaught Place area of New Delhi with several branches all over Delhi. The chain plans to add a new Vegan patty burger to its menu. For this purpose, they decided three possible marketing campaigns for promoting the new product. In order to determine which promotion has the greatest effect on sales, the new item is introduced at locations in several randomly selected markets. A different promotion is used at each location, and the weekly sales of the new item are recorded for the first four weeks.

Our major focus was on the three promotional strategies used, which are as follows -

  1. Collectibles - Fast food companies can drive sales through collectibles, particularly those that kids enjoy. For instance, a popular animated film, dolls, glasses or other mementos that are related to the movie. The owners decided to provide one free item for the purchase of the vegan patty burger. This fast food marketing strategies entices people to come back until they have all the collectibles.

  2. Societal Marketing - It includes volunteering or collecting money or items for charity. Consumers who relate to our ideas and values due to our charitable work may, in turn, patronize our fast food restaurant.

  3. Loyalty Programs - Frequency card programs are a popular type of loyalty program for fast food restaurants. The owners plan to invite people to fill out an application and reward them according to the frequency in which they order the particular product. They planned on giving a free drink after their first four orders, then free fries after their next four orders. Ultimately, a customer could earn a free meal after 12 orders.

4. Model Analysis

I have used two models to test the hypothesis of how different variables drive sales in thousands.

4.1 First Model -

Null Hypothesis - MarketSize, Promotion and week have equal effect on sales In order to test Hypothesis, the proposed model is as follows:

\[SalesInThousands= \beta_0 + \beta_1 MarketSize + \beta_2 Promotion + \beta_3 week\] ## 4.2 Second Model -

Null Hypothesis - MarketID, LocationID, AgeOfStore and Promotion have equal effect on sales In order to test Hypothesis, the proposed model is as follows:

\[SalesInThousands= \beta_0 + \beta_1 MarketID + \beta_2 LocationID + \beta_3 AgeOfStore + \beta_4 Promotion\]

## Reading the dataset
MarCamp.df <- read.csv(paste("WA_Fn-UseC_-Marketing-Campaign-Eff-UseC_-FastF.csv", sep=""))

## Converting sex, quarter and first language columns to factors from integers

MarCamp.df$Promotion <- as.factor(MarCamp.df$Promotion)
MarCamp.df$week <- as.factor(MarCamp.df$week)
MarCamp.df$MarketID <- as.factor(MarCamp.df$MarketID)
str(MarCamp.df)
## 'data.frame':    548 obs. of  7 variables:
##  $ MarketID        : Factor w/ 10 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ MarketSize      : Factor w/ 3 levels "Large","Medium",..: 2 2 2 2 2 2 2 2 2 2 ...
##  $ LocationID      : int  1 1 1 1 2 2 2 2 3 3 ...
##  $ AgeOfStore      : int  4 4 4 4 5 5 5 5 12 12 ...
##  $ Promotion       : Factor w/ 3 levels "1","2","3": 3 3 3 3 2 2 2 2 1 1 ...
##  $ week            : Factor w/ 4 levels "1","2","3","4": 1 2 3 4 1 2 3 4 1 2 ...
##  $ SalesInThousands: num  33.7 35.7 29 39.2 27.8 ...
## First Regression Model

fit_a <- lm(SalesInThousands ~ MarketSize+Promotion+week, data = MarCamp.df)
summary(fit_a)
## 
## Call:
## lm(formula = SalesInThousands ~ MarketSize + Promotion + week, 
##     data = MarCamp.df)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -25.2132  -7.8601   0.7121   7.8724  24.8168 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       74.8332     1.3461  55.591  < 2e-16 ***
## MarketSizeMedium -26.5213     1.0423 -25.445  < 2e-16 ***
## MarketSizeSmall  -13.8219     1.6460  -8.397 4.05e-16 ***
## Promotion2       -10.7674     1.1522  -9.345  < 2e-16 ***
## Promotion3        -1.0156     1.1535  -0.881    0.379    
## week2             -0.4040     1.3183  -0.306    0.759    
## week3             -0.3160     1.3183  -0.240    0.811    
## week4             -0.5775     1.3183  -0.438    0.662    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 10.91 on 540 degrees of freedom
## Multiple R-squared:  0.5814, Adjusted R-squared:  0.5759 
## F-statistic: 107.1 on 7 and 540 DF,  p-value: < 2.2e-16
## Second Regression Model

fit_b <- lm(SalesInThousands ~ MarketID+LocationID+AgeOfStore+Promotion, data = MarCamp.df)
summary(fit_b)
## 
## Call:
## lm(formula = SalesInThousands ~ MarketID + LocationID + AgeOfStore + 
##     Promotion, data = MarCamp.df)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -14.5060  -3.4551  -0.1593   3.4745  14.3174 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 39.8781902  0.9108658  43.781  < 2e-16 ***
## MarketID2   25.9118551  4.9504734   5.234 2.39e-07 ***
## MarketID3   49.8706379 10.1158147   4.930 1.10e-06 ***
## MarketID4   19.6766409 14.6969086   1.339    0.181    
## MarketID5   16.0503762 19.7758281   0.812    0.417    
## MarketID6    2.1058084 24.6771579   0.085    0.932    
## MarketID7    9.9726005 29.6060346   0.337    0.736    
## MarketID8   13.2753016 34.4303218   0.386    0.700    
## MarketID9   18.1428750 39.3210401   0.461    0.645    
## MarketID10  20.2760451 44.4695539   0.456    0.649    
## LocationID  -0.0009556  0.0492052  -0.019    0.985    
## AgeOfStore   0.0129108  0.0345882   0.373    0.709    
## Promotion2  -9.7180316  0.5627960 -17.267  < 2e-16 ***
## Promotion3  -4.9327702  0.5763881  -8.558  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.165 on 534 degrees of freedom
## Multiple R-squared:  0.9072, Adjusted R-squared:  0.905 
## F-statistic: 401.7 on 13 and 534 DF,  p-value: < 2.2e-16

4.3 T-Test Analysis

# Converting from Factor to Integer
MarCamp.df$Promotion <- as.integer(MarCamp.df$Promotion)
# Dataset with Promotion strategies 1 and 2 only
Prom12.df <- MarCamp.df[which(MarCamp.df$Promotion <= 2), ]
## View(Prom12.df)

# Dataset with Promotion strategies 2 and 3 only
Prom23.df <- MarCamp.df[which(MarCamp.df$Promotion >= 2), ]
## View(Prom23.df)

# Dataset with Promotion strategies 1 and 3 only
Prom13.df <- MarCamp.df[which(MarCamp.df$Promotion != 2), ]
## View(Prom13.df)

# Converting from Integer to Factor
MarCamp.df$Promotion <- as.factor(MarCamp.df$Promotion)
Prom12.df$Promotion <- as.factor(Prom12.df$Promotion)
Prom23.df$Promotion <- as.factor(Prom23.df$Promotion)
Prom13.df$Promotion <- as.factor(Prom13.df$Promotion)

t.test(SalesInThousands ~ Promotion, data=Prom12.df)
## 
##  Welch Two Sample t-test
## 
## data:  SalesInThousands by Promotion
## t = 6.4275, df = 346.78, p-value = 4.29e-10
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##   7.474093 14.065101
## sample estimates:
## mean in group 1 mean in group 2 
##        58.09901        47.32941
t.test(SalesInThousands ~ Promotion, data=Prom23.df)
## 
##  Welch Two Sample t-test
## 
## data:  SalesInThousands by Promotion
## t = -4.8814, df = 370.02, p-value = 1.569e-06
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -11.271854  -4.798253
## sample estimates:
## mean in group 2 mean in group 3 
##        47.32941        55.36447
t.test(SalesInThousands ~ Promotion, data=Prom13.df)
## 
##  Welch Two Sample t-test
## 
## data:  SalesInThousands by Promotion
## t = 1.556, df = 355.92, p-value = 0.1206
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.7216369  6.1907240
## sample estimates:
## mean in group 1 mean in group 3 
##        58.09901        55.36447

5. Discussion and Results

5.1 Interpret the results of t-Tests

  1. The p-value based on the first t-test for promotional strategies 1 and 2 = 4.29e-10
  2. The p-value based on the first t-test for promotional strategies 2 and 3 = 1.569e-06
  3. The p-value based on the first t-test for promotional strategies 1 and 3 = 0.1206

  4. The t test showed there was a significant difference in average sales for promotions 1 and 2, for promotions 2 and 3, but no significant difference for promotions 1 and 3.

5.2 Regression Analysis

Both the models are equally important as both have statistically significant variables.

The explanatory variable(s) whose beta-coefficients are statistically significant (p < 0.05) -

  1. MarketSize
  2. Promotion
  3. MarketID

5.3 Insights

  1. p- value of both models is less than 2.2e-16 which is much less than 0.05 and therefore, the models as a whole are good for the prediction of sales.

  2. The models have passed the F-Test most likely.

  3. According to the Adjusted R-Squared, all the predictor variables taken explain a 58% and 91% of variance respectively. Since it is between 58% to 91%, we can say that the number of variables taken to calculate the effect on sales is just fine.

  4. There is a very strong relationship between small to medium market size and sales. Same is the case with the second promotion strategy.

  5. There is a very positive relationship between market ID 1,2 and sales.

  6. Sales are less for small and medium market sizes, keeping all other variables constant due to negative beta coefficents.

  7. Promotion strategy 2 i.e Societal Marketing also decrease sales.

5.4 Plot Analysis

  1. The first plot in code suggests that the first Promotional Strategy of using collectibles drives sales the most. Then the second and the third strategy as the graph shows.

  2. The second plot suggests that age of a store does not have much effect on sales.

  3. The third plot suggests that Large Market Size drives sales in thousands more than small and medium market sizes.

  4. Small and large Market Sizes have most sales with third promotional strategy of loyalty programs.

  5. Medium market size has most effect on sales with second promotional strategy of societal Marketing.

6. Conclusion

This report is motivated by the need for research that could improve sales to make them in thousands using different promotional techniques. I found that customers repond more positively in case of societal marketing and loyalty programs. Also, the medium and large market sizes drive sales in thousands to a large extent.

7. References

Fast Food Marketing, Available from - http://smallbusiness.chron.com/fast-food-marketing-strategies-3408.html [25 Feb, 2018]

Marketing Campaign, Promotion Effectiveness - Fast Food Chain, available on https://www.ibm.com/communities/analytics/watson-analytics-blog/marketing-campaign-eff-usec_-fastf/ [25 Feb, 2018]

Sue Peattie, (1998) “Promotional competitions as a marketing tool in food retailing”, British Food Journal, Vol. 100 Issue: 6, pp.286-294, https://doi.org/10.1108/00070709810230472 [25 Feb, 2018]

Chem L. Narayana & P. S. Raju (2013) Gifts versus Sweepstakes: Consumer Choices and Profiles, Journal of Advertising, 14:1, 50-53, DOI: 10.1080/00913367.1985.10672930. [25 Feb, 2018]

Data Science and Analytics using R programming by Prof. Sameer Mathur, Marketing Professor at IIM Lucknow, course available on https://www.udemy.com/

8. Appendices

Table 1: Summary Statistics of the Marketing Campaign Study Study

Total Observations in Table: 548

                  |Promotion Strategy Adopted
Market Size 1 2 3 Row Total
Large 56 64 48 168
0.203 0.703 1.611
0.333 0.381 0.286 0.307
0.326 0.340 0.255
0.102 0.117 0.088
———————- ———– ———– ———– ———–
Medium 96 108 116 320
0.196 0.029 0.352
0.300 0.338 0.362 0.584
0.558 0.574 0.617
0.175 0.197 0.212
———————- ———– ———– ———– ———–
Small 20 16 24 60
0.072 1.021 0.567
0.333 0.267 0.400 0.109
0.116 0.085 0.128
0.036 0.029 0.044
———————- ———– ———– ———– ———–
Column Total 172 188 188 548
0.314 0.343 0.343
———————- ———– ———– ———– ———–

Table 2: Regression Analysis according to Model 1

\(\beta\) SE t-statistic
Intercept 74.8332*** 1.3461 55.591
MarketSizeMedium -26.5213*** 1.0423 -25.445
MarketSizeSmall -13.8219*** 1.6460 -8.397
Promotion2 -10.7674*** 1.1522 -9.345
————————— ————- ———– ————-

Table 3: Regression Analysis according to Model 2

\(\beta\) SE t-statistic
(Intercept) 39.8781902*** 0.9108658 43.781
MarketID2 25.9118551*** 4.9504734 5.234
MarketID3 49.8706379*** 10.1158147 4.930
Promotion2 -9.7180316*** 0.5627960 -17.267
Promotion3 -4.9327702*** 0.5763881 -8.558
————————— ————- ———– ————-

9. R Code supporting study

Reading the dataset

MarCamp.df <- read.csv(paste("WA_Fn-UseC_-Marketing-Campaign-Eff-UseC_-FastF.csv", sep=""))
str(MarCamp.df)
## 'data.frame':    548 obs. of  7 variables:
##  $ MarketID        : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ MarketSize      : Factor w/ 3 levels "Large","Medium",..: 2 2 2 2 2 2 2 2 2 2 ...
##  $ LocationID      : int  1 1 1 1 2 2 2 2 3 3 ...
##  $ AgeOfStore      : int  4 4 4 4 5 5 5 5 12 12 ...
##  $ Promotion       : int  3 3 3 3 2 2 2 2 1 1 ...
##  $ week            : int  1 2 3 4 1 2 3 4 1 2 ...
##  $ SalesInThousands: num  33.7 35.7 29 39.2 27.8 ...
dim(MarCamp.df)
## [1] 548   7
## Converting sex, quarter and first language columns to factors from integers

MarCamp.df$Promotion <- as.factor(MarCamp.df$Promotion)
MarCamp.df$week <- as.factor(MarCamp.df$week)
MarCamp.df$MarketID <- as.factor(MarCamp.df$MarketID)
str(MarCamp.df)
## 'data.frame':    548 obs. of  7 variables:
##  $ MarketID        : Factor w/ 10 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ MarketSize      : Factor w/ 3 levels "Large","Medium",..: 2 2 2 2 2 2 2 2 2 2 ...
##  $ LocationID      : int  1 1 1 1 2 2 2 2 3 3 ...
##  $ AgeOfStore      : int  4 4 4 4 5 5 5 5 12 12 ...
##  $ Promotion       : Factor w/ 3 levels "1","2","3": 3 3 3 3 2 2 2 2 1 1 ...
##  $ week            : Factor w/ 4 levels "1","2","3","4": 1 2 3 4 1 2 3 4 1 2 ...
##  $ SalesInThousands: num  33.7 35.7 29 39.2 27.8 ...

Descriptive statistics (min, max, median etc) of each variable

summary(MarCamp.df)
##     MarketID    MarketSize    LocationID      AgeOfStore     Promotion
##  3      : 88   Large :168   Min.   :  1.0   Min.   : 1.000   1:172    
##  10     : 80   Medium:320   1st Qu.:216.0   1st Qu.: 4.000   2:188    
##  5      : 60   Small : 60   Median :504.0   Median : 7.000   3:188    
##  6      : 60                Mean   :479.7   Mean   : 8.504            
##  7      : 60                3rd Qu.:708.0   3rd Qu.:12.000            
##  1      : 52                Max.   :920.0   Max.   :28.000            
##  (Other):148                                                          
##  week    SalesInThousands
##  1:137   Min.   :17.34   
##  2:137   1st Qu.:42.55   
##  3:137   Median :50.20   
##  4:137   Mean   :53.47   
##          3rd Qu.:60.48   
##          Max.   :99.65   
## 
library(psych)
describe(MarCamp.df)
##                  vars   n   mean     sd median trimmed    mad   min    max
## MarketID*           1 548   5.72   2.88    6.0    5.76   4.45  1.00  10.00
## MarketSize*         2 548   1.80   0.61    2.0    1.75   0.00  1.00   3.00
## LocationID          3 548 479.66 287.97  504.0  483.96 421.06  1.00 920.00
## AgeOfStore          4 548   8.50   6.64    7.0    7.63   5.93  1.00  28.00
## Promotion*          5 548   2.03   0.81    2.0    2.04   1.48  1.00   3.00
## week*               6 548   2.50   1.12    2.5    2.50   1.48  1.00   4.00
## SalesInThousands    7 548  53.47  16.76   50.2   52.02  12.76 17.34  99.65
##                   range  skew kurtosis    se
## MarketID*          9.00 -0.02    -1.18  0.12
## MarketSize*        2.00  0.14    -0.53  0.03
## LocationID       919.00 -0.02    -1.16 12.30
## AgeOfStore        27.00  1.04     0.35  0.28
## Promotion*         2.00 -0.05    -1.48  0.03
## week*              3.00  0.00    -1.37  0.05
## SalesInThousands  82.31  0.80     0.14  0.72

Effect of the three promotion strategies on sales

boxplot(SalesInThousands ~ Promotion,data=MarCamp.df, 
        main="Plot of Promotion Strategy vs Sales", ylab="Promotion Strategy", 
        xlab="Sales in Thousands", horizontal=TRUE,
        col=c("red","blue","yellow"))

Effect of Age of Store on Sales

plot(SalesInThousands ~ AgeOfStore , data=MarCamp.df, 
    xlab="Age Of Store", ylab="Sales in Thousands", 
    main="Visualization of Sales wrt Age of store")

Effect of Market Size on Sales

boxplot(SalesInThousands ~ MarketSize ,data=MarCamp.df, 
        main="Plot of Market Size and Sales", ylab="Market Size", 
        xlab="Sales In Thousands", horizontal=TRUE,
        col=c("red","blue","peachpuff","yellow", "green", "pink"))

Effect of Location ID on Promotional Strategy

boxplot(LocationID ~ Promotion ,data=MarCamp.df, 
        main="Plot of Promotion Strategy and Location ID", ylab="Promotion Strategy", 
        xlab="Location ID", horizontal=TRUE,
        col=c("red","blue","peachpuff","yellow", "green", "pink"))

Effect of Market Size on Sales

library(lattice)
histogram(~SalesInThousands | MarketSize, data=MarCamp.df)

Relationship between Market Size and Promotional Strategy

library(lattice)
histogram(~Promotion | MarketSize, data=MarCamp.df)

Percentage of Sales wrt market size and Promotion Strategies

# histogram of percentages
histogram(~SalesInThousands | Promotion + MarketSize, data=MarCamp.df,
#          type="count", 
          layout=c(3,3), 
          col=c("burlywood", "darkolivegreen", "red", "yellow", "peachpuff", "blue"))

Relation between market size, Promotion Strategies and week

# histogram of counts
histogram(~week | Promotion + MarketSize, data=MarCamp.df,
          type="count", 
          layout=c(3,3), 
          col=c("burlywood", "darkolivegreen", "red", "yellow", "peachpuff", "blue"))

Mean Sales wrt Market Size, Promotion, Age of Store and week

aggregate(SalesInThousands ~ MarketSize, data = MarCamp.df, mean)
##   MarketSize SalesInThousands
## 1      Large         70.11673
## 2     Medium         43.98534
## 3      Small         57.40933
aggregate(SalesInThousands ~ Promotion, data = MarCamp.df, mean)
##   Promotion SalesInThousands
## 1         1         58.09901
## 2         2         47.32941
## 3         3         55.36447
aggregate(SalesInThousands ~ AgeOfStore, data = MarCamp.df, mean)
##    AgeOfStore SalesInThousands
## 1           1         58.41562
## 2           2         59.17950
## 3           3         60.22750
## 4           4         53.43773
## 5           5         48.81864
## 6           6         51.36667
## 7           7         52.12875
## 8           8         50.47575
## 9           9         48.99607
## 10         10         39.31375
## 11         11         57.15937
## 12         12         47.48292
## 13         13         59.64250
## 14         14         49.06333
## 15         15         42.67375
## 16         17         49.93750
## 17         18         50.71000
## 18         19         63.63800
## 19         20         60.20250
## 20         22         59.68833
## 21         23         65.09750
## 22         24         51.14083
## 23         25         45.42500
## 24         27         52.39250
## 25         28         52.28500
aggregate(SalesInThousands ~ week, data = MarCamp.df, mean)
##   week SalesInThousands
## 1    1         53.79058
## 2    2         53.38657
## 3    3         53.47460
## 4    4         53.21307

Combined mean sales and age of store wrt market size and promotional strategies

aggregate(cbind(SalesInThousands, AgeOfStore) ~ MarketSize, data = MarCamp.df, mean)
##   MarketSize SalesInThousands AgeOfStore
## 1      Large         70.11673   7.142857
## 2     Medium         43.98534   8.787500
## 3      Small         57.40933  10.800000
aggregate(cbind(SalesInThousands, AgeOfStore) ~ Promotion, data = MarCamp.df, mean)
##   Promotion SalesInThousands AgeOfStore
## 1         1         58.09901   8.279070
## 2         2         47.32941   7.978723
## 3         3         55.36447   9.234043

Number of stores wrt Promotion strategies and market size in One way Contingency Tables

with(MarCamp.df, table(Promotion))
## Promotion
##   1   2   3 
## 172 188 188
with(MarCamp.df, table(MarketSize))
## MarketSize
##  Large Medium  Small 
##    168    320     60
prop.table(with(MarCamp.df, table(Promotion)))*100 # percentages
## Promotion
##        1        2        3 
## 31.38686 34.30657 34.30657
prop.table(with(MarCamp.df, table(MarketSize)))*100 # percentages
## MarketSize
##    Large   Medium    Small 
## 30.65693 58.39416 10.94891

Number of stores wrt Promotion strategies and market size in two way Contingency Tables

mytable1 <- xtabs(~ MarketSize+Promotion, data=MarCamp.df)
mytable1 # frequencies
##           Promotion
## MarketSize   1   2   3
##     Large   56  64  48
##     Medium  96 108 116
##     Small   20  16  24
addmargins(mytable1)
##           Promotion
## MarketSize   1   2   3 Sum
##     Large   56  64  48 168
##     Medium  96 108 116 320
##     Small   20  16  24  60
##     Sum    172 188 188 548
library(gmodels) 
CrossTable(MarCamp.df$MarketSize, MarCamp.df$Promotion)
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## | Chi-square contribution |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  548 
## 
##  
##                       | MarCamp.df$Promotion 
## MarCamp.df$MarketSize |         1 |         2 |         3 | Row Total | 
## ----------------------|-----------|-----------|-----------|-----------|
##                 Large |        56 |        64 |        48 |       168 | 
##                       |     0.203 |     0.703 |     1.611 |           | 
##                       |     0.333 |     0.381 |     0.286 |     0.307 | 
##                       |     0.326 |     0.340 |     0.255 |           | 
##                       |     0.102 |     0.117 |     0.088 |           | 
## ----------------------|-----------|-----------|-----------|-----------|
##                Medium |        96 |       108 |       116 |       320 | 
##                       |     0.196 |     0.029 |     0.352 |           | 
##                       |     0.300 |     0.338 |     0.362 |     0.584 | 
##                       |     0.558 |     0.574 |     0.617 |           | 
##                       |     0.175 |     0.197 |     0.212 |           | 
## ----------------------|-----------|-----------|-----------|-----------|
##                 Small |        20 |        16 |        24 |        60 | 
##                       |     0.072 |     1.021 |     0.567 |           | 
##                       |     0.333 |     0.267 |     0.400 |     0.109 | 
##                       |     0.116 |     0.085 |     0.128 |           | 
##                       |     0.036 |     0.029 |     0.044 |           | 
## ----------------------|-----------|-----------|-----------|-----------|
##          Column Total |       172 |       188 |       188 |       548 | 
##                       |     0.314 |     0.343 |     0.343 |           | 
## ----------------------|-----------|-----------|-----------|-----------|
## 
## 

Number of stores wrt Promotion strategies, market size and week in three way Contingency Tables

mytable2 <- xtabs(~ MarketSize+week+Promotion, data=MarCamp.df)
mytable2
## , , Promotion = 1
## 
##           week
## MarketSize  1  2  3  4
##     Large  14 14 14 14
##     Medium 24 24 24 24
##     Small   5  5  5  5
## 
## , , Promotion = 2
## 
##           week
## MarketSize  1  2  3  4
##     Large  16 16 16 16
##     Medium 27 27 27 27
##     Small   4  4  4  4
## 
## , , Promotion = 3
## 
##           week
## MarketSize  1  2  3  4
##     Large  12 12 12 12
##     Medium 29 29 29 29
##     Small   6  6  6  6
ftable(mytable2) # Compact 3-way Table
##                 Promotion  1  2  3
## MarketSize week                   
## Large      1              14 16 12
##            2              14 16 12
##            3              14 16 12
##            4              14 16 12
## Medium     1              24 27 29
##            2              24 27 29
##            3              24 27 29
##            4              24 27 29
## Small      1               5  4  6
##            2               5  4  6
##            3               5  4  6
##            4               5  4  6

Correlation Matrix for effect of Market Size on Promotion

library(psych)
#Conversion from factor to numeric form for making the correlation matrix
MarCamp.df$Promotion <- as.integer(MarCamp.df$Promotion)
MarCamp.df$week <- as.integer(MarCamp.df$week)
MarCamp.df$MarketID <- as.integer(MarCamp.df$MarketID)
MarCamp.df$MarketSize <- as.integer(MarCamp.df$MarketSize) # 1-Small, 2-Medium, 3-Large
str(MarCamp.df)
## 'data.frame':    548 obs. of  7 variables:
##  $ MarketID        : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ MarketSize      : int  2 2 2 2 2 2 2 2 2 2 ...
##  $ LocationID      : int  1 1 1 1 2 2 2 2 3 3 ...
##  $ AgeOfStore      : int  4 4 4 4 5 5 5 5 12 12 ...
##  $ Promotion       : int  3 3 3 3 2 2 2 2 1 1 ...
##  $ week            : int  1 2 3 4 1 2 3 4 1 2 ...
##  $ SalesInThousands: num  33.7 35.7 29 39.2 27.8 ...
corr.test(MarCamp.df, use="complete")
## Call:corr.test(x = MarCamp.df, use = "complete")
## Correlation matrix 
##                  MarketID MarketSize LocationID AgeOfStore Promotion  week
## MarketID             1.00      -0.26       1.00      -0.05     -0.05  0.00
## MarketSize          -0.26       1.00      -0.27       0.16      0.06  0.00
## LocationID           1.00      -0.27       1.00      -0.05     -0.05  0.00
## AgeOfStore          -0.05       0.16      -0.05       1.00      0.06  0.00
## Promotion           -0.05       0.06      -0.05       0.06      1.00  0.00
## week                 0.00       0.00       0.00       0.00      0.00  1.00
## SalesInThousands    -0.19      -0.45      -0.19      -0.03     -0.06 -0.01
##                  SalesInThousands
## MarketID                    -0.19
## MarketSize                  -0.45
## LocationID                  -0.19
## AgeOfStore                  -0.03
## Promotion                   -0.06
## week                        -0.01
## SalesInThousands             1.00
## Sample Size 
## [1] 548
## Probability values (Entries above the diagonal are adjusted for multiple tests.) 
##                  MarketID MarketSize LocationID AgeOfStore Promotion week
## MarketID             0.00       0.00       0.00       1.00      1.00  1.0
## MarketSize           0.00       0.00       0.00       0.00      1.00  1.0
## LocationID           0.00       0.00       0.00       1.00      1.00  1.0
## AgeOfStore           0.24       0.00       0.24       0.00      1.00  1.0
## Promotion            0.28       0.19       0.24       0.16      0.00  1.0
## week                 1.00       1.00       1.00       1.00      1.00  0.0
## SalesInThousands     0.00       0.00       0.00       0.51      0.17  0.8
##                  SalesInThousands
## MarketID                        0
## MarketSize                      0
## LocationID                      0
## AgeOfStore                      1
## Promotion                       1
## week                            1
## SalesInThousands                0
## 
##  To see confidence intervals of the correlations, print with the short=FALSE option

Construct a Corrgram based on all variables in the dataset.

library(corrgram)
corrgram(MarCamp.df[, names(MarCamp.df)], order=FALSE,
         main="Corrgram of dataset variables",
         lower.panel=panel.shade, upper.panel=panel.pie,
         diag.panel=panel.minmax, text.panel=panel.txt)

Construct a scatter plot matrix of the dataset.

# Converting from Integer to Factor
MarCamp.df$Promotion <- as.factor(MarCamp.df$Promotion)
MarCamp.df$week <- as.factor(MarCamp.df$week)
MarCamp.df$MarketID <- as.factor(MarCamp.df$MarketID)
MarCamp.df$MarketSize <- as.factor(MarCamp.df$MarketSize)
library(car)
scatterplotMatrix(formula = ~ MarketID + MarketSize + LocationID + AgeOfStore +
                  Promotion + week + SalesInThousands, cex=0.6, data=MarCamp.df, 
                  diagonal="histogram")

Running t-Tests

# Converting from Factor to Integer
MarCamp.df$Promotion <- as.integer(MarCamp.df$Promotion)
# Dataset with Promotion strategies 1 and 2 only
Prom12.df <- MarCamp.df[which(MarCamp.df$Promotion <= 2), ]
## View(Prom12.df)

# Dataset with Promotion strategies 2 and 3 only
Prom23.df <- MarCamp.df[which(MarCamp.df$Promotion >= 2), ]
## View(Prom23.df)

# Dataset with Promotion strategies 1 and 3 only
Prom13.df <- MarCamp.df[which(MarCamp.df$Promotion != 2), ]
## View(Prom13.df)

# Converting from Integer to Factor
MarCamp.df$Promotion <- as.factor(MarCamp.df$Promotion)
Prom12.df$Promotion <- as.factor(Prom12.df$Promotion)
Prom23.df$Promotion <- as.factor(Prom23.df$Promotion)
Prom13.df$Promotion <- as.factor(Prom13.df$Promotion)

t.test(SalesInThousands ~ Promotion, data=Prom12.df)
## 
##  Welch Two Sample t-test
## 
## data:  SalesInThousands by Promotion
## t = 6.4275, df = 346.78, p-value = 4.29e-10
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##   7.474093 14.065101
## sample estimates:
## mean in group 1 mean in group 2 
##        58.09901        47.32941
t.test(SalesInThousands ~ Promotion, data=Prom23.df)
## 
##  Welch Two Sample t-test
## 
## data:  SalesInThousands by Promotion
## t = -4.8814, df = 370.02, p-value = 1.569e-06
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -11.271854  -4.798253
## sample estimates:
## mean in group 2 mean in group 3 
##        47.32941        55.36447
t.test(SalesInThousands ~ Promotion, data=Prom13.df)
## 
##  Welch Two Sample t-test
## 
## data:  SalesInThousands by Promotion
## t = 1.556, df = 355.92, p-value = 0.1206
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.7216369  6.1907240
## sample estimates:
## mean in group 1 mean in group 3 
##        58.09901        55.36447

First Regression Model

In this model we are considering maximum number of important variables - 1) Independent Variables - MarketSize, Promotion, week 2) Dependent Variables - SalesInThousands

fit_a <- lm(SalesInThousands ~ MarketSize+Promotion+week, data = MarCamp.df)
summary(fit_a)
## 
## Call:
## lm(formula = SalesInThousands ~ MarketSize + Promotion + week, 
##     data = MarCamp.df)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -25.2132  -7.8601   0.7121   7.8724  24.8168 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  74.8332     1.3461  55.591  < 2e-16 ***
## MarketSize2 -26.5213     1.0423 -25.445  < 2e-16 ***
## MarketSize3 -13.8219     1.6460  -8.397 4.05e-16 ***
## Promotion2  -10.7674     1.1522  -9.345  < 2e-16 ***
## Promotion3   -1.0156     1.1535  -0.881    0.379    
## week2        -0.4040     1.3183  -0.306    0.759    
## week3        -0.3160     1.3183  -0.240    0.811    
## week4        -0.5775     1.3183  -0.438    0.662    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 10.91 on 540 degrees of freedom
## Multiple R-squared:  0.5814, Adjusted R-squared:  0.5759 
## F-statistic: 107.1 on 7 and 540 DF,  p-value: < 2.2e-16

Second Regression Model

In this model we are considering only those variables whose effect Daer esp. wants to see given in the last paragraph of the case study. 1) Independent Variables - MarketID, LocationID, AgeOfStore, Promotion 2) Dependent Variables - SalesInThousands

fit_b <- lm(SalesInThousands ~ MarketID+LocationID+AgeOfStore+Promotion, data = MarCamp.df)
summary(fit_b)
## 
## Call:
## lm(formula = SalesInThousands ~ MarketID + LocationID + AgeOfStore + 
##     Promotion, data = MarCamp.df)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -14.5060  -3.4551  -0.1593   3.4745  14.3174 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 39.8781902  0.9108658  43.781  < 2e-16 ***
## MarketID2   25.9118551  4.9504734   5.234 2.39e-07 ***
## MarketID3   49.8706379 10.1158147   4.930 1.10e-06 ***
## MarketID4   19.6766409 14.6969086   1.339    0.181    
## MarketID5   16.0503762 19.7758281   0.812    0.417    
## MarketID6    2.1058084 24.6771579   0.085    0.932    
## MarketID7    9.9726005 29.6060346   0.337    0.736    
## MarketID8   13.2753016 34.4303218   0.386    0.700    
## MarketID9   18.1428750 39.3210401   0.461    0.645    
## MarketID10  20.2760451 44.4695539   0.456    0.649    
## LocationID  -0.0009556  0.0492052  -0.019    0.985    
## AgeOfStore   0.0129108  0.0345882   0.373    0.709    
## Promotion2  -9.7180316  0.5627960 -17.267  < 2e-16 ***
## Promotion3  -4.9327702  0.5763881  -8.558  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.165 on 534 degrees of freedom
## Multiple R-squared:  0.9072, Adjusted R-squared:  0.905 
## F-statistic: 401.7 on 13 and 534 DF,  p-value: < 2.2e-16