These days the fast food sector is becoming highly competitive. But the one thing common in this industry with rest of the businesses is that it is dominated by large companies. For driving customer traffic, smaller businesses must develop new and creative marketing techniques. This emplies that the businesses stay in constant touch with customers. One of the best ways to do this is to do a proper marketing research. A small fast food company must know what key customers want and will buy before developing marketing and advertising strategies (http://smallbusiness.chron.com) {1}. Therefore, marketing campaigns are conducted to get a current overview of the trends and kinds of tastes that people in any area prefer.
This report addresses the following issues concerning the fast food chain in New Delhi, VeganVista’s promotional campaign. The first issue concerns that which customers will respond to which campaign more positively. In this study, I investigate about the key metrics that drive the sales of VeganVista, located in different localities, in thousands {2}.
Marketing managers today need to utilize to the core, the full variety of marketing mix tools so as to achieve maximum benefit. Sales Promotion is a very important marketing tool in the food industry (Sue Peattie; 1998) {3}.
In today’s time, there are many promotional tools such as coupons, bonus packs, free samples and sweepstakes. These are very common activities offered to consumers. It is the the next step in this process, the consumer response to such promotional strategies, which has not been well understood (Chem L. Narayan and P.S. Raju; 1985) {4}. Understanding the importance of consumer response towards promotional campaigns, the influence of these promotional strategies driving sales and consumer decision is analyzed in this study.
For this study, I collected data from the IBM Watson Analytics website (https://www.ibm.com/communities/analytics/watson-analytics-blog/marketing-campaign-eff-usec_-fastf/).
VeganVista is a fast food restaurant where no table service is provided. It is a self service restaurant. It is located in the Cannaught Place area of New Delhi with several branches all over Delhi. The chain plans to add a new Vegan patty burger to its menu. For this purpose, they decided three possible marketing campaigns for promoting the new product. In order to determine which promotion has the greatest effect on sales, the new item is introduced at locations in several randomly selected markets. A different promotion is used at each location, and the weekly sales of the new item are recorded for the first four weeks.
Our major focus was on the three promotional strategies used, which are as follows -
Collectibles - Fast food companies can drive sales through collectibles, particularly those that kids enjoy. For instance, a popular animated film, dolls, glasses or other mementos that are related to the movie. The owners decided to provide one free item for the purchase of the vegan patty burger. This fast food marketing strategies entices people to come back until they have all the collectibles.
Societal Marketing - It includes volunteering or collecting money or items for charity. Consumers who relate to our ideas and values due to our charitable work may, in turn, patronize our fast food restaurant.
Loyalty Programs - Frequency card programs are a popular type of loyalty program for fast food restaurants. The owners plan to invite people to fill out an application and reward them according to the frequency in which they order the particular product. They planned on giving a free drink after their first four orders, then free fries after their next four orders. Ultimately, a customer could earn a free meal after 12 orders.
I have used two models to test the hypothesis of how different variables drive sales in thousands.
Null Hypothesis - MarketSize, Promotion and week have equal effect on sales In order to test Hypothesis, the proposed model is as follows:
\[SalesInThousands= \beta_0 + \beta_1 MarketSize + \beta_2 Promotion + \beta_3 week\] ## 4.2 Second Model -
Null Hypothesis - MarketID, LocationID, AgeOfStore and Promotion have equal effect on sales In order to test Hypothesis, the proposed model is as follows:
\[SalesInThousands= \beta_0 + \beta_1 MarketID + \beta_2 LocationID + \beta_3 AgeOfStore + \beta_4 Promotion\]
## Reading the dataset
MarCamp.df <- read.csv(paste("WA_Fn-UseC_-Marketing-Campaign-Eff-UseC_-FastF.csv", sep=""))
## Converting sex, quarter and first language columns to factors from integers
MarCamp.df$Promotion <- as.factor(MarCamp.df$Promotion)
MarCamp.df$week <- as.factor(MarCamp.df$week)
MarCamp.df$MarketID <- as.factor(MarCamp.df$MarketID)
str(MarCamp.df)
## 'data.frame': 548 obs. of 7 variables:
## $ MarketID : Factor w/ 10 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ MarketSize : Factor w/ 3 levels "Large","Medium",..: 2 2 2 2 2 2 2 2 2 2 ...
## $ LocationID : int 1 1 1 1 2 2 2 2 3 3 ...
## $ AgeOfStore : int 4 4 4 4 5 5 5 5 12 12 ...
## $ Promotion : Factor w/ 3 levels "1","2","3": 3 3 3 3 2 2 2 2 1 1 ...
## $ week : Factor w/ 4 levels "1","2","3","4": 1 2 3 4 1 2 3 4 1 2 ...
## $ SalesInThousands: num 33.7 35.7 29 39.2 27.8 ...
## First Regression Model
fit_a <- lm(SalesInThousands ~ MarketSize+Promotion+week, data = MarCamp.df)
summary(fit_a)
##
## Call:
## lm(formula = SalesInThousands ~ MarketSize + Promotion + week,
## data = MarCamp.df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -25.2132 -7.8601 0.7121 7.8724 24.8168
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 74.8332 1.3461 55.591 < 2e-16 ***
## MarketSizeMedium -26.5213 1.0423 -25.445 < 2e-16 ***
## MarketSizeSmall -13.8219 1.6460 -8.397 4.05e-16 ***
## Promotion2 -10.7674 1.1522 -9.345 < 2e-16 ***
## Promotion3 -1.0156 1.1535 -0.881 0.379
## week2 -0.4040 1.3183 -0.306 0.759
## week3 -0.3160 1.3183 -0.240 0.811
## week4 -0.5775 1.3183 -0.438 0.662
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 10.91 on 540 degrees of freedom
## Multiple R-squared: 0.5814, Adjusted R-squared: 0.5759
## F-statistic: 107.1 on 7 and 540 DF, p-value: < 2.2e-16
## Second Regression Model
fit_b <- lm(SalesInThousands ~ MarketID+LocationID+AgeOfStore+Promotion, data = MarCamp.df)
summary(fit_b)
##
## Call:
## lm(formula = SalesInThousands ~ MarketID + LocationID + AgeOfStore +
## Promotion, data = MarCamp.df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -14.5060 -3.4551 -0.1593 3.4745 14.3174
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 39.8781902 0.9108658 43.781 < 2e-16 ***
## MarketID2 25.9118551 4.9504734 5.234 2.39e-07 ***
## MarketID3 49.8706379 10.1158147 4.930 1.10e-06 ***
## MarketID4 19.6766409 14.6969086 1.339 0.181
## MarketID5 16.0503762 19.7758281 0.812 0.417
## MarketID6 2.1058084 24.6771579 0.085 0.932
## MarketID7 9.9726005 29.6060346 0.337 0.736
## MarketID8 13.2753016 34.4303218 0.386 0.700
## MarketID9 18.1428750 39.3210401 0.461 0.645
## MarketID10 20.2760451 44.4695539 0.456 0.649
## LocationID -0.0009556 0.0492052 -0.019 0.985
## AgeOfStore 0.0129108 0.0345882 0.373 0.709
## Promotion2 -9.7180316 0.5627960 -17.267 < 2e-16 ***
## Promotion3 -4.9327702 0.5763881 -8.558 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.165 on 534 degrees of freedom
## Multiple R-squared: 0.9072, Adjusted R-squared: 0.905
## F-statistic: 401.7 on 13 and 534 DF, p-value: < 2.2e-16
# Converting from Factor to Integer
MarCamp.df$Promotion <- as.integer(MarCamp.df$Promotion)
# Dataset with Promotion strategies 1 and 2 only
Prom12.df <- MarCamp.df[which(MarCamp.df$Promotion <= 2), ]
## View(Prom12.df)
# Dataset with Promotion strategies 2 and 3 only
Prom23.df <- MarCamp.df[which(MarCamp.df$Promotion >= 2), ]
## View(Prom23.df)
# Dataset with Promotion strategies 1 and 3 only
Prom13.df <- MarCamp.df[which(MarCamp.df$Promotion != 2), ]
## View(Prom13.df)
# Converting from Integer to Factor
MarCamp.df$Promotion <- as.factor(MarCamp.df$Promotion)
Prom12.df$Promotion <- as.factor(Prom12.df$Promotion)
Prom23.df$Promotion <- as.factor(Prom23.df$Promotion)
Prom13.df$Promotion <- as.factor(Prom13.df$Promotion)
t.test(SalesInThousands ~ Promotion, data=Prom12.df)
##
## Welch Two Sample t-test
##
## data: SalesInThousands by Promotion
## t = 6.4275, df = 346.78, p-value = 4.29e-10
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 7.474093 14.065101
## sample estimates:
## mean in group 1 mean in group 2
## 58.09901 47.32941
t.test(SalesInThousands ~ Promotion, data=Prom23.df)
##
## Welch Two Sample t-test
##
## data: SalesInThousands by Promotion
## t = -4.8814, df = 370.02, p-value = 1.569e-06
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -11.271854 -4.798253
## sample estimates:
## mean in group 2 mean in group 3
## 47.32941 55.36447
t.test(SalesInThousands ~ Promotion, data=Prom13.df)
##
## Welch Two Sample t-test
##
## data: SalesInThousands by Promotion
## t = 1.556, df = 355.92, p-value = 0.1206
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.7216369 6.1907240
## sample estimates:
## mean in group 1 mean in group 3
## 58.09901 55.36447
The p-value based on the first t-test for promotional strategies 1 and 3 = 0.1206
The t test showed there was a significant difference in average sales for promotions 1 and 2, for promotions 2 and 3, but no significant difference for promotions 1 and 3.
Both the models are equally important as both have statistically significant variables.
The explanatory variable(s) whose beta-coefficients are statistically significant (p < 0.05) -
p- value of both models is less than 2.2e-16 which is much less than 0.05 and therefore, the models as a whole are good for the prediction of sales.
The models have passed the F-Test most likely.
According to the Adjusted R-Squared, all the predictor variables taken explain a 58% and 91% of variance respectively. Since it is between 58% to 91%, we can say that the number of variables taken to calculate the effect on sales is just fine.
There is a very strong relationship between small to medium market size and sales. Same is the case with the second promotion strategy.
There is a very positive relationship between market ID 1,2 and sales.
Sales are less for small and medium market sizes, keeping all other variables constant due to negative beta coefficents.
Promotion strategy 2 i.e Societal Marketing also decrease sales.
The first plot in code suggests that the first Promotional Strategy of using collectibles drives sales the most. Then the second and the third strategy as the graph shows.
The second plot suggests that age of a store does not have much effect on sales.
The third plot suggests that Large Market Size drives sales in thousands more than small and medium market sizes.
Small and large Market Sizes have most sales with third promotional strategy of loyalty programs.
Medium market size has most effect on sales with second promotional strategy of societal Marketing.
This report is motivated by the need for research that could improve sales to make them in thousands using different promotional techniques. I found that customers repond more positively in case of societal marketing and loyalty programs. Also, the medium and large market sizes drive sales in thousands to a large extent.
Fast Food Marketing, Available from - http://smallbusiness.chron.com/fast-food-marketing-strategies-3408.html [25 Feb, 2018]
Marketing Campaign, Promotion Effectiveness - Fast Food Chain, available on https://www.ibm.com/communities/analytics/watson-analytics-blog/marketing-campaign-eff-usec_-fastf/ [25 Feb, 2018]
Sue Peattie, (1998) “Promotional competitions as a marketing tool in food retailing”, British Food Journal, Vol. 100 Issue: 6, pp.286-294, https://doi.org/10.1108/00070709810230472 [25 Feb, 2018]
Chem L. Narayana & P. S. Raju (2013) Gifts versus Sweepstakes: Consumer Choices and Profiles, Journal of Advertising, 14:1, 50-53, DOI: 10.1080/00913367.1985.10672930. [25 Feb, 2018]
Data Science and Analytics using R programming by Prof. Sameer Mathur, Marketing Professor at IIM Lucknow, course available on https://www.udemy.com/
Total Observations in Table: 548
|Promotion Strategy Adopted
| Market Size | 1 | 2 | 3 | Row Total |
|---|---|---|---|---|
| Large | 56 | 64 | 48 | 168 |
| 0.203 | 0.703 | 1.611 | ||
| 0.333 | 0.381 | 0.286 | 0.307 | |
| 0.326 | 0.340 | 0.255 | ||
| 0.102 | 0.117 | 0.088 | ||
| ———————- | ———– | ———– | ———– | ———– |
| Medium | 96 | 108 | 116 | 320 |
| 0.196 | 0.029 | 0.352 | ||
| 0.300 | 0.338 | 0.362 | 0.584 | |
| 0.558 | 0.574 | 0.617 | ||
| 0.175 | 0.197 | 0.212 | ||
| ———————- | ———– | ———– | ———– | ———– |
| Small | 20 | 16 | 24 | 60 |
| 0.072 | 1.021 | 0.567 | ||
| 0.333 | 0.267 | 0.400 | 0.109 | |
| 0.116 | 0.085 | 0.128 | ||
| 0.036 | 0.029 | 0.044 | ||
| ———————- | ———– | ———– | ———– | ———– |
| Column Total | 172 | 188 | 188 | 548 |
| 0.314 | 0.343 | 0.343 | ||
| ———————- | ———– | ———– | ———– | ———– |
| \(\beta\) | SE | t-statistic | |
|---|---|---|---|
| Intercept | 74.8332*** | 1.3461 | 55.591 |
| MarketSizeMedium | -26.5213*** | 1.0423 | -25.445 |
| MarketSizeSmall | -13.8219*** | 1.6460 | -8.397 |
| Promotion2 | -10.7674*** | 1.1522 | -9.345 |
| ————————— | ————- | ———– | ————- |
| \(\beta\) | SE | t-statistic | |
|---|---|---|---|
| (Intercept) | 39.8781902*** | 0.9108658 | 43.781 |
| MarketID2 | 25.9118551*** | 4.9504734 | 5.234 |
| MarketID3 | 49.8706379*** | 10.1158147 | 4.930 |
| Promotion2 | -9.7180316*** | 0.5627960 | -17.267 |
| Promotion3 | -4.9327702*** | 0.5763881 | -8.558 |
| ————————— | ————- | ———– | ————- |
MarCamp.df <- read.csv(paste("WA_Fn-UseC_-Marketing-Campaign-Eff-UseC_-FastF.csv", sep=""))
str(MarCamp.df)
## 'data.frame': 548 obs. of 7 variables:
## $ MarketID : int 1 1 1 1 1 1 1 1 1 1 ...
## $ MarketSize : Factor w/ 3 levels "Large","Medium",..: 2 2 2 2 2 2 2 2 2 2 ...
## $ LocationID : int 1 1 1 1 2 2 2 2 3 3 ...
## $ AgeOfStore : int 4 4 4 4 5 5 5 5 12 12 ...
## $ Promotion : int 3 3 3 3 2 2 2 2 1 1 ...
## $ week : int 1 2 3 4 1 2 3 4 1 2 ...
## $ SalesInThousands: num 33.7 35.7 29 39.2 27.8 ...
dim(MarCamp.df)
## [1] 548 7
## Converting sex, quarter and first language columns to factors from integers
MarCamp.df$Promotion <- as.factor(MarCamp.df$Promotion)
MarCamp.df$week <- as.factor(MarCamp.df$week)
MarCamp.df$MarketID <- as.factor(MarCamp.df$MarketID)
str(MarCamp.df)
## 'data.frame': 548 obs. of 7 variables:
## $ MarketID : Factor w/ 10 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ MarketSize : Factor w/ 3 levels "Large","Medium",..: 2 2 2 2 2 2 2 2 2 2 ...
## $ LocationID : int 1 1 1 1 2 2 2 2 3 3 ...
## $ AgeOfStore : int 4 4 4 4 5 5 5 5 12 12 ...
## $ Promotion : Factor w/ 3 levels "1","2","3": 3 3 3 3 2 2 2 2 1 1 ...
## $ week : Factor w/ 4 levels "1","2","3","4": 1 2 3 4 1 2 3 4 1 2 ...
## $ SalesInThousands: num 33.7 35.7 29 39.2 27.8 ...
summary(MarCamp.df)
## MarketID MarketSize LocationID AgeOfStore Promotion
## 3 : 88 Large :168 Min. : 1.0 Min. : 1.000 1:172
## 10 : 80 Medium:320 1st Qu.:216.0 1st Qu.: 4.000 2:188
## 5 : 60 Small : 60 Median :504.0 Median : 7.000 3:188
## 6 : 60 Mean :479.7 Mean : 8.504
## 7 : 60 3rd Qu.:708.0 3rd Qu.:12.000
## 1 : 52 Max. :920.0 Max. :28.000
## (Other):148
## week SalesInThousands
## 1:137 Min. :17.34
## 2:137 1st Qu.:42.55
## 3:137 Median :50.20
## 4:137 Mean :53.47
## 3rd Qu.:60.48
## Max. :99.65
##
library(psych)
describe(MarCamp.df)
## vars n mean sd median trimmed mad min max
## MarketID* 1 548 5.72 2.88 6.0 5.76 4.45 1.00 10.00
## MarketSize* 2 548 1.80 0.61 2.0 1.75 0.00 1.00 3.00
## LocationID 3 548 479.66 287.97 504.0 483.96 421.06 1.00 920.00
## AgeOfStore 4 548 8.50 6.64 7.0 7.63 5.93 1.00 28.00
## Promotion* 5 548 2.03 0.81 2.0 2.04 1.48 1.00 3.00
## week* 6 548 2.50 1.12 2.5 2.50 1.48 1.00 4.00
## SalesInThousands 7 548 53.47 16.76 50.2 52.02 12.76 17.34 99.65
## range skew kurtosis se
## MarketID* 9.00 -0.02 -1.18 0.12
## MarketSize* 2.00 0.14 -0.53 0.03
## LocationID 919.00 -0.02 -1.16 12.30
## AgeOfStore 27.00 1.04 0.35 0.28
## Promotion* 2.00 -0.05 -1.48 0.03
## week* 3.00 0.00 -1.37 0.05
## SalesInThousands 82.31 0.80 0.14 0.72
boxplot(SalesInThousands ~ Promotion,data=MarCamp.df,
main="Plot of Promotion Strategy vs Sales", ylab="Promotion Strategy",
xlab="Sales in Thousands", horizontal=TRUE,
col=c("red","blue","yellow"))
plot(SalesInThousands ~ AgeOfStore , data=MarCamp.df,
xlab="Age Of Store", ylab="Sales in Thousands",
main="Visualization of Sales wrt Age of store")
boxplot(SalesInThousands ~ MarketSize ,data=MarCamp.df,
main="Plot of Market Size and Sales", ylab="Market Size",
xlab="Sales In Thousands", horizontal=TRUE,
col=c("red","blue","peachpuff","yellow", "green", "pink"))
boxplot(LocationID ~ Promotion ,data=MarCamp.df,
main="Plot of Promotion Strategy and Location ID", ylab="Promotion Strategy",
xlab="Location ID", horizontal=TRUE,
col=c("red","blue","peachpuff","yellow", "green", "pink"))
library(lattice)
histogram(~SalesInThousands | MarketSize, data=MarCamp.df)
library(lattice)
histogram(~Promotion | MarketSize, data=MarCamp.df)
# histogram of percentages
histogram(~SalesInThousands | Promotion + MarketSize, data=MarCamp.df,
# type="count",
layout=c(3,3),
col=c("burlywood", "darkolivegreen", "red", "yellow", "peachpuff", "blue"))
# histogram of counts
histogram(~week | Promotion + MarketSize, data=MarCamp.df,
type="count",
layout=c(3,3),
col=c("burlywood", "darkolivegreen", "red", "yellow", "peachpuff", "blue"))
aggregate(SalesInThousands ~ MarketSize, data = MarCamp.df, mean)
## MarketSize SalesInThousands
## 1 Large 70.11673
## 2 Medium 43.98534
## 3 Small 57.40933
aggregate(SalesInThousands ~ Promotion, data = MarCamp.df, mean)
## Promotion SalesInThousands
## 1 1 58.09901
## 2 2 47.32941
## 3 3 55.36447
aggregate(SalesInThousands ~ AgeOfStore, data = MarCamp.df, mean)
## AgeOfStore SalesInThousands
## 1 1 58.41562
## 2 2 59.17950
## 3 3 60.22750
## 4 4 53.43773
## 5 5 48.81864
## 6 6 51.36667
## 7 7 52.12875
## 8 8 50.47575
## 9 9 48.99607
## 10 10 39.31375
## 11 11 57.15937
## 12 12 47.48292
## 13 13 59.64250
## 14 14 49.06333
## 15 15 42.67375
## 16 17 49.93750
## 17 18 50.71000
## 18 19 63.63800
## 19 20 60.20250
## 20 22 59.68833
## 21 23 65.09750
## 22 24 51.14083
## 23 25 45.42500
## 24 27 52.39250
## 25 28 52.28500
aggregate(SalesInThousands ~ week, data = MarCamp.df, mean)
## week SalesInThousands
## 1 1 53.79058
## 2 2 53.38657
## 3 3 53.47460
## 4 4 53.21307
aggregate(cbind(SalesInThousands, AgeOfStore) ~ MarketSize, data = MarCamp.df, mean)
## MarketSize SalesInThousands AgeOfStore
## 1 Large 70.11673 7.142857
## 2 Medium 43.98534 8.787500
## 3 Small 57.40933 10.800000
aggregate(cbind(SalesInThousands, AgeOfStore) ~ Promotion, data = MarCamp.df, mean)
## Promotion SalesInThousands AgeOfStore
## 1 1 58.09901 8.279070
## 2 2 47.32941 7.978723
## 3 3 55.36447 9.234043
with(MarCamp.df, table(Promotion))
## Promotion
## 1 2 3
## 172 188 188
with(MarCamp.df, table(MarketSize))
## MarketSize
## Large Medium Small
## 168 320 60
prop.table(with(MarCamp.df, table(Promotion)))*100 # percentages
## Promotion
## 1 2 3
## 31.38686 34.30657 34.30657
prop.table(with(MarCamp.df, table(MarketSize)))*100 # percentages
## MarketSize
## Large Medium Small
## 30.65693 58.39416 10.94891
mytable1 <- xtabs(~ MarketSize+Promotion, data=MarCamp.df)
mytable1 # frequencies
## Promotion
## MarketSize 1 2 3
## Large 56 64 48
## Medium 96 108 116
## Small 20 16 24
addmargins(mytable1)
## Promotion
## MarketSize 1 2 3 Sum
## Large 56 64 48 168
## Medium 96 108 116 320
## Small 20 16 24 60
## Sum 172 188 188 548
library(gmodels)
CrossTable(MarCamp.df$MarketSize, MarCamp.df$Promotion)
##
##
## Cell Contents
## |-------------------------|
## | N |
## | Chi-square contribution |
## | N / Row Total |
## | N / Col Total |
## | N / Table Total |
## |-------------------------|
##
##
## Total Observations in Table: 548
##
##
## | MarCamp.df$Promotion
## MarCamp.df$MarketSize | 1 | 2 | 3 | Row Total |
## ----------------------|-----------|-----------|-----------|-----------|
## Large | 56 | 64 | 48 | 168 |
## | 0.203 | 0.703 | 1.611 | |
## | 0.333 | 0.381 | 0.286 | 0.307 |
## | 0.326 | 0.340 | 0.255 | |
## | 0.102 | 0.117 | 0.088 | |
## ----------------------|-----------|-----------|-----------|-----------|
## Medium | 96 | 108 | 116 | 320 |
## | 0.196 | 0.029 | 0.352 | |
## | 0.300 | 0.338 | 0.362 | 0.584 |
## | 0.558 | 0.574 | 0.617 | |
## | 0.175 | 0.197 | 0.212 | |
## ----------------------|-----------|-----------|-----------|-----------|
## Small | 20 | 16 | 24 | 60 |
## | 0.072 | 1.021 | 0.567 | |
## | 0.333 | 0.267 | 0.400 | 0.109 |
## | 0.116 | 0.085 | 0.128 | |
## | 0.036 | 0.029 | 0.044 | |
## ----------------------|-----------|-----------|-----------|-----------|
## Column Total | 172 | 188 | 188 | 548 |
## | 0.314 | 0.343 | 0.343 | |
## ----------------------|-----------|-----------|-----------|-----------|
##
##
mytable2 <- xtabs(~ MarketSize+week+Promotion, data=MarCamp.df)
mytable2
## , , Promotion = 1
##
## week
## MarketSize 1 2 3 4
## Large 14 14 14 14
## Medium 24 24 24 24
## Small 5 5 5 5
##
## , , Promotion = 2
##
## week
## MarketSize 1 2 3 4
## Large 16 16 16 16
## Medium 27 27 27 27
## Small 4 4 4 4
##
## , , Promotion = 3
##
## week
## MarketSize 1 2 3 4
## Large 12 12 12 12
## Medium 29 29 29 29
## Small 6 6 6 6
ftable(mytable2) # Compact 3-way Table
## Promotion 1 2 3
## MarketSize week
## Large 1 14 16 12
## 2 14 16 12
## 3 14 16 12
## 4 14 16 12
## Medium 1 24 27 29
## 2 24 27 29
## 3 24 27 29
## 4 24 27 29
## Small 1 5 4 6
## 2 5 4 6
## 3 5 4 6
## 4 5 4 6
library(psych)
#Conversion from factor to numeric form for making the correlation matrix
MarCamp.df$Promotion <- as.integer(MarCamp.df$Promotion)
MarCamp.df$week <- as.integer(MarCamp.df$week)
MarCamp.df$MarketID <- as.integer(MarCamp.df$MarketID)
MarCamp.df$MarketSize <- as.integer(MarCamp.df$MarketSize) # 1-Small, 2-Medium, 3-Large
str(MarCamp.df)
## 'data.frame': 548 obs. of 7 variables:
## $ MarketID : int 1 1 1 1 1 1 1 1 1 1 ...
## $ MarketSize : int 2 2 2 2 2 2 2 2 2 2 ...
## $ LocationID : int 1 1 1 1 2 2 2 2 3 3 ...
## $ AgeOfStore : int 4 4 4 4 5 5 5 5 12 12 ...
## $ Promotion : int 3 3 3 3 2 2 2 2 1 1 ...
## $ week : int 1 2 3 4 1 2 3 4 1 2 ...
## $ SalesInThousands: num 33.7 35.7 29 39.2 27.8 ...
corr.test(MarCamp.df, use="complete")
## Call:corr.test(x = MarCamp.df, use = "complete")
## Correlation matrix
## MarketID MarketSize LocationID AgeOfStore Promotion week
## MarketID 1.00 -0.26 1.00 -0.05 -0.05 0.00
## MarketSize -0.26 1.00 -0.27 0.16 0.06 0.00
## LocationID 1.00 -0.27 1.00 -0.05 -0.05 0.00
## AgeOfStore -0.05 0.16 -0.05 1.00 0.06 0.00
## Promotion -0.05 0.06 -0.05 0.06 1.00 0.00
## week 0.00 0.00 0.00 0.00 0.00 1.00
## SalesInThousands -0.19 -0.45 -0.19 -0.03 -0.06 -0.01
## SalesInThousands
## MarketID -0.19
## MarketSize -0.45
## LocationID -0.19
## AgeOfStore -0.03
## Promotion -0.06
## week -0.01
## SalesInThousands 1.00
## Sample Size
## [1] 548
## Probability values (Entries above the diagonal are adjusted for multiple tests.)
## MarketID MarketSize LocationID AgeOfStore Promotion week
## MarketID 0.00 0.00 0.00 1.00 1.00 1.0
## MarketSize 0.00 0.00 0.00 0.00 1.00 1.0
## LocationID 0.00 0.00 0.00 1.00 1.00 1.0
## AgeOfStore 0.24 0.00 0.24 0.00 1.00 1.0
## Promotion 0.28 0.19 0.24 0.16 0.00 1.0
## week 1.00 1.00 1.00 1.00 1.00 0.0
## SalesInThousands 0.00 0.00 0.00 0.51 0.17 0.8
## SalesInThousands
## MarketID 0
## MarketSize 0
## LocationID 0
## AgeOfStore 1
## Promotion 1
## week 1
## SalesInThousands 0
##
## To see confidence intervals of the correlations, print with the short=FALSE option
library(corrgram)
corrgram(MarCamp.df[, names(MarCamp.df)], order=FALSE,
main="Corrgram of dataset variables",
lower.panel=panel.shade, upper.panel=panel.pie,
diag.panel=panel.minmax, text.panel=panel.txt)
# Converting from Integer to Factor
MarCamp.df$Promotion <- as.factor(MarCamp.df$Promotion)
MarCamp.df$week <- as.factor(MarCamp.df$week)
MarCamp.df$MarketID <- as.factor(MarCamp.df$MarketID)
MarCamp.df$MarketSize <- as.factor(MarCamp.df$MarketSize)
library(car)
scatterplotMatrix(formula = ~ MarketID + MarketSize + LocationID + AgeOfStore +
Promotion + week + SalesInThousands, cex=0.6, data=MarCamp.df,
diagonal="histogram")
# Converting from Factor to Integer
MarCamp.df$Promotion <- as.integer(MarCamp.df$Promotion)
# Dataset with Promotion strategies 1 and 2 only
Prom12.df <- MarCamp.df[which(MarCamp.df$Promotion <= 2), ]
## View(Prom12.df)
# Dataset with Promotion strategies 2 and 3 only
Prom23.df <- MarCamp.df[which(MarCamp.df$Promotion >= 2), ]
## View(Prom23.df)
# Dataset with Promotion strategies 1 and 3 only
Prom13.df <- MarCamp.df[which(MarCamp.df$Promotion != 2), ]
## View(Prom13.df)
# Converting from Integer to Factor
MarCamp.df$Promotion <- as.factor(MarCamp.df$Promotion)
Prom12.df$Promotion <- as.factor(Prom12.df$Promotion)
Prom23.df$Promotion <- as.factor(Prom23.df$Promotion)
Prom13.df$Promotion <- as.factor(Prom13.df$Promotion)
t.test(SalesInThousands ~ Promotion, data=Prom12.df)
##
## Welch Two Sample t-test
##
## data: SalesInThousands by Promotion
## t = 6.4275, df = 346.78, p-value = 4.29e-10
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 7.474093 14.065101
## sample estimates:
## mean in group 1 mean in group 2
## 58.09901 47.32941
t.test(SalesInThousands ~ Promotion, data=Prom23.df)
##
## Welch Two Sample t-test
##
## data: SalesInThousands by Promotion
## t = -4.8814, df = 370.02, p-value = 1.569e-06
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -11.271854 -4.798253
## sample estimates:
## mean in group 2 mean in group 3
## 47.32941 55.36447
t.test(SalesInThousands ~ Promotion, data=Prom13.df)
##
## Welch Two Sample t-test
##
## data: SalesInThousands by Promotion
## t = 1.556, df = 355.92, p-value = 0.1206
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.7216369 6.1907240
## sample estimates:
## mean in group 1 mean in group 3
## 58.09901 55.36447
In this model we are considering maximum number of important variables - 1) Independent Variables - MarketSize, Promotion, week 2) Dependent Variables - SalesInThousands
fit_a <- lm(SalesInThousands ~ MarketSize+Promotion+week, data = MarCamp.df)
summary(fit_a)
##
## Call:
## lm(formula = SalesInThousands ~ MarketSize + Promotion + week,
## data = MarCamp.df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -25.2132 -7.8601 0.7121 7.8724 24.8168
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 74.8332 1.3461 55.591 < 2e-16 ***
## MarketSize2 -26.5213 1.0423 -25.445 < 2e-16 ***
## MarketSize3 -13.8219 1.6460 -8.397 4.05e-16 ***
## Promotion2 -10.7674 1.1522 -9.345 < 2e-16 ***
## Promotion3 -1.0156 1.1535 -0.881 0.379
## week2 -0.4040 1.3183 -0.306 0.759
## week3 -0.3160 1.3183 -0.240 0.811
## week4 -0.5775 1.3183 -0.438 0.662
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 10.91 on 540 degrees of freedom
## Multiple R-squared: 0.5814, Adjusted R-squared: 0.5759
## F-statistic: 107.1 on 7 and 540 DF, p-value: < 2.2e-16
In this model we are considering only those variables whose effect Daer esp. wants to see given in the last paragraph of the case study. 1) Independent Variables - MarketID, LocationID, AgeOfStore, Promotion 2) Dependent Variables - SalesInThousands
fit_b <- lm(SalesInThousands ~ MarketID+LocationID+AgeOfStore+Promotion, data = MarCamp.df)
summary(fit_b)
##
## Call:
## lm(formula = SalesInThousands ~ MarketID + LocationID + AgeOfStore +
## Promotion, data = MarCamp.df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -14.5060 -3.4551 -0.1593 3.4745 14.3174
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 39.8781902 0.9108658 43.781 < 2e-16 ***
## MarketID2 25.9118551 4.9504734 5.234 2.39e-07 ***
## MarketID3 49.8706379 10.1158147 4.930 1.10e-06 ***
## MarketID4 19.6766409 14.6969086 1.339 0.181
## MarketID5 16.0503762 19.7758281 0.812 0.417
## MarketID6 2.1058084 24.6771579 0.085 0.932
## MarketID7 9.9726005 29.6060346 0.337 0.736
## MarketID8 13.2753016 34.4303218 0.386 0.700
## MarketID9 18.1428750 39.3210401 0.461 0.645
## MarketID10 20.2760451 44.4695539 0.456 0.649
## LocationID -0.0009556 0.0492052 -0.019 0.985
## AgeOfStore 0.0129108 0.0345882 0.373 0.709
## Promotion2 -9.7180316 0.5627960 -17.267 < 2e-16 ***
## Promotion3 -4.9327702 0.5763881 -8.558 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.165 on 534 degrees of freedom
## Multiple R-squared: 0.9072, Adjusted R-squared: 0.905
## F-statistic: 401.7 on 13 and 534 DF, p-value: < 2.2e-16