Question: Do coupons and advertising affects peanut butter sales?
# Install pacman if needed
if (!require("pacman")) install.packages("pacman")
## Loading required package: pacman
# load packages
pacman::p_load(pacman,
tidyverse, openxlsx, ggpubr)
#Dataset is in datasets subfolder
(peanutbutter <- read.xlsx("datasets/Coupondata.xlsx"))
#Lowercase column names
(peanutbutter<- rename_with(peanutbutter, tolower))
(peanutbutter_long <- peanutbutter %>%
pivot_longer(!advertising, names_to = "coupon", values_to = "sales"))
#Plot using ggpubr
ggline(peanutbutter_long, x = "advertising", y = "sales",
add = c("mean_se", "jitter"),
color = "coupon", palette = "startrek",
title = "Sales increase when ad spending increases",
subtitle = "(Advertising and coupon do not interact) ",
legend.title = "coupon status"
)
two_way_aov <- aov(sales ~ advertising*coupon, data = peanutbutter_long)
#The anova table
summary(two_way_aov)
## Df Sum Sq Mean Sq F value Pr(>F)
## advertising 1 18096 18096 75.506 2.4e-05 ***
## coupon 1 1323 1323 5.520 0.0467 *
## advertising:coupon 1 16 16 0.068 0.8006
## Residuals 8 1917 240
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Since the p-values for the main effects advertising and then coupon are small and the interaction (advertising*coupon) is very large. Advertising and coupon factors (separately) increases sales.
#Diagnostic plots
#plot(two_way_aov)
hist(two_way_aov$residuals, prob = TRUE)
lines(density(two_way_aov$residuals))
qqnorm(two_way_aov$residuals)
qqline(two_way_aov$residuals)
#What we can expect in sales when there is advertising vs. no advertising
peanutbutter_long %>%
group_by(advertising) %>%
summarize(sales_forecast = round(mean(sales),2),
std_dev = round(sd(sales),2))
#What we can expect in sales when there is a coupon vs no coupon
peanutbutter_long %>%
group_by(coupon) %>%
summarize(sales_forecast = round(mean(sales),2),
std_dev = round(sd(sales),2))
| Sales with: | Forecast |
|---|---|
| No advertising | 80.67 |
| With advertising | 158.33 |
| No coupon | 109 |
| With coupon | 130 |
Ads tends to increase sales by 78 (158.33 - 80.67) over no ad and coupon tends to increase sales by 21 (130 - 109) over no coupon.
So in the case of peanut butter sales, if we wanted to predict peanut butter sales with both coupon and advertising we can use the following equation:
predicted sales = overall average + factor A effect (if significant) + factor B effect (if significant)
#Calculate the overall average
overall_avg <- mean(peanutbutter_long$sales)
paste("The overall sales average is", overall_avg, sep=" ")
## [1] "The overall sales average is 119.5"
#Calculate advertising effect
adv_effect <- (158.33 - 80.67)
#Calculate coupon effect
coupon_effect <- (130-109)
#Calculate Forecast:
#predicted value = overall average + factor A effect (if significant) + factor B effect (if significant)
predicted_sales <- overall_avg + adv_effect + coupon_effect
paste("Forecast when there is both advertising and a coupon is:", predicted_sales, sep = " ")
## [1] "Forecast when there is both advertising and a coupon is: 218.16"
The forecast equation we can use to predict with two-way ANOVA (with or without replication) is the:
predicted value = overall average + factor A effect (if significant) + factor B effect (if significant)
If a factor is not significant than the factor effect is assumed to be 0.
Two-way ANOVA (with replication), if interaction effect is significant, then the predicted value is the value of the response variable (y) is equal to the mean of all observations having that combination of factor levels. If interaction effect, is not significant, you can proceed with your analysis as if it were two-way ANOVA without replication scenario.