A clothing company recently launched a marketing campaign featuring a famous actor. The goal was to increase profits (USD) by associating the brand with a well-liked celebrity. After the campaign, the company wants to determine if the campaign was effective. The company has data for 60 clothing stores. Did the sales increase after the campaign?
NULL HYPOTHESIS (H0): There is no difference between the PreCampaign Sales and Post Campaign Sales.
ALTERNATE HYPOTHESIS (H1): There is a difference between the PreCampaign Sales and Post Campaign Sales.
# install.packages("readxl")
library(readxl)
# IMPORT EXCEL FILE INTO R STUDIO
A6R4 <- read_excel("C:/Users/sravz/Downloads/A6R4.xlsx")
# CALCULATE THE DIFFERENCE SCORES
Before <- A6R4$PreCampaignSales
After <- A6R4$PostCampaignSales
Differences <- After - Before
HISTOGRAM
hist(Differences,
main = "Histogram of Difference in Sales",
xlab = "Value",
ylab = "Frequency",
col = "pink",
border = "black",
breaks = 20)
QUESTIONS QUESTION 1: Is the histograms symmetrical, positively skewed, or negatively skewed?
ANSWER: Data is positively skewed.
QUESTION 2: Did the histogram look too flat, too tall, or did it have a proper bell curve?
ANSWER: NOT proper bell curve
SHAPIRO-WILK TEST
shapiro.test(Differences)
##
## Shapiro-Wilk normality test
##
## data: Differences
## W = 0.94747, p-value = 0.01186
QUESTION QUESTION 1: Was the data normally distributed or abnormally distributed?
ANSWER: P < 0.05 , the data is NOT normal. Hence, switching to Wilcoxon sign rank
BOXPLOT
boxplot(Differences,
main = "Distribution of Differences in sales (After - Before)",
ylab = "Difference in Sales",
col = "yellow",
border = "darkblue")
QUESTIONS QUESTION 1: How many dots are in your boxplot?
ANSWER: There is one dot in the boxplot.
QUESTION 2: Where are the dots in your boxplot?
ANSWER: Far from the whiskers (lines of the boxplot).
QUESTION 3: Based on the dots and there location, is the data normal?
ANSWER: The data is abnormally distributed , p = 0.01 < 0.05
DESCRIPTIVES FOR BEFORE SCORES
mean(Before, na.rm = TRUE)
## [1] 25154.53
median(Before, na.rm = TRUE)
## [1] 24624
sd(Before, na.rm = TRUE)
## [1] 12184.4
length(Before)
## [1] 60
DESCRIPTIVES FOR AFTER SCORES
mean(After, na.rm = TRUE)
## [1] 26873.45
median(After, na.rm = TRUE)
## [1] 25086
sd(After, na.rm = TRUE)
## [1] 14434.37
length(After)
## [1] 60
WILCOXON SIGN RANK TEST
wilcox.test(Before, After, paired = TRUE)
##
## Wilcoxon signed rank test with continuity correction
##
## data: Before and After
## V = 640, p-value = 0.0433
## alternative hypothesis: true location shift is not equal to 0
# install.packages("coin")
# install.packages("rstatix")
library(coin)
## Loading required package: survival
library(rstatix)
##
## Attaching package: 'rstatix'
## The following objects are masked from 'package:coin':
##
## chisq_test, friedman_test, kruskal_test, sign_test, wilcox_test
## The following object is masked from 'package:stats':
##
## filter
RANK BISERIAL CORRELATION (EFFECT SIZE)
df_long <- data.frame(
id = rep(1:length(Before), 2),
time = rep(c("Before", "After"), each = length(Before)),
score = c(Before, After)
)
wilcox_effsize(df_long, score ~ time, paired = TRUE)
## # A tibble: 1 × 7
## .y. group1 group2 effsize n1 n2 magnitude
## * <chr> <chr> <chr> <dbl> <int> <int> <ord>
## 1 score After Before 0.261 60 60 small
QUESTIONS Q1) What is the size of the effect?
ANSWER: A Rank Biserial Correlation of 0.26 indicates the difference between the group averages was moderate.
Q2) Which group had the higher average score?
ANSWER: It is positive, it means the After scores were higher.
A Wilcoxon Signed-Rank Test was conducted to compare the sales before and after Campagin training among 60 participants. Median campaign sales were significantly lower before the training (Md = 24624) than After (Md = 25086), V = 640, p = 0.04. These results indicate that the Campaign training significantly increased sales.The effect size was r = 0.261, indicating a small effect.