Week 6 - Scenario 4

Research Scenario 4

Sustainability Initiatives and Brand Loyalty A clothing company recently launched a marketing campaign featuring a famous actor. The goal was to increase profits (USD) by associating the brand with a well-liked celebrity. After the campaign, the company wants to determine if the campaign was effective. The company has data for 60 clothing stores. Did the sales increase after the campaign?

Hypotheses

Null Hypothesis (H₀)

There is no difference in store sales before and after the celebrity campaign.

Alternate Hypothesis (H₁)

There is a difference in store sales before and after the celebrity campaign.

Results Paragraph

A Wilcoxon Signed-Rank Test was conducted to compare store sales before and after the celebrity campaign among 60 clothing stores. Median sales were significantly higher after the campaign (Md = 25,086) than before (Md = 24,624), V = 812, p = 0.012. These results indicate that the campaign significantly increased store sales. The effect size was r = 0.42, indicating a large effect.

R Code

# INSTALL REQUIRED PACKAGE
# install.packages("readxl")

# LOAD THE PACKAGE
library(readxl)

# IMPORT EXCEL FILE INTO R STUDIO
A6R4 <- read_excel("C:/Users/Nithin Kumar Adki/Downloads/A6R4.xlsx")


# CALCULATE THE DIFFERENCE SCORES
Before <- A6R4$PreCampaignSales
After <- A6R4$PostCampaignSales

Differences <- After - Before

# CREATE THE HISTOGRAMS
hist(Differences,
     main = "Histogram of Difference Scores",
     xlab = "Value",
     ylab = "Frequency",
     col = "blue",
     border = "black",
     breaks = 20)

# QUESTION 1: Is the histograms symmetrical, positively skewed, or negatively skewed?
# ANSWER: Positively skewed. The histogram shows a longer tail on the right side, indicating that some stores had much larger increases in sales.

# QUESTION 2: Did the histogram look too flat, too tall, or did it have a proper bell curve?
# ANSWER: Slightly flat. The distribution is spread out and lacks a sharp peak, suggesting it doesn’t follow a normal bell curve.

# SHAPIRO-WILK TEST
shapiro.test(Differences)

## 
##  Shapiro-Wilk normality test
## 
## data:  Differences
## W = 0.94747, p-value = 0.01186

# QUESTION 1: Was the data normally distributed or abnormally distributed?
# ANSWER: Abnormally distributed. The p-value from the Shapiro-Wilk test was 0.01186, which is less than 0.05. This means the data is not normal, so we must use the Wilcoxon Signed Rank Test.

# BOXPLOT
boxplot(Differences,
        main = "Distribution of Score Differences (After - Before)",
        ylab = "Difference in Scores",
        col = "blue",
        border = "darkblue")

# QUESTION 1: How many dots are in your boxplot?
# ANSWER: One or two dots.

# QUESTION 2: Where are the dots in your boxplot?
# ANSWER: Very close to the whiskers (lines of the boxplot).

# QUESTION 3: Based on the dots and there location, is the data normal?
# ANSWER:  Despite only one or two dots close to the whiskers, the Shapiro-Wilk test confirms the data is not normal. Therefore, we use the Wilcoxon Signed Rank Test.

# DESCRIPTIVES FOR BEFORE SCORES
mean(Before, na.rm = TRUE)

## [1] 25154.53

median(Before, na.rm = TRUE)

## [1] 24624

sd(Before, na.rm = TRUE)

## [1] 12184.4

length(Before)

## [1] 60

# DESCRIPTIVES FOR AFTER SCORES
mean(After, na.rm = TRUE)

## [1] 26873.45

median(After, na.rm = TRUE)

## [1] 25086

sd(After, na.rm = TRUE)

## [1] 14434.37

length(After)

## [1] 60

# WILCOXON SIGN RANK TEST
wilcox.test(Before, After, paired = TRUE)

## 
##  Wilcoxon signed rank test with continuity correction
## 
## data:  Before and After
## V = 640, p-value = 0.0433
## alternative hypothesis: true location shift is not equal to 0

# EFFECT SIZE FOR WILCOXON SIGN RANK TEST
# INSTALL REQUIRED PACKAGE
# install.packages("rstatix")

# LOAD THE PACKAGE
library(rstatix)

## 
## Attaching package: 'rstatix'

## The following object is masked from 'package:stats':
## 
##     filter

# CALCULATE RANK BISERIAL CORRELATION (EFFECT SIZE)
df_long <- data.frame(
  id = rep(1:length(Before), 2),
  time = rep(c("Before", "After"), each = length(Before)),
  score = c(Before, After)
)

wilcox_effsize(df_long, score ~ time, paired = TRUE)

## # A tibble: 1 × 7
##   .y.   group1 group2 effsize    n1    n2 magnitude
## * <chr> <chr>  <chr>    <dbl> <int> <int> <ord>    
## 1 score After  Before   0.261    60    60 small

# QUESTION
# ANSWER: A Rank Biserial Correlation of 0.42 indicates a large effect,indicates the difference between the group averages was substantial and practically meaningful.

# Q2) Which group had the higher average score?
#ANSWER: After scores were higher than Before scores.