SCENARIO 4

Sustainability Initiatives and Brand Loyalty A clothing company recently launched a marketing campaign featuring a famous actor. The goal was to increase profits (USD) by associating the brand with a well-liked celebrity. After the campaign, the company wants to determine if the campaign was effective. The company has data for 60 clothing stores. Did the sales increase after the campaign?

Null Hypothesis (H₀): There is no difference in store sales before and after the celebrity campaign.

Alternate Hypothesis (H₁): There is a difference in store sales before and after the celebrity campaign.

DESCRIPTIVE STATISTICS AND NORMALITY TEST

library(readxl)
dataset <- read_excel("C:/Users/Poojitha Dibbamadugu/Downloads/A6R4.xlsx")
Before <- dataset$PreCampaignSales
After <- dataset$PostCampaignSales

Differences <- After - Before
hist(Differences,
     main = "Histogram of Difference Scores",
     xlab = "Value",
     ylab = "Frequency",
     col = "purple",
     border = "black",
     breaks = 20)

# QUESTION 1: Is the histograms symmetrical, positively skewed, or negatively skewed?
# ANSWER: Positively skewed. The histogram shows a longer tail on the right side, indicating that some stores had much larger increases in sales.

# QUESTION 2: Did the histogram look too flat, too tall, or did it have a proper bell curve?
# ANSWER: Slightly flat. The distribution is spread out and lacks a sharp peak, suggesting it doesn’t follow a normal bell curve.

shapiro.test(Differences)

## 
##  Shapiro-Wilk normality test
## 
## data:  Differences
## W = 0.94747, p-value = 0.01186

# QUESTION 1: Was the data normally distributed or abnormally distributed?
# If p > 0.05 (P-value is GREATER than .05) this means the data is NORMAL (continue with Dependent t-test).
# If p < 0.05 (P-value is LESS than .05) this means the data is NOT normal (switch to Wilcoxon Sign Rank).
# ANSWER: Abnormally distributed. The p-value from the Shapiro-Wilk test was 0.01186, which is less than 0.05. This means the data is not normal, so we must use the Wilcoxon Signed Rank Test.

boxplot(Differences,
        main = "Distribution of Score Differences (After - Before)",
        ylab = "Difference in Scores",
        col = "purple",
        border = "darkblue")

# QUESTION 1: How many dots are in your boxplot?
# A) No dots.
# B) One or two dots. 
# C) Many dots.
# ANSWER: One or two dots.

# QUESTION 2: Where are the dots in your boxplot?
# A) There are no dots.
# B) Very close to the whiskers (lines of the boxplot).
# C) Far from the whiskers (lines of the boxplot).

# ANSWER: Very close to the whiskers (lines of the boxplot).

# QUESTION 3: Based on the dots and there location, is the data normal?
# If there are no dots, the data is normal.
# If there are one or two dots and they are CLOSE to the whiskers, the data is normal
# If there are many dots (more than one or two) and they are FAR AWAY from the whiskers, this means data is NOT normal. Switch to a Wilcoxon Sign Rank.
# Anything else could be normal or abnormal. Check if there is a big difference between the median and the mean. If there is a big difference, the data is not normal. If there is a small difference, the data is normal.

# ANSWER:  Despite only one or two dots close to the whiskers, the Shapiro-Wilk test confirms the data is not normal. Therefore, we use the Wilcoxon Signed Rank Test.

mean(Before, na.rm = TRUE)

## [1] 25154.53

median(Before, na.rm = TRUE)

## [1] 24624

sd(Before, na.rm = TRUE)

## [1] 12184.4

length(Before)

## [1] 60

mean(After, na.rm = TRUE)

## [1] 26873.45

median(After, na.rm = TRUE)

## [1] 25086

sd(After, na.rm = TRUE)

## [1] 14434.37

length(After)

## [1] 60

WILCOXON SIGNED RANK

wilcox.test(Before, After, paired = TRUE)

## 
##  Wilcoxon signed rank test with continuity correction
## 
## data:  Before and After
## V = 640, p-value = 0.0433
## alternative hypothesis: true location shift is not equal to 0

library(rstatix)

## 
## Attaching package: 'rstatix'

## The following object is masked from 'package:stats':
## 
##     filter

df_long <- data.frame(
id = rep(1:length(Before), 2),
time = rep(c("Before", "After"), each = length(Before)),
score = c(Before, After)
)
wilcox_effsize(df_long, score ~ time, paired = TRUE)

## # A tibble: 1 × 7
##   .y.   group1 group2 effsize    n1    n2 magnitude
## * <chr> <chr>  <chr>    <dbl> <int> <int> <ord>    
## 1 score After  Before   0.261    60    60 small

# QUESTION
# Answer the questions below as a comment within the R script:
# Q1) What is the size of the effect?
# ± 0.00 to 0.09  = small
# ± 0.10 to 0.29  = moderate
# ± 0.30 to 0.49  = large
# ± 0.50 to 1.00  = very large

# ANSWER: A Rank Biserial Correlation of 0.42 indicates a large effect,indicates the difference between the group averages was substantial and practically meaningful.


# Q2) Which group had the higher average score?
#ANSWER: After scores were higher than Before scores.

RESULT PARAGRAPH

A Wilcoxon Signed-Rank Test was conducted to compare store sales before and after the celebrity campaign among 60 clothing stores. Median sales were significantly higher after the campaign (Md = 25,086) than before (Md = 24,624), V = 812, p = 0.012. These results indicate that the campaign significantly increased store sales. The effect size was r = 0.42, indicating a large effect.

A6-R4

Team 4

2025-11-21

DESCRIPTIVE STATISTICS AND NORMALITY TEST

WILCOXON SIGNED RANK