# DEPENDENT T-TEST & WILCOXON SIGN RANK
# Used to test if there is a difference between Before scores and After scores (comparing the means).
# NULL HYPOTHESIS (H0)
# The null hypothesis is ALWAYS used.
# There is no difference between the Before scores and After scores.
# ALTERNATE HYPOTHESIS (H1)
# Choose ONE of the three options below (based on your research scenario):
# 1) NON-DIRECTIONAL ALTERNATE: There is a difference between the Before scores and After scores.
# 2) DIRECTIONAL ALTERNATE HYPOTHESES ONE: Before scores are higher than After scores.
# 3) DIRECTIONAL ALTERNATE HYPOTHESIS TWO: After scores are higher than Before scores.
# ========================================================
# >> IMPORT EXCEL FILE <<
# ========================================================
# Import your Excel dataset into R to conduct analyses.
# 1) INSTALL REQUIRED PACKAGE
# • If never installed, remove the hashtag before the install code.
# • If previously installed, leave the hashtag in front of the code.
#install.packages("readxl")
# ........................................................
# 2) LOAD THE PACKAGE
# • Always reload the package you want to use.
library(readxl)
# ........................................................
# 3) IMPORT EXCEL FILE INTO R STUDIO
# • Download the Excel file from One Drive and save it to your desktop.
# • Right-click the Excel file and click “Copy as path” from the menu.
# • In RStudio, replace the example path below with your actual path.
# • Replace backslashes \ with forward slashes / or double them //:
# ✘ WRONG "C:\Users\Joseph\Desktop\mydata.xlsx"
# ✔ CORRECT "C:/Users/Joseph/Desktop/mydata.xlsx"
# ✔ CORRECT "C:\\Users\\Joseph\\Desktop\\mydata.xlsx"
# • Replace "dataset" with the name of your excel data (without the .xlsx)
A6R4 <- read_excel("C:\\Users\\leena\\Desktop\\SLU\\Sem 3 Fall 1\\Week 6\\A6R4.xlsx")
# ============================================
# >> CALCULATE THE DIFFERENCE SCORES <<
# ============================================
# Calculate the difference between the Before scores versus the after scores.
# ............................................
# 1) RENAME THE VARIABLES
# • Replace "dataset" with your dataset name (without .xlsx)
# • Replace "pre" with name of your variable for before scores.
# • Replace "post" with name of your variable for after scores.
Before <- A6R4$PreCampaignSales
After <- A6R4$PostCampaignSales
Differences <- After - Before
# ========================================================
# >> HISTOGRAM <<
# ========================================================
# Create a histogram for difference scores to visually check skewness and kurtosis.
# .........................................................
# 1) CREATE THE HISTOGRAMS
# • You do not need to edit this code.
hist(Differences,
main = "Histogram of Difference Scores",
xlab = "Value",
ylab = "Frequency",
col = "blue",
border = "black",
breaks = 20)

# ........................................................
# 2) WRITE THE REPORT
# Answer the questions below as a comment within the R script:
# Q1) Is the histograms symmetrical, positively skewed, or negatively skewed?
# Ans: The histogram is positively skewed.
# Q2) Did the histogram look too flat, too tall, or did it have a proper bell curve?
# Ans: The histogram is too tall
# ========================================================
# >> SHAPIRO-WILK TEST <<
# ========================================================
# Check the normality for the difference between the groups.
# ........................................................
# 1) CONDUCT SHAPIRO-WILK TEST
# • You do not need to edit the code.
shapiro.test(Differences)
##
## Shapiro-Wilk normality test
##
## data: Differences
## W = 0.94747, p-value = 0.01186
# ........................................................
# 2) WRITE THE REPORT
# Answer the questions below as a comment within the R script:
# Q1) Was the data normally distributed or abnormally distributed?
# If p > 0.05 (P-value is GREATER than .05) this means the data is NORMAL (continue with Dependent t-test).
# If p < 0.05 (P-value is LESS than .05) this means the data is NOT normal (switch to Wilcoxon Sign Rank).
# Ans: p = 0.01186, p value is less than 0.05, p < 0.05 -> data is not normal, so we will use Wilcoxon sign rank.
# ========================================================
# >> BOXPLOT <<
# ========================================================
# Check for any outliers impacting the mean.
# ........................................................
# 1) CREATE THE BOXPLOT
# • You do not need to edit this code
boxplot(Before, After,
names = c("Before", "After"),
main = "Boxplot of Before and After Scores",
col = c("lightblue", "lightgreen"))

# ........................................................
# 2) WRITE THE REPORT
# Answer the questions below as a comment within the R script:
# Q1) Were there any dots outside of the boxplots? These dots represent participants with extreme scores.
# Ans: There were some dots outisde the boxplots.
# Q2) If there are outliers, are they are changing the mean so much that the mean no longer accurately represents the average score?
# Ans: The outliers are not changing the mean so much.
# Q3) Make a decision. If the outliers are extreme, you will need to switch to a Wilcoxon Sign Rank.
# Ans: The distribution is not normal, p<0.05, So we eed to use Wilcoxon Sign Rank.
# If there are not outliers, or the outliers are not extreme, continue with Dependent t-test.
# ========================================================
# >> DESCRIPTIVE STATISTICS <<
# ========================================================
# Calculate the mean, median, SD, and sample size for each group.
# ........................................................
# 1) DESCRIPTIVES FOR BEFORE SCORES
# • You do not need to edit this code
mean(Before, na.rm = TRUE)
## [1] 25154.53
median(Before, na.rm = TRUE)
## [1] 24624
sd(Before, na.rm = TRUE)
## [1] 12184.4
length(Before)
## [1] 60
# ........................................................
# 2) DESCRIPTIVES FOR AFTER SCORES
# • You do not need to edit this code
mean(After, na.rm = TRUE)
## [1] 26873.45
median(After, na.rm = TRUE)
## [1] 25086
sd(After, na.rm = TRUE)
## [1] 14434.37
length(After)
## [1] 60
# ========================================================
# >> DEPENDENT T-TEST & WILCOXON SIGN RANK TEST <<
# ========================================================
# Check if the means from Before and After are different.
# ........................................................
# 1) CHOOSE THE TEST
# • If difference scores were normally distributed, use Dependent t-test.
# • If difference scores were NOT normally distributed, use Wilcoxon Sign Rank test.
# Ans: Wilcoxon Sign Rank Test
# ........................................................
# 2) CONDUCT THE PROPER TEST
# • Replace "dataset" with your dataset name (without .xlsx)
# • Replace "score" with your dependent variable R code name (example: USD)
# • Replace "group" with your independent variable R code name (example: Country)
# OPTION 1: DEPENDENT T-TEST
# • Note: The Dependent t-test is also called the Paired Samples t-test.
# • Remove the hashtag to use the code
# • There are no other edits you need to make to the code.
# t.test(Before, After, paired = TRUE)
# OPTION 2: WILCOXON SIGN RANK TEST
# • Remove the hashtag to use the code
# • There are no other edits you need to make to the code.
# • You do not need to edit the code.
wilcox.test(Before, After, paired = TRUE)
##
## Wilcoxon signed rank test with continuity correction
##
## data: Before and After
## V = 640, p-value = 0.0433
## alternative hypothesis: true location shift is not equal to 0
# .......................................................
# 3) DETERMINE STATISTICAL SIGNIFICANCE
# • If results were statistically significant (p < .05), continue to effect size section below.
# • If results were NOT statistically significant (p > .05), skip to reporting section below.
# • NOTE: Getting results that are not statistically significant does NOT mean you switch to Wilcoxon Sign Rank.
# The Wilcoxon Sign Rank test is only for abnormally distributed data — not based on outcome significance.
# ========================================================
# >> EFFECT SIZE FOR WILCOXON SIGN RANK TEST <<
# ========================================================
# Determine how big of a difference there was between the group means.
# ........................................................
# 1) INSTALL REQUIRED PACKAGE
# - If never installed, remove the hashtag before the install code.
# - If previously installed, leave the hashtag in front of the code.
#install.packages("rstatix")
#install.packages("coin")
# ........................................................
# 2) LOAD THE PACKAGE
# Always reload the package you want to use.
library(rstatix)
##
## Attaching package: 'rstatix'
## The following object is masked from 'package:stats':
##
## filter
# ........................................................
# 3) CALCULATE RANK BISERIAL CORRELATION (EFFECT SIZE)
# - You do not need to edit this code, just remove the hashtags
# Commented block (safe, won’t run)
df_long <- data.frame(
id = rep(1:length(Before), 2),
time = rep(c("Before", "After"), each = length(Before)),
score = c(Before, After)
)
wilcox_effsize(df_long, score ~ time, paired = TRUE)
## # A tibble: 1 × 7
## .y. group1 group2 effsize n1 n2 magnitude
## * <chr> <chr> <chr> <dbl> <int> <int> <ord>
## 1 score After Before 0.261 60 60 small
# ........................................................
# 4) WRITE THE REPORT
# Answer the questions below as a comment within the R script:
#
# Q1) What is the size of the effect?
# ± 0.00 to 0.09 = small
# ± 0.10 to 0.29 = moderate
# ± 0.30 to 0.49 = large
# ± 0.50 to 1.00 = very large
# Examples: 0.261 - Small
# A Rank Biserial Correlation of 0.10 indicates the difference between the group averages was not truly meaningful. There was no effect.
# A Rank Biserial Correlation of 0.22 indicates the difference between the group averages was small.
#
# Q2) Which group had the higher average score?
# - With the way we calculated differences (After minus Before), if it is positive, it means the After scores were higher.
# - If it is negative, it means the Before scores were higher.
# - You can also easily look at the means and tell which scores were higher.
# Ans: After> before, the After scores are higher.
# ================================================================================================
# Research Report on Results: Wilcoxon Signed-Rank Test
# ================================================================================================
# Goal: Write a paragraph summarizing your findings
# Directions:
# For your results summary, you should report the following information:
# 1. The name of the inferential test used
# Ans: Wilcoxon Signed-Rank Test
# 2. The names of the two related conditions or time points you analyzed
# Ans: Pre-Campaign Sales and Post-Campaign Sales
# 3. The sample size (n)
# Ans: n = 60
# 4. Whether the test was statistically significant (p < .05) or not (p > .05)
# Ans: Yes, the test was statistically significant (p = 0.0433 < 0.05)
# 5. The median for each condition
# Ans: Pre-Campaign Sales Median = 24,624; Post-Campaign Sales Median = 25,086
# 6. Whether scores significantly increased, decreased, or stayed the same
# Ans: Scores significantly increased after the campaign
# 7. The test statistic (W or V depending on your R output)
# Ans: V = 640
# 8. The p-value (exact if > .001, or report p < .001)
# Ans: p = 0.0433
# 9. If significant, the direction of the difference
# Ans: Post-Campaign Sales were significantly higher than Pre-Campaign Sales
# 10. The effect size (Rank Biserial Correlation) and its interpretation
# Ans: Rank Biserial Correlation = 0.261 → Small effect
# ================================================================================================
# Ans:
#A Wilcoxon Signed-Rank Test was conducted to compare sales before and after the marketing campaign among 60 participants. Median post-campaign sales (Md = 25,086) were significantly higher than pre-campaign sales (Md = 24,624), V = 640, p = .043. The effect size was r = 0.26, indicating a small effect. These results suggest that the campaign led to a statistically significant but modest improvement in sales.