# DEPENDENT T-TEST & WILCOXON SIGN RANK
# Used to test if there is a difference between Before Training and After Training (comparing the means).
# NULL HYPOTHESIS (H0)
# The null hypothesis is ALWAYS used.
# There is no difference between the Before Training and After Training
# ALTERNATE HYPOTHESIS (H1)
# Choose ONE of the three options below (based on your research scenario):
# 1) NON-DIRECTIONAL ALTERNATE: There is a difference between the Before Training and After Training
# 2) DIRECTIONAL ALTERNATE HYPOTHESES ONE: Before Training is higher than After Training
# 3) DIRECTIONAL ALTERNATE HYPOTHESIS TWO: After Training are higher than Before Training.
# ========================================================
# >> IMPORT EXCEL FILE <<
# ========================================================
# Import your Excel dataset into R to conduct analyses.
# 1) INSTALL REQUIRED PACKAGE
# • If never installed, remove the hashtag before the install code.
# • If previously installed, leave the hashtag in front of the code.
#install.packages("readxl")
# ........................................................
# 2) LOAD THE PACKAGE
# • Always reload the package you want to use.
library(readxl)
# ........................................................
# 3) IMPORT EXCEL FILE INTO R STUDIO
A6R3 <- read_excel("C:\\Users\\leena\\Desktop\\SLU\\Sem 3 Fall 1\\Week 6\\A6R3.xlsx")
# ============================================
# >> CALCULATE THE DIFFERENCE SCORES <<
# ============================================
# Calculate the difference between the Before scores versus the after scores.
# ............................................
# 1) RENAME THE VARIABLES
# • Replace "dataset" with your dataset name (without .xlsx)
# • Replace "pre" with name of your variable for before scores.
# • Replace "post" with name of your variable for after scores.
Before <- A6R3$PreTraining
After <- A6R3$PostTraining
Differences <- After - Before
# ========================================================
# >> HISTOGRAM <<
# ========================================================
# Create a histogram for difference scores to visually check skewness and kurtosis.
# .........................................................
# 1) CREATE THE HISTOGRAMS
# • You do not need to edit this code.
hist(Differences,
main = "Histogram of Difference Scores",
xlab = "Value",
ylab = "Frequency",
col = "blue",
border = "black",
breaks = 20)

# ........................................................
# 2) WRITE THE REPORT
# Answer the questions below as a comment within the R script:
# Q1) Is the histograms symmetrical, positively skewed, or negatively skewed?
#Ans: The histograms are symmetrical
# Q2) Did the histogram look too flat, too tall, or did it have a proper bell curve?
#ans: It is a proper bell curve
# ========================================================
# >> SHAPIRO-WILK TEST <<
# ========================================================
# Check the normality for the difference between the groups.
# ........................................................
# 1) CONDUCT SHAPIRO-WILK TEST
# • You do not need to edit the code.
shapiro.test(Differences)
##
## Shapiro-Wilk normality test
##
## data: Differences
## W = 0.98773, p-value = 0.21
# ........................................................
# 2) WRITE THE REPORT
# Answer the questions below as a comment within the R script:
# Q1) Was the data normally distributed or abnormally distributed?
# If p > 0.05 (P-value is GREATER than .05) this means the data is NORMAL (continue with Dependent t-test).
# If p < 0.05 (P-value is LESS than .05) this means the data is NOT normal (switch to Wilcoxon Sign Rank).
# Ans: Data is normally distributed. (W= 0.988, p=0.21 > 0.05)
# ========================================================
# >> BOXPLOT <<
# ========================================================
# Check for any outliers impacting the mean.
# ........................................................
# 1) CREATE THE BOXPLOT
# • You do not need to edit this code
boxplot(Before, After,
names = c("Before", "After"),
main = "Boxplot of Before and After Scores",
col = c("lightblue", "lightgreen"))

# ........................................................
# 2) WRITE THE REPORT
# Answer the questions below as a comment within the R script:
# Q1) Were there any dots outside of the boxplots? These dots represent participants with extreme scores.
#Ans: No
# Q2) If there are outliers, are they are changing the mean so much that the mean no longer accurately represents the average score?
#Ans: No, the mean and median are nearly identical, so the averages represent the data well.
# Q3) Make a decision. If the outliers are extreme, you will need to switch to a Wilcoxon Sign Rank.
# If there are not outliers, or the outliers are not extreme, continue with Dependent t-test.
#Ans: Continuing with the Dependent t-test
# ========================================================
# >> DESCRIPTIVE STATISTICS <<
# ========================================================
# Calculate the mean, median, SD, and sample size for each group.
# ........................................................
# 1) DESCRIPTIVES FOR BEFORE SCORES
# • You do not need to edit this code
mean(Before, na.rm = TRUE)
## [1] 59.73333
median(Before, na.rm = TRUE)
## [1] 60
sd(Before, na.rm = TRUE)
## [1] 7.966091
length(Before)
## [1] 150
# ........................................................
# 2) DESCRIPTIVES FOR AFTER SCORES
# • You do not need to edit this code
mean(After, na.rm = TRUE)
## [1] 69.24
median(After, na.rm = TRUE)
## [1] 69.5
sd(After, na.rm = TRUE)
## [1] 9.481653
length(After)
## [1] 150
# ========================================================
# >> DEPENDENT T-TEST & WILCOXON SIGN RANK TEST <<
# ========================================================
# Check if the means from Before and After are different.
# ........................................................
# 1) CHOOSE THE TEST
# • If difference scores were normally distributed, use Dependent t-test.
# • If difference scores were NOT normally distributed, use Wilcoxon Sign Rank test.
#Ans: Dependent t-test
# ........................................................
# 2) CONDUCT THE PROPER TEST
# • Replace "dataset" with your dataset name (without .xlsx)
# • Replace "score" with your dependent variable R code name (example: USD)
# • Replace "group" with your independent variable R code name (example: Country)
# OPTION 1: DEPENDENT T-TEST
# • Note: The Dependent t-test is also called the Paired Samples t-test.
# • Remove the hashtag to use the code
# • There are no other edits you need to make to the code.
t.test(Before, After, paired = TRUE)
##
## Paired t-test
##
## data: Before and After
## t = -23.285, df = 149, p-value < 2.2e-16
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
## -10.313424 -8.699909
## sample estimates:
## mean difference
## -9.506667
# OPTION 2: WILCOXON SIGN RANK TEST
# • Remove the hashtag to use the code
# • There are no other edits you need to make to the code.
# • You do not need to edit the code.
#wilcox.test(Before, After, paired = TRUE)
# .......................................................
# 3) DETERMINE STATISTICAL SIGNIFICANCE
# • If results were statistically significant (p < .05), continue to effect size section below.
# • If results were NOT statistically significant (p > .05), skip to reporting section below.
# • NOTE: Getting results that are not statistically significant does NOT mean you switch to Wilcoxon Sign Rank.
# The Wilcoxon Sign Rank test is only for abnormally distributed data — not based on outcome significance.
#Ans: Results were statistically significant, p < 0.001
# ========================================================
# >> EFFECT SIZE FOR DEPENDENT T-TEST <<
# ========================================================
# Determine how big of a difference there was between the group means.
# • Remove the hashtags to use the code below.
# ........................................................
# 1) INSTALL REQUIRED PACKAGE
# • If never installed, remove the hashtag before the install code.
# • If previously installed, leave the hashtag in front of the code.
#install.packages("effectsize")
# ........................................................
# 2) LOAD THE PACKAGE
# Always reload the package you want to use.
library(effectsize)
# ........................................................
# 3) CALCULATE COHEN’S D
# • You do not need to edit the code
cohens_d(Before, After, paired = TRUE)
## For paired samples, 'repeated_measures_d()' provides more options.
## Cohen's d | 95% CI
## --------------------------
## -1.90 | [-2.17, -1.63]
# ........................................................
# 4) WRITE THE REPORT
# Answer the questions below as a comment within the R script:
#
# Q1) What is the size of the effect?
# The effect means how big or small was the difference between the group averages.
# ± 0.00 to 0.19 = ignore
# ± 0.20 to 0.49 = small
# ± 0.50 to 0.79 = moderate
# ± 0.80 to 1.29 = large
# ± 1.30 to + = very large
# Examples:
# A Cohen's D of 0.10 indicates the difference between the group averages was not truly meaningful. There was no effect.
# A Cohen's D of 0.22 indicates the difference between the group averages was small.
#Ans: t= -23.285, n=150
# Cohen's d = 1.90 (very large effect)
# d is verylarge.
#
# Q2) Which group had the higher average score?
# - With the way we calculated differences (After minus Before), if it is positive, it means the After scores were higher.
# - If it is negative, it means the Before scores were higher.
# - You can also easily look at the means and tell which scores were higher.
#Ans: Before Training: M = 59.73, SD = 7.97
# After Training: M = 69.24, SD = 9.48
# The after training scores were higher.
# ================================================================================================
# Research Report on Results: Dependent t-test (Paired Samples t-test)
# ================================================================================================
# Goal: Write a paragraph summarizing your findings
# Directions:
# For your results summary, you should report the following information:
# 1. The name of the inferential test used (Dependent t-test or Paired Samples t-test) -> Dependent t-test
# 2. The names of the two related conditions or time points you analyzed (use proper labels) -> Before training versus After training
# 3. The sample size (n) -> n=150
# 4. Whether the test was statistically significant (p < .05) or not (p > .05) -> p< 0.001
# 5. The mean (M) and standard deviation (SD) for each condition -> Before Traning M = 59.73, SD = 7.96
# After Training M = 69.24, SD = 9.48
# 6. Whether scores significantly increased, decreased, or stayed the same across time/conditions
#Ans: Scores increased Significantly
# 7. Degrees of freedom (df)
#Ans: df = 149
# 8. t-value
#Ans: t = -23.29
# 9. p-value (exact value if > .001, or p < .001)
#Ans: p< 0.001
# 10. If there was a significant difference, report the effect size (Cohen’s d) and interpretation (small, medium, large)
#Ans: Effect size = d = 1.90(very large)
# ================================================================================================
# Paragraph
# A dependent t-test was conducted to compare training scores before and after training among 150 participants. Results showed that post-training scores (M = 69.24, SD = 9.48) were significantly higher than pre-training scores (M = 59.73, SD = 7.97), t(149) = -23.29, p < .001. The mean difference was -9.51 points, with a 95% confidence interval from -10.31 to -8.70. The effect size was Cohen’s d = 1.90, indicating a very large effect. These results suggest that the training program produced a substantial improvement in participant scores.
#===================================================================================================
#===================================================================================================
# DEPENDENT T-TEST
# Note: The Dependent t-test is also called the Paired Samples t-test.
# Remove the hashtag to use the code
# There are no other edits you need to make to the code.
t.test(Before, After, paired = TRUE)
##
## Paired t-test
##
## data: Before and After
## t = -23.285, df = 149, p-value < 2.2e-16
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
## -10.313424 -8.699909
## sample estimates:
## mean difference
## -9.506667
# DETERMINE STATISTICAL SIGNIFICANCE
# If results were statistically significant (p < .05), continue to effect size section below.
# If results were NOT statistically significant (p > .05), skip to reporting section below.
# NOTE: Getting results that are not statistically significant does NOT mean you switch to Wilcoxon Sign Rank.
# The Wilcoxon Sign Rank test is only for abnormally distributed data — not based on outcome significance.
# EFFECT SIZE FOR DEPENDENT T-TEST
# Purpose: Determine how big of a difference there was between the group means.
# INSTALL REQUIRED PACKAGE
# If never installed, remove the hashtag before the install code.
# If previously installed, leave the hashtag in front of the code.
#install.packages("effectsize")
# LOAD THE PACKAGE
# Always reload the package you want to use.
library(effectsize)
# CALCULATE COHEN’S D
# You do not need to edit the code.
# Just remove the hashtag.
cohens_d(Before, After, paired = TRUE)
## For paired samples, 'repeated_measures_d()' provides more options.
## Cohen's d | 95% CI
## --------------------------
## -1.90 | [-2.17, -1.63]
# QUESTIONS
# Answer the questions below as a comment within the R script:
#
# Q1) What is the size of the effect?
# The effect means how big or small was the difference between the group averages.
# ± 0.00 to 0.19 = ignore
# ± 0.20 to 0.49 = small
# ± 0.50 to 0.79 = moderate
# ± 0.80 to 1.29 = large
# ± 1.30 to + = very large
# Ans: Cohen’s d = -1.90. The absolute value of 1.90 indicates a VERY LARGE effect size.
#
# Q2) Which group had the higher average score?
# Ans: he After Training group had the higher average (M = 69.24, SD = 9.48) compared to Before Training (M = 59.73, SD = 7.97).
# Research Report on Results: Dependent t-test
# Goal: Write a paragraph summarizing your findings
# Directions:
# For your results summary, you should report the following information:
# 1. The name of the inferential test used (Dependent t-test or Paired Samples t-test)
# Ans: Dependent t-test (Paired Samples t-test)
# 2. The names of the two related conditions or time points you analyzed (use proper labels)
# Ans: Before Training and After Training
# 3. The sample size (n)
# Ans: n = 150 participants
# 4. Whether the test was statistically significant (p < .05) or not (p > .05)
# Ans: Yes, statistically significant (p < .001)
# 5. The mean (M) and standard deviation (SD) for each condition
# Ans: Before Training: M = 59.73, SD = 7.97
# After Training: M = 69.24, SD = 9.48
# 6. Whether scores significantly increased, decreased, or stayed the same across time/conditions
# Ans: Scores significantly increased after training
# 7. Degrees of freedom (df)
# Ans: df = 149
# 8. t-value
# Ans: t = -23.29
# 9. p-value (exact value if > .001, or p < .001)
# Ans: p < .001
# 10. If there was a significant difference, report the effect size (Cohen’s d) and interpretation (small, medium, large)
# Ans: Cohen’s d = 1.90, very large effect
# Answer:
# A dependent t-test was conducted to compare training scores before and after training among 150 participants.
# Results showed that post-training scores (M = 69.24, SD = 9.48) were significantly higher than pre-training scores
# (M = 59.73, SD = 7.97), t(149) = -23.29, p < .001.
# The mean difference was -9.51 points.
# The effect size was Cohen’s d = 1.90, indicating a very large effect.
# These findings suggest that the training program produced a substantial improvement in participant scores.