DEPENDENT T-TEST & WILCOXON SIGN RANK
Used to test if there is a difference between Before scores and After scores (comparing the means).
NULL HYPOTHESIS (H0)
There is no difference between communication abilities of employees before and after the training.
ALTERNATE HYPOTHESIS (H1)
There is a difference between communication abilities of employees before and after the training.
IMPORT EXCEL FILE
Purpose
Import your Excel dataset into R to conduct analyses.
INSTALL REQUIRED PACKAGE
install.packages(“readxl”)
LOAD THE PACKAGE
Always reload the package you want to use.
library(readxl)
IMPORT EXCEL FILE INTO R STUDIO
A6R3 <- read_excel("/Users/alfred/Desktop/A6R3.xlsx")
CALCULATE THE DIFFERENCE SCORES
Calculate the difference between the Before scores versus the after scores.
Before <- A6R3$PreTraining
After <- A6R3$PostTraining
Differences <- After - Before
HISTOGRAM
Create a histogram for difference scores to visually check skewness and kurtosis.
CREATE THE HISTOGRAMS
hist(Differences,
main = "Histogram of Difference Scores",
xlab = "Value",
ylab = "Frequency",
col = "blue",
border = "black",
breaks = 20)
QUESTION
Q1) Is the histograms symmetrical, positively skewed, or negatively skewed?
- The histogram appears roughly symmetrical with no strong skew, although there may be slight positive skew due to higher After scores.
Q2) Did the histogram look too flat, too tall, or did it have a proper bell curve?
- The histogram has a reasonably proper bell curve shape, indicating a fairly normal distribution.
SHAPIRO-WILK TEST
Check the normality for the difference between the groups.
CONDUCT SHAPIRO-WILK TEST
shapiro.test(Differences)
##
## Shapiro-Wilk normality test
##
## data: Differences
## W = 0.98773, p-value = 0.21
QUESTIONS
Q1) Was the data normally distributed or abnormally distributed?
- The Shapiro-Wilk test returned p = 0.21, which is more than 0.05. This means the data was normally distributed. Therefore, we use the Dependent t-test.
BOXPLOT
Check for any outliers impacting the mean.
CREATE THE BOXPLOT
boxplot(Before, After,
names = c("Before", "After"),
main = "Boxplot of Before and After Scores",
col = c("lightblue", "lightgreen"))
QUESTIONS
Q1) Were there any dots outside of the boxplots? These dots represent participants with extreme scores.
- Yes, there were a few dots outside the boxplots, indicating the presence of outliers.
Q2) If there are outliers, are they are changing the mean so much that the mean no longer accurately represents the average score?
- The outliers have some effect on the mean, but not drastically. The mean Before score is 59.73 and the median is 60, while the mean After score is 69.24 and the median is 69.5. Since the mean and median are very close in both groups, the mean still reasonably represents the average score despite the outliers.
Q3) Make a decision. If the outliers are extreme, you will need to switch to a Wilcoxon Sign Rank.
If there are not outliers, or the outliers are not extreme, continue with Dependent t-test.
- Since the data is normally distributed (Shapiro-Wilk p = 0.21) and outliers are not extreme, we can continue with the Dependent t-test.
DESCRIPTIVE STATISTICS
Calculate the mean, median, SD, and sample size for each group.
DESCRIPTIVES FOR BEFORE SCORES
mean(Before, na.rm = TRUE)
## [1] 59.73333
median(Before, na.rm = TRUE)
## [1] 60
sd(Before, na.rm = TRUE)
## [1] 7.966091
length(Before)
## [1] 150
DESCRIPTIVES FOR AFTER SCORES
mean(After, na.rm = TRUE)
## [1] 69.24
median(After, na.rm = TRUE)
## [1] 69.5
sd(After, na.rm = TRUE)
## [1] 9.481653
length(After)
## [1] 150
DEPENDENT T-TEST
t.test(Before, After, paired = TRUE)
##
## Paired t-test
##
## data: Before and After
## t = -23.285, df = 149, p-value < 2.2e-16
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
## -10.313424 -8.699909
## sample estimates:
## mean difference
## -9.506667
DETERMINE STATISTICAL SIGNIFICANCE
If results were statistically significant (p < .05), continue to effect size section below.
If results were NOT statistically significant (p > .05), skip to reporting section below.
NOTE
Getting results that are not statistically significant does NOT mean you switch to Wilcoxon Sign Rank.
The Wilcoxon Sign Rank test is only for abnormally distributed data — not based on outcome significance.
EFFECT SIZE FOR DEPENDENT T-TEST
Purpose
Determine how big of a difference there was between the group means.
INSTALL REQUIRED PACKAGE
install.packages(“effectsize”)
LOAD THE PACKAGE
Always reload the package you want to use.
library(effectsize)
CALCULATE COHEN’S D
cohens_d(Before, After, paired = TRUE)
## For paired samples, 'repeated_measures_d()' provides more options.
## Cohen's d | 95% CI
## --------------------------
## -1.90 | [-2.17, -1.63]
QUESTIONS
Q1) What is the size of the effect?
± 0.00 to 0.09 = small
± 0.10 to 0.29 = moderate
± 0.30 to 0.49 = large
± 0.50 to 1.00 = very large
- The Cohen’s d for the paired samples is -1.90, which indicates a very large effect. This means the difference between Before and After scores is substantial and meaningful.
Q2) Which group had the higher average score?
- The mean Before score is 59.73, and the mean After score is 69.24. Since the After mean is higher, the After group (Post-Training scores) had the higher average score.