DEPENDENT T-TEST & WILCOXON SIGN RANK

Used to test if there is a difference between Before scores and After scores (comparing the means).

NULL HYPOTHESIS (H0)

There is no difference between communication abilities of employees before and after the training.

ALTERNATE HYPOTHESIS (H1)

There is a difference between communication abilities of employees before and after the training.

IMPORT EXCEL FILE

Purpose

Import your Excel dataset into R to conduct analyses.

INSTALL REQUIRED PACKAGE

install.packages(“readxl”)

LOAD THE PACKAGE

Always reload the package you want to use.

library(readxl)

IMPORT EXCEL FILE INTO R STUDIO

A6R3 <- read_excel("/Users/alfred/Desktop/A6R3.xlsx")

CALCULATE THE DIFFERENCE SCORES

Calculate the difference between the Before scores versus the after scores.

Before <- A6R3$PreTraining
After <- A6R3$PostTraining

Differences <- After - Before

HISTOGRAM

Create a histogram for difference scores to visually check skewness and kurtosis.

CREATE THE HISTOGRAMS

hist(Differences,
     main = "Histogram of Difference Scores",
     xlab = "Value",
     ylab = "Frequency",
     col = "blue",
     border = "black",
     breaks = 20)

QUESTION

Q1) Is the histograms symmetrical, positively skewed, or negatively skewed?

- The histogram appears roughly symmetrical with no strong skew, although there may be slight positive skew due to higher After scores.

Q2) Did the histogram look too flat, too tall, or did it have a proper bell curve?

- The histogram has a reasonably proper bell curve shape, indicating a fairly normal distribution.

SHAPIRO-WILK TEST

Check the normality for the difference between the groups.

CONDUCT SHAPIRO-WILK TEST

shapiro.test(Differences)
## 
##  Shapiro-Wilk normality test
## 
## data:  Differences
## W = 0.98773, p-value = 0.21

QUESTIONS

Q1) Was the data normally distributed or abnormally distributed?

- The Shapiro-Wilk test returned p = 0.21, which is more than 0.05. This means the data was normally distributed. Therefore, we use the Dependent t-test.

BOXPLOT

Check for any outliers impacting the mean.

CREATE THE BOXPLOT

boxplot(Before, After,
        names = c("Before", "After"),
        main = "Boxplot of Before and After Scores",
        col = c("lightblue", "lightgreen"))

QUESTIONS

Q1) Were there any dots outside of the boxplots? These dots represent participants with extreme scores.

- Yes, there were a few dots outside the boxplots, indicating the presence of outliers.

Q2) If there are outliers, are they are changing the mean so much that the mean no longer accurately represents the average score?

- The outliers have some effect on the mean, but not drastically. The mean Before score is 59.73 and the median is 60, while the mean After score is 69.24 and the median is 69.5. Since the mean and median are very close in both groups, the mean still reasonably represents the average score despite the outliers.

Q3) Make a decision. If the outliers are extreme, you will need to switch to a Wilcoxon Sign Rank.

If there are not outliers, or the outliers are not extreme, continue with Dependent t-test.

- Since the data is normally distributed (Shapiro-Wilk p = 0.21) and outliers are not extreme, we can continue with the Dependent t-test.

DESCRIPTIVE STATISTICS

Calculate the mean, median, SD, and sample size for each group.

DESCRIPTIVES FOR BEFORE SCORES

mean(Before, na.rm = TRUE)
## [1] 59.73333
median(Before, na.rm = TRUE)
## [1] 60
sd(Before, na.rm = TRUE)
## [1] 7.966091
length(Before)
## [1] 150

DESCRIPTIVES FOR AFTER SCORES

mean(After, na.rm = TRUE)
## [1] 69.24
median(After, na.rm = TRUE)
## [1] 69.5
sd(After, na.rm = TRUE)
## [1] 9.481653
length(After)
## [1] 150

DEPENDENT T-TEST

t.test(Before, After, paired = TRUE)
## 
##  Paired t-test
## 
## data:  Before and After
## t = -23.285, df = 149, p-value < 2.2e-16
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  -10.313424  -8.699909
## sample estimates:
## mean difference 
##       -9.506667

DETERMINE STATISTICAL SIGNIFICANCE

If results were statistically significant (p < .05), continue to effect size section below.

If results were NOT statistically significant (p > .05), skip to reporting section below.

NOTE

Getting results that are not statistically significant does NOT mean you switch to Wilcoxon Sign Rank.

The Wilcoxon Sign Rank test is only for abnormally distributed data — not based on outcome significance.

EFFECT SIZE FOR DEPENDENT T-TEST

Purpose

Determine how big of a difference there was between the group means.

INSTALL REQUIRED PACKAGE

install.packages(“effectsize”)

LOAD THE PACKAGE

Always reload the package you want to use.

library(effectsize)

CALCULATE COHEN’S D

cohens_d(Before, After, paired = TRUE)
## For paired samples, 'repeated_measures_d()' provides more options.
## Cohen's d |         95% CI
## --------------------------
## -1.90     | [-2.17, -1.63]

QUESTIONS

Q1) What is the size of the effect?

± 0.00 to 0.09 = small

± 0.10 to 0.29 = moderate

± 0.30 to 0.49 = large

± 0.50 to 1.00 = very large

- The Cohen’s d for the paired samples is -1.90, which indicates a very large effect. This means the difference between Before and After scores is substantial and meaningful.

Q2) Which group had the higher average score?

- The mean Before score is 59.73, and the mean After score is 69.24. Since the After mean is higher, the After group (Post-Training scores) had the higher average score.