DEPENDENT T-TEST & WILCOXON SIGN RANK

# Used to test if there is a difference between Before scores and After scores (comparing the means).

# NULL HYPOTHESIS (H0) :There is no difference in the employees communication skills before and after the training

ALTERNATE HYPOTHESIS (H1): Employees communication skills increased after the training.

#install.packages("readxl")
library(readxl)
dataset <- read_excel("/Users/mac/Downloads/A6R3.xlsx")
Before <-dataset$PreTraining
After <- dataset$PostTraining

Differences <- After - Before

HISTOGRAM

Create a histogram for difference scores to visually check skewness and kurtosis.

hist(Differences,
     main = "Histogram of Difference Scores",
     xlab = "Value",
     ylab = "Frequency",
     col = "blue",
     border = "black",
     breaks = 20)

WRITE THE REPORT

Answer the questions below as a comment within the R script:

Q1) The histogram is positively skewed

Q2) The histogram has a proper bell curve

SHAPIRO-WILK TEST

Check the normality for the difference between the groups.

CONDUCT SHAPIRO-WILK TEST

shapiro.test(Differences)
## 
##  Shapiro-Wilk normality test
## 
## data:  Differences
## W = 0.98773, p-value = 0.21

QUESTIONS

Answer the questions below as a comment within the R script:

Q1) The data was normally distributed

If p > 0.05 (P-value is GREATER than .05) this means the data is NORMAL (continue with Dependent t-test).

If p < 0.05 (P-value is LESS than .05) this means the data is NOT normal (switch to Wilcoxon Sign Rank).

BOXPLOT

Check for any outliers impacting the mean.

CREATE THE BOXPLOT

boxplot(Before, After,
        names = c("Before", "After"),
        main = "Boxplot of Before and After Scores",
        col = c("lightblue", "lightgreen"))

QUESTIONS

Answer the questions below as a comment within the R script:

Q1) Were there any dots outside of the boxplots? yes

Q2) If there are outliers, are they are changing the mean so much that the mean no longer accurately represents the average score? No

Q3) Make a decision. If the outliers are extreme, you will need to switch to a Wilcoxon Sign Rank.

If there are not outliers, or the outliers are not extreme, continue with Dependent t-test.

DESCRIPTIVE STATISTICS

Calculate the mean, median, SD, and sample size for each group.

DESCRIPTIVES FOR BEFORE SCORES

mean(Before, na.rm = TRUE)
## [1] 59.73333
median(Before, na.rm = TRUE)
## [1] 60
sd(Before, na.rm = TRUE)
## [1] 7.966091
length(Before)
## [1] 150

DESCRIPTIVES FOR AFTER SCORES

mean(After, na.rm = TRUE)
## [1] 69.24
median(After, na.rm = TRUE)
## [1] 69.5
sd(After, na.rm = TRUE)
## [1] 9.481653
length(After)
## [1] 150

DEPENDENT T-TEST

Note: The Dependent t-test is also called the Paired Samples t-test.

t.test(Before, After, paired = TRUE)
## 
##  Paired t-test
## 
## data:  Before and After
## t = -23.285, df = 149, p-value < 2.2e-16
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  -10.313424  -8.699909
## sample estimates:
## mean difference 
##       -9.506667

DETERMINE STATISTICAL SIGNIFICANCE

If results were statistically significant (p < .05), continue to effect size section below.

If results were NOT statistically significant (p > .05), skip to reporting section below.

NOTE: Getting results that are not statistically significant does NOT mean you switch to Wilcoxon Sign Rank.

The Wilcoxon Sign Rank test is only for abnormally distributed data — not based on outcome significance.

EFFECT SIZE FOR DEPENDENT T-TEST

Purpose: Determine how big of a difference there was between the group means.

INSTALL REQUIRED PACKAGE

If never installed, remove the hashtag before the install code.

If previously installed, leave the hashtag in front of the code.

#install.packages(“effectsize”)

LOAD THE PACKAGE

Always reload the package you want to use.

library(effectsize)

CALCULATE COHEN’S D

cohens_d(Before, After, paired = TRUE)
## For paired samples, 'repeated_measures_d()' provides more options.
## Cohen's d |         95% CI
## --------------------------
## -1.90     | [-2.17, -1.63]

QUESTIONS

Answer the questions below as a comment within the R script:

Q1) What is the size of the effect?

The effect means how big or small was the difference between the group averages.

± 0.00 to 0.19 = ignore

± 0.20 to 0.49 = small

± 0.50 to 0.79 = moderate

± 0.80 to 1.29 = large

± 1.30 to + = very large

Examples:

A Cohen’s D of 0.10 indicates the difference between the group averages was not truly meaningful. There was no effect.

A Cohen’s D of 0.22 indicates the difference between the group averages was small.

There was a large difference between the groups before and after

Q2) Which group had the higher average score?

With the way we calculated differences (After minus Before), if it is positive, it means the After scores were higher.

If it is negative, it means the Before scores were higher.

You can also easily look at the means and tell which scores were higher.

The After scores were higher

Research Report on Results: Dependent t-test

Goal: Write a paragraph summarizing your findings

Directions:

For your results summary, you should report the following information:

1. The name of the inferential test used (Dependent t-test or Paired Samples t-test)

3. The sample size (n)

4. Whether the test was statistically significant (p < .05) or not (p > .05)

5. The mean (M) and standard deviation (SD) for each condition

6. Whether scores significantly increased, decreased, or stayed the same across time/conditions

7. Degrees of freedom (df)

8. t-value

9. EXACT p-value to three decimals. NOTE: If p > .05, just report p > .05 If p < .001, just report p < .001

10. If there was a significant difference, report the effect size (Cohen’s d) and interpretation (small, medium, large)

Example:A dependent t-test was done to compare the pre-training and post training scores of 150 participants.

#The findings indicated the scores needed after training (M = 69.24, SD = 9.48) were significantly greater than those required before training (M = 59.73, SD = 7.97), t(149) = -23.29, p < recovery protocols. #The mean difference was -9.51, 95% CI [-10.31, -8.70]. The allowed impact was the d of Cohen = -1.90 or massive impact. #These results indicate a significant change in the scores, as a result of the training program # `