Two sample T test & Mann-Whitney U test

When two conditions are independent

E.g., Every participant joined a different task in your experiment

#create a dummy dataset as an example
Dataset <- data.frame(score = c(45 ,85, 66, 78, 92, 94, 91, 85, 62, 60,
                             40, 56, 70, 80, 90, 88, 95, 90, 45, 55,
                             84, 88, 88, 90, 92, 93, 91, 85, 80, 73,
                             97, 100, 93, 91, 90, 87, 94, 83, 92, 93),
                   condition = c(rep('condition1', 20), rep('condition2', 20)))

Descriptive statistics

#find sample size, mean, and standard deviation for each group
Dataset %>%
  group_by(condition) %>%
  summarise(
    count = n(),
    mean = mean(score),
    sd = sd(score)
  )

## # A tibble: 2 × 4
##   condition  count  mean    sd
##   <chr>      <int> <dbl> <dbl>
## 1 condition1    20  73.4 18.3 
## 2 condition2    20  89.2  6.09

Check data normality assumption

#### perform shapiro-wilk test
normality_check <- shapiro.test(Dataset$score)
if (normality_check$p.value > 0.05){
  print("The data comes from a population that is normally distributed. Please check the result of two sample t test.")
} else {
  print("The data is not normally distribued. Please check the result of Mann-Whitney U test.")
}

## [1] "The data is not normally distribued. Please check the result of Mann-Whitney U test."

Two sample t test

ref.: http://www.sthda.com/english/wiki/unpaired-two-samples-t-test-in-r

# Compute t-test
# t_res <- t.test(score ~ condition, data = Dataset, var.equal = TRUE) # Two Sample t-test
t_res <- t.test(score ~ condition, data = Dataset) # Welch t test
t_res

## 
##  Welch Two Sample t-test
## 
## data:  score by condition
## t = -3.6679, df = 23.143, p-value = 0.001269
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -24.786092  -6.913908
## sample estimates:
## mean in group condition1 mean in group condition2 
##                    73.35                    89.20

t_effectSize <- cohens_d(score ~ condition, data = Dataset)
t_effectSize

## # A tibble: 1 × 7
##   .y.   group1     group2     effsize    n1    n2 magnitude
## * <chr> <chr>      <chr>        <dbl> <int> <int> <ord>    
## 1 score condition1 condition2   -1.16    20    20 large

Mann-Whitney U test

ref.: https://www.statology.org/mann-whitney-u-test-r/

# Compute Mann-Whitney U test
u_res <- wilcox.test(score ~ condition, data = Dataset)

## Warning in wilcox.test.default(x = c(45, 85, 66, 78, 92, 94, 91, 85, 62, :
## cannot compute exact p-value with ties

u_res

## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  score by condition
## W = 96, p-value = 0.005047
## alternative hypothesis: true location shift is not equal to 0

## effect size ##
wilcox_effsize(score ~ condition, data = Dataset)

## # A tibble: 1 × 7
##   .y.   group1     group2     effsize    n1    n2 magnitude
## * <chr> <chr>      <chr>        <dbl> <int> <int> <ord>    
## 1 score condition1 condition2   0.445    20    20 moderate

Box plot with median and mean

ggplot(Dataset, aes(x = condition, y = score)) + 
    geom_boxplot(width=0.3) +
    stat_summary(fun = mean, geom = "point", col = "black") +  # Add points to plot
    stat_summary(fun = mean, geom = "text", col = "black", size = 2.7, # Add text to plot
    vjust = 3, aes(label = paste("Mean:", round(..y.., digits = 2)))) +
    xlab("Conditions")+
    ylab("Score")

Two sample T test & Mann-Whitney U test

cly

2022-08-27

When two conditions are independent

Descriptive statistics

Check data normality assumption

Two sample t test

Mann-Whitney U test

Box plot with median and mean