Dependent t-test

• Compares two means based on related data.

• E.g., Data from the same people measured at different times.

• Data from ‘matched’ samples. Sometimes called: “Matched-samples” t-test

Example

• Are invisible people mischievous?

• 24 Participants

• Manipulation

• Placed participants in an enclosed community riddled with hidden cameras.

• For first week participants normal behaviour was observed.

• For the second week, participants were given an invisibility cloak.

• Outcome

• measured how many mischievous acts participants performed in week 1 and week 2.

Rationale for the dependent t-test

• We will take the difference of the two scores for each participant. If there is no effect of the treatment (e.g., being invisible), then we expect the difference scores to be approximately 0, on average.

• What is our null & alternative hypotheses?

Ho: mud = 0

Ha: mud ≠ 0

Calculation of test statistic (t value)

Focuses on difference scores

Assumptions of the dependent t-test

The sampling distribution is normally distributed. In the dependent t-test this means that the sampling distribution of the differences between scores should be normal, not the scores themselves.
Dependent data are measured at least at the interval level. [same assumption as independent t-test]

Data

Running the code in R (by hand)

Or load Invisibility RM.csv

library(readr)
Invisibility_RM <- read_csv("Invisibility RM.csv")

## Rows: 12 Columns: 2
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (2): No_Cloak, Cloak
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

head(Invisibility_RM)

## # A tibble: 6 × 2
##   No_Cloak Cloak
##      <dbl> <dbl>
## 1        3     4
## 2        1     3
## 3        5     6
## 4        4     6
## 5        6     8
## 6        4     5

data <- c("no_cloak" = 3, 1, 5, 4, 6, 4, 6, 2, 0, 5, 4, 5)
data2 <- c("cloak" = 4, 3, 6, 6, 8, 5, 5, 4, 2, 5, 7, 5)

#calculate difference scores
diff <- data - data2
Invisibility_RM$No_Cloak - Invisibility_RM$Cloak

##  [1] -1 -2 -1 -2 -2 -1  1 -2 -2  0 -3  0

mean(diff) #mean of differences

## [1] -1.25

sd(diff) #standard deviation of differences

## [1] 1.13818

t <- mean(diff) / (sd(diff)/sqrt(length(diff)))
t

## [1] -3.80443

n is the length of “diff”, take the square root of n for the denominator. -3.8 means the t value is on the left. had we subtracted Cloak by No_Cloak, the value would be positive and the t vaule would represent the tail to the right.

#pt- area to the left on the t-distribution curve
pt(t, df=11)

## [1] 0.001460396

degrees of freedom is n-1 or df <- length(diff) - 1 which equals 11. If we were on the other side of the distribution (right) you would do 1-pt(t, df)

#2 sided test - need to multiply by 2
2*pt(t, df=11)

## [1] 0.002920793

Check assumption of normality

#check assumption - normality
hist(diff)

shapiro.test(diff)

## 
##  Shapiro-Wilk normality test
## 
## data:  diff
## W = 0.91231, p-value = 0.2284

If this was my actual data, we’d do the non-parametric because the histogram is shewed and the sample size is small.

Run dependent t-test

#run t-test;
#note that here we use commas and not ~ because our data are in different
#columns. Neither data nor data2 are the “independent” variable
t.test(data, data2, paired=T)

## 
##  Paired t-test
## 
## data:  data and data2
## t = -3.8044, df = 11, p-value = 0.002921
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  -1.9731653 -0.5268347
## sample estimates:
## mean difference 
##           -1.25

We are including the data with comma and not a ~. Recall dependent ~ independentbut the dependent data is really different. All the dependent data is in 2 different columns. This is repeated measures. The person needs to be on the same row. Do the first column (initial) to the second column (after) and tell it paired. MUST use comma for dependent ttest. so the data is dependent, dependent. the ~ doesnot go with this test. It will give the wrong answer.

t.test(Invisibility_RM$No_Cloak, Invisibility_RM$Cloak, paired=TRUE)

## 
##  Paired t-test
## 
## data:  Invisibility_RM$No_Cloak and Invisibility_RM$Cloak
## t = -3.8044, df = 11, p-value = 0.002921
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  -1.9731653 -0.5268347
## sample estimates:
## mean difference 
##           -1.25

What if the assumptions are not met?

The Wilcoxon signed-rank test which utilizes ranks
non-parametric This gets rid of any deviation from normality.

Steps to creating ranks:

Rank the absolute value of the scores (1 for the smallest)
Then sum up the positive ranks
Sum up the negative ranks
Your test statistic is the smaller of the two sums

Wilcoxon signed-rank test

wilcox.test(data, data2, paired=T, exact=F)

## 
##  Wilcoxon signed rank test with continuity correction
## 
## data:  data and data2
## V = 2.5, p-value = 0.01085
## alternative hypothesis: true location shift is not equal to 0

wilcox.test(Invisibility_RM$No_Cloak, Invisibility_RM$Cloak, paired = TRUE, exact = FALSE)

## 
##  Wilcoxon signed rank test with continuity correction
## 
## data:  Invisibility_RM$No_Cloak and Invisibility_RM$Cloak
## V = 2.5, p-value = 0.01085
## alternative hypothesis: true location shift is not equal to 0

Again, we have commas.

exact = FALSE: specifies whether to compute exact p-values or use asymptotic approximations. In this case, we use the asymptotic approximation, which is appropriate for larger sample sizes.Tells R that this is ranks with ties.

V is the test statistic. p-value is 0.01 which is smaller than 0.05. This means that there is a significant difference. This provides evidence to reject our null hypotheses in favor of the alternative. It looks like the amount of mischievousness between cloak and no cloak is significantly different.

Summary

For the dependent t-test:

1.What type of variables do we need for this test?

interval with 2 dependent groups

2.What is the null and alternative hypotheses?

mean differences are equal to zero, alt is mean is no equl

3.What test (and R code) do we need to run the statistical analysis?

t.test with paired equals true.

4.What are the assumptions of the test?

The different scores need to be normal.

• How do we check the assumptions of the test?

wilcox.test()with paired = TRUE, exact = FALSE

• What alternative tests do we run if the assumptions are not met?

p-value is that the probability of this conclusion to be true.

Communication

1.Descriptive Statistics

2.Description of the null hypothesis

3.A “stat” block

4.The results are interpreted

On average, students who were not wearing cloaks engaged in fewer mischievous acts (M = 3.6, SD = 1.9) than the same students did when wearing invisibility cloaks (M = 5, SD = 1.65). A dependent t-test was conducted to test whether the mean difference scores were significantly different from 0. This difference, 1.5, was significant t(11) = -3.8, p = 0.003. Based on this result, we can conclude that students wearing invisibility cloaks performed more mischievous acts than when they weren’t wearing cloaks.

First sentence: Descriptive statistics using the median

Second sentence: test name

Third: stat block

Fourth: interpretation

Students who were not wearing cloaks engaged in fewer mischievous acts (Mdn = 3.6) than the same students did when wearing invisibility cloaks (Mdn = 5). A Wilcoxon signed-rank test was conducted to test whether the ranked difference scores were significantly different. This difference, 1.5, was significant V = 2.5, p = 0.011. Based on this result, we can conclude that students wearing invisibility cloaks performed more mischievous acts than when they weren’t wearing cloaks.

“How do I report a t-test?” in Google for APA. Non-parametric is not average. We are talking about medians and looking ar if ranks are different from each other.

Dependent t-test

Lorraine Gaudio

2023-03-07

Introduction

Dependent t-test

Example

Rationale for the dependent t-test

Calculation of test statistic (t value)

Assumptions of the dependent t-test

Data

What if the assumptions are not met?

Wilcoxon signed-rank test

Summary

Communication