This document was composed from Dr. Snopkowski’s ANTH 504 Week 8 lecture and Danielle Navarro’s 2021 Learning statistics with R Chapter 13.
• Compares two means based on related data.
• E.g., Data from the same people measured at different times.
• Data from ‘matched’ samples. Sometimes called: “Matched-samples” t-test
• Are invisible people mischievous?
• 24 Participants
• Manipulation
• Placed participants in an enclosed community riddled with hidden cameras.
• For first week participants normal behaviour was observed.
• For the second week, participants were given an invisibility cloak.
• Outcome
• measured how many mischievous acts participants performed in week 1 and week 2.
• We will take the difference of the two scores for each participant. If there is no effect of the treatment (e.g., being invisible), then we expect the difference scores to be approximately 0, on average.
• What is our null & alternative hypotheses?
Ho: mud = 0
Ha: mud ≠ 0
Focuses on difference scores
The sampling distribution is normally distributed. In the dependent t-test this means that the sampling distribution of the differences between scores should be normal, not the scores themselves.
Dependent data are measured at least at the interval level. [same assumption as independent t-test]
Running the code in R (by hand)
Or load Invisibility RM.csv
library(readr)
Invisibility_RM <- read_csv("Invisibility RM.csv")
## Rows: 12 Columns: 2
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (2): No_Cloak, Cloak
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
head(Invisibility_RM)
## # A tibble: 6 × 2
## No_Cloak Cloak
## <dbl> <dbl>
## 1 3 4
## 2 1 3
## 3 5 6
## 4 4 6
## 5 6 8
## 6 4 5
data <- c("no_cloak" = 3, 1, 5, 4, 6, 4, 6, 2, 0, 5, 4, 5)
data2 <- c("cloak" = 4, 3, 6, 6, 8, 5, 5, 4, 2, 5, 7, 5)
#calculate difference scores
diff <- data - data2
Invisibility_RM$No_Cloak - Invisibility_RM$Cloak
## [1] -1 -2 -1 -2 -2 -1 1 -2 -2 0 -3 0
mean(diff) #mean of differences
## [1] -1.25
sd(diff) #standard deviation of differences
## [1] 1.13818
t <- mean(diff) / (sd(diff)/sqrt(length(diff)))
t
## [1] -3.80443
n is the length of “diff”, take the square root of n for the denominator. -3.8 means the t value is on the left. had we subtracted Cloak by No_Cloak, the value would be positive and the t vaule would represent the tail to the right.
#pt- area to the left on the t-distribution curve
pt(t, df=11)
## [1] 0.001460396
degrees of freedom is n-1 or
df <- length(diff) - 1 which equals 11. If we were on
the other side of the distribution (right) you would do
1-pt(t, df)
#2 sided test - need to multiply by 2
2*pt(t, df=11)
## [1] 0.002920793
#check assumption - normality
hist(diff)
shapiro.test(diff)
##
## Shapiro-Wilk normality test
##
## data: diff
## W = 0.91231, p-value = 0.2284
If this was my actual data, we’d do the non-parametric because the histogram is shewed and the sample size is small.
#run t-test;
#note that here we use commas and not ~ because our data are in different
#columns. Neither data nor data2 are the “independent” variable
t.test(data, data2, paired=T)
##
## Paired t-test
##
## data: data and data2
## t = -3.8044, df = 11, p-value = 0.002921
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
## -1.9731653 -0.5268347
## sample estimates:
## mean difference
## -1.25
We are including the data with comma and not a ~. Recall
dependent ~ independentbut the dependent data is really
different. All the dependent data is in 2 different columns. This is
repeated measures. The person needs to be on the same row. Do the first
column (initial) to the second column (after) and tell it paired. MUST
use comma for dependent ttest. so the data is dependent, dependent. the
~ doesnot go with this test. It will give the wrong
answer.
t.test(Invisibility_RM$No_Cloak, Invisibility_RM$Cloak, paired=TRUE)
##
## Paired t-test
##
## data: Invisibility_RM$No_Cloak and Invisibility_RM$Cloak
## t = -3.8044, df = 11, p-value = 0.002921
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
## -1.9731653 -0.5268347
## sample estimates:
## mean difference
## -1.25
The Wilcoxon signed-rank test which utilizes ranks
non-parametric This gets rid of any deviation from normality.
Steps to creating ranks:
Rank the absolute value of the scores (1 for the smallest)
Then sum up the positive ranks
Sum up the negative ranks
Your test statistic is the smaller of the two sums
wilcox.test(data, data2, paired=T, exact=F)
##
## Wilcoxon signed rank test with continuity correction
##
## data: data and data2
## V = 2.5, p-value = 0.01085
## alternative hypothesis: true location shift is not equal to 0
wilcox.test(Invisibility_RM$No_Cloak, Invisibility_RM$Cloak, paired = TRUE, exact = FALSE)
##
## Wilcoxon signed rank test with continuity correction
##
## data: Invisibility_RM$No_Cloak and Invisibility_RM$Cloak
## V = 2.5, p-value = 0.01085
## alternative hypothesis: true location shift is not equal to 0
Again, we have commas.
exact = FALSE: specifies whether to compute exact
p-values or use asymptotic approximations. In this case, we use the
asymptotic approximation, which is appropriate for larger sample
sizes.Tells R that this is ranks with ties.
V is the test statistic. p-value is 0.01 which is smaller than 0.05. This means that there is a significant difference. This provides evidence to reject our null hypotheses in favor of the alternative. It looks like the amount of mischievousness between cloak and no cloak is significantly different.
For the dependent t-test:
1.What type of variables do we need for this test?
interval with 2 dependent groups
2.What is the null and alternative hypotheses?
mean differences are equal to zero, alt is mean is no equl
3.What test (and R code) do we need to run the statistical analysis?
t.test with paired equals true.
4.What are the assumptions of the test?
The different scores need to be normal.
• How do we check the assumptions of the test?
wilcox.test()with
paired = TRUE, exact = FALSE
• What alternative tests do we run if the assumptions are not met?
p-value is that the probability of this conclusion to be true.
1.Descriptive Statistics
2.Description of the null hypothesis
3.A “stat” block
4.The results are interpreted
On average, students who were not wearing cloaks engaged in fewer
mischievous acts (M = 3.6, SD = 1.9) than the same
students did when wearing invisibility cloaks
(M = 5, SD = 1.65). A dependent t-test was conducted to
test whether the mean difference scores were significantly different
from 0. This difference, 1.5, was significant
t(11) = -3.8, p = 0.003. Based on this result, we can
conclude that students wearing invisibility cloaks performed more
mischievous acts than when they weren’t wearing cloaks.
First sentence: Descriptive statistics using the median
Second sentence: test name
Third: stat block
Fourth: interpretation
Students who were not wearing cloaks engaged in fewer mischievous
acts (Mdn = 3.6) than the same students did when wearing
invisibility cloaks (Mdn = 5). A Wilcoxon signed-rank test
was conducted to test whether the ranked difference scores were
significantly different. This difference, 1.5, was significant
V = 2.5, p = 0.011. Based on this result, we can conclude
that students wearing invisibility cloaks performed more mischievous
acts than when they weren’t wearing cloaks.
“How do I report a t-test?” in Google for APA. Non-parametric is not average. We are talking about medians and looking ar if ranks are different from each other.