August 20, 2019
pop = rnorm(100000) #base population
sd1 = NULL #initial sample 1 (with nothing in it)
sd2 = NULL #initial sample 2 (with nothing in it)
for (i in 1:1000) { #take 1000 samples, append them to sd1 and sd2
s1 = sample(pop, 5) #take a sample of 5 from the population
sd1 = append(sd1, sqrt(sum((s1 - mean(s1))^2)/(length(s1) - 1))) #sd with n-1
sd2 = append(sd2, sqrt(sum((s1 - mean(s1))^2)/length(s1))) #sd with n
}
par(mfrow = c(1, 2)) #change parameter settings to a 1x2 plotting area
hist(sd1, xlim = c(0, 2)) #draw a histogram of
abline(v = 1, col = 'red')
abline(v = mean(sd1), col = 'green')
hist(sd2, xlim = c(0, 2))
abline(v = 1, col = 'red')
abline(v = mean(sd2), col = 'green')
pnorm(q = 160, mean = 164, sd = 6) [1] 0.2524925
Is a value that we'd get 25% of the time by chance rare?
This situation can be analysed with a t-test:
scream1 = c(180, 165, 122, 156, 170) #max heart rates scream1 scream2 = c(190, 145, 100, 138, 166) #max heart rates scream2 t.test(scream1, scream2)
As the observed difference between the sample means gets larger and the spread around the means gets smaller, the more confident we become that the second scenario (above) is correct (i.e. that the null hypothesis should be rejected)
We need an objective metric, that takes into account both the difference between the samples AND their standard deviations
We need a metric (a test statistic!) that puts the difference between the samples into perspective with
This is called the t-statistic:
\[t = \frac{\text{observed difference - expected difference}}{\text{estimate of the standard deviations}}\]
In fact, the expected difference is mostly zero (this is the case in the following examples)
\[ t = \frac{\bar{X_1}-\bar{X_2}}{\sqrt{\frac{s^2_p}{n_1} + \frac{s^2_p}{n_2}}} \]
\[ s^2_p = \frac{(n_1 - 1)s^2_1 + (n_2 - 1)s^2_2}{n_1 + n_2 -2} \]
Use the equivalent commands to rnorm(), pnorm(), and qnorm()
rt(100, df = 10); pt(q = 0, df = 10); qt(p = .025, df = 10)
realspider = c(3, 5, 3, 7, 8, 5) spiderpicture = c(5, 6, 3, 8, 7, 8)
You can now organise your data in two ways, the so-called 'wide'- or 'long' format:
d_wide <- data.frame(realspider, spiderpicture)
d_long <- data.frame(treat = c(0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1),
anxiety = c(realspider, spiderpicture))
head(d_wide, 3)
realspider spiderpicture
1 3 5
2 5 6
3 3 3
head(d_long, 3)
treat anxiety
1 0 3
2 0 5
3 0 3
t.test()t.test(d_wide$realspider, d_wide$spiderpicture) t.test(d_long$anxiety ~ d_long$treat) #or: t.test(anxiety ~ treat, data = d_long)
What does the dollar sign do again…?
I recommend the long format, it is more versatile and easier to use when you have a lot of data!
The result however looks the same:
t.test(d_long$anxiety ~ d_long$treat)
Welch Two Sample t-test
data: d_long$anxiety by d_long$treat
t = -0.86966, df = 9.9746, p-value = 0.4049
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-3.562974 1.562974
sample estimates:
mean in group 0 mean in group 1
5.166667 6.166667
How do we interpret this output?
Apart from a recall of the data, the group means and a confidence interval for the difference between groups, we obtain a t-value, the degrees of freedom, and a p-value
We now compare our obtained t-value against a random t-distribution: how rare is our t-value, which reflects the difference between the groups and their standard deviations?
pt(q = -0.87, df = 10) [1] 0.20235
This is our p-value! (in fact we need to multiply it by 2 to account for the two tails)
qt(p = .025, df = 10) [1] -2.228139
We fail to reject our null hypotheis, but we cannot accept our null hypothesis
We cannot state that there is no difference in anxiety between seeing a real spider or a picture of a spider
We can only say that we don't have enough evidence to reject the null hypothesis, as we could be committing a type II error, and we don't know the probability for such an error (\(\beta\), see slides on power test!)
On average, participants did not experience greater anxiety from real spiders (6.1 \(\pm\) 0.79 s.e.) than from pictures of spiders (5.1 \(\pm\) 0.83 s.e.; t-test, p = 0.4).
or, if it were significant:
On average, participants did experience greater anxiety from real spiders (xy \(\pm\) xy s.e.) than from pictures of spiders (xy \(\pm\) x s.e.; t-test, p < xyz).
d1 = data.frame(transpiration = c(2, 4, 3, 4, 3, 5, 5, 4, 3, 6, 5, 4, 1, 2, 1, 4, 2, 3, 4, 3, 3, 2, 1, 4),
co2 = rep(c('before', 'after'), each = 12))
head(d1)
transpiration co2
1 2 before
2 4 before
3 3 before
4 4 before
5 3 before
6 5 before
t.test(d1$transpiration ~ d1$co2, paired = T)
Paired t-test
data: d1$transpiration by d1$co2
t = -3.7607, df = 11, p-value = 0.003151
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-2.3778894 -0.6221106
sample estimates:
mean of the differences
-1.5
#set argument paired to TRUE
In R, choose alternative = 'two-sided', 'greater' or 'less' (by default the argument is set to 'two-sided':
t.test(d1$transpiration ~ d1$co2, paired = T, alternative = 'greater')
Ingredients:
Method: