August 22, 2022

Good morning!

Week 6

Midsemester test nightmare

Week 6

Midsemester / Exam

Week 6

What to put on the cheat sheet?

  • Think of the test/exam as a normal exam where you don’t get access to your notes
  • as you prepare for the test/exam, put things on your cheat sheet that otherwise you would have to remember/learn by heart

Possible examples:

  • All R functions that you have used so far (and how to apply them)
  • The formulas for sd, se, var, sum of squares
  • Examples for systematic vs. non-systematic variation, type I/II errors
  • An example for a t-test, from null hypothesis to interpretation
  • The ‘recipe’ for a confident interval calculation
Week 6

What we will do today

Week 6
  • A little simulation of why we divide by \(n-1\), not \(n\) when calculating the standard deviation
  • Quick recap of the box1-box2 example
  • What is a t-test?
  • What different t-tests exist?
  • What is a t-distribution?
  • What is one-tailed vs. two-tailed testing?
  • How to do a t-test in R
  • What assumptions need to be met?

Why we divide by \(n-1\), not \(n\) when calculating the variance?

Week 6
pop = rnorm(100000) #base population

var1 = NULL #initial sample 1 (with nothing in it)
var2 = NULL #initial sample 2 (with nothing in it)

for (i in 1:1000) { #take 1000 samples, append them to var1 and var2
  s1 = sample(pop, 5) #take a sample of 5 from the population
  var1 = append(var1, sum((s1 - mean(s1))^2)/(length(s1) - 1)) #sd with n-1
  var2 = append(var2, sum((s1 - mean(s1))^2)/length(s1)) #sd with n
}

par(mfrow = c(1, 2)) #change parameter settings to a 1x2 plotting area
hist(var1, xlim = c(0, 5)) #draw a histogram of 
abline(v = 1, col = 'red')
abline(v = mean(var1), col = 'green')
hist(var2, xlim = c(0, 5))
abline(v = 1, col = 'red')
abline(v = mean(var2), col = 'green')

Why we divide by \(n-1\), not \(n\) when calculating the variance?

Week 6

Recap week 5

Week 6

pnorm(q = 160, mean = 164, sd = 6)
[1] 0.2524925

Is a value that we’d get 25% of the time by chance rare?

The story with the two boxes

Week 6

The simplest experiment

Week 6
  • Only one response variable, one predictor
  • The predictor variable is binomial (e.g. treated, non-treated)
  • For example, we can ask if the movie ‘Scream 2’ is scarier than the original ‘Scream’.
  • We could measure heart rates (which indicate anxiety) during both films and compare them.

This situation can be analysed with a t-test:

scream1 = c(180, 165, 122, 156, 170) #max heart rates scream1
scream2 = c(190, 145, 100, 138, 166) #max heart rates scream2
t.test(scream1, scream2)

The t-test

Week 6
  • Independent t-test (or simply t-test)
    • Compares two means based on independent data.
    • E.g. data from different groups of people
  • Paired (or dependent) t-test
    • Compares two means based on related data.
    • E.g. data from the same people measured at different times.
    • Data from ‘matched’ samples (e.g. before - after)
  • One-tailed vs. two-tailed testing

Rationale for the t-test (1)

Week 6
  • Two samples are collected and the sample means calculated. These means might differ by either a little or a lot. Our null hypothesis is There is no difference between the samples
  • If the samples come from the same population, then we expect their means to be roughly equal (give or take a little due to chance)
  • We compare the difference between the sample means that we collected to the difference between the sample means that we would expect to obtain if there were no effect (i.e. if the null hypothesis were true). We use the standard deviation(s) as a gauge of the variability in our samples. Why does the spread in the two samples matter?

Rationale for the t-test (2)

Week 6
  • If the difference between the samples we have collected is unusually large then we can assume one of two things:
    • There is no ‘effect’ (difference between samples) but sample means in our population fluctuate a lot and we have, by chance, collected at least one or two atypical samples.
    • OR: the two samples come from different populations and are typical of their respective parent population. In this scenario, the difference between samples represents a genuine difference between the samples (and so the null hypothesis is incorrect).

As the observed difference between the sample means gets larger and the spread around the means gets smaller, the more confident we become that the second scenario (above) is correct (i.e. that the null hypothesis should be rejected)

We need an objective metric, that takes into account both the difference between the samples AND their standard deviations

Rationale for the t-test (in brief)

Week 6

We need a metric (a test statistic!) that puts the difference between the samples into perspective with

  • the difference between the samples that we would expect by chance, and
  • the standard deviations of the two samples

This is called the t-statistic:

\[t = \frac{\text{observed difference - expected difference}}{\text{estimate of the standard deviations}}\]

In fact, the expected difference is mostly zero (this is the case in the following examples)

The t-statistic, the test statistic for a t-test

Week 6

\[ t = \frac{\bar{X_1}-\bar{X_2}}{\sqrt{\frac{s^2_p}{n_1} + \frac{s^2_p}{n_2}}} \]

\[ s^2_p = \frac{(n_1 - 1)s^2_1 + (n_2 - 1)s^2_2}{n_1 + n_2 -2} \]

\(t\) follows a \(t\)-distribution!

\(n_1 + n_2 - 2\) are the degrees of freedom, the only parameter for this distribution

The t-distribution

Week 6

Use the equivalent commands to rnorm(), pnorm(), and qnorm()

rt(100, df = 10); pt(q = 0, df = 10); qt(p = .025, df = 10)

Arachnophobia example

Week 6
  • Is arachnophobia (fear of spiders) specific to real spiders or is a picture enough?
  • Participants
    • 12 arachnophobic individuals
  • Manipulation
    • 6 participants were exposed to a real spider
    • 6 were exposed to a picture of the same spider
  • Response variable: anxiety (using an imaginary anxiety meter…)
  • Our null hypothesis is: There is no difference in anxiety between seeing a real spider or a picture of a spider
realspider = c(3, 5, 3, 7, 8, 5)
spiderpicture = c(5, 6, 3, 8, 7, 8)

Arachnophobia example

Week 6

You can now organise your data in two ways, the so-called ‘wide’- or ‘long’ format:

d_wide = data.frame(realspider, spiderpicture)
d_long = data.frame(treat = c(0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1),
                     anxiety = c(realspider, spiderpicture))
head(d_wide, 3)
  realspider spiderpicture
1          3             5
2          5             6
3          3             3
head(d_long, 3)
  treat anxiety
1     0       3
2     0       5
3     0       3

The t-test in R

Week 6
  • To do a t-test we use the function t.test()
  • Depending on the format of your data, you can use this function in two ways:
t.test(d_wide$realspider, d_wide$spiderpicture)
t.test(d_long$anxiety ~ d_long$treat) #or:
t.test(anxiety ~ treat, data = d_long) #preferred

What does the dollar sign do again…?

I recommend the long format, it is more versatile and easier to use when you have a lot of data!

The result however looks the same:

The t-test in R

Week 6
t.test(d_long$anxiety ~ d_long$treat)

    Welch Two Sample t-test

data:  d_long$anxiety by d_long$treat
t = -0.86966, df = 9.9746, p-value = 0.4049
alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
95 percent confidence interval:
 -3.562974  1.562974
sample estimates:
mean in group 0 mean in group 1 
       5.166667        6.166667 

How do we interpret this output?

Comparing the t-value against a random t-distribution

Week 6
  • Apart from a recall of the data, the group means and a confidence interval for the difference between groups, we obtain a t-value, the degrees of freedom, and a p-value

  • We now compare our obtained t-value against a random t-distribution: how rare is our t-value, which reflects the difference between the groups and their standard deviations?

pt(q = -0.87, df = 10)
[1] 0.20235

This is our p-value! (in fact we need to multiply it by 2 to account for the two tails)

The t-test in R

Week 6

qt(p = .025, df = 10)
[1] -2.228139

Arachnophobia example: our conclusion

Week 6

We fail to reject our null hypotheis, but we cannot accept our null hypothesis

We cannot state that there is no difference in anxiety between seeing a real spider or a picture of a spider

We can only say that we don’t have enough evidence to reject the null hypothesis, as we could be committing a type II error, and we don’t know the probability for such an error (\(\beta\), see slides on power test!)

How to report the results of a t-test

Week 6

On average, participants did not experience greater anxiety from real spiders (6.1 \(\pm\) 0.79 s.e.) than from pictures of spiders (5.1 \(\pm\) 0.83 s.e.; t-test, p = 0.4).

or, if it were significant:

On average, participants did experience greater anxiety from real spiders (xy \(\pm\) xy s.e.) than from pictures of spiders (xy \(\pm\) xy s.e.; t-test, p = xyz).

Paired t-test: example

Week 6
  • Does plant transpiration respond to treatment with carbon dioxide?
  • Null hypothesis: plant transpiration is not affected by carbon dioxide
  • Measure transpiration in 12 leaves before and after treatment with carbon dioxide
  • The measurements are paired: the 12 leaves are the same before and after the treatment
  • Could you create a fictitious data frame for this example?

Paired t-test: example

Week 6
d1 = data.frame(transpiration = c(2, 4, 3, 4, 3, 5, 5, 4, 3, 6, 5, 4, 1, 2, 1, 4, 2, 3, 4, 3, 3, 2, 1, 4),
                co2 = rep(c('before', 'after'), each = 12))
head(d1)
  transpiration    co2
1             2 before
2             4 before
3             3 before
4             4 before
5             3 before
6             5 before

Paired t-test: example

Week 6
t.test(d1$transpiration ~ d1$co2, paired = T)

    Paired t-test

data:  d1$transpiration by d1$co2
t = -3.7607, df = 11, p-value = 0.003151
alternative hypothesis: true mean difference is not equal to 0
95 percent confidence interval:
 -2.3778894 -0.6221106
sample estimates:
mean difference 
           -1.5 
#set argument paired to TRUE

One- vs. two-tailed tests

Week 6
  • Depending on whether we expect differences between groups to occur in both directions or only in one direction, we use 1- or 2-tailed t-tests
  • In the spider example, differences could occur in both directions (those shown a picture could be more afraid): 2-tailed
  • If you test whether carrying backpacks makes people shorter (paired, before and after): 1-tailed (carrying backpacks can’t make you taller!

In R, choose alternative = 'two-sided', 'greater' or 'less' (by default the argument is set to ‘two-sided’:

t.test(d1$transpiration ~ d1$co2, paired = T, alternative = 'greater')

Assumptions of a t-test

Week 6
  • Both the independent t-test and the paired t-test are parametric tests based on the normal distribution. Therefore, they assume:
    • The sampling distribution is normally distributed. In the paired t-test this means that the sampling distribution of the differences between scores should be normal, not the scores themselves (assumption of normality).
    • Variances in the two samples are roughly equal (assumption of homogeneity of variance). This assumption however can be relaxed if variance heterogeneity is accounted for, and by default, R will do that!
    • Scores are independent (unless in a paired t-test) (assumption of independence)

Example

Week 6
  • Research question: Does the body height of the rear half of the lecture theatre differ from the front half?
  • Think about how we should sample, how many samples?
  • Set up a data frame with fictitious data, once in the wide, once in the long format
  • Formulate a null hypothesis
  • Think of what kind of t-test we should use
  • Think of the assumptions. Are they verified?
  • Think about type I and type II errors!

Summary: the t-test soup

Week 6

Ingredients:

  • One continuous response variable
  • One binomial predictor variable
  • The mean and standard deviations for both groups

Method:

  • Mix in the two groups with their means and standard deviations
  • The outcome will be a t-statistic (your test statistic)
  • Compare against a random t-distribution with the same degrees of freedom
  • Decide whether your t-value is rare, medium-rare, or not rare at all by looking at the p-value…

What will we have learnt in Week 6?

Week 6
  • How to compute a t-test
  • What an independent and a paired t-test is, how they are used
    • how to formulate a null hypothesis for a t-test
    • how to run a t-test in R
    • how to interpret the output of a t-test
  • How the argument ‘alternative’ is used (one- and two-tailed tests)
  • What the t-distribution is, how it relates to the normal distribution
  • What it means to commit a type I or type II error in a t-test
  • What assumptions need to be met in a t-test

Glossary Week 6

Week 6
  • t-statistic
  • t-distribution
  • t-test
  • independent t-test
  • paired t-test
  • one-tailed t-test
  • two-tailed t-test
  • p-value