Biological Sampling and Interpretation

September 19, 2022

Timetable second half (tentative)

Week 8

Date	Week	Lab	Assignment	Topic
29.8.	7	-	-	Week of midsemester test
19.9.	8	6	-	Statistical power, power test, plotting/tabulating data correctly
26.9.	9	7	-	The Chi-squared test
3.10.	10	8	-	Correlation analysis, reporting statistical results
10.10.	11	9	-	Linear Regression
17.10.	12	10	2	Exam week, Assignment 2 due on Friday

What will we do today

Week 8

Midsem test
Quick recap on t-test
Data transformations
non-parametric t-test alternatives (Wilcoxon test)
What is a power test?
The power t-test
Plotting data in R (bar plots)

Histogram of midsemester test results

Week 8

Recap on t-test

Week 8

What is the t-statistic (basics of the formula)?
What is a t-distribution?
Independent t-test vs. paired t-test
one vs. two-tailed t-test
How does the t-value link to the p-value?

Rationale for the t-test

Week 8

We need a metric (a test statistic!) that puts the difference between the samples into perspective with the standard deviations of the two samples !

This is called the t-statistic in a two sample test:

\[t = \frac{\text{observed difference - expected difference}}{\text{estimate of the standard deviations}}\]

So the larger the numerator or the smaller the denominator, the larger t will turn out. Think what this means in terms of the p-value!

Example question

Week 8

The formula for calculating a t-value makes sure a t-value becomes larger (and is thus more likely to produce a significant p-value) as…

…the observed difference between samples becomes larger
…the estimated standard deviations of the samples become larger
…the observed difference between samples becomes smaller
all of the above

The assumptions of a t-test

A t-test will give trustworthy results if the underlying assumptions are met:

Week 8

The two samples that are compared should follow a normal distribution
The two samples should be taken independently (e.g. not clustered)
The variances of the two samples can be unequal (set ‘var.equal’ argument to ‘FALSE’, which is the default)

But how can we test for normality of the data?

Using a test (e.g. the Shapiro-Wilk test: shapiro.test())
Looking at the so-called ‘qqplot’ using the function qqnorm(): do the points sit on a straight line?
Looking at the histogram of the variable: does it look bell-shaped?

Example: Are the marks of the mid-semester test normally distributed?

Using the Shapiro-Wilk test

Week 8

The null hypothesis of the Shapiro-Wilk test is: ‘The data are not significantly different from a normal distribution’

shapiro.test(test$mark) #your midsemester test marks

    Shapiro-Wilk normality test

data:  test$mark
W = 0.96243, p-value = 0.02131

What do we conclude?

Looking at the qq-plot

Week 8

The qq-plot plots the theoretical quantiles (of a perfect normal distribution) against the quantiles of the distribution in question, simple example:

x = c(1, 7, 8, 9, 9, 9, 10, 10, 10)
theoretical_quantiles = qnorm(p = c(.06, .17, .28, .39, .5, .61, .72, .83, .94))
plot(theoretical_quantiles, x)

If the distribution of \(x\) is approximately normal, the dots sit on a straight line! Use qqnorm() and qqline() to create this plot.

Looking at the qq-plot

Week 8

qq-plot for the midsemester test results:

qqnorm(test$mark, main = NA)
qqline(test$mark) #adds line where quantiles should match

- What do we conclude? The interpretation of a qq-plot needs experience, dots systematically deviating from the line indicate that your sample is not normally distributed. Question: can percentages be normally distributed?

What if the assumption of normally distributed data is violated in a t-test?

Week 8

The assumption of normality is often difficult to assess in small sample sizes. In a t-test with n = 3 for example, this is virtually impossible
Once you are sure that you cannot/should not assume normality in your data, consider alternatives
One possibility is to transform the data
A Non-parametric alternative (in the case of the t-test) is the Wilcoxon test
For more complicated models (linear regression, ANOVA) there are a multitude of ways to deal with non-normality (not discussed here)

Non-parametric means you are not assuming anything about the parameters of population (‘distribution-free’ tests)

Data transformation

The log transformation

Week 8

Consider the size of fish (x). A log transformation makes the distribution look much more normal:

hist(x)
hist(log(x))

Data transformation

The square root transformation

Week 8

Consider a variable x that shows a ratio (e.g. mark in %). A square root transformation makes the distribution look much more normal:

hist(x)
hist(sqrt(x))

The Wilcoxon test

A two-sample test of non-normally distributed samples

Week 8

Robust against an increased probability for type I and type II errors due to the violation of the assumption of normally distributed samples
Based on ranks, the absolute numbers are irrelevant
Therefore we do not required the data to be normally distributed
E.g., testing the two samples a = c(2, 3, 5) against b = c(3, 6, 8) will yield the very same results than testing x = c(0, 3, 4) against y = c(3, 23, 700). This is why:

a = c(2, 3, 5); b = c(3, 6, 8)
x = c(0, 3, 4); y = c(3, 23, 700)
rank(c(a, b))
[1] 1.0 2.5 4.0 2.5 5.0 6.0
rank(c(x, y))
[1] 1.0 2.5 4.0 2.5 5.0 6.0

You should be able to understand and interpret the above code!

The Wilcoxon test (example)

Week 8

Did the food science students do better than the environmental sciences students?

Check for normality of the samples visually

qqnorm(test$mark[test$Pathway == 'Food Science'])
qqline(test$mark[test$Pathway == 'Food Science'])
qqnorm(test$mark[test$Pathway == 'Env Sci'])
qqline(test$mark[test$Pathway == 'Env Sci'])

The Wilcoxon test (example)

Week 8

Check for normality of the samples using the Shapiro-Wilk test

shapiro.test(test$mark[test$Pathway == 'Food Science'])

    Shapiro-Wilk normality test

data:  test$mark[test$Pathway == "Food Science"]
W = 0.94241, p-value = 0.4889
shapiro.test(test$mark[test$Pathway == 'Env Sci'])

    Shapiro-Wilk normality test

data:  test$mark[test$Pathway == "Env Sci"]
W = 0.91797, p-value = 0.119

What would you conclude?

If you are unsure, using both a parametric and a non-parametric test is a good idea:

The Wilcoxon test (example)

Week 8

t.test(test$mark[test$Pathway == 'Food Science'],
            test$mark[test$Pathway == 'Env Sci'])

    Welch Two Sample t-test

data:  test$mark[test$Pathway == "Food Science"] and test$mark[test$Pathway == "Env Sci"]
t = 1.3396, df = 28.199, p-value = 0.1911
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -3.919604 18.748664
sample estimates:
mean of x mean of y 
 77.69231  70.27778

wilcox.test(test$mark[test$Pathway == 'Food Science'],
            test$mark[test$Pathway == 'Env Sci'])

    Wilcoxon rank sum test with continuity correction

data:  test$mark[test$Pathway == "Food Science"] and test$mark[test$Pathway == "Env Sci"]
W = 136.5, p-value = 0.4437
alternative hypothesis: true location shift is not equal to 0

The Wilcoxon test (example)

Week 8

Note the use of [] and ==. What does it do?

test$mark[test$Pathway == 'Food Science'] #same as
test[test$Pathway == 'Env Sci', 'Mark']

What do we conclude from the test?
Both tests are non-significant, so we can be reasonably confident that there is no difference between the two groups
The Wilcoxon test sacrifices some information in the data (that’s the price we have to pay) but it is more robust

Statistical power in a two-sample test

Week 8

Statistical power is the probability to detect a significant difference between two samples, given there is a true difference between them

This probability should normally be >80%
If it is less, we have little chance (power) to detect a difference, should there be one
By conducting a pilot study, we can estimate the variation and the expected difference between samples (in the case of a t-test)
This allows us to conduct a power test and plan the required sample size

Remember the example with the two boxes?

What was your power in this example?

Week 8

Statistical power in a t-test

Week 8

There is a trade-off between sample size (or standard deviations), expected difference, type-I error probability (\(\alpha\)) and type-II error probability (\(\beta\)) and the power (\(1-\beta\))!

Power t-test: example

Week 8

You are trying to find out whether it is possible to detect a 20% change in transpiration if you apply a certain treatment to forest trees, and your funding restricts you to a sample size of 8 trees:

In a pilot study you find a mean of 4 \(Ld^{-1}\) (liters per day) with a standard deviation of 2 \(Ld^{-1}\). The expected difference is 1 \(Ld^{-1}\)

We simulate a t-test based on these assumptions, what do we conclude?

set.seed(0)
t.test(rnorm(8, mean = 4, sd = 2), rnorm(8, mean = 5, sd = 2))

    Welch Two Sample t-test

data:  rnorm(8, mean = 4, sd = 2) and rnorm(8, mean = 5, sd = 2)
t = -0.6851, df = 13.997, p-value = 0.5045
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -3.124121  1.611485
sample estimates:
mean of x mean of y 
 4.297588  5.053906

Power t-test: example continuted

Week 8

Given the above, a t-test is unlikely to detect a potential difference, so how big would our sample size need to be?

power.t.test(delta = 1, sd = 2, power = 0.8)

     Two-sample t test power calculation 

              n = 63.76576
          delta = 1
             sd = 2
      sig.level = 0.05
          power = 0.8
    alternative = two.sided

NOTE: n is number in *each* group

This tells you that you need a sample size of c. 65 to achieve a power of 80% (i.e. 80% change to detect the difference if there is one)

Conclusion: Don’t even start the experiment if your sample size is restricted to 8, you only have about a 15% chance to detect a potential difference (see next slide)!

Power t-test: example continuted

Week 8

power.t.test(n = 8, delta = 1, sd = 2, sig.level = 0.05)

     Two-sample t test power calculation 

              n = 8
          delta = 1
             sd = 2
      sig.level = 0.05
          power = 0.1521558
    alternative = two.sided

NOTE: n is number in *each* group

Again, sample size, expected difference, standard deviations, \(\alpha\) (type I error probability), and \(\beta\) are traded off, you can’t have the cake and eat it!

Power t-test: visualisation

Week 8

http://www.statstudio.net/free-tools/power-analysis/

Example question

Week 8

What can you do to improve your power in a t-test?

You can try to reduce unsystematic variation in order to obtain lower standard deviations
You can decrease your \(\alpha\)-level
You can reduce the sample size
All of the above is correct

Bar plots

Week 8

Plotting the results of a t-test

Week 8

Create a simple data frame, one categorical, one continuous variable:

d1 = data.frame(sex = rep(c('M', 'F'), each = 10),
                 height = c(rnorm(n = 10, mean = 177, sd = 9),
                            rnorm(10, 160, 9)))
head(d1, 14)
   sex   height
1    M 179.2700
2    M 168.9727
3    M 180.9211
4    M 165.8622
5    M 174.9816
6    M 180.3966
7    M 178.2000
8    M 184.2377
9    M 176.4860
10   M 181.5325
11   F 169.7719
12   F 153.7814
13   F 148.4386
14   F 160.4205

Plotting the results of a t-test

Week 8

Conduct the t-test (as seen earlier):

t.test(d1$height ~ d1$sex)

For a quick look at the data, use the exploratory plotting function boxplot:

boxplot(d1$height ~ d1$sex)

Plotting the results of a t-test (customised plot)

Week 8

First calculate the mean and standard errors per group:

m = tapply(X = d1$height, INDEX = d1$sex, FUN = mean) #mean per group
s = tapply(X = d1$height, INDEX = d1$sex, FUN = sd) #calculating the sd per group

and plot it:

hh = barplot(m, ylim = c(0, 200))
segments(hh, m, hh, m+s)

What will we have learnt by the end of this week?

Week 8

How to do a simple data transformation
How to perform a Wilcoxon test (and when it makes sense to do so)
How to test for normality visually or with a test
What statistical power is
How to perform a power-t-test to determine the required sample size
How to plot the results from a t-test quickly
How to plot a customised a bar plot

Glossary

Week 8

log transformation
square root transformation
Wilcoxon rank-sum test
non-parametric tests
qq-plot (qqnorm(), qqline())
power, sample size, significance level (\(\alpha\)), type-II error probability (1-\(\beta\))
boxplot()
barplot()
segments()