M. Drew LaMar
February 26, 2016
Definition:
Type I error is rejecting a true null hypothesis. The probability of a Type I error is given by Pr[Reject H0 | H0 is true]=α
Definition:
Type II error is failing to reject a false null hypothesis. The probability of a Type II error is given by Pr[Do not reject H0 | H0 is false]=β
Definition: The
power of a statistical test (denoted 1−β) is given by Pr[Reject H0 | H0 is false]=1−β=1−Pr[Type II error]
Statistical power example
https://qubeshub.org/tools/statpowerviz/
Power of a statistical test is a function of
- Significance level α
- Variability of data
- Sample size
- Effect size
From proportions and binomial distributions…
…to working with direct frequency distributions.
Right Left
Observed 14 4
Expected 9 9
Note: The binomial test is an example of a
goodness-of-fit test .
Definition: A
goodness-of-fit test is a method for comparing an observed frequency distribution with the frequency distribution that would be expected under a simple probability model governing the occurrence of different outcomes.
Definition: A
model in this case is a simplified, mathematical representation that mimics how we think a natural process works.
Assignment Problem #21
A more recent study of Feline High-Rise Syndrom (FHRS) included data on the month in which each of 119 cats fell (Vnuk et al. 2004). The data are in the accompanying table. Can we infer that the rate of cat falling varies between months of the year?
Month | Number fallen | Month | Number fallen |
---|---|---|---|
January | 4 | July | 19 |
February | 6 | August | 13 |
March | 8 | September | 12 |
April | 10 | October | 12 |
May | 9 | November | 7 |
June | 14 | December | 5 |
Null and alternative hypotheses
A more recent study of Feline High-Rise Syndrom (FHRS) included data on the month in which each of 119 cats fell (Vnuk et al. 2004). The data are in the accompanying table. Can we infer that the rate of cat falling varies between months of the year?
Question: What are the null and alternative hypotheses?
Answer:
H0: The frequency of cats falling is the same in each month.
HA: The frequency of cats falling isnot the same in each month.
Observed and Expected Frequencies
rows <- c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec")
Obs <- c(4, 6, 8, 10, 9, 14, 19, 13, 12, 12, 7, 5)
Exp <- rep(sum(Obs)/12, 12)
FHRSTable <- matrix(c(Obs, Exp), ncol = 2, dimnames = list(rows, c("Obs","Exp")))
addmargins(FHRSTable, margin = 1)
Observed and Expected Frequencies
Obs Exp
Jan 4 9.916667
Feb 6 9.916667
Mar 8 9.916667
Apr 10 9.916667
May 9 9.916667
Jun 14 9.916667
Jul 19 9.916667
Aug 13 9.916667
Sep 12 9.916667
Oct 12 9.916667
Nov 7 9.916667
Dec 5 9.916667
Sum 119 119.000000
barplot(FHRSTable, beside=TRUE)
barplot(t(FHRSTable), beside=TRUE)
χ2 test statistic
Definition: The
χ2 statistic measures the discrepancy between observed frequencies from the data and expected frequencies from the null hypothesis and is given by
χ2=∑i(Observedi−Expectedi)2Expectedi
Discuss: What would support the null hypothesis more: a small value or large value for χ2?
Answer: Small value for χ2