For each problem, write down the hypothesis and also write your conclusion. You should also attach the codes. The level of significance is \(0.05\).
For this assignment, we will use the Fisher’s Exact Test and the Chi-square test. We denote the odds ratio as \(\theta\). There are six problems. See the document in the repository for more details.
Is it statistically significant that the proportion of chipmunks trilling is higher when they are closer to their burrow?
Our hypotheses are \(H_0: \theta = 1\) and \(H_1: \theta > 1\). We input the data and use fisher.test with the argument alternative = 'greater'.
trills=array(c(16, 8, 3, 18),
dim = c(2, 2),
dimnames = list(
trilled = c("did", "did not"),
distance = c("10 M", "100 M")))
fisher.test(trills, alternative = "greater")
##
## Fisher's Exact Test for Count Data
##
## data: trills
## p-value = 0.0004321
## alternative hypothesis: true odds ratio is greater than 1
## 95 percent confidence interval:
## 2.829883 Inf
## sample estimates:
## odds ratio
## 11.23249
Based on the \(p-value\) we find the results to be statistically significant and we reject \(H_0\).
Is there no evidence that the two species of birds use the substrates in different proportions?
Our hypotheses are \(H_0: \theta = 1\) and \(H_1: \theta \neq 1\). We input the data and use fisher.test with the argument alternative = 'two.sided'.
substrate=array(c(15, 20, 14, 6, 8, 5, 7, 1),
dim = c(4, 2),
dimnames = list(
type = c("Vegetation", "Shoreline", "Water", "Structures"),
bird = c("Heron", "Egret")))
fisher.test(substrate, alternative = "two.sided")
##
## Fisher's Exact Test for Count Data
##
## data: substrate
## p-value = 0.5491
## alternative hypothesis: two.sided
Based on the \(p-value\) we do not find the results to be statistically significant and we fail to reject \(H_0\).
Is there a significant difference in synonymous/replacement ratio between polymorphisms and fixed differences?
Our hypotheses are \(H_0: \theta = 1\) and \(H_1: \theta \neq 1\). We input the data and use fisher.test with the argument alternative = 'two.sided'.
gene=array(c(43, 17, 2, 7),
dim = c(2, 2),
dimnames = list(
fixity = c("polymorphic", "fixed"),
synonymicity = c("synonymous", "replacement")))
fisher.test(gene, alternative = "two.sided")
##
## Fisher's Exact Test for Count Data
##
## data: gene
## p-value = 0.006653
## alternative hypothesis: true odds ratio is not equal to 1
## 95 percent confidence interval:
## 1.437432 92.388001
## sample estimates:
## odds ratio
## 8.540913
Based on the \(p-value\) we find the results to be statistically significant and we reject \(H_0\).
We want to check whether being a man or a woman (columns) is independent of having voted in the last election (rows). In other words is “sex and voting independent”?
Our hypotheses are \(H_0:\) the variables are independent and \(H_1:\) they are not independent. We input the data and use chisq.test.
vote=array(c(2792, 1486, 3591, 2131),
dim = c(2, 2),
dimnames = list(
position = c("voted", "didn't Vote"),
sex = c("Men", "Women")))
chisq.test(vote)
##
## Pearson's Chi-squared test with Yates' continuity correction
##
## data: vote
## X-squared = 6.5523, df = 1, p-value = 0.01047
Based on the \(p-value\) we find the results to be statistically significant and we reject \(H_0\).
What would we conclude (about this die)?
Our hypotheses are \(H_0:\) the variables are independent and \(H_1:\) they are not independent. We input the data and use chisq.test.
die=array(c(42, 55, 38, 57, 64, 44),
dim = c(6, 1),
dimnames = list(
face = c("1", "2", "3", "4", "5", "6"),
count = c("observed")))
chisq.test(die)
##
## Chi-squared test for given probabilities
##
## data: die
## X-squared = 10.28, df = 5, p-value = 0.06768
Based on the \(p-value\) we do not find the results to be statistically significant and we fail to reject \(H_0\).
Based on the data, what is your conclusion (pertaining to Calendar Effect for hockey players and date of birth)?
Our hypotheses are \(H_0:\) the variables are independent and \(H_1:\) they are not independent. We input the data and use chisq.test.
hockey=array(c(84, 77, 35, 34),
dim = c(4, 1),
dimnames = list(
quarter = c("Jan. to March", "April to June",
"July to Sept.", "Oct. to Dec."),
players = c("count")))
chisq.test(hockey)
##
## Chi-squared test for given probabilities
##
## data: hockey
## X-squared = 37.2348, df = 3, p-value = 4.104e-08
Based on the \(p-value\) we find the results to be statistically significant and we reject \(H_0\).