This report will work on basic statistical inference tests using R, RStudio and knitr. The goal of this report is to:
iris<-read.csv("iris.csv")
deer<-read.csv("Deer.csv")
aragorn = rnorm(50, mean=180, sd=10)
gimli = rnorm(50, mean=132, sd=15)
legolas = rnorm(50, mean=195, sd=15)
t.test(legolas, aragorn, alternative="two.sided")
##
## Welch Two Sample t-test
##
## data: legolas and aragorn
## t = 5.4299, df = 84.227, p-value = 5.348e-07
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 8.084051 17.426540
## sample estimates:
## mean of x mean of y
## 191.8878 179.1325
t.test(legolas, gimli, alternative="two.sided")
##
## Welch Two Sample t-test
##
## data: legolas and gimli
## t = 19.88, df = 94.476, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 55.45838 67.76428
## sample estimates:
## mean of x mean of y
## 191.8878 130.2765
We find significant evidence to reject the null hypothesis that there is a significant difference in the heights between the actors who played Legolas and the actors who played Aragorn. Furthermore, we find significant evidence to reject the null hypothesis that there is a significant difference in the heights between the actors who played Legolas and the actors who played Gimli.
var.test(gimli, legolas)
##
## F test to compare two variances
##
## data: gimli and legolas
## F = 1.4787, num df = 49, denom df = 49, p-value = 0.1745
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
## 0.8391182 2.6057208
## sample estimates:
## ratio of variances
## 1.478684
We do not find signifcant evidence and fail to reject the null hypothesis that there is no differece in the variance between the group of Gimli and Legolas actors.
setosa = subset(iris, iris$Code =="1")
versicolor = subset(iris, iris$Code =="2")
virginica = subset(iris, iris$Code =="3")
cor.test(setosa$Sepal.Length, setosa$Sepal.Width)
##
## Pearson's product-moment correlation
##
## data: setosa$Sepal.Length and setosa$Sepal.Width
## t = 7.6807, df = 48, p-value = 6.71e-10
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.5851391 0.8460314
## sample estimates:
## cor
## 0.7425467
cor.test(versicolor$Sepal.Length, versicolor$Sepal.Width)
##
## Pearson's product-moment correlation
##
## data: versicolor$Sepal.Length and versicolor$Sepal.Width
## t = 4.2839, df = 48, p-value = 8.772e-05
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.2900175 0.7015599
## sample estimates:
## cor
## 0.5259107
cor.test(virginica$Sepal.Length, virginica$Sepal.Width)
##
## Pearson's product-moment correlation
##
## data: virginica$Sepal.Length and virginica$Sepal.Width
## t = 3.5619, df = 48, p-value = 0.0008435
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.2049657 0.6525292
## sample estimates:
## cor
## 0.4572278
We find significant evidence to reject the null hypothesis that there is a significant correlation between the Sepal Length and the Sepal Width for the Setosa, Versicolor, and Virginica species of iris.
chisq.test(table(deer$Month))
##
## Chi-squared test for given probabilities
##
## data: table(deer$Month)
## X-squared = 997.07, df = 11, p-value < 2.2e-16
chisq.test(table(deer$Tb,deer$Farm))
## Warning in chisq.test(table(deer$Tb, deer$Farm)): Chi-squared approximation may
## be incorrect
##
## Pearson's Chi-squared test
##
## data: table(deer$Tb, deer$Farm)
## X-squared = 129.09, df = 26, p-value = 1.243e-15
We find statistically significant evidence to reject the null hypothesis, the number of deer caught per month is not uniform. Similarly, we find significant evidence to reject the null hypothesis, the cases of tuberculosis are not uniformly distributed across all farms.