-For this lab the goal was to utilize the data and run simple inference tests. - Before beginning the lab, we read in the data and also loaded up the dplyr() package.
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
Deer <- read.csv("Deer.csv")
iris <- read.csv("iris.csv")
-For the first problem, we had to make a vector of 50 random legolas actors. These actors had a mean height of 195cm, and a standard deviation of 15cm. From there we ran a t-test to compare the sample of actors to our set pf Aragorn and Gimli actors.
aragorn = rnorm(50, mean = 180, sd = 10)
gimli = rnorm(50, mean = 132, sd = 15)
legolas = rnorm(50, mean = 195, sd = 15)
t.test(aragorn, legolas, alternative = "two.sided")
##
## Welch Two Sample t-test
##
## data: aragorn and legolas
## t = -5.4084, df = 84.979, p-value = 5.752e-07
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -18.269358 -8.447525
## sample estimates:
## mean of x mean of y
## 181.3406 194.6991
t.test(gimli, legolas, alternative = "two.sided")
##
## Welch Two Sample t-test
##
## data: gimli and legolas
## t = -22.596, df = 97.43, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -69.02309 -57.87722
## sample estimates:
## mean of x mean of y
## 131.2489 194.6991
As noted by the output, ther seems to be a significant difference between both Legolas and Aragorn actors, and between both Legolas and Gimli actors.
The second problem had us re-run the variance test to compare the group of Gimli and Legolas actors.
var.test(gimli, legolas)
##
## F test to compare two variances
##
## data: gimli and legolas
## F = 0.85784, num df = 49, denom df = 49, p-value = 0.5936
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
## 0.4868036 1.5116753
## sample estimates:
## ratio of variances
## 0.8578397
As noted by the output, there does not seems to be statistically significant difference in the variance between the two groups.
For the third problem, we redid the correlation for Sepal Length and Sepal Width within the iris dataset, but for each of the three individual species.
iris_seto <- iris %>%
filter(Species == "setosa")
cor.test(iris_seto$Sepal.Length, iris_seto$Sepal.Width)
##
## Pearson's product-moment correlation
##
## data: iris_seto$Sepal.Length and iris_seto$Sepal.Width
## t = 7.6807, df = 48, p-value = 6.71e-10
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.5851391 0.8460314
## sample estimates:
## cor
## 0.7425467
iris_versi <- iris %>%
filter(Species == "versicolor")
cor.test(iris_versi$Sepal.Length, iris_versi$Sepal.Width)
##
## Pearson's product-moment correlation
##
## data: iris_versi$Sepal.Length and iris_versi$Sepal.Width
## t = 4.2839, df = 48, p-value = 8.772e-05
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.2900175 0.7015599
## sample estimates:
## cor
## 0.5259107
iris_virgin <- iris %>%
filter(Species == "virginica")
cor.test(iris_virgin$Sepal.Length, iris_virgin$Sepal.Width)
##
## Pearson's product-moment correlation
##
## data: iris_virgin$Sepal.Length and iris_virgin$Sepal.Width
## t = 3.5619, df = 48, p-value = 0.0008435
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.2049657 0.6525292
## sample estimates:
## cor
## 0.4572278
-Finally, we ran a chisq.test() function to test if there was a significant difference in the number of deer caught per month, and if the cases of tuberculosis are uniformly distributed across all farms.
table(Deer$Month)
##
## 1 2 3 4 5 6 7 8 9 10 11 12
## 256 165 27 3 2 35 11 19 58 168 189 188
chisq.test(table(Deer$Month))
##
## Chi-squared test for given probabilities
##
## data: table(Deer$Month)
## X-squared = 997.07, df = 11, p-value < 2.2e-16
table(Deer$Farm, Deer$Tb)
##
## 0 1
## AL 10 3
## AU 23 0
## BA 67 5
## BE 7 0
## CB 88 3
## CRC 4 0
## HB 22 1
## LCV 0 1
## LN 28 6
## MAN 27 24
## MB 16 5
## MO 186 31
## NC 24 4
## NV 18 1
## PA 11 0
## PN 39 0
## QM 67 7
## RF 23 1
## RN 21 0
## RO 31 0
## SAL 0 1
## SAU 3 0
## SE 16 10
## TI 9 0
## TN 16 2
## VISO 13 1
## VY 15 4
chisq.test(table(Deer$Farm, Deer$Tb))
## Warning in chisq.test(table(Deer$Farm, Deer$Tb)): Chi-squared approximation may
## be incorrect
##
## Pearson's Chi-squared test
##
## data: table(Deer$Farm, Deer$Tb)
## X-squared = 129.09, df = 26, p-value = 1.243e-15