Module09 (5/29/2026)

Author

Ethan Schatz

This module explores simple inference tests in R. In this module, we will run:

1. t.tests

2. Analysis of Variance

3. Correlation Tests

4. Chi-squared Tests

Project

Variables

iris <- read.csv("E:/Summer 2026/data/iris.csv")
deer = read.csv("E:/Summer 2026/data/Deer.csv")
aragorn = rnorm(50, mean=180, sd=10)
gimli = rnorm(50, mean=132, sd=15)
legolas = rnorm(50, 195, 15)

Test 1

t.test(legolas, gimli, alternative="two.sided")


    Welch Two Sample t-test

data:  legolas and gimli
t = 19.989, df = 97.69, p-value < 2.2e-16
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 52.08915 63.57240
sample estimates:
mean of x mean of y 
 192.3782  134.5474

t.test(legolas, aragorn, alternative="two.sided")


    Welch Two Sample t-test

data:  legolas and aragorn
t = 5.8117, df = 80.406, p-value = 1.195e-07
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
  9.380038 19.147960
sample estimates:
mean of x mean of y 
 192.3782  178.1142

There is a significant difference in height between Legolas and Gimli actors and between the height of Legolas and Aragorn actors.

Test 2

var.test(legolas,gimli)


    F test to compare two variances

data:  legolas and gimli
F = 1.1193, num df = 49, denom df = 49, p-value = 0.6947
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
 0.635193 1.972470
sample estimates:
ratio of variances 
           1.11933

There is no significant difference between the variances of the height of Legolas actors compared to Gimli actors.

Test 3

#Setosa
cor.test(iris$Sepal.Length[iris$Species == "setosa"], iris$Sepal.Width[iris$Species == "setosa"])


    Pearson's product-moment correlation

data:  iris$Sepal.Length[iris$Species == "setosa"] and iris$Sepal.Width[iris$Species == "setosa"]
t = 7.6807, df = 48, p-value = 6.71e-10
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.5851391 0.8460314
sample estimates:
      cor 
0.7425467

#Versicolor
cor.test(iris$Sepal.Length[iris$Species == "versicolor"], iris$Sepal.Width[iris$Species == "versicolor"])


    Pearson's product-moment correlation

data:  iris$Sepal.Length[iris$Species == "versicolor"] and iris$Sepal.Width[iris$Species == "versicolor"]
t = 4.2839, df = 48, p-value = 8.772e-05
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.2900175 0.7015599
sample estimates:
      cor 
0.5259107

#Virginica
cor.test(iris$Sepal.Length[iris$Species == "virginica"], iris$Sepal.Width[iris$Species == "virginica"])


    Pearson's product-moment correlation

data:  iris$Sepal.Length[iris$Species == "virginica"] and iris$Sepal.Width[iris$Species == "virginica"]
t = 3.5619, df = 48, p-value = 0.0008435
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.2049657 0.6525292
sample estimates:
      cor 
0.4572278

Setosa: Strong positive correlation

Veriscolor: Moderate positive correlation

Virginica: Moderate positive correlation

Test 4

table(deer$Month)


  1   2   3   4   5   6   7   8   9  10  11  12 
256 165  27   3   2  35  11  19  58 168 189 188

chisq.test(table(deer$Month))


    Chi-squared test for given probabilities

data:  table(deer$Month)
X-squared = 997.07, df = 11, p-value < 2.2e-16

table(deer$Farm, deer$Tb)

      
         0   1
  AL    10   3
  AU    23   0
  BA    67   5
  BE     7   0
  CB    88   3
  CRC    4   0
  HB    22   1
  LCV    0   1
  LN    28   6
  MAN   27  24
  MB    16   5
  MO   186  31
  NC    24   4
  NV    18   1
  PA    11   0
  PN    39   0
  QM    67   7
  RF    23   1
  RN    21   0
  RO    31   0
  SAL    0   1
  SAU    3   0
  SE    16  10
  TI     9   0
  TN    16   2
  VISO  13   1
  VY    15   4

chisq.test(table(deer$Farm, deer$Tb))

Warning in chisq.test(table(deer$Farm, deer$Tb)): Chi-squared approximation may
be incorrect


    Pearson's Chi-squared test

data:  table(deer$Farm, deer$Tb)
X-squared = 129.09, df = 26, p-value = 1.243e-15

Deer observations are not evenly distributed across months of the year. There is a significant association between farms and TB in deer.

DISCLAIMER:

ChatGPT was used during the process of writing the code for the purpose of debugging and fixing errors in the code.