Problem 1: Chi-Square Goodness of Fit Test

This question is about the ACTN3 gene alleles (R and X).
We want to see if the two alleles are equally common or not.

Hypotheses

\[ H_0: p_R = p_X = 0.5 \]

\[ H_a: p_R \neq p_X \]

Run the Test

# Observed counts
obs <- c(R = 244, X = 192)

# Chi-square goodness of fit against equal proportions
chisq_test1 <- chisq.test(obs, p = c(0.5, 0.5))
chisq_test1
## 
##  Chi-squared test for given probabilities
## 
## data:  obs
## X-squared = 6.2018, df = 1, p-value = 0.01276

P-value

chisq_test1$p.value
## [1] 0.01276179

Conclusion

If p-value < 0.05 → reject H0.
The test suggests the alleles are not equally likely.


Problem 2: Chi-Square Test for Association

Now we check if Vitamin Use depends on Gender using the NutritionStudy dataset.

Load dataset

nutrition <- read.csv(file.choose())

Create the contingency table

vit_gender_table <- table(nutrition$VitaminUse, nutrition$Sex)
vit_gender_table
##             
##              Female Male
##   No             87   24
##   Occasional     77    5
##   Regular       109   13

Hypotheses

\[ H_0: \text{Vitamin use and gender are independent} \]

\[ H_a: \text{Vitamin use and gender are associated} \]

Chi-square Test

chisq_test2 <- chisq.test(vit_gender_table)
chisq_test2
## 
##  Pearson's Chi-squared test
## 
## data:  vit_gender_table
## X-squared = 11.071, df = 2, p-value = 0.003944

P-value

chisq_test2$p.value
## [1] 0.003944277

Conclusion

If p-value < 0.05 → vitamin use and gender are related.


Problem 3: ANOVA Test (Fish Gill Rates)

We want to see if the mean gill rate is different depending on the calcium level.

Build FishGills3 dataset

FishGills3 <- data.frame(
  Calcium = c(
    rep("Low", 30),
    rep("Medium", 30),
    rep("High", 30)
  ),
  GillRate = c(
    # Low
    55,63,78,85,65,98,68,84,44,87,48,86,93,64,83,79,85,65,88,47,68,86,57,53,58,47,62,64,50,45,
    # Medium
    38,42,63,46,55,63,36,58,73,69,55,68,63,73,45,79,41,83,60,48,59,33,67,43,57,72,46,74,68,83,
    # High
    59,45,63,52,59,78,72,53,69,68,57,63,68,83,38,85,68,63,58,48,42,42,80,42,52,37,57,62,40,42
  )
)

Quick preview

head(FishGills3)
##   Calcium GillRate
## 1     Low       55
## 2     Low       63
## 3     Low       78
## 4     Low       85
## 5     Low       65
## 6     Low       98

Hypotheses

\[ H_0: \mu_{Low} = \mu_{Medium} = \mu_{High} \]

\[ H_a: \text{At least one mean is different} \]

Run ANOVA

anova_result <- aov(GillRate ~ Calcium, data = FishGills3)
summary(anova_result)
##             Df Sum Sq Mean Sq F value Pr(>F)  
## Calcium      2   2037  1018.6   4.648 0.0121 *
## Residuals   87  19064   219.1                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

P-value

summary(anova_result)[[1]][["Pr(>F)"]][1]
## [1] 0.01207706

Conclusion

If p-value < 0.05 → mean gill rate differs by calcium level.


End of Assignment