nutrition_study <- read.csv("NutritionStudy.csv")
Fish_Gills <- read.csv("FishGills3.csv")

Problem 1: ACTN3 Alleles (Chi-Square Goodness-of-Fit)

Of the 436 people in the sample, 244 have allele R and 192 have allele X. Test whether the two alleles are equally likely in the population at a = 0.05.

Problem 1 Hypotheses

\(H_0\):\(p_r\) = \(p_x\) = 0.5 \(H_a\): \(p_r\) \(\neq\) 0.5

# Observed counts and expected values

observed <- c(244, 192)
theoritical_prop <- c(0.5, 0.5)

expected_values <- theoritical_prop * sum(observed)
expected_values

## [1] 218 218

# Chi-square test and p-value

chisq_actn3 <- chisq.test(observed)
chisq_actn3

## 
##  Chi-squared test for given probabilities
## 
## data:  observed
## X-squared = 6.2018, df = 1, p-value = 0.01276

# Results

cat("Chi-square Test for ACTN3 Alleles\n")

## Chi-square Test for ACTN3 Alleles

cat("Test statistic:", round(chisq_actn3$statistic, 3), "\n")

## Test statistic: 6.202

cat("Degrees of freedom:", chisq_actn3$parameter, "\n")

## Degrees of freedom: 1

cat("P-value:", signif(chisq_actn3$p.value, 4), "\n")

## P-value: 0.01276

Problem 1 Conclusion

The chi-square test yielded a p-value of 0.0128, which is below a = 0.05. Therefore, we reject the null hypothesis and conclude that the R and X alleles are not equally likely in the population.

Problem 2: Vitamin Use and Gender

We want to determine whether vitamin use is associated with gender (Sex) in the NutritionStudy dataset.

Problem 2 Hypotheses and Contingency Table

\(H_0\): Vitamin use and gender are independent (no association).

\(H_A\): Vitamin use and gender are associated (there is a relationship).

# Contingency table of VitaminUse by Sex

tab_vit_sex <- table(nutrition_study$VitaminUse, nutrition_study$Sex)
tab_vit_sex

##             
##              Female Male
##   No             87   24
##   Occasional     77    5
##   Regular       109   13

# Chi-square test and p-value

chisq_vit_sex <- chisq.test(tab_vit_sex)
chisq_vit_sex

## 
##  Pearson's Chi-squared test
## 
## data:  tab_vit_sex
## X-squared = 11.071, df = 2, p-value = 0.003944

# Results

cat("Chi-square Test for Association: Vitamin Use vs Sex\n")

## Chi-square Test for Association: Vitamin Use vs Sex

cat("--------------------------------------------------\n")

## --------------------------------------------------

cat("Test statistic:", round(chisq_vit_sex$statistic, 3), "\n")

## Test statistic: 11.071

cat("Degrees of freedom:", chisq_vit_sex$parameter, "\n")

## Degrees of freedom: 2

cat("P-value:", signif(chisq_vit_sex$p.value, 6), "\n")

## P-value: 0.00394428

Problem 2 Conclusion

The chi-square test gives a test statistic of approximately 11.07 with 2 degrees of freedom and a p-value of about 0.00394. Because this p-value is less than α = 0.05, we reject the null hypothesis and conclude that there is statistically significant evidence of an association between vitamin use and gender in this sample.

Problem 3: Calcium Level and Fish Gill Rate

We want to determine whether mean gill rate differs among three calcium levels (Low, Medium, High) in the FishGills3 dataset.

Problem 3 Hypotheses and Anova test

\(H_0\): \(\mu_{Low} = \mu_{Medium} = \mu_{High}\)
(the mean gill rate is the same for all calcium levels)

\(H_A\): At least one mean gill rate differs from the others.

# Make sure Calcium is treated as a factor and perform check
Fish_Gills$Calcium <- as.factor(Fish_Gills$Calcium)

str(Fish_Gills)

## 'data.frame':    90 obs. of  2 variables:
##  $ Calcium : Factor w/ 3 levels "High","Low","Medium": 2 2 2 2 2 2 2 2 2 2 ...
##  $ GillRate: int  55 63 78 85 65 98 68 84 44 87 ...

table(Fish_Gills$Calcium)

## 
##   High    Low Medium 
##     30     30     30

# Anova test

anova_gill <- aov(GillRate ~ Calcium, data = Fish_Gills)
summary(anova_gill)

##             Df Sum Sq Mean Sq F value Pr(>F)  
## Calcium      2   2037  1018.6   4.648 0.0121 *
## Residuals   87  19064   219.1                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

#Results

anova_summary <- summary(anova_gill)

F_val       <- anova_summary[[1]]$`F value`[1]
p_val       <- anova_summary[[1]]$`Pr(>F)`[1]
df_between  <- anova_summary[[1]]$Df[1]
df_within   <- anova_summary[[1]]$Df[2]

cat("One-way ANOVA: GillRate ~ Calcium\n")

## One-way ANOVA: GillRate ~ Calcium

cat("---------------------------------\n")

## ---------------------------------

cat("F-statistic:", round(F_val, 3), "\n")

## F-statistic: 4.648

cat("Degrees of freedom (between, within):", df_between, ",", df_within, "\n")

## Degrees of freedom (between, within): 2 , 87

cat("P-value:", signif(p_val, 6), "\n")

## P-value: 0.0120771

Problem 3 Conclusion

The one-way ANOVA gives an F-statistic of approximately 4.65 with degrees of freedom (2, 87) and a p-value of about 0.0121. Because this p-value is less than α = 0.05, we reject the null hypothesis and conclude that there is statistically significant evidence that mean gill rate differs among at least some of the calcium levels.

Chi Squared and Anova HW

Zoe Kaplan

2025-12-14