Three problems — one chi-square goodness-of-fit, one chi-square test of independence, and one one-way ANOVA. Each one follows the same pattern: state hypotheses, run the test, interpret the result.
ACTN3 is a gene that encodes alpha-actinin-3, a protein in fast-twitch muscle fibers. The gene has two main alleles: R (functional) and X (non-functional). The R allele is linked to better performance in strength/speed/power sports; the X allele is associated with endurance.
A study of 436 people classified 244 as R and 192 as X. Does this provide evidence that the two options are NOT equally likely?
State your hypotheses:
# Q1. Run a chi-square goodness-of-fit test.
# (Hint: observed <- c(244, 192); chisq.test(observed))
observed <- c(244,192);
chisq.test(observed)
##
## Chi-squared test for given probabilities
##
## data: observed
## X-squared = 6.2018, df = 1, p-value = 0.01276
# Q2. What is the p-value? At α = 0.05, do you reject H₀?
# p-value = .01276; Since p-value < .05, we must reject H₀.
Q3. Write your conclusion in plain English: We cannot conclusively elect that the R and X alleles aren’t equally likely in the given sample. —
The NutritionStudy.csv dataset contains data on vitamin
use (VitaminUse) and gender (Sex) for many
participants. Is there a significant association between these two
variables?
Download NutritionStudy.csv from the Datasets folder on
Blackboard.
nutrition <- read.csv("NutritionStudy.csv")
State your hypotheses:
# Q4. Build a contingency table of VitaminUse and Sex using table().
table_vit <- table(nutrition$VitaminUse, nutrition$Sex)
# Q5. Run a chi-square test of independence on that table.
chisq.test(table_vit)
##
## Pearson's Chi-squared test
##
## data: table_vit
## X-squared = 11.071, df = 2, p-value = 0.003944
# Q6. What is the p-value? Do you reject H₀ at α = 0.05?
# p-value = .003944. Due to the p-value being smaller than .05, we must reject H₀.
Q7. Write your conclusion in plain English:
Researchers wanted to know how water chemistry affects fish ventilation. Fish were randomly assigned to one of three tanks with different calcium levels:
The team counted gill rates (beats per minute) for 30 fish in each
tank. The data is in FishGills3.csv.
Download FishGills3.csv from the Datasets folder on
Blackboard.
fish <- read.csv("FishGills3.csv")
State your hypotheses:
# Q8. Run a one-way ANOVA testing GillRate by Calcium.
# (Hint: aov(GillRate ~ Calcium, data = fish))
anova_result <- aov(GillRate ~ Calcium, data = fish)
anova_result
## Call:
## aov(formula = GillRate ~ Calcium, data = fish)
##
## Terms:
## Calcium Residuals
## Sum of Squares 2037.222 19064.333
## Deg. of Freedom 2 87
##
## Residual standard error: 14.80305
## Estimated effects may be unbalanced
# Q9. Use summary() on the result. What is the F statistic and p-value?
summary(anova_result)
## Df Sum Sq Mean Sq F value Pr(>F)
## Calcium 2 2037 1018.6 4.648 0.0121 *
## Residuals 87 19064 219.1
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# p-value = .0121 and f-value = 4.648.
# Q10. At α = 0.05, do you reject H₀?
# Since the p value is smaller than .05, we must reject the H₀.
Q11. Write your conclusion in plain English:
There are enough data to support the idea that mean gill rate differs between at least one level of calcium. Therefore, the effect calcium level has on fill gill rate is very apparent.