# Reading CSV files
nutrition_data <- read.csv("NutritionStudy.csv")
fish_data <- read.csv("FishGills3.csv")
str(nutrition_data)
## 'data.frame': 315 obs. of 17 variables:
## $ ID : int 1 2 3 4 5 6 7 8 9 10 ...
## $ Age : int 64 76 38 40 72 40 65 58 35 55 ...
## $ Smoke : chr "No" "No" "No" "No" ...
## $ Quetelet : num 21.5 23.9 20 25.1 21 ...
## $ Vitamin : int 1 1 2 3 1 3 2 1 3 3 ...
## $ Calories : num 1299 1032 2372 2450 1952 ...
## $ Fat : num 57 50.1 83.6 97.5 82.6 56 52 63.4 57.8 39.6 ...
## $ Fiber : num 6.3 15.8 19.1 26.5 16.2 9.6 28.7 10.9 20.3 15.5 ...
## $ Alcohol : num 0 0 14.1 0.5 0 1.3 0 0 0.6 0 ...
## $ Cholesterol : num 170.3 75.8 257.9 332.6 170.8 ...
## $ BetaDiet : int 1945 2653 6321 1061 2863 1729 5371 823 2895 3307 ...
## $ RetinolDiet : int 890 451 660 864 1209 1439 802 2571 944 493 ...
## $ BetaPlasma : int 200 124 328 153 92 148 258 64 218 81 ...
## $ RetinolPlasma: int 915 727 721 615 799 654 834 825 517 562 ...
## $ Sex : chr "Female" "Female" "Female" "Female" ...
## $ VitaminUse : chr "Regular" "Regular" "Occasional" "No" ...
## $ PriorSmoke : int 2 1 2 2 1 2 1 1 1 2 ...
head(nutrition_data)
## ID Age Smoke Quetelet Vitamin Calories Fat Fiber Alcohol Cholesterol
## 1 1 64 No 21.4838 1 1298.8 57.0 6.3 0.0 170.3
## 2 2 76 No 23.8763 1 1032.5 50.1 15.8 0.0 75.8
## 3 3 38 No 20.0108 2 2372.3 83.6 19.1 14.1 257.9
## 4 4 40 No 25.1406 3 2449.5 97.5 26.5 0.5 332.6
## 5 5 72 No 20.9850 1 1952.1 82.6 16.2 0.0 170.8
## 6 6 40 No 27.5214 3 1366.9 56.0 9.6 1.3 154.6
## BetaDiet RetinolDiet BetaPlasma RetinolPlasma Sex VitaminUse PriorSmoke
## 1 1945 890 200 915 Female Regular 2
## 2 2653 451 124 727 Female Regular 1
## 3 6321 660 328 721 Female Occasional 2
## 4 1061 864 153 615 Female No 2
## 5 2863 1209 92 799 Female Regular 1
## 6 1729 1439 148 654 Female No 2
tail(nutrition_data)
## ID Age Smoke Quetelet Vitamin Calories Fat Fiber Alcohol Cholesterol
## 310 310 48 No 24.6147 2 2021.1 72.2 16.6 9.0 299.1
## 311 311 46 No 25.8967 3 2263.6 98.2 19.4 2.6 306.5
## 312 312 45 No 23.8270 1 1841.1 84.2 14.1 2.2 257.7
## 313 313 49 No 24.2613 1 1125.6 44.8 11.9 4.0 150.5
## 314 314 31 No 23.4525 1 2729.6 144.4 13.2 2.2 381.8
## 315 315 45 No 26.5081 1 1627.0 77.4 9.9 0.2 195.6
## BetaDiet RetinolDiet BetaPlasma RetinolPlasma Sex VitaminUse PriorSmoke
## 310 1392 1027 144 752 Female Occasional 2
## 311 2572 1261 164 216 Female No 2
## 312 1665 465 80 328 Female Regular 1
## 313 6943 520 300 502 Female Regular 1
## 314 741 644 121 684 Female Regular 2
## 315 1242 554 233 826 Female Regular 1
str(fish_data)
## 'data.frame': 90 obs. of 2 variables:
## $ Calcium : chr "Low" "Low" "Low" "Low" ...
## $ GillRate: int 55 63 78 85 65 98 68 84 44 87 ...
head(fish_data)
## Calcium GillRate
## 1 Low 55
## 2 Low 63
## 3 Low 78
## 4 Low 85
## 5 Low 65
## 6 Low 98
tail(fish_data)
## Calcium GillRate
## 85 High 52
## 86 High 37
## 87 High 57
## 88 High 62
## 89 High 40
## 90 High 42
Hypothesis for Problem 1:
\(H_0\): \(p_R = p_X\) (the two alleles are equally
frequent in population)
\(H_a\): \(p_R \neq p_X\) (the two alleles are
not equally frequent in population)
observed <- c(244, 192)
p_null <- c(0.5, 0.5)
expected_values <- p_null * sum(observed)
expected_values
## [1] 218 218
chisq.test(observed, p = p_null)
##
## Chi-squared test for given probabilities
##
## data: observed
## X-squared = 6.2018, df = 1, p-value = 0.01276
Conclusion for Problem 1:
With a p-value of 0.01276 and a significance level of 5%, we would
reject the null hypothesis; there is sufficient evidence to conclude
that the observed allele counts are not an equal ratio.
Hypothesis for Problem 2:
\(H_0\): Vitamin use and gender are
not significantly associated
\(H_a\): Vitamin use and gender are
significantly associated
vitamin_gender_table <- table(nutrition_data$VitaminUse, nutrition_data$Sex)
vitamin_gender_table
##
## Female Male
## No 87 24
## Occasional 77 5
## Regular 109 13
chisq.test(vitamin_gender_table)
##
## Pearson's Chi-squared test
##
## data: vitamin_gender_table
## X-squared = 11.071, df = 2, p-value = 0.003944
Conclusion for Problem 2:
With a p-value of 0.003944 and a significance level of 5%, we would
reject the null hypothesis; there is sufficient evidence to conclude
that there is a strong association between gender and vitamin usage
habits.
Hypothesis for Problem 3:
\(H_0\): \(\mu_L\) = \(\mu_M\) = \(\mu_H\) (Gill rates do not
differ significantly depending on the calcium level of the water)
\(H_a\): \(\mu_L \neq \mu_M \neq \mu_H\) (Gill rates
do differ significantly depending on the calcium level of the
water)
fish_data$Calcium <- as.factor(fish_data$Calcium)
anova_test <- aov(GillRate ~ Calcium, data = fish_data)
anova_test
## Call:
## aov(formula = GillRate ~ Calcium, data = fish_data)
##
## Terms:
## Calcium Residuals
## Sum of Squares 2037.222 19064.333
## Deg. of Freedom 2 87
##
## Residual standard error: 14.80305
## Estimated effects may be unbalanced
summary(anova_test)
## Df Sum Sq Mean Sq F value Pr(>F)
## Calcium 2 2037 1018.6 4.648 0.0121 *
## Residuals 87 19064 219.1
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Conclusion for Problem 3:
With a p-value of 0.0121 and a significance level of 5%, we would
reject the null hypothesis; there is sufficient evidence to conclude
that there is a significant difference in gill beat rates among
differing calcium levels.