suppressMessages(penguins <- read_csv("penguins.csv"))
p_1 <- penguins %>%
filter(Sex == "MALE") %>%
group_by(Island) %>%
summarize(
mean_wt = round(mean(Body_Mass), 1),
sd_wt = round(sd(Body_Mass), 1),
length(Body_Mass)
)
kable(p_1, col.names = c("Island","Mean Body Mass (g)","SD Mean Body Mass (g)","Sample Size (# of Obs. Penguins)")) %>%
kable_styling(bootstrap_options = c("striped", "hover", "bordered", full_width = F)) %>%
row_spec(0, color = "white", background = "skyblue") %>%
column_spec(1, bold = T)
| Island | Mean Body Mass (g) | SD Mean Body Mass (g) | Sample Size (# of Obs. Penguins) |
|---|---|---|---|
| Biscoe | 4050.0 | 355.6 | 22 |
| Dream | 4045.5 | 330.5 | 28 |
| Torgersen | 4034.8 | 372.5 | 23 |
p_2 <- penguins %>%
filter(Sex == "MALE") %>%
group_by(Island)
mass_histogram <- ggplot(p_2, aes(x = Body_Mass)) +
geom_histogram(aes(fill = Island), bins = 8, color = "white") +
facet_wrap(~ Island, scale = "free")
mass_histogram
mass_quantile <- ggplot(p_2, aes(sample = Body_Mass)) +
geom_qq(aes(fill = Island)) +
facet_wrap(~ Island, scale = "free")
mass_quantile
To determine the confidence levels for the sample data collected, I would use the T-Distribution because we do not know the population standard deviation (only the sample), and the population samples appear to be normally distributed
Biscoe <- p_2 %>%
filter(Island == "Biscoe")
Dream <- p_2 %>%
filter(Island == "Dream")
biscoe_ci <- t.test(Biscoe$Body_Mass)
dream_ci <- t.test(Dream$Body_Mass)
Mean Male Penguin Body Mass for Biscoe Isalnd (n = 22) is 4050 g, with a 95% confidence interval of [3892, 4208].
Mean Male Penguin Body Mass for Dream Island (n = 28) is 4045.5 g, with a 95% confidence interval of [3917,4174].
bodymass_t <- t.test(Biscoe$Body_Mass, Dream$Body_Mass)
bodymass_t
##
## Welch Two Sample t-test
##
## data: Biscoe$Body_Mass and Dream$Body_Mass
## t = 0.045448, df = 43.575, p-value = 0.964
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -193.5580 202.4866
## sample estimates:
## mean of x mean of y
## 4050.000 4045.536
Male Penguin Body Mass at the two Islands, Biscoe (n=22) and Dream (n=28), do not differ significantly (t(43.5) = .0455, p = .964). From the observed samples, the mean body mass at Biscoe Island (4050 g) only differs from the mean at Dream Island (4045.5 g) by 4.5 g. Thus, there is a 96.4% chance that the two samples were drawn from populations with the same mean.
biscoe_t_oneside <- t.test(Biscoe$Body_Mass, mu = 4200, alternative = "less")
biscoe_t_oneside
##
## One Sample t-test
##
## data: Biscoe$Body_Mass
## t = -1.9787, df = 21, p-value = 0.03056
## alternative hypothesis: true mean is less than 4200
## 95 percent confidence interval:
## -Inf 4180.445
## sample estimates:
## mean of x
## 4050
The journal “Penguin Research” claims that the mean body mass of adult male Adiele penguins at Biscoe Island is 4200 g. However, our study sample (n = 22) resulted in a sample mean body mass of 4050 g. A one-tailed t-test at a 95% confidence interval (t(21) = -1.98, p = .03) was significant to show that the mean body mass of adult male Penguins at Biscoe Island is less than 4200 g. The results suggest that if the true mean were 4200 g, there is only a 3% chance that a mean sample body mass of 4050 g could have been ramdonly selected, and thus, the true mean body mass is likely less than 4200 g.
penguin_boxplot <- ggplot(p_2, aes(x = Island, y = Body_Mass)) +
geom_boxplot(aes(fill = Island), alpha = 0.8, show.legend = F) +
geom_jitter(width = 0.1, alpha = 0.8) +
labs(x = "Penguin Observations by Island (n (B) = 22, n (D) = 28, n (T) = 23)", y = "Body Mass (g)", title = "Male Adiele Penguin Body Mass Observations (Created by Chase Brewster)")
penguin_boxplot