##chunk 1
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.6
## ✔ forcats 1.0.1 ✔ stringr 1.6.0
## ✔ ggplot2 4.0.1 ✔ tibble 3.3.1
## ✔ lubridate 1.9.4 ✔ tidyr 1.3.2
## ✔ purrr 1.2.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(here)
## here() starts at /Users/oliviacowen/Desktop/5-Data Analysis & Visualization Practice
library(car)
## Loading required package: carData
##
## Attaching package: 'car'
##
## The following object is masked from 'package:dplyr':
##
## recode
##
## The following object is masked from 'package:purrr':
##
## some
##chunk 2
library(here)
fish_raw <- read.csv(here("data", "raw", "pond_fish_body_morpho.csv"))
fish_condition <- fish_raw %>%
mutate(body_condition = mass_g / length_mm) %>%
filter(!is.na(body_condition))
glimpse(fish_condition)
## Rows: 175
## Columns: 6
## $ site <chr> "pond_a", "pond_a", "pond_a", "pond_a", "pond_a", "pond…
## $ sci_name <chr> "Lepomis macrochirus", "Lepomis macrochirus", "Lepomis …
## $ common_name <chr> "Bluegill", "Bluegill", "Bluegill", "Bluegill", "Bluegi…
## $ length_mm <dbl> 118.99, 121.42, 117.28, 108.46, 127.77, 129.99, 123.26,…
## $ mass_g <dbl> 52.93, 53.13, 53.53, 47.05, 56.70, 63.76, 50.81, 55.32,…
## $ body_condition <dbl> 0.4448273, 0.4375721, 0.4564291, 0.4338005, 0.4437661, …
##chunk 3: data to test whether the average body condition of fish differed among species
anova1 <- aov(body_condition ~ common_name, data = fish_condition)
summary(anova1)
## Df Sum Sq Mean Sq F value Pr(>F)
## common_name 2 0.17503 0.08752 153.7 <2e-16 ***
## Residuals 172 0.09795 0.00057
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
A one-way anova was completed to compare body condition and species of fish. The anova shows that species has a significant effect on body condition, and that species has a strong effect on body condition. (F(2, 172) = 153.7, p < 2 × 10⁻¹⁶). #chunk 4 Tukey
TukeyHSD(anova1)
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = body_condition ~ common_name, data = fish_condition)
##
## $common_name
## diff lwr upr p adj
## Redbreast sunfish-Bluegill 0.04837591 0.03865795 0.05809387 0
## Warmouth-Bluegill -0.03073998 -0.04207760 -0.01940235 0
## Warmouth-Redbreast sunfish -0.07911588 -0.09029819 -0.06793358 0
The tukey test showed that the species differ significantly and that the Redbreasted sunfish has a significantly higher body condition when compared to the bluegill, with the difference=0.04837591, with p<0.05, while the Warmouth has a significantly lower body condition than both other species, diff=-0.03073998, p<0.05 for Bluegill, and diff=-0.07911588, p<0.05 for Redbreast sunfish. #chunk 5 Diagnostic Plots to confirm assumptions
par(mfrow = c(1, 2))
plot(anova1, which = c(2, 3))
leveneTest(anova1)
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 2 0.1035 0.9018
## 172
The diagnostic plots show that the ANOVA assumptions were met. the
Q-Q plot closley follows the reference line, indicating that the
normality assumption is correct. This is supported by the leveneTest
conducted which shows homogeniety. The risiduals vs fitted values plot
shows evenly distributed residuals across all fitted values, indicating
that the homogeniety assumption is correct.
#chunk 6
fish_lm <- fish_raw %>%
filter(!is.na(mass_g), !is.na(length_mm), length_mm != 0) %>%
lm(mass_g ~ length_mm, data = .)
summary(fish_lm)
##
## Call:
## lm(formula = mass_g ~ length_mm, data = .)
##
## Residuals:
## Min 1Q Median 3Q Max
## -10.2803 -3.1280 -0.4475 3.7589 10.3408
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -5.66209 5.13416 -1.103 0.272
## length_mm 0.50852 0.04278 11.888 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.726 on 173 degrees of freedom
## Multiple R-squared: 0.4496, Adjusted R-squared: 0.4464
## F-statistic: 141.3 on 1 and 173 DF, p-value: < 2.2e-16
There was a significant predictive relationship found between fish mass and respective body length given the linear model test conducted. Given the large test statistic of F=141.3, with p< 2.2e-16, length is a significant predictor of mass in these fish. The R² = 0.4496 indicates about 45% of the variation in fish mass is explained by the variable length.
#chunk 8
lm_model1 <- lm(mass_g ~ length_mm, data = fish_condition)
par(mfrow = c(1, 2))
plot(lm_model1)
shapiro.test(lm_model1$residuals)
##
## Shapiro-Wilk normality test
##
## data: lm_model1$residuals
## W = 0.98601, p-value = 0.07838
The diagnostic plots show that the linear model assumptions were met. the Q-Q plot closely follows the reference line, indicating that the normality assumption is correct. This is supported by the leveneTest conducted which shows homogeneity. The residuals vs fitted values plot shows evenly distributed residuals across all fitted values, indicating that the homogeneity assumption is correct.
#chunk 9
fish_summary <- fish_condition %>%
group_by(common_name) %>%
summarise(mean_fish = mean(body_condition),
sd = sd(body_condition))
ggplot(fish_summary,
aes(x = common_name,
y = mean_fish)) +
geom_jitter(data = fish_condition,
aes(x = common_name,
y = body_condition,
col = common_name),
width = 0.25) +
geom_point(size = 5) +
geom_errorbar(aes(ymin = mean_fish - sd,
ymax = mean_fish +sd),
lwd = 1.5,
width = 0.5) +
labs(x = "Fish Species",
y = "Body Condition g/mm")
theme_bw()
## <theme> List of 144
## $ line : <ggplot2::element_line>
## ..@ colour : chr "black"
## ..@ linewidth : num 0.5
## ..@ linetype : num 1
## ..@ lineend : chr "butt"
## ..@ linejoin : chr "round"
## ..@ arrow : logi FALSE
## ..@ arrow.fill : chr "black"
## ..@ inherit.blank: logi TRUE
## $ rect : <ggplot2::element_rect>
## ..@ fill : chr "white"
## ..@ colour : chr "black"
## ..@ linewidth : num 0.5
## ..@ linetype : num 1
## ..@ linejoin : chr "round"
## ..@ inherit.blank: logi TRUE
## $ text : <ggplot2::element_text>
## ..@ family : chr ""
## ..@ face : chr "plain"
## ..@ italic : chr NA
## ..@ fontweight : num NA
## ..@ fontwidth : num NA
## ..@ colour : chr "black"
## ..@ size : num 11
## ..@ hjust : num 0.5
## ..@ vjust : num 0.5
## ..@ angle : num 0
## ..@ lineheight : num 0.9
## ..@ margin : <ggplot2::margin> num [1:4] 0 0 0 0
## ..@ debug : logi FALSE
## ..@ inherit.blank: logi TRUE
## $ title : <ggplot2::element_text>
## ..@ family : NULL
## ..@ face : NULL
## ..@ italic : chr NA
## ..@ fontweight : num NA
## ..@ fontwidth : num NA
## ..@ colour : NULL
## ..@ size : NULL
## ..@ hjust : NULL
## ..@ vjust : NULL
## ..@ angle : NULL
## ..@ lineheight : NULL
## ..@ margin : NULL
## ..@ debug : NULL
## ..@ inherit.blank: logi TRUE
## $ point : <ggplot2::element_point>
## ..@ colour : chr "black"
## ..@ shape : num 19
## ..@ size : num 1.5
## ..@ fill : chr "white"
## ..@ stroke : num 0.5
## ..@ inherit.blank: logi TRUE
## $ polygon : <ggplot2::element_polygon>
## ..@ fill : chr "white"
## ..@ colour : chr "black"
## ..@ linewidth : num 0.5
## ..@ linetype : num 1
## ..@ linejoin : chr "round"
## ..@ inherit.blank: logi TRUE
## $ geom : <ggplot2::element_geom>
## ..@ ink : chr "black"
## ..@ paper : chr "white"
## ..@ accent : chr "#3366FF"
## ..@ linewidth : num 0.5
## ..@ borderwidth: num 0.5
## ..@ linetype : int 1
## ..@ bordertype : int 1
## ..@ family : chr ""
## ..@ fontsize : num 3.87
## ..@ pointsize : num 1.5
## ..@ pointshape : num 19
## ..@ colour : NULL
## ..@ fill : NULL
## $ spacing : 'simpleUnit' num 5.5points
## ..- attr(*, "unit")= int 8
## $ margins : <ggplot2::margin> num [1:4] 5.5 5.5 5.5 5.5
## $ aspect.ratio : NULL
## $ axis.title : NULL
## $ axis.title.x : <ggplot2::element_text>
## ..@ family : NULL
## ..@ face : NULL
## ..@ italic : chr NA
## ..@ fontweight : num NA
## ..@ fontwidth : num NA
## ..@ colour : NULL
## ..@ size : NULL
## ..@ hjust : NULL
## ..@ vjust : num 1
## ..@ angle : NULL
## ..@ lineheight : NULL
## ..@ margin : <ggplot2::margin> num [1:4] 2.75 0 0 0
## ..@ debug : NULL
## ..@ inherit.blank: logi TRUE
## $ axis.title.x.top : <ggplot2::element_text>
## ..@ family : NULL
## ..@ face : NULL
## ..@ italic : chr NA
## ..@ fontweight : num NA
## ..@ fontwidth : num NA
## ..@ colour : NULL
## ..@ size : NULL
## ..@ hjust : NULL
## ..@ vjust : num 0
## ..@ angle : NULL
## ..@ lineheight : NULL
## ..@ margin : <ggplot2::margin> num [1:4] 0 0 2.75 0
## ..@ debug : NULL
## ..@ inherit.blank: logi TRUE
## $ axis.title.x.bottom : NULL
## $ axis.title.y : <ggplot2::element_text>
## ..@ family : NULL
## ..@ face : NULL
## ..@ italic : chr NA
## ..@ fontweight : num NA
## ..@ fontwidth : num NA
## ..@ colour : NULL
## ..@ size : NULL
## ..@ hjust : NULL
## ..@ vjust : num 1
## ..@ angle : num 90
## ..@ lineheight : NULL
## ..@ margin : <ggplot2::margin> num [1:4] 0 2.75 0 0
## ..@ debug : NULL
## ..@ inherit.blank: logi TRUE
## $ axis.title.y.left : NULL
## $ axis.title.y.right : <ggplot2::element_text>
## ..@ family : NULL
## ..@ face : NULL
## ..@ italic : chr NA
## ..@ fontweight : num NA
## ..@ fontwidth : num NA
## ..@ colour : NULL
## ..@ size : NULL
## ..@ hjust : NULL
## ..@ vjust : num 1
## ..@ angle : num -90
## ..@ lineheight : NULL
## ..@ margin : <ggplot2::margin> num [1:4] 0 0 0 2.75
## ..@ debug : NULL
## ..@ inherit.blank: logi TRUE
## $ axis.text : <ggplot2::element_text>
## ..@ family : NULL
## ..@ face : NULL
## ..@ italic : chr NA
## ..@ fontweight : num NA
## ..@ fontwidth : num NA
## ..@ colour : chr "#4D4D4DFF"
## ..@ size : 'rel' num 0.8
## ..@ hjust : NULL
## ..@ vjust : NULL
## ..@ angle : NULL
## ..@ lineheight : NULL
## ..@ margin : NULL
## ..@ debug : NULL
## ..@ inherit.blank: logi TRUE
## $ axis.text.x : <ggplot2::element_text>
## ..@ family : NULL
## ..@ face : NULL
## ..@ italic : chr NA
## ..@ fontweight : num NA
## ..@ fontwidth : num NA
## ..@ colour : NULL
## ..@ size : NULL
## ..@ hjust : NULL
## ..@ vjust : num 1
## ..@ angle : NULL
## ..@ lineheight : NULL
## ..@ margin : <ggplot2::margin> num [1:4] 2.2 0 0 0
## ..@ debug : NULL
## ..@ inherit.blank: logi TRUE
## $ axis.text.x.top : <ggplot2::element_text>
## ..@ family : NULL
## ..@ face : NULL
## ..@ italic : chr NA
## ..@ fontweight : num NA
## ..@ fontwidth : num NA
## ..@ colour : NULL
## ..@ size : NULL
## ..@ hjust : NULL
## ..@ vjust : num 0
## ..@ angle : NULL
## ..@ lineheight : NULL
## ..@ margin : <ggplot2::margin> num [1:4] 0 0 2.2 0
## ..@ debug : NULL
## ..@ inherit.blank: logi TRUE
## $ axis.text.x.bottom : NULL
## $ axis.text.y : <ggplot2::element_text>
## ..@ family : NULL
## ..@ face : NULL
## ..@ italic : chr NA
## ..@ fontweight : num NA
## ..@ fontwidth : num NA
## ..@ colour : NULL
## ..@ size : NULL
## ..@ hjust : num 1
## ..@ vjust : NULL
## ..@ angle : NULL
## ..@ lineheight : NULL
## ..@ margin : <ggplot2::margin> num [1:4] 0 2.2 0 0
## ..@ debug : NULL
## ..@ inherit.blank: logi TRUE
## $ axis.text.y.left : NULL
## $ axis.text.y.right : <ggplot2::element_text>
## ..@ family : NULL
## ..@ face : NULL
## ..@ italic : chr NA
## ..@ fontweight : num NA
## ..@ fontwidth : num NA
## ..@ colour : NULL
## ..@ size : NULL
## ..@ hjust : num 0
## ..@ vjust : NULL
## ..@ angle : NULL
## ..@ lineheight : NULL
## ..@ margin : <ggplot2::margin> num [1:4] 0 0 0 2.2
## ..@ debug : NULL
## ..@ inherit.blank: logi TRUE
## $ axis.text.theta : NULL
## $ axis.text.r : <ggplot2::element_text>
## ..@ family : NULL
## ..@ face : NULL
## ..@ italic : chr NA
## ..@ fontweight : num NA
## ..@ fontwidth : num NA
## ..@ colour : NULL
## ..@ size : NULL
## ..@ hjust : num 0.5
## ..@ vjust : NULL
## ..@ angle : NULL
## ..@ lineheight : NULL
## ..@ margin : <ggplot2::margin> num [1:4] 0 2.2 0 2.2
## ..@ debug : NULL
## ..@ inherit.blank: logi TRUE
## $ axis.ticks : <ggplot2::element_line>
## ..@ colour : chr "#333333FF"
## ..@ linewidth : NULL
## ..@ linetype : NULL
## ..@ lineend : NULL
## ..@ linejoin : NULL
## ..@ arrow : logi FALSE
## ..@ arrow.fill : chr "#333333FF"
## ..@ inherit.blank: logi TRUE
## $ axis.ticks.x : NULL
## $ axis.ticks.x.top : NULL
## $ axis.ticks.x.bottom : NULL
## $ axis.ticks.y : NULL
## $ axis.ticks.y.left : NULL
## $ axis.ticks.y.right : NULL
## $ axis.ticks.theta : NULL
## $ axis.ticks.r : NULL
## $ axis.minor.ticks.x.top : NULL
## $ axis.minor.ticks.x.bottom : NULL
## $ axis.minor.ticks.y.left : NULL
## $ axis.minor.ticks.y.right : NULL
## $ axis.minor.ticks.theta : NULL
## $ axis.minor.ticks.r : NULL
## $ axis.ticks.length : 'rel' num 0.5
## $ axis.ticks.length.x : NULL
## $ axis.ticks.length.x.top : NULL
## $ axis.ticks.length.x.bottom : NULL
## $ axis.ticks.length.y : NULL
## $ axis.ticks.length.y.left : NULL
## $ axis.ticks.length.y.right : NULL
## $ axis.ticks.length.theta : NULL
## $ axis.ticks.length.r : NULL
## $ axis.minor.ticks.length : 'rel' num 0.75
## $ axis.minor.ticks.length.x : NULL
## $ axis.minor.ticks.length.x.top : NULL
## $ axis.minor.ticks.length.x.bottom: NULL
## $ axis.minor.ticks.length.y : NULL
## $ axis.minor.ticks.length.y.left : NULL
## $ axis.minor.ticks.length.y.right : NULL
## $ axis.minor.ticks.length.theta : NULL
## $ axis.minor.ticks.length.r : NULL
## $ axis.line : <ggplot2::element_blank>
## $ axis.line.x : NULL
## $ axis.line.x.top : NULL
## $ axis.line.x.bottom : NULL
## $ axis.line.y : NULL
## $ axis.line.y.left : NULL
## $ axis.line.y.right : NULL
## $ axis.line.theta : NULL
## $ axis.line.r : NULL
## $ legend.background : <ggplot2::element_rect>
## ..@ fill : NULL
## ..@ colour : logi NA
## ..@ linewidth : NULL
## ..@ linetype : NULL
## ..@ linejoin : NULL
## ..@ inherit.blank: logi TRUE
## $ legend.margin : NULL
## $ legend.spacing : 'rel' num 2
## $ legend.spacing.x : NULL
## $ legend.spacing.y : NULL
## $ legend.key : NULL
## $ legend.key.size : 'simpleUnit' num 1.2lines
## ..- attr(*, "unit")= int 3
## $ legend.key.height : NULL
## $ legend.key.width : NULL
## $ legend.key.spacing : NULL
## $ legend.key.spacing.x : NULL
## $ legend.key.spacing.y : NULL
## $ legend.key.justification : NULL
## $ legend.frame : NULL
## $ legend.ticks : NULL
## $ legend.ticks.length : 'rel' num 0.2
## $ legend.axis.line : NULL
## $ legend.text : <ggplot2::element_text>
## ..@ family : NULL
## ..@ face : NULL
## ..@ italic : chr NA
## ..@ fontweight : num NA
## ..@ fontwidth : num NA
## ..@ colour : NULL
## ..@ size : 'rel' num 0.8
## ..@ hjust : NULL
## ..@ vjust : NULL
## ..@ angle : NULL
## ..@ lineheight : NULL
## ..@ margin : NULL
## ..@ debug : NULL
## ..@ inherit.blank: logi TRUE
## $ legend.text.position : NULL
## $ legend.title : <ggplot2::element_text>
## ..@ family : NULL
## ..@ face : NULL
## ..@ italic : chr NA
## ..@ fontweight : num NA
## ..@ fontwidth : num NA
## ..@ colour : NULL
## ..@ size : NULL
## ..@ hjust : num 0
## ..@ vjust : NULL
## ..@ angle : NULL
## ..@ lineheight : NULL
## ..@ margin : NULL
## ..@ debug : NULL
## ..@ inherit.blank: logi TRUE
## $ legend.title.position : NULL
## $ legend.position : chr "right"
## $ legend.position.inside : NULL
## $ legend.direction : NULL
## $ legend.byrow : NULL
## $ legend.justification : chr "center"
## $ legend.justification.top : NULL
## $ legend.justification.bottom : NULL
## $ legend.justification.left : NULL
## $ legend.justification.right : NULL
## $ legend.justification.inside : NULL
## [list output truncated]
## @ complete: logi TRUE
## @ validate: logi TRUE
#chunk 10
ggplot(fish_condition, aes(x = length_mm, y = mass_g)) +
geom_point() +
geom_smooth(method = "lm", se = TRUE, color = "blue") +
labs(
x = "Body Length",
y = "Mass",
title = "Relationship Between Fish Length and Mass"
) +
theme_minimal()
## `geom_smooth()` using formula = 'y ~ x'
The graphs and statistical analyses show us the correlations between the
different variables presented to use within the morphological fish data.
Overall given the graphs, statistical test outputs and assumption test,
I would say that the assumptions have been met. My take home message is
that fish length and body mass are positively correlated, and that mass
is a strong predictor of body length. My current comfort working with
these statistical tests is 1. I’m having a hard time understanding how
to interpret the graphs we create. I will go over your powerpoints to
see if that clears my confusion in the future.