Load packages and read in datasets for class exercises:
fish.abundance <- read.csv(file = "data/fish_abundance.csv")
site.info <- read.csv(file = "data/site_info.csv")
Re-familiarize with the “fish_abundance.csv” dataset. The dataset includes many fish species, which were counted across different sites and depths. Fish diversity often decreases with depth on coral reefs, so let’s explore whether there is a relationship between depth and diversity.
summary(fish.abundance)
## surveyid country site sitelat
## Min. : 4000720 Length:6159 Length:6159 Min. :-14.69
## 1st Qu.: 4000733 Class :character Class :character 1st Qu.:-14.69
## Median : 4000750 Mode :character Mode :character Median :-14.66
## Mean :402351719 Mean :-14.67
## 3rd Qu.:912347536 3rd Qu.:-14.65
## Max. :912347584 Max. :-14.65
## sitelong surveydate depth family
## Min. :145.4 Length:6159 Min. : 1.000 Length:6159
## 1st Qu.:145.4 Class :character 1st Qu.: 3.000 Class :character
## Median :145.4 Mode :character Median : 4.500 Mode :character
## Mean :145.5 Mean : 5.569
## 3rd Qu.:145.5 3rd Qu.: 6.000
## Max. :145.5 Max. :16.000
## genus species block total
## Length:6159 Length:6159 Min. :1.000 Min. : 1.0
## Class :character Class :character 1st Qu.:1.000 1st Qu.: 1.0
## Mode :character Mode :character Median :1.000 Median : 3.0
## Mean :1.497 Mean : 31.6
## 3rd Qu.:2.000 3rd Qu.: 12.0
## Max. :2.000 Max. :8000.0
## genspe
## Length:6159
## Class :character
## Mode :character
##
##
##
fish.sprich <- fish.abundance %>%
select(site, depth, genspe) %>%
distinct() %>%
group_by(site, depth) %>%
summarize(sprich = n())
## `summarise()` has grouped output by 'site'. You can override using the
## `.groups` argument.
fish.sprich
## # A tibble: 43 × 3
## # Groups: site [15]
## site depth sprich
## <chr> <dbl> <int>
## 1 Bird Islets 2.5 107
## 2 Bird Islets 3 117
## 3 Bird Islets 3.1 64
## 4 Bird Islets 10 40
## 5 Blue Hole 2 68
## 6 Blue Hole 3.5 71
## 7 Blue Hole 9 65
## 8 Blue Hole 10 58
## 9 Blue Lagoon Channel 5 88
## 10 Blue Lagoon Channel 8 85
## # … with 33 more rows
fish.sprich.plot <- ggplot(fish.sprich, aes(x = depth, y = sprich, color = site)) +
geom_point()
fish.sprich.plot
fish.sprich.fam <- fish.abundance %>%
filter(family %in% c("Acanthuridae", "Chaetodontidae", "Serranidae", "Pomacentridae")) %>%
select(site, depth, family, genspe) %>%
distinct() %>%
group_by(site, depth, family) %>%
summarize(sprich = n())
## `summarise()` has grouped output by 'site', 'depth'. You can override using the
## `.groups` argument.
fish.sprich.fam
## # A tibble: 162 × 4
## # Groups: site, depth [43]
## site depth family sprich
## <chr> <dbl> <chr> <int>
## 1 Bird Islets 2.5 Acanthuridae 9
## 2 Bird Islets 2.5 Chaetodontidae 6
## 3 Bird Islets 2.5 Pomacentridae 29
## 4 Bird Islets 2.5 Serranidae 2
## 5 Bird Islets 3 Acanthuridae 5
## 6 Bird Islets 3 Chaetodontidae 9
## 7 Bird Islets 3 Pomacentridae 29
## 8 Bird Islets 3 Serranidae 4
## 9 Bird Islets 3.1 Acanthuridae 3
## 10 Bird Islets 3.1 Chaetodontidae 7
## # … with 152 more rows
fish.sprich.fam.plot <- ggplot(fish.sprich.fam, aes(x = depth, y = sprich, color = site)) +
geom_point() +
facet_wrap(.~family, scales = "free")
fish.sprich.fam.plot
In the last plot from Exercise 1, it appears as though some sites have higher species richness than others. Let’s further examine why species richness across sites using “site_info.csv,” which includes metadata on the exposure of each site.
fish.sprich.site <- fish.abundance %>%
select(surveyid, site, genspe) %>%
distinct() %>%
group_by(surveyid, site) %>%
summarize(sprich = n()) %>%
left_join(site.info)
## `summarise()` has grouped output by 'surveyid'. You can override using the
## `.groups` argument.
## Joining, by = "site"
fish.sprich.site
## # A tibble: 62 × 4
## # Groups: surveyid [62]
## surveyid site sprich exposure
## <int> <chr> <int> <chr>
## 1 4000720 Watsons Bay north 87 lagoon
## 2 4000721 Watsons Bay north 84 lagoon
## 3 4000722 Watsons-Turtle Reef 71 lagoon
## 4 4000723 Watsons-Turtle Reef 82 lagoon
## 5 4000724 Horseshoe Reef 60 lagoon
## 6 4000725 Horseshoe Reef 68 lagoon
## 7 4000726 Vickys Reef 66 lagoon
## 8 4000727 Vickys Reef 82 lagoon
## 9 4000728 Mermaid Cove dropoff 88 exposed
## 10 4000729 Mermaid Cove dropoff 68 exposed
## # … with 52 more rows
fish.sprich.site.plot <- ggplot(fish.sprich.site, aes(x = exposure, y = sprich, fill = exposure)) +
geom_violin(draw_quantiles = c(0.025, 0.5, 0.975)) +
geom_jitter(width = 0.1)
fish.sprich.site.plot
# 1) Calculate the average abundance of each family in a given survey.
fish.abun.survey <- fish.abundance %>%
group_by(surveyid, family) %>%
summarize(total.fish = sum(total))
## `summarise()` has grouped output by 'surveyid'. You can override using the
## `.groups` argument.
fish.abun.survey
## # A tibble: 1,249 × 3
## # Groups: surveyid [62]
## surveyid family total.fish
## <int> <chr> <int>
## 1 4000720 Acanthuridae 37
## 2 4000720 Blenniidae 9
## 3 4000720 Carangidae 2
## 4 4000720 Chaetodontidae 33
## 5 4000720 Gobiidae 5
## 6 4000720 Haemulidae 3
## 7 4000720 Labridae 235
## 8 4000720 Lutjanidae 1
## 9 4000720 Microdesmidae 19
## 10 4000720 Mugilidae 4
## # … with 1,239 more rows
# 2) Plot the average abundance using density curves.
fish.abun.plot <- ggplot(fish.abun.survey,
aes(x = total.fish, y = family)) +
geom_density_ridges(alpha = 0.5, fill = "steelblue")
fish.abun.plot
## Picking joint bandwidth of 19.9
# 3) Bonus: Transform the x-axis to make the plot more useful.
# use fct_reorder() to reorder the y-variable as descending based on the total sum of fish in each family
# use rel_min_height() to cut the tails
fish.abun.plot2 <- ggplot(fish.abun.survey,
aes(x = log10(total.fish),
y = fct_reorder(family, total.fish, .fun = sum))) +
geom_density_ridges(alpha = 0.5, rel_min_height = 0.005, fill = "steelblue") +
xlab("Species richness (log)") +
ylab("Fish family")
fish.abun.plot2
## Picking joint bandwidth of 0.215