Lab Assignment8: FIA Data Analysis

Species Distributions and Biogeographic Patterns

Biogeography — Department of Earth Sciences, University of Memphis Spring 2026

1. Overview

This assignment builds directly on the skills you developed in Lab 2 (data exploration, visualization, for loops) and Lab 3 (Importance Values). You will apply those techniques to investigate biogeographic patterns in eastern US forests using FIA data — the same data underlying the papers we have been discussing in class.

There are 4 tasks worth a total of 100 points. Each task requires R code, plots, and a brief written interpretation. Submit a single R Markdown (.Rmd) file that knits to HTML, containing your code, output, and written answers.

Why This Assignment Matters: The skills you practice here — computing Importance Values, mapping species distributions, analyzing latitudinal profiles, and calculating weighted mean latitudes — are directly applicable to your semester project analysis. The techniques in Tasks 1–4 form the analytical foundation for investigating species distribution patterns, range dynamics, and community composition in FIA data. Mastering these now will give you a significant head start on your project.

Due Date: April 27 10 p.m.

2. Setup

Use the same data files and setup code from Labs 2 and 3. Your R Markdown file should begin with:

library(readr)
library(dplyr)
library(ggplot2)

# Load data (same as Labs 2-3)
tree <- read_csv("fia_eastern31_recent.csv", show_col_types = FALSE)
ref_species <- read_csv("REF_SPECIES_trimmed.csv", show_col_types = FALSE)
tree <- tree %>% left_join(ref_species, by = "SPCD")

# Live trees only
live <- tree %>% filter(STATUSCD == 1)

# Basal area (from Lab 3)
live <- live %>% mutate(BA = pi/4 * DIA^2)

Tip: Make sure your .Rmd knits successfully before submitting. If a plot takes a long time to render, consider sampling or filtering the data first, then using the full dataset for your final version.

3. Tasks

Task 1: Map Species Richness Across the Eastern US (25 pts)

Question: Calculate the number of unique tree species in each grid cell (GRIDID) and create a map of species richness using the grid cell coordinates (centroidLon, centroidLat). Then answer: Where is tree species richness highest? Is there a clear spatial pattern?

Steps:

Group the live data by GRIDID, centroidLon, and centroidLat.
Summarize by counting the number of distinct species (n_distinct(COMMON_NAME)) in each grid cell. Call this column richness.
Create a scatter-style map using geom_point() with longitude on x, latitude on y, and richness mapped to both color and size.
Use scale_color_viridis_c() for a perceptually uniform color scale.
Add coord_quickmap() for proper aspect ratio.

Starter Code:

richness_grid <- live %>%
  group_by(GRIDID, centroidLon, centroidLat) %>%
  summarise(richness = ___, .groups = "drop")

ggplot(richness_grid, aes(x = centroidLon, y = centroidLat)) +
  geom_point(aes(color = ___, size = ___), alpha = 0.7) +
  scale_color_viridis_c(option = "C") +
  coord_quickmap() +
  labs(title = "___",
       color = "# Species", size = "# Species") +
  theme_minimal()

Deliverables: (a) The completed code, (b) your map, (c) 2–3 sentences describing the spatial pattern you observe — where is richness highest and lowest? Does it follow a latitudinal gradient?

Connection to Course Readings: The latitudinal diversity gradient is one of the most fundamental patterns in biogeography. Your map shows this gradient in real FIA data. Recall from Lab 2 (Q7) that richness peaks around 33–36°N — the southern Appalachians. Think about why this region is so species-rich (glacial refugia, topographic heterogeneity, moisture availability).

Task 2: IV Rankings vs. Count Rankings (25 pts)

Question: Pick one state of your choice. For that state, compare the top 10 species by raw stem count against the top 10 species by mean Importance Value. Do the rankings agree? Which species move up or down, and why?

Steps:

Filter the live data to your chosen state (STATE_ABBR).
Count ranking: Count the total number of stems per species and rank the top 10.
IV ranking: Compute plot-level IVs (relative density + relative dominance, as in Lab 3), then average by species across all plots in the state. Rank the top 10.
Create a side-by-side visualization. One effective approach: make two horizontal bar charts using patchwork or gridExtra, or a single combined table.

Hints:

# Step 2 — Raw count ranking (straightforward)
count_rank <- state_data %>%
  count(COMMON_NAME, sort = TRUE) %>%
  slice_head(n = 10)

# Step 3 — IV computation (follow Lab 3 steps)
# Remember: IV = relative density + relative dominance
# First compute plot totals, then species totals per plot,
# then relative values, then average across plots.

Deliverables: (a) State your chosen state, (b) the completed code, (c) the side-by-side comparison (plot or table), (d) 3–4 sentences explaining which species changed rank and what ecological trait explains the shift (think about tree size, shade tolerance, growth form).

Tip: Species with fewer but larger trees gain rank when switching from counts to IV. Pioneer species (fast-growing, small-diameter) tend to drop. This is exactly why IV is preferred in community ecology — it captures structural importance, not just numerical abundance.

Task 3: Latitudinal IV Profiles — Species Replacement Along the Gradient (25 pts)

Question: Select three species that represent different positions along the latitudinal gradient: one southern species, one northern species, and one widespread species. Plot their IV vs. latitude profiles on a single figure and interpret the pattern of species replacement.

Suggested Species (or choose your own):

Category	Example Species	Approximate Range
Southern	loblolly pine, sweetgum, slash pine	Peak ~30–35°N
Northern	balsam fir, quaking aspen, sugar maple	Peak ~44–48°N
Widespread	red maple, white oak, black cherry	Broad range, peak varies

Steps:

Compute grid-level IVs for all species across all 31 states (reuse your Lab 3 code for the full eastern US).
Filter to your three chosen species.
Plot IV (y-axis) against centroidLat (x-axis) with geom_point() and geom_smooth(method = "loess"). Use color to distinguish species.
Add a meaningful title and axis labels.

Starter Code:

selected <- c("loblolly pine", "red maple", "balsam fir")

grid_iv %>%
  filter(COMMON_NAME %in% selected) %>%
  mutate(COMMON_NAME = factor(COMMON_NAME, levels = selected)) %>%
  ggplot(aes(x = centroidLat, y = mean_IV, color = COMMON_NAME)) +
  geom_point(alpha = 0.3, size = 1) +
  geom_smooth(method = "loess", se = TRUE, linewidth = 1.2) +
  labs(title = "___",
       x = "Latitude (°N)",
       y = "Mean Importance Value",
       color = "Species") +
  theme_minimal()

Deliverables: (a) State which three species you chose and why, (b) the completed code, (c) the overlaid IV profile plot, (d) 4–5 sentences interpreting the pattern: Where does each species peak? Where do they overlap? How does this illustrate species replacement along a climatic gradient?

Connection to Course Readings: These latitudinal IV curves are the same type of response analyzed by Murphy et al. (2010), who fitted HOF models (Huisman-Olff-Fresco) to abundance vs. latitude for 92 species and found that 62% had peaks skewed northward. Your plot visualizes the raw data behind that analysis. Also think about Zhu et al. (2014): they shifted from geographic space (latitude) to climate space (temperature × precipitation). What might your profiles look like if you replaced latitude with mean annual temperature?

Task 4: Side-by-Side IV Maps — A Southern vs. Northern Species (25 pts)

Question: Create two maps showing the geographic distribution of IV for one southern species and one northern species (use the same species from Task 3, or pick new ones). Display the maps side by side. Then calculate the abundance-weighted mean latitude for each species and mark it on your maps.

Steps:

From your grid-level IV data, filter to your two chosen species.
For each species, calculate the abundance-weighted mean latitude: mean_lat = sum(mean_IV * centroidLat) / sum(mean_IV)
Create two maps using facet_wrap(~ COMMON_NAME) (or use patchwork). Map IV to color and use coord_quickmap().
Add a horizontal line (geom_hline()) at the weighted mean latitude for each species.

Starter Code:

# Weighted mean latitude per species
sp_means <- grid_iv %>%
  filter(COMMON_NAME %in% c("loblolly pine", "balsam fir")) %>%
  group_by(COMMON_NAME) %>%
  summarise(
    wtd_lat = sum(mean_IV * centroidLat) / sum(mean_IV),
    .groups = "drop"
  )

# Map with facets
grid_iv %>%
  filter(COMMON_NAME %in% c("loblolly pine", "balsam fir")) %>%
  ggplot(aes(x = centroidLon, y = centroidLat)) +
  geom_point(aes(color = mean_IV), size = 1.5, alpha = 0.7) +
  geom_hline(data = sp_means,
             aes(yintercept = wtd_lat),
             linetype = "dashed", color = "red", linewidth = 0.8) +
  scale_color_viridis_c(option = "D") +
  facet_wrap(~ COMMON_NAME) +
  coord_quickmap() +
  labs(title = "___", color = "IV") +
  theme_minimal()

Deliverables: (a) The completed code, (b) the side-by-side maps with weighted mean latitude lines, (c) report the weighted mean latitude for each species, (d) 2–3 sentences: How far apart are the two species’ weighted mean latitudes? What does this tell you about their thermal niches?

Connection to Course Readings: The weighted mean latitude you just computed is conceptually the same metric used by Woodall et al. (2009), who compared seedling mean latitude vs. adult biomass mean latitude and found northward shifts for many species. In that study, a higher mean latitude for seedlings than adults was interpreted as evidence of migration. You are computing the adult weighted mean latitude here — this is the baseline that Woodall compared seedlings against.

4. Submission Guidelines

File format: Submit a single .Rmd file that knits successfully to HTML.
File naming: LastName_Assignment1.Rmd
Code style: Comment your code so a classmate could follow your logic. Use meaningful variable names.
Plots: Every plot must have a descriptive title, axis labels, and a legend where applicable.
Written answers: Place your written interpretations immediately below the relevant code chunk, not at the end of the document.

Academic Integrity: You may discuss approaches with classmates, but all code and writing must be your own. If you use AI assistance for code, you must disclose this and be able to explain every line.

5. Quick Reference

Key Functions from Labs 2 & 3

Task	Function / Pattern
Count unique values	`n_distinct(column)`
Group & summarize	`group_by(...) %>% summarise(...)`
Basal area	`BA = pi/4 * DIA^2`
Relative density	`(species_n / plot_total_n) * 100`
Relative dominance	`(species_BA / plot_total_BA) * 100`
Importance Value	`rel_density + rel_dominance` (out of 200)
Scatter map	`geom_point(aes(x = lon, y = lat, color = value)) + coord_quickmap()`
Smooth trend	`geom_smooth(method = "loess")`
Faceted panels	`facet_wrap(~ variable)`
Color scales	`scale_color_viridis_c()` for continuous, `scale_color_brewer()` for discrete

End of Lab Assignment 1 — Good luck!