Species Richness Map Creation
Here I’m going to describe my project for the subject R for Life. The species distribution and diversity are my favorite topics in ecology and I’m always curious about learning more about it. Here is the process of creating species richness map using data about birds collected by prof. Vladimír Remeš and col. in Australia.
Loading the packages and data
First we need to load the libraries for two packages:
ggplot2
dplyr
Let’s load the dataset:
Select the map
Now it’s the time for creating the first function called “subset_map”. The subset_map function is designed to subset a map data frame to a specified region and geographical bounds.
Here is the final function:
subset_map <- function(region_name, lat_min, lat_max, long_min, long_max) {
world_map <- map_data("world")
region_map <- subset(world_map, region %in% region_name & lat >= lat_min & lat <= lat_max & long >= long_min & long <= long_max)
return(region_map)
}Let’s break down each component of the function to understand how it works:
Parameters:
region: The name of the region to subset (e.g., a country or a continent).lat_min: The minimum latitude for the bounding box.lat_max: The maximum latitude for the bounding box.long_min: The minimum longitude for the bounding box.long_max: The maximum longitude for the bounding box.
world_map <- map_data("world")map_data("world")loads the world map data using themap_datafunction from theggplot2package. This data frame contains the geographical coordinates (latitude and longitude) and other information needed to draw the map of the world.The
regioncolumn inworld_mapmatches theregionparameter provided to the function.The
lat(latitude) values are betweenlat_minandlat_max.The
long(longitude) values are betweenlong_minandlong_max.
Computing species richness for selected group
The spr function is designed to calculate species
richness either by rows or columns in a given dataset. It focuses on
specified genera and can either add a new column representing species
richness for each row or a new row summarizing species richness across
columns. Here’s a breakdown of the function’s steps:
Parameters
data: A data frame containing the dataset to be processed.genera: A vector of genera names to be included in the species richness calculation.rowcol: A string indicating the direction for the calculation. It can be either"rows"to calculate species richness for each row or"columns"to calculate species richness for each column.
spr <- function(data, genera, rowcol = "rows") {
# Create a pattern that matches any of the genera
genus_pattern <- paste(genera, collapse = "|")
# Filter columns that match any of the genera names
filtered_data <- data %>%
select(X, Site_code, LONG, LAT, matches(genus_pattern))
# Define the columns to use for row sums
genus_columns <- grep(genus_pattern, names(filtered_data), value = TRUE)
if (rowcol == "rows") {
# Calculate row sums and add as a new column "sp_richness"
filtered_data <- filtered_data %>%
mutate(sp_richness = rowSums(select(filtered_data, all_of(genus_columns)), na.rm = TRUE))
} else if (rowcol == "columns") {
# Calculate column sums
sum_column <- colSums(select(filtered_data, all_of(genus_columns)), na.rm = TRUE)
# Create an empty data frame with the same columns as filtered_data and a single row
sum_row <- data.frame(matrix(NA, nrow = 1, ncol = ncol(filtered_data)))
colnames(sum_row) <- colnames(filtered_data)
# Assign the calculated column sums to the appropriate columns in the new row
sum_row[1, genus_columns] <- sum_column
new_row_name <- "sp_richness" # Define the name of the added row
sum_row[1, 1] <- new_row_name
# Replace NA values with zeros or any other value you prefer
sum_row[is.na(sum_row)] <- 0
# Add the new row to the original data
filtered_data <- bind_rows(filtered_data, sum_row)
# Rename the new row
rownames(filtered_data)[nrow(filtered_data)] <- new_row_name
} else {
stop("Invalid direction. Use 'rows' or 'columns'.")
}
return(filtered_data)
}Map creation
final_map <- function(polygon_data, point_data, polygon_fill, main_title, col_grad_low, col_grad_high){
ggplot() +
geom_polygon(data = polygon_data, aes(x = long, y = lat, group = group), fill = polygon_fill, color = "black") +
geom_point(data = point_data, aes(x = LONG, y = LAT, color = sp_richness), size = 3) +
labs(title = main_title)+
scale_color_gradient(low = col_grad_low, high = col_grad_high) +
theme_minimal() +
theme(plot.title = element_text(hjust = 0.5))
}Here’s an explanation of the function and its components.
The final_map function in R is designed to create a
geographic map using ggplot2. It displays polygons (such as
country or state borders) and points (such as specific locations) with a
color gradient representing a variable of interest (e.g., species
richness).
Parameters
polygon_data: A data frame containing the polygon data. This should include columns for:long: Longitude coordinates for the polygon vertices.lat: Latitude coordinates for the polygon vertices.group: A grouping variable to define each polygon.
point_data: A data frame containing the point data. This should include columns for:LONG: Longitude coordinates for the points.LAT: Latitude coordinates for the points.sp_richness: The variable used for the color gradient, typically representing species richness or similar.
polygon_fill: A color used to fill the polygons.main_title: A string for the main title of the plot.col_grad_low: The color representing the low end of the gradient scale for point colors.col_grad_high: The color representing the high end of the gradient scale for point colors.