Species Richness Map Creation

Here I’m going to describe my project for the subject R for Life. The species distribution and diversity are my favorite topics in ecology and I’m always curious about learning more about it. Here is the process of creating species richness map using data about birds collected by prof. Vladimír Remeš and col. in Australia.

Loading the packages and data

First we need to load the libraries for two packages:

  • ggplot2

  • dplyr

Let’s load the dataset:

data <- read.csv("C:/Users/Matěj Tvarůžka/Documenty/Programming/R_statistics/Geocomputing_with_R/data/pres_abs_locs_final_LH.csv")

Select the map

Now it’s the time for creating the first function called “subset_map”. The subset_map function is designed to subset a map data frame to a specified region and geographical bounds.

Here is the final function:

subset_map <- function(region_name, lat_min, lat_max, long_min, long_max) {
  world_map <- map_data("world")
  region_map <- subset(world_map, region %in% region_name & lat >= lat_min & lat <= lat_max & long >= long_min & long <= long_max)
  return(region_map)
}

Let’s break down each component of the function to understand how it works:

  • Parameters:

    • region: The name of the region to subset (e.g., a country or a continent).

    • lat_min: The minimum latitude for the bounding box.

    • lat_max: The maximum latitude for the bounding box.

    • long_min: The minimum longitude for the bounding box.

    • long_max: The maximum longitude for the bounding box.

    world_map <- map_data("world")

    • map_data("world") loads the world map data using the map_data function from the ggplot2 package. This data frame contains the geographical coordinates (latitude and longitude) and other information needed to draw the map of the world.

    • The region column in world_map matches the region parameter provided to the function.

    • The lat (latitude) values are between lat_min and lat_max.

    • The long (longitude) values are between long_min and long_max.

Computing species richness for selected group

The spr function is designed to calculate species richness either by rows or columns in a given dataset. It focuses on specified genera and can either add a new column representing species richness for each row or a new row summarizing species richness across columns. Here’s a breakdown of the function’s steps:

Parameters

  • data: A data frame containing the dataset to be processed.

  • genera: A vector of genera names to be included in the species richness calculation.

  • rowcol: A string indicating the direction for the calculation. It can be either "rows" to calculate species richness for each row or "columns" to calculate species richness for each column.

spr <- function(data, genera, rowcol = "rows") {
  # Create a pattern that matches any of the genera
  genus_pattern <- paste(genera, collapse = "|")
  
  # Filter columns that match any of the genera names
  filtered_data <- data %>%
    select(X, Site_code, LONG, LAT, matches(genus_pattern))
  
  # Define the columns to use for row sums
  genus_columns <- grep(genus_pattern, names(filtered_data), value = TRUE)
  
  if (rowcol == "rows") {
    # Calculate row sums and add as a new column "sp_richness"
    filtered_data <- filtered_data %>%
      mutate(sp_richness = rowSums(select(filtered_data, all_of(genus_columns)), na.rm = TRUE))
  } else if (rowcol == "columns") {
    # Calculate column sums
    sum_column <- colSums(select(filtered_data, all_of(genus_columns)), na.rm = TRUE)
    
    # Create an empty data frame with the same columns as filtered_data and a single row
    sum_row <- data.frame(matrix(NA, nrow = 1, ncol = ncol(filtered_data)))
    colnames(sum_row) <- colnames(filtered_data)
    
    # Assign the calculated column sums to the appropriate columns in the new row
    sum_row[1, genus_columns] <- sum_column
    
    new_row_name <- "sp_richness"  # Define the name of the added row
    sum_row[1, 1] <- new_row_name
    
    # Replace NA values with zeros or any other value you prefer
    sum_row[is.na(sum_row)] <- 0
    
    # Add the new row to the original data
    filtered_data <- bind_rows(filtered_data, sum_row)
    
    # Rename the new row
    rownames(filtered_data)[nrow(filtered_data)] <- new_row_name
  } else {
    stop("Invalid direction. Use 'rows' or 'columns'.")
  }
  
  return(filtered_data)
}

Map creation

final_map <- function(polygon_data, point_data, polygon_fill, main_title, col_grad_low, col_grad_high){
ggplot() +
  geom_polygon(data = polygon_data, aes(x = long, y = lat, group = group), fill = polygon_fill, color = "black") +
  geom_point(data = point_data, aes(x = LONG, y = LAT, color = sp_richness), size = 3) +
  labs(title = main_title)+
    scale_color_gradient(low = col_grad_low, high = col_grad_high) +
    theme_minimal() +
    theme(plot.title = element_text(hjust = 0.5))
}

Here’s an explanation of the function and its components.

The final_map function in R is designed to create a geographic map using ggplot2. It displays polygons (such as country or state borders) and points (such as specific locations) with a color gradient representing a variable of interest (e.g., species richness).

Parameters

  • polygon_data: A data frame containing the polygon data. This should include columns for:

    • long: Longitude coordinates for the polygon vertices.

    • lat: Latitude coordinates for the polygon vertices.

    • group: A grouping variable to define each polygon.

  • point_data: A data frame containing the point data. This should include columns for:

    • LONG: Longitude coordinates for the points.

    • LAT: Latitude coordinates for the points.

    • sp_richness: The variable used for the color gradient, typically representing species richness or similar.

  • polygon_fill: A color used to fill the polygons.

  • main_title: A string for the main title of the plot.

  • col_grad_low: The color representing the low end of the gradient scale for point colors.

  • col_grad_high: The color representing the high end of the gradient scale for point colors.

Example of use

1. Select map

Australia_map <- subset_map(region_name = "Australia", lat_min = -50, lat_max = -10, long_min = 110, long_max = 155)

2. Select family, genus, species or columns/rows and compute species richness

Acanthizidae <- c("Acanthiza", "Acanthornis", "Smicrornis", "Calamanthus", "Hylacola", "Pycnoptilus", "Pyrrholaemus", "Origma", "Sericornis", "Gerygone", "Aphelocephala", "Oreoscopus")
Acanthizidae_data <- spr(data, genera = Acanthizidae, rowcol = "rows")

3. Create a species richness map

Acanthizidae_map <- final_map(polygon_data = Australia_map, point_data = Acanthizidae_data, polygon_fill = "snow",
          main_title = "Species Richness of Family Acanthizidae in Australia Sites",
          col_grad_low = "orange", col_grad_high = "red4")
print(Acanthizidae_map)