Module 13: Apply it to your data 12

Import your data

data <- read_excel("../00_data/NationoalParkSpecies1.xlsx")

Repeat the same operation over different columns of a data frame

Case of numeric variables

# Summarize mean and total of all numeric columns
numeric_summary <- data %>%
  summarise(across(where(is.numeric), list(mean = mean, sum = sum), na.rm = TRUE))

## Warning: There was 1 warning in `summarise()`.
## ℹ In argument: `across(where(is.numeric), list(mean = mean, sum = sum), na.rm =
##   TRUE)`.
## Caused by warning:
## ! The `...` argument of `across()` is deprecated as of dplyr 1.1.0.
## Supply arguments directly to `.fns` through an anonymous function instead.
## 
##   # Previously
##   across(a:b, mean, na.rm = TRUE)
## 
##   # Now
##   across(a:b, \(x) mean(x, na.rm = TRUE))

numeric_summary

## # A tibble: 1 × 6
##   References_mean References_sum Observations_mean Observations_sum
##             <dbl>          <dbl>             <dbl>            <dbl>
## 1            5.68           9710             0.202              345
## # ℹ 2 more variables: Vouchers_mean <dbl>, Vouchers_sum <dbl>

Create your own function

# Define a custom function to flag species with "Rare" abundance
flag_rare <- function(abundance) {
  ifelse(str_to_lower(abundance) == "rare", TRUE, FALSE)
}

# Apply the custom function to your dataset
data <- data %>%
  mutate(RareSpeciesFlag = flag_rare(Abundance))

Repeat the same operation over different elements of a list

When you have a grouping variable (factor)

# Split the dataset by ParkName
data_split <- data %>%
  group_split(ParkName)

# Apply a summary function to each park’s subset
lapply(data_split, function(df) {
  df %>%
    summarise(
      park = unique(df$ParkName),
      total_obs = sum(Observations, na.rm = TRUE)
    )
})

## [[1]]
## # A tibble: 1 × 2
##   park                 total_obs
##   <chr>                    <dbl>
## 1 Acadia National Park       345

Create your own

Choose either one of the two cases above and apply it to your data

# Group by Category (e.g., Mammal, Bird) and summarize relevant metrics
data %>%
  group_by(CategoryName) %>%
  summarise(
    total_observations = sum(Observations, na.rm = TRUE),
    mean_references = mean(References, na.rm = TRUE),
    n_species = n()
  )

## # A tibble: 6 × 4
##   CategoryName   total_observations mean_references n_species
##   <chr>                       <dbl>           <dbl>     <int>
## 1 Amphibian                      15            8.4         15
## 2 Bird                          286           11.8        364
## 3 Fish                            1            2.79        38
## 4 Mammal                         37            8.73        55
## 5 Reptile                         5            6.36        11
## 6 Vascular Plant                  1            3.79      1226

Module 13: Apply it to your data 12

Chapter 21 Iteration

Daniel Lee

Import your data

Repeat the same operation over different columns of a data frame

Repeat the same operation over different elements of a list

Create your own