problem matricesFirst load the required R packages
library(Rcompadre)
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.2 ✔ readr 2.1.4
## ✔ forcats 1.0.0 ✔ stringr 1.5.0
## ✔ ggplot2 3.4.2 ✔ tibble 3.2.1
## ✔ lubridate 1.9.2 ✔ tidyr 1.3.0
## ✔ purrr 1.0.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Then download the database.
compadre <- cdb_fetch("Compadre")
## This is COMPADRE version 6.23.5.0 (release date May_06_2023)
## See user agreement at https://compadre-db.org/Help/UserAgreement
## See how to cite with `citation(Rcompadre)`
Wherever possible, the compadre databases split the complete A matrix into the submatrices, U, F and C. These submatrices represent growth/survival, sexual reproduction and clonal reproduction respectively.
In this example I want to find matrices that have a problem in the U matrix. Specifically, I want to find cases where stage-specific survival is recorded as zero, or as 1. These are unrealistic and likely caused by sampling error. Some analyses may not work with these matrices so it can be a good idea to examine them carefully.
Before proceeding check out the documentation for the function
cdb_flag which examines the data for common problems, and
flags them in columns added to the data base.
First I extract the U matrices into a list using
matU.
U_matrices <- matU(compadre)
I can look at an individual matrix using square-bracket subsetting. For example, here I look at the 10th matrix.
U_matrices[[10]]
## U1 U2 U3 U4 U5
## U1 0.38 0.00 0.00 0.00 0.0
## U2 0.11 0.18 0.10 0.07 0.0
## U3 0.01 0.10 0.23 0.26 0.1
## U4 0.00 0.00 0.10 0.15 0.6
## U5 0.00 0.00 0.02 0.04 0.1
Next I need to write a small function that examines a single matrix and tests whether it has a problem. Obviously, one could change this function to identify other issues but in this case it checks whether any column sums of the U matrix are 0 or equal to 1.
problemFinderFunction <- function(m){
column_sums <- colSums(m)
#Check for problem survival
problem_detected <- any(column_sums == 0) || any(column_sums == 1)
return(problem_detected)
}
It is a good idea to check the function on a known matrix. Like this:
U_matrices[[10]]
## U1 U2 U3 U4 U5
## U1 0.38 0.00 0.00 0.00 0.0
## U2 0.11 0.18 0.10 0.07 0.0
## U3 0.01 0.10 0.23 0.26 0.1
## U4 0.00 0.00 0.10 0.15 0.6
## U5 0.00 0.00 0.02 0.04 0.1
colSums(U_matrices[[10]])
## U1 U2 U3 U4 U5
## 0.50 0.28 0.45 0.52 0.80
problemFinderFunction(U_matrices[[10]])
## [1] FALSE
and
U_matrices[[1]]
## U1 U2 U3 U4 U5 U6 U7 U8 U9
## U1 0.077 0 0 0 0 0 0 0 0
## U2 0.037 0 0 0 0 0 0 0 0
## U3 0.003 0 0 0 0 0 0 0 0
## U4 0.000 0 0 0 0 0 0 0 0
## U5 0.000 0 0 0 0 0 0 0 0
## U6 0.000 0 0 0 0 0 0 0 0
## U7 0.000 0 0 0 0 0 0 0 0
## U8 0.000 0 0 0 0 0 0 0 0
## U9 0.000 0 0 0 0 0 0 0 0
colSums(U_matrices[[1]])
## U1 U2 U3 U4 U5 U6 U7 U8 U9
## 0.117 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
problemFinderFunction(U_matrices[[1]])
## [1] TRUE
When you are sure that it works correctly, you can apply the function
to the whole list of matrices using sapply. The result here
is a binary vector (TRUE/FALSE) indicating
whether the matrix has a problem.
problemMatrix <- sapply(U_matrices,problemFinderFunction)
You could add this indicator vector as an additional column to the original COMPADRE database metadata like this.
compadre_metadata <- cdb_metadata(compadre)
compadre_metadata <- cbind(compadre_metadata,problemMatrix)
Then you can filter the compadre_metadata in the normal
way.
problemData <- compadre_metadata %>%
filter(problemMatrix == TRUE) %>%
select(SpeciesAuthor, Authors, Journal, YearPublication, DOI_ISBN) %>%
as_tibble()
problemData
## # A tibble: 4,685 × 5
## SpeciesAuthor Authors Journal YearPublication DOI_ISBN
## <chr> <chr> <chr> <chr> <chr>
## 1 Abies_balsamea "Silver… Am Nat 1999 10.1086…
## 2 Abies_balsamea "Silver… Am Nat 1999 10.1086…
## 3 Agave_angustifolia "Arias-… Bot Sci 2016 10.1712…
## 4 Ascophyllum_nodosum_3 "Aberg" Mar Ec… 1990 10.3354…
## 5 Astragalus_australis_var._olympicus "Kaye" <NA> 1990 <NA>
## 6 Astragalus_australis_var._olympicus "Kaye" <NA> 1990 <NA>
## 7 Betula_pendula "Maille… J Appl… 1982 10.2307…
## 8 Carex_membranacea "Tolvan… J Veg … 2001 10.2307…
## 9 Carex_membranacea "Tolvan… J Veg … 2001 10.2307…
## 10 Cassia_nemophila "Siland… Oecolo… 1983 10.1007…
## # ℹ 4,675 more rows