Abstract
TO BE COMPLETEDThe functions available in the package estimate volume and phytomass for a given species or for a group of similar species. An unequivocal species identification procedure is hence essential to access the system.
“European and Mediterranean Plant Protection Organization” EPPO offers a coding system accessible also via REST API services, a basic building block for the develoment of a procedure for species codes selection, dynamically based on EPPO services.
ForIt ver 2 will adopt EPPO codes and rely on its web services to match species names with avaliable volume and weight estimation functions.
The package will include a subset of EPPO species DB, limited to the species named in the reference work for INFC volume estimation (Tabacchi et al., 2011) and to a subset of connected attributes, to provide a local key to access the available functions.
This set is obtained using EPPO on line tools - Batch processing. Quick and simple login is required to access the tools. Uploading an Excel file with a column containing species names, corresponding codes can be found in the successive column of the Execel file one can download after processing.
library(tidyverse)
library(readxl)
library(writexl)
require(Matrix)
library(ForIT)
data("INFCstats")
data("INFCdomain")
INFCstats %>%
dplyr::select(specie) %>%
unique %>%
write_xlsx(path = "INFC_bare_species_list.xlsx")
Batch processing with EPPO] the list obtained from ForIt ver.1 data, two names are not recognised.
INFCstats %>%
dplyr::select(specie) %>%
unique %>%
mutate(specie2 = as.character(specie)) %>%
# ci sono 2 correzioni da apportare: angustiGolia e togliere spp.
mutate(specie2 = replace(specie2, specie2 == "Fraxinus angustigolia", "Fraxinus angustifolia")) %>%
mutate(specie2 = replace(specie2, specie2 == "Eucalyptus spp.", "Eucalyptus")) %>%
write_xlsx(path = "INFC_corr_species_list.xlsx")
Using the ‘Batch processing’ EPPO tool, EPPOCodes have been coupled with species (specie2) names producing EPPO_01.xlsx file Further processing the latter, following attributes have been added * PrefName,
* Synonyms_la, # arguable info!!
* the Taxonomy columns,
producing EPPO_04.xlsx.
The last step sligthly corrupts the table: what I suppose should be the ‘Subclass’ info (only defined for some classes), ends up in the first column, substituting only partially its content.
The original content of the first column, key connecting this table with the version 1 tables, has to be recovered joining, on EPPOCode, EPPO_01.xlsx and EPPO_04.xlsx.
Modifications with respect to its original content have to be moved to the Subclass column (My hypotesis!!)
INFC_EPPO_db <- read_xlsx("EPPO_01.xlsx") %>%
dplyr::select(specie, EPPOCode) %>%
full_join(unique(dplyr::select(INFCstats, specie, pag))) %>%
rename(Species_ForIt1 = specie) %>%
inner_join(read_xlsx("EPPO_04.xlsx"), by = c("EPPOCode" = "EPPOCode")) %>%
mutate(Subclass = ifelse(Species_ForIt1 == specie, Subclass, specie)) %>%
dplyr::select(-specie, -specie2)
## Joining, by = "specie"
## Warning: Column `specie` joining character vector and factor, coercing into
## character vector
# just for a quick check
INFC_EPPO_db %>% dplyr::select(1:5) %>% filter(Species_ForIt1 != PrefName)
## # A tibble: 4 x 5
## Species_ForIt1 EPPOCode pag PrefName Synonyms_la
## <chr> <chr> <int> <chr> <chr>
## 1 Eucalyptus spp. 1EUCG 287 Eucalyptus <NA>
## 2 Fraxinus angust~ FRXAN 315 Fraxinus ang~ Fraxinus angustifolia sub~
## 3 Pinus laricio PIUNL 117 Pinus nigra ~ Pinus laricio;Pinus laric~
## 4 Populus canesce~ POPCN 399 Populus x ca~ Populus alba x P. tremula~
There are 26 distinct function groups, estimating stem volume and 4 tree biomass dryweigth components (see below).
The five enities are estimated based on a common pool of samples for each group sharing also model structure.
n_par structure
2 y = b_0 + b_1 * d^2 * h
3 y = b_0 + b_1 * d^2 * h + b_2 * d
Groups are identified through the page number (pag) of the reference publication.
INFCcatalog <- INFCstats %>%
dplyr::select(pag, n_oss, n_par) %>%
unique
The key to the individual functions is given combinig the ‘page number’ with the acronym of the estimated entity.
For each function, beside parameter values, also the var-covar matrix as well as the standard error have been reported in the publication and hence acqired in the data base.
(estimated_entities <- tribble(
~entity, ~entity_definition,
"vol", "volume of the stem and large branches [dm^3]",
"dw1", "phytomass of the stem and large braches [kg]",
"dw2", "phytomass of the small branches [kg]",
"dw3", "phytomass of the stump [kg]",
"dw4", "phytomass of the whole tree [kg]"))
## # A tibble: 5 x 2
## entity entity_definition
## <chr> <chr>
## 1 vol volume of the stem and large branches [dm^3]
## 2 dw1 phytomass of the stem and large braches [kg]
## 3 dw2 phytomass of the small branches [kg]
## 4 dw3 phytomass of the stump [kg]
## 5 dw4 phytomass of the whole tree [kg]
# assemble b-coefficients matix (it is just an array!)
assemble_bm <- function(n, b0, b1, b2) {
if (n == 2) x <- array(c(b0, b1), dim = c(n, 1)) else
if (n == 3) x <- array(c(b0, b1, b2), dim = c(n, 1)) else
x <- NULL
}
# assemble variance-covariance-matix (only the upper triangle is required)
assemble_vcm <- function(n, m1, m3, m4, m5, m7, m8, m9) {
if (n == 2) x <- c(m1, m3, m4) else
if (n == 3) x <- c(m1, m4, m7, m5, m8, m9) else
x <- NULL
return(new("dspMatrix", Dim = as.integer(c(n,n)), x = x))
}
INFCparam <- INFCstats %>%
dplyr::select(pag, mod, b.1., b.2., b.3., mvc.1., mvc.3., mvc.4., mvc.5., mvc.7., mvc.8., mvc.9., saq) %>%
unique
# There is an error: there are pag+mod key replicates
INFCparam %>%
group_by(pag, mod) %>%
summarise(c = n()) %>%
filter(c !=1) %>%
inner_join(INFCparam)
## Joining, by = c("pag", "mod")
## # A tibble: 2 x 14
## # Groups: pag [?]
## pag mod c b.1. b.2. b.3. mvc.1. mvc.3. mvc.4. mvc.5.
## <int> <fct> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 231 dw2 2 5.33 0.00556 0 0.332 -1.59e-4 2.80e-7 0
## 2 231 dw2 2 5.33 0.00556 0 0.332 -1.59e-4 2.80e-7 0
## # ... with 4 more variables: mvc.7. <dbl>, mvc.8. <dbl>, mvc.9. <dbl>,
## # saq <dbl>
INFCparam <- INFCparam %>%
# correction of error above
filter(!(pag == 231 & mod == "dw2" & saq == 0.00000028)) %>%
add_column(entity = as.character(.$mod), .after = 1) %>%
mutate(entity = replace(entity, entity == "v", "vol")) %>%
dplyr::select(-mod) %>%
inner_join(INFCcatalog) %>%
mutate(bm = pmap(list(n_par, b.1., b.2., b.3.), assemble_bm),
vcm = pmap(list(n_par, mvc.1., mvc.3., mvc.4., mvc.5., mvc.7., mvc.8., mvc.9.), assemble_vcm)) %>%
dplyr::select(pag, entity, saq, bm, vcm)
## Joining, by = "pag"
Interpolation functions produce unreliable extrapolations. It is quite important to verify, for every single estimation, if, at least, input values fall within the domain defined by the calibration sample extent. Actually original calibration dataset is not available, otherwise the density distribution of the observations could be used as a base for this evaluation. In the reference work the only information available concerning applicabilty domain of the functions proposed is given by the extent and limits of the tabulated versions of the functions. Tables charateristics are constant for each species (or species group): minimum amd maximum diameter and height are the same and the distribution of the table cells for which no value is calculated is identical.
library(raster)
library(rgeos)
outOfRangeDistance <- function(mat) {
# mat[!mat] <- NA
# mdrmat <- raster(vals = mat, crs = '+proj=utm +zone=12 +datum=WGS84',
rst <- raster(vals = mat, crs = '+proj=utm +zone=12 +datum=WGS84',
nrows = dim(mat)[1],
ncols = dim(mat)[2],
xmn = range(as.integer(dimnames(mat)[[2]]))[1],
xmx = range(as.integer(dimnames(mat)[[1]]))[2],
ymn = range(as.integer(dimnames(mat)[[1]]))[1],
ymx = range(as.integer(dimnames(mat)[[1]]))[2])
domLim <- rasterToPolygons(rst, fun = function(x) x==T, dissolve = T)
# distance %>%
# as.matrix
# mdrmat[mdrmat == 0] <- NA
# dimnames(mdrmat) <- dimnames(mat)
return(domLim)
}
assemble_dm <- function(d, h) {
# Present domain as a lgl matrix
d_range <- range(as.numeric(d))
h_range <- range(as.numeric(h))
dom <- matrix(F,
nrow = d_range[2] - d_range[1] +1,
ncol = h_range[2] - h_range[1] +1,
dimnames = list(d = c(d_range[1]:d_range[2]),
h = c(h_range[1]:h_range[2]))
)
for (i in 1:length(d)) dom[d[i],h[i]] <- T
return(outOfRangeDistance(dom))
}
INFCdomains <- INFCdomain %>%
mutate(spg = map_chr(key, ~str_split(., "-")[[1]][1]),
dbh = map_chr(key, ~str_split(., "-")[[1]][2]),
h_tot = map_chr(key, ~str_split(., "-")[[1]][3])) %>%
dplyr::select(-key, -in.range) %>%
left_join(mutate(unique(dplyr::select(INFCstats, spg, pag)),
spg = as.character(spg)))
## Joining, by = "spg"
# Some species codes mismatch:
# compare
INFCdomains %>% dplyr::select(spg, pag) %>% unique %>% filter(is.na(pag))
## spg pag
## 1 Cusp NA
## 2 Euoc NA
## 3 Pips NA
# with
INFCstats %>% dplyr::select(spg, pag, specie) %>% unique %>%
filter(str_detect(spg, "^(?:Cu|Eu|Pi)"))
## spg pag specie
## 1 Cuar 47 Cupressus arizonica
## 2 Cuse 47 Cupressus sempervirens
## 3 Eusp 287 Eucalyptus spp.
## 4 Piab 75 Picea abies
## 5 Pice 89 Pinus cembra
## 6 Piha 103 Pinus halepensis
## 7 Pila 117 Pinus laricio
## 8 Pini 131 Pinus nigra
## 9 Pipi 145 Pinus pinaster
## 10 Pipi 159 Pinus pinea
## 11 Pira 187 Pinus radiata
## 12 Pist 187 Pinus strobus
## 13 Pisy 173 Pinus sylvestris
# and hence correct
INFCdomains <- INFCdomains %>%
mutate(
pag = case_when(spg == "Cusp" ~ 47L,
spg == "Euoc" ~ 287L,
spg == "Pips" ~ 145L,
TRUE ~ pag))
INFCdomains %>% dplyr::select(spg, pag) %>% unique %>% filter(is.na(pag))
## [1] spg pag
## <0 rows> (or 0-length row.names)
INFCdomains %>% dplyr::select(pag) %>% unique %>% nrow
## [1] 26
# If 'pag', contains no NA and all 26 entry pages are considered, then we can proceed
INFCdomains <- INFCdomains %>%
unique %>%
group_by(pag) %>%
nest() %>%
mutate(dom = map(data, ~assemble_dm(.$dbh, .$h_tot))) #%>%
# dplyr::select(-data) ## Dominio in ForIT-1, da eliminare conclusa la fase di test
# for domains represented as poligons
# st_distance(x, y, ..., dist_fun, by_element = FALSE, which = "distance", par = 0, tolerance = 0)
## OLD for Domains represented by matrices
# library(magrittr)
# plotDom <- function(x, t) {
# image(as.integer(dimnames(x)[[1]]),
# as.integer(dimnames(x)[[2]]),
# x,
# ylim = rev(range(as.integer(dimnames(x)[[2]]))),
# xlab = 'h_tot',
# ylab = 'dbh',
# col = (terrain.colors(12)),
# main = paste("Page:", pag)
# )
# }
# INFCdomains %$% map2(dom, pag, plotDom)
rmarkdown::render("PreparingUpgrade.Rmd")
markdown::rpubsUpload("ForIt CRAN package - Preparing Upgrade", "PreparingUpgrade.html",
id = "https://api.rpubs.com/api/v1/document/440006/6be3bb33c3004b5cad3cc32d8baae593")
# continueUrl
# [1] "http://rpubs.com/publish/claim/440006/37856ca880654cfcaefc313d451bf664"
# http://rpubs.com/scotti/440006