1 Packages needed

library(WorldFlora)
library(data.table)
library(stringr)

2 Introduction

In previous posts (see here, here and here), I showed how the WorldFlora package Kindt 2020) can be used to standardize names from GlobalTreeSearch with the taxonomic backbone data from World Flora Online (WFO) or the World Checklist of Vascular Plants (WCVP).

Here I show standardization results for the latest versions of GlobalTreeSearch with the latest taxonomic backbone data sets of WFO and WCVP. Some of the scripts also need to be different given changes in more recent versions of the backbone data sets.

3 First modify the names of hybrid species in the downloaded database files

In this document, I use the latest available versions of World Flora Online (WFO, version 2023.12 downloaded from Zenodo) and the World Checklist of Vascular Plants (WCVP, version 11 downloaded from the Kew data depository).

As shown previously for the WCVP, I recommend to first use a text editor to replace instances of ’ × ’ by ’ ×’ in the WCVP. Now I recommend that this is also done for the latest (2023) versions of WFO.

4 Load the taxonomic backbone data file of World Flora Online

I used the latest available version of World Flora Online of v.2023.12. The download was done earlier, followed by providing the location of the file to WFO.download via its argument ‘WFO.file’.

(You could use WFO.remember(WFO.file=file.choose()) to be certain that the right version of the WFO backbone is used.)

In the taxonomic backbone, World Flora Online lists about half a million current species names.

WFO.remember()
## Data sourced from: E:\Roeland\WorldFloraOnline\2023\wfo_202312_RK.csv (Thu Dec 28 10:43:25 2023)
## Reading WFO data
## The WFO data is now available from WFO.data
nrow(WFO.data)
## [1] 1576062
nrow(WFO.data[WFO.data$taxonRank == "species", ])
## [1] 1164296
nrow(WFO.data[WFO.data$taxonRank == "species" & WFO.data$acceptedNameUsageID == "", ])
## [1] 517755

5 Load the complete list of species of GlobalTreeSearch

The downloadable complete list of tree species (version 1.7) was obtained from Global Tree Search. The list includes close to 60,000 species names (so roughly 10 percent of current species names in World Flora Online).

# GTS.file <- choose.files()
GTS.file <- "E:\\Roeland\\WorldFloraOnline\\2023\\global_tree_search_trees_1_7.csv"
GTS <- fread(GTS.file, header=TRUE, encoding="UTF-8")
head(GTS)
##               TaxonName                                Author V3 V4
## 1:     Abarema abbottii (Rose & Leonard) Barneby & J.W.Grimes NA NA
## 2:      Abarema acreana                   (J.F.Macbr.) L.Rico NA NA
## 3:   Abarema adenophora          (Ducke) Barneby & J.W.Grimes NA NA
## 4:    Abarema alexandri           (Urb.) Barneby & J.W.Grimes NA NA
## 5: Abarema asplenifolia        (Griseb.) Barneby & J.W.Grimes NA NA
## 6:   Abarema auriculata         (Benth.) Barneby & J.W.Grimes NA NA
##    Citation:  GlobalTreeSearch online database. Botanic Gardens Conservation International. Richmond, U.K. Available at www.bgci.org. Accessed on DD/MM/YYYY. DOI: 10.13140/RG.2.2.34077.79847
## 1:                                                                                                                                                            DOI: 10.13140/RG.2.2.34077.79847
## 2:                                                                                                                                                                                            
## 3:                                                                                                                                                                                            
## 4:                                                                                                                                                                                            
## 5:                                                                                                                                                                                            
## 6:
GTS <- GTS[, c("TaxonName", "Author")]
nrow(GTS)
## [1] 57922

6 Standardize species names with World Flora Online

6.1 Use WFO.match.fuzzyjoin

Everything is in place now to start the matching process. To avoid a crash of the WFO.match.fuzzyjoin() function, however, the data needs to be split. This can be done relatively easily via the cut() function.

It takes about an hour for the matching to be completed…

cuts <- cut(c(1:nrow(GTS)), breaks=20, labels=FALSE)
cut.i <- sort(unique(cuts))

start.time <- Sys.time()

for (i in 1:length(cut.i)) {

cat(paste("Cut: ", i, "\n"))  
    
GTS.i <- WFO.one(WFO.match.fuzzyjoin(spec.data=GTS[cuts==cut.i[i], ],
                                     WFO.data=WFO.data,
                                     spec.name="TaxonName",
                                     Authorship="Author",
                                     fuzzydist.max=3),
                 verbose=FALSE)

if (i==1) {
  GTS.WFO <- GTS.i
}else{
  GTS.WFO <- rbind(GTS.WFO, GTS.i)
}

}
## Cut:  1
## Checking for fuzzy matches for 20 records
## 
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut:  2
## Checking for fuzzy matches for 25 records
## 
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut:  3
## Checking for fuzzy matches for 21 records
## 
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut:  4
## Checking for fuzzy matches for 32 records
## 
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut:  5
## Checking for fuzzy matches for 17 records
## 
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut:  6
## Checking for fuzzy matches for 19 records
## 
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut:  7
## Checking for fuzzy matches for 21 records
## 
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut:  8
## Checking for fuzzy matches for 30 records
## 
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut:  9
## Checking for fuzzy matches for 24 records
## 
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut:  10
## Checking for fuzzy matches for 23 records
## 
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut:  11
## Checking for fuzzy matches for 16 records
## 
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut:  12
## Checking for fuzzy matches for 26 records
## 
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut:  13
## Checking for fuzzy matches for 14 records
## 
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut:  14
## Checking for fuzzy matches for 34 records
## 
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut:  15
## Checking for fuzzy matches for 25 records
## 
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut:  16
## Checking for fuzzy matches for 53 records
## 
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut:  17
## Checking for fuzzy matches for 50 records
## 
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut:  18
## Checking for fuzzy matches for 32 records
## 
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut:  19
## Checking for fuzzy matches for 30 records
## 
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut:  20
## Checking for fuzzy matches for 34 records
## 
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
end.time <- Sys.time()
end.time - start.time # 57.78924 mins
## Time difference of 59.00002 mins

6.2 Breakdown of matches

The results can be subdivided into species that could not be matched, species that could be directly matched and species with fuzzy matches.

# not matched
nrow(GTS.WFO[GTS.WFO$Matched == FALSE, ])
## [1] 8
# directly matched
nrow(GTS.WFO[GTS.WFO$Matched == TRUE & GTS.WFO$Fuzzy == FALSE, ])
## [1] 57376
GTS.fuzzy <- GTS.WFO[GTS.WFO$Fuzzy == TRUE, ]
nrow(GTS.fuzzy)
## [1] 538
nrow(GTS.fuzzy[GTS.fuzzy$Fuzzy.dist == 1, ])
## [1] 369
nrow(GTS.fuzzy[GTS.fuzzy$Fuzzy.dist == 2, ])
## [1] 160
nrow(GTS.fuzzy[GTS.fuzzy$Fuzzy.dist == 3, ])
## [1] 9

6.3 Check for likely fuzzy matches

With the WFO.acceptable.match() function, fuzzy matches can be identified that correspond to differences in gender or differences only for vowels.

accept.var <- WFO.acceptable.match(GTS.fuzzy,
                                   spec.name="TaxonName",
                                   no.vowels=TRUE)

GTS.fuzzy <- data.frame(GTS.fuzzy,
                        acceptable=accept.var)

Most of the fuzzy matches can be accepted.

nrow(GTS.fuzzy[GTS.fuzzy$acceptable == TRUE, ])
## [1] 428
head(GTS.fuzzy[GTS.fuzzy$acceptable == TRUE, 
               c("TaxonName", "scientificName", "Old.name")])
##                   TaxonName        scientificName Old.name
## 120   Abrahamia itremoensis Abrahamia itromoensis         
## 1054   Acropogon calcicolus   Acropogon calcicola         
## 1239 Adelobotrys macranthus Adelobotrys macrantha         
## 1488   Aeschynomene burttii Aeschynomene burttiie         
## 1606       Ageratina urbani     Ageratina urbanii         
## 1757         Aidia congesta       Aidia congestum
tail(GTS.fuzzy[GTS.fuzzy$acceptable == TRUE, 
               c("TaxonName", "scientificName", "Old.name")])
##                    TaxonName        scientificName Old.name
## 253617  Xylopia subdehiscens Xylopia sub-dehiscens         
## 258221    Xylosma kaalaensis     Xylosma kaalensis         
## 266618    Zabelia tyaihyonii     Zabelia tyaihyoni         
## 267816 Zanthoxylum amapaense  Zanthoxylum amapense         
## 286121   Ziziphus cambodiana  Ziziphus cambodianus         
## 296819  Zygocarpum caeruleum  Zygocarpum coeruleum

Also many of the other fuzzy matches could be accepted during a manual check.

nrow(GTS.fuzzy[GTS.fuzzy$acceptable == FALSE, ])
## [1] 110
GTS.fuzzy[GTS.fuzzy$acceptable == FALSE & GTS.fuzzy$Fuzzy.dist==1, 
               c("TaxonName", "scientificName", "Old.name")]
##                              TaxonName                  scientificName
## 1348         Adinandra macquilingensis        Adinandra maquilingensis
## 1428             Aegiphila luschnathii            Aegiphila luschnatii
## 1492   Aeschynomene pararubrofarinacea Aeschynomene pararuhrofarinacea
## 6011             Arachnothryx chaconii           Arachnothryx chaconis
## 10261             Ardisia labisiifolia           Ardisia labrisiifolia
## 13301                Arytera litoralis              Arytera littoralis
## 22671             Barringtonia augusta            Barringtonia angusta
## 9882               Buchanania evrardii            Buchanania ×evrardii
## 18613           Calophyllum touranense         Calophyllum tournanense
## 6913          Castanopsis catappifolia        Castanopsis catalpifolia
## 14114            Castanopsis motleyana          Castanopsis mottleyana
## 14983            Cinnamomum malabatrum          Cinnamomum malabathrum
## 16515            Citharexylum mocinnoi            Citharexylum mocinoi
## 3154            Comocladia ebrenbergii          Comocladia ehrenbergii
## 6054               Coprosma tahitensis              Coprosma taitensis
## 9795           Dendropanax langsdorfii        Dendropanax langsdorffii
## 12865            Dicoryphe buddleoides          Dicoryphe buddlejoides
## 15735          Diospyros baloen-idjoek         Diospyros baloen-ldjoek
## 17375               Diospyros exculpta             Diospyros exsculpta
## 16420              Euclinia squamifera            Euclinia ×squamifera
## 6836                 Eugenia marshiana               Eugenia marchiana
## 11267                Euodia nishimurae             Melicope nishimurae
## 15157                   Eurya leptanta                 Eurya leptantha
## 15696               Eurya sandwicensis             Eurya sandwichensis
## 25606            Ficus scott-elliottii            Ficus scott-elliotii
## 28186              Fordia brachybotrys             Fordia brachybotrya
## 30602            Furcraea macdougallii            Furcraea macdougalii
## 28228                Garcinia matsudae                Garcinia matudae
## 29928              Garcinia moseleyana            Garcinia moselleyana
## 3848              Garcinia schlecbteri            Garcinia schlechteri
## 6388            Geissanthus serrulatus           Geissanthus serulatus
## 11468              Gmelina leichardtii            Gmelina leichhardtii
## 14997      Graffenrieda conostegioides     Graffenrieda comostegioides
## 5449                 Hibbertia wagapii               Hibbertia wayapii
## 6189                Hibiscus tiliaceus             Hibiscus tilliaceus
## 7068            Hirtella brachystachya          Hirtella brachystachys
## 151310          Hymenodictyon perrieri           Hymenodictyon perieri
## 18757                       Ilex linii                      Ilex limii
## 21709                  Inga andersonii                Inga ×andersonii
## 138710                Licaria cannella                 Licaria canella
## 256114               Luehea conwentzii               Luehea conwentsii
## 17180            Machilus minkweiensis            Machilus mikweiensis
## 61213           Magnolia argyrothricha           Magnolia argyrotricha
## 63811          Magnolia caricifragrans          Magnolia caricifragans
## 77911               Magnolia macclurei               Magnolia maclurei
## 189410         Melanochyla fulvinervis         Melanochyla fulvinervia
## 266311            Mezoneuron kavaiense            Mezoneuron kauaiense
## 277710            Miconia buddlejoides             Miconia budlejoides
## 287810                 Miconia doniana                 Miconia doriana
## 7850               Miconia paspaloides            Miconia paspaploides
## 86214       Moldenhawera luschnathiana       Moldenhawera lushnathiana
## 176116            Myrcia dolichopetala             Myrcia dolicopetala
## 254313          Pancheria xaragurensis           Pancheria xaraguensis
## 38611              Pavetta yambatensis             Pavetta yamhatensis
## 88813          Phellocalyx vollescenii          Phellocalyx vollesenii
## 96314                 Phoebe mathewsii               Phoebe matthewsii
## 237118            Plectroniella armata                Canthium armatum
## 262514            Plumeria trouinensis           Plumeria ×stenopetala
## 289215             Polyosma ilicifolia            Polyosma illicifolia
## 203615          Psychotria uncumariana           Psychotria ucumariana
## 256515            Pyrus tamamschiannae             Pyrus tamamschianae
## 267119              Quararibea mayanum              Quararibea mayarum
## 31141             Quercus shangxiensis             Quercus shanxiensis
## 116315                Sorbus avonensis               Sorbus ×avonensis
## 135816                 Sorbus roopiana                Sorbus ×roopiana
## 177316         Stenostomum albobruneum        Stenostomum albobrunneum
## 228317                  Styrax obassis                  Styrax obassia
## 169100              Syzygium dealbatum               Syzygium dealatum
## 33017            Syzygium hullettianum            Syzygium hulletianum
## 52517                 Syzygium naiadum                Syzygium najadum
## 55121            Syzygium normanbiensc           Syzygium normanbiense
## 94616            Syzygium vanderwateri            Syzygium vandewateri
## 96617              Syzygium vidalianum             Syzygium vidaliarum
## 197219        Ternstroemia conicocarpa         Ternstroemia coniocarpa
## 274016            Tricalysia elliottii             Tricalysia elliotii
## 53718               Vaccinium dubiosum              Vaccinium duhiosum
## 196519           Wendlandia buddleacea          Wendlandia buddlejacea
## 226219      Xanthophyllum schizocarpon      Xanthophyllum schixocarpon
##                    Old.name
## 1348                       
## 1428                       
## 1492                       
## 6011                       
## 10261                      
## 13301                      
## 22671                      
## 9882                       
## 18613                      
## 6913                       
## 14114                      
## 14983                      
## 16515                      
## 3154                       
## 6054                       
## 9795                       
## 12865                      
## 15735                      
## 17375                      
## 16420                      
## 6836                       
## 11267     Evodia nishimurae
## 15157                      
## 15696                      
## 25606                      
## 28186                      
## 30602                      
## 28228                      
## 29928                      
## 3848                       
## 6388                       
## 11468                      
## 14997                      
## 5449                       
## 6189                       
## 7068                       
## 151310                     
## 18757                      
## 21709                      
## 138710                     
## 256114                     
## 17180                      
## 61213                      
## 63811                      
## 77911                      
## 189410                     
## 266311                     
## 277710                     
## 287810                     
## 7850                       
## 86214                      
## 176116                     
## 254313                     
## 38611                      
## 88813                      
## 96314                      
## 237118  Plectroniella amata
## 262514 Plumeria trouinenais
## 289215                     
## 203615                     
## 256515                     
## 267119                     
## 31141                      
## 116315                     
## 135816                     
## 177316                     
## 228317                     
## 169100                     
## 33017                      
## 52517                      
## 55121                      
## 94616                      
## 96617                      
## 197219                     
## 274016                     
## 53718                      
## 196519                     
## 226219
GTS.fuzzy[GTS.fuzzy$acceptable == FALSE & GTS.fuzzy$Fuzzy.dist==2, 
               c("TaxonName", "scientificName", "Old.name")]
##                           TaxonName              scientificName
## 25451        Beguea tsaratananensis        Beguea tsaratanensis
## 10052         Buchanania sessifolia     Buchanania sessilifolia
## 22902          Canarium multinervis         Canarium multinerve
## 23282             Canarium subtilis            Canarium subtile
## 26662         Coccoloba ramosissima     Coccoloba ramosisissima
## 7295   Decarydendron ranomafanensis Decarydendron ranomafanense
## 18286         Eriocoelum dzangensis        Eriocoelum dzangense
## 24127           Euadenia trifoliata           Crateva monticola
## 154112  Grazielodendron riodocensis  Grazielodendron riodocense
## 222411        Memecylon arnhemensis        Memecylon arnhemense
## 116012     Monteverdia schummaniana       Maytenus schumanniana
## 53511           Pentaceras australe        Pentaceras australis
## 161915        Stadtmannia acuminata         Stadmania acuminata
## 162017          Stadtmannia excelsa           Stadmania excelsa
## 162121           Stadtmannia glauca            Stadmania glauca
## 162217         Stadtmannia leandrii          Stadmania leandrii
## 162317    Stadtmannia oppositifolia     Stadmania oppositifolia
## 162417        Stadtmannia serrulata         Stadmania serrulata
## 36617        Syzygium kanneliyensis       Syzygium kanneliyense
## 202017       Ternstroemia oleifolia      Ternstroemia alnifolia
## 208616       Tetrachyron orizabense    Tetrachyron orizabaensis
## 270717              Trema orientale            Trema orientalis
## 293219    Tridesmostemon congoensis    Tridesmostemon congoense
## 186100            Turraea laciniata           Turraea laciniosa
## 222219          Xanthophyllum laeve        Xanthophyllum laevis
##                        Old.name
## 25451                          
## 10052                          
## 22902                          
## 23282                          
## 26662                          
## 7295                           
## 18286                          
## 24127     Euadenia trifoliolata
## 154112                         
## 222411                         
## 116012 Monteverdia schumanniana
## 53511                          
## 161915                         
## 162017                         
## 162121                         
## 162217                         
## 162317                         
## 162417                         
## 36617                          
## 202017                         
## 208616  Tetrachyron orizabensis
## 270717                         
## 293219                         
## 186100                         
## 222219
GTS.fuzzy[GTS.fuzzy$acceptable == FALSE & GTS.fuzzy$Fuzzy.dist==3, 
               c("TaxonName", "scientificName", "Old.name")]
##                           TaxonName             scientificName
## 23842  Capparidastrum cuatrecasanum  Morisonia cuatrecasasiana
## 13783      Cinnamomum austrosinense Cinnamomum austro-sinensis
## 23095          Eschweilera jefensis      Eschweilera juruensis
## 160511      Matisia cuatrecasasiana       Matisia cuatrecasana
## 26790      Portulacaria carrissoana      Portulaca carrissoana
## 258615             Quadrella indica                Cordia myxa
## 31271           Quercus sontraensis    Lithocarpus sootepensis
##                               Old.name
## 23842  Capparidastrum cuatrecasasianum
## 13783                                 
## 23095                                 
## 160511                                
## 26790                                 
## 258615                  Quarena indica
## 31271              Quercus sootepensis

7 Standardize species names with the World Checklist of Vascular Plants

Instead of using the taxonomic backbone of World Flora Online, now we will use the taxonomic backbone of the World Checklist of Vascular Plants (WCVP). Note the modifications that were done in the file to replace instances of ’ × ’ by ’ ×’.

7.1 Create a new backbone data set

# WCVP.file <- choose.files()
WCVP.file <- "E:\\Roeland\\WorldFloraOnline\\2023\\wcvp_names_RK.csv"
WCVP.data <- fread(WCVP.file, header=TRUE, encoding="UTF-8")
## Warning in fread(WCVP.file, header = TRUE, encoding = "UTF-8"): Found and
## resolved improper quoting out-of-sample. First healed line 11102: <<329638|
## 328672-2|Variety|Synonym|Lamiaceae||Ajuga||decumbens|var.|vegeta||Honda||Bot.
## Mag. (Tokyo)| 45: 299|(1931)||"Hondo: Agematsu, prov. Shinano", Japan, Eastern
## Asia, Asia-Temperate|||Ajuga decumbens var. vegeta|Honda|5322|||||328672-2||Y>>.
## If the fields are not quoted (e.g. field separator does not appear within any
## field), try quote="" to avoid this warning.
head(WCVP.data)
##    plant_name_id  ipni_id taxon_rank taxon_status    family genus_hybrid
## 1:        195508 243233-2    Species      Synonym Lamiaceae             
## 2:        197585 767122-1    Species      Synonym Rubiaceae             
## 3:         76791 595920-1    Species      Synonym Myrtaceae             
## 4:         74373 593644-1    Species      Synonym Myrtaceae             
## 5:        205204 884387-1 Subspecies      Synonym Lamiaceae             
## 6:        102745 326216-2      Genus      Synonym Lamiaceae             
##          genus species_hybrid     species infraspecific_rank infraspecies
## 1:     Stachys                  pustulosa                                
## 2: Stenostomum                 dichotomum                                
## 3:     Eugenia                   scoparia                                
## 4:     Eugenia                   areolata                                
## 5:      Thymus                pallasianus             subsp.   brachyodon
## 6:    Isanthus                                                           
##    parenthetical_author primary_author publication_author place_of_publication
## 1:                               Rydb.                               Brittonia
## 2:                                 DC.                                  Prodr.
## 3:                              Duthie         J.D.Hooker      Fl. Brit. India
## 4:                  DC.         Duthie         J.D.Hooker      Fl. Brit. India
## 5:               Borbás          Jalas                      Bot. J. Linn. Soc.
## 6:                              Michx.                          Fl. Bor.-Amer.
##    volume_and_page first_published nomenclatural_remarks geographic_area
## 1:           1: 95          (1931)                                      
## 2:          4: 461          (1830)                                      
## 3:          2: 489          (1878)                                      
## 4:          2: 490          (1878)                                      
## 5:         64: 262          (1971)                                      
## 6:            2: 3          (1803)                                      
##    lifeform_description climate_description
## 1:                                         
## 2:                                         
## 3:                                         
## 4:                                         
## 5:                                         
## 6:                                         
##                              taxon_name  taxon_authors accepted_plant_name_id
## 1:                    Stachys pustulosa          Rydb.                 195467
## 2:               Stenostomum dichotomum            DC.                 197582
## 3:                     Eugenia scoparia         Duthie                 199254
## 4:                     Eugenia areolata   (DC.) Duthie                 200472
## 5: Thymus pallasianus subsp. brachyodon (Borbás) Jalas                 204938
## 6:                             Isanthus         Michx.                 208472
##    basionym_plant_name_id replaced_synonym_author homotypic_synonym
## 1:                     NA                                          
## 2:                     NA                                          
## 3:                     NA                                          
## 4:                 199231                                          
## 5:                 204517                                          
## 6:                     NA                                          
##    parent_plant_name_id  powo_id hybrid_formula reviewed
## 1:                   NA 243233-2                       Y
## 2:                   NA 767122-1                       Y
## 3:                   NA 595920-1                       Y
## 4:                   NA 593644-1                       Y
## 5:                   NA 884387-1                       Y
## 6:                   NA 326216-2                       Y
tail(WCVP.data)
##    plant_name_id  ipni_id taxon_rank taxon_status     family genus_hybrid
## 1:       3297495          Subspecies     Accepted   Fabaceae             
## 2:        437859 204824-2    Species     Accepted    Poaceae             
## 3:       2510011 946536-1    Species     Accepted   Rutaceae             
## 4:       2580958 940927-1    Species     Accepted  Malvaceae             
## 5:       3052324 978137-1    Species     Accepted Asteraceae             
## 6:       3052326 985860-1    Species     Accepted Asteraceae             
##            genus species_hybrid        species infraspecific_rank infraspecies
## 1:    Onobrychis                          alba             subsp.    pentelica
## 2:       Poidium                     calotheca                                
## 3:      Melicope                 zahlbruckneri                                
## 4:     Sterculia                   tantraensis                                
## 5: Vernonanthura                santacruzensis                                
## 6: Vernonanthura                    schulziana                                
##    parenthetical_author          primary_author publication_author
## 1:             Hausskn.                   Nyman                   
## 2:                Trin.                 Matthei                   
## 3:                 Rock T.G.Hartley & B.C.Stone                   
## 4:                                        Morat                   
## 5:              Hieron.                  H.Rob.                   
## 6:              Cabrera                  H.Rob.                   
##                         place_of_publication   volume_and_page
## 1:                           Consp. Fl. Eur. , Suppl. 2(1): 99
## 2:                        Willdenowia, Beih.            8: 116
## 3:                                     Taxon           38: 122
## 4: Bull. Mus. Natl. Hist. Nat., B, Adansonia            8: 362
## 5:                                Phytologia            76: 29
## 6:                                Phytologia           78: 386
##      first_published nomenclatural_remarks
## 1:            (1889)                      
## 2:            (1975)                      
## 3:            (1989)                      
## 4: (1986 publ. 1987)                      
## 5:            (1994)                      
## 6:            (1995)                      
##                                    geographic_area lifeform_description
## 1:                           S. Italy, Balkan Pen.            perennial
## 2:      Colombia, SE. & S. Brazil to NE. Argentina            perennial
## 3:                                    Hawaiian Is.                 tree
## 4:                                        Papuasia                 tree
## 5:                            Bolivia (Santa Cruz)                     
## 6: Brazil (Rio Grande do Sul) to Argentina (Chaco)                     
##    climate_description                       taxon_name
## 1:           temperate Onobrychis alba subsp. pentelica
## 2:         subtropical                Poidium calotheca
## 3:        wet tropical           Melicope zahlbruckneri
## 4:        wet tropical            Sterculia tantraensis
## 5:                         Vernonanthura santacruzensis
## 6:                             Vernonanthura schulziana
##                     taxon_authors accepted_plant_name_id basionym_plant_name_id
## 1:               (Hausskn.) Nyman                3297495                2389037
## 2:                (Trin.) Matthei                 437859                 412327
## 3: (Rock) T.G.Hartley & B.C.Stone                2510011                2541635
## 4:                          Morat                2580958                2552044
## 5:               (Hieron.) H.Rob.                3052324                3052322
## 6:               (Cabrera) H.Rob.                3052326                3052325
##    replaced_synonym_author homotypic_synonym parent_plant_name_id   powo_id
## 1:                                                        2389084 3297495-4
## 2:                                                         437855  204824-2
## 3:                                                        2509107  946536-1
## 4:                Lauterb.                                2579857  940927-1
## 5:                                                        3134351  978137-1
## 6:                                                        3134351  985860-1
##    hybrid_formula reviewed
## 1:                       Y
## 2:                       Y
## 3:                       N
## 4:                       N
## 5:                       N
## 6:                       N
WCVP.data <- new.backbone(WCVP.data, 
                          taxonID="plant_name_id",
                          scientificName="taxon_name",
                          scientificNameAuthorship="taxon_authors",
                          acceptedNameUsageID = "accepted_plant_name_id",
                          taxonomicStatus = "taxon_status")
head(WCVP.data)
##    taxonID                       scientificName scientificNameAuthorship
## 1:  195508                    Stachys pustulosa                    Rydb.
## 2:  197585               Stenostomum dichotomum                      DC.
## 3:   76791                     Eugenia scoparia                   Duthie
## 4:   74373                     Eugenia areolata             (DC.) Duthie
## 5:  205204 Thymus pallasianus subsp. brachyodon           (Borbás) Jalas
## 6:  102745                             Isanthus                   Michx.
##    acceptedNameUsageID taxonomicStatus plant_name_id  ipni_id taxon_rank
## 1:              195467         Synonym        195508 243233-2    Species
## 2:              197582         Synonym        197585 767122-1    Species
## 3:              199254         Synonym         76791 595920-1    Species
## 4:              200472         Synonym         74373 593644-1    Species
## 5:              204938         Synonym        205204 884387-1 Subspecies
## 6:              208472         Synonym        102745 326216-2      Genus
##    taxon_status    family genus_hybrid       genus species_hybrid     species
## 1:      Synonym Lamiaceae                  Stachys                  pustulosa
## 2:      Synonym Rubiaceae              Stenostomum                 dichotomum
## 3:      Synonym Myrtaceae                  Eugenia                   scoparia
## 4:      Synonym Myrtaceae                  Eugenia                   areolata
## 5:      Synonym Lamiaceae                   Thymus                pallasianus
## 6:      Synonym Lamiaceae                 Isanthus                           
##    infraspecific_rank infraspecies parenthetical_author primary_author
## 1:                                                               Rydb.
## 2:                                                                 DC.
## 3:                                                              Duthie
## 4:                                                  DC.         Duthie
## 5:             subsp.   brachyodon               Borbás          Jalas
## 6:                                                              Michx.
##    publication_author place_of_publication volume_and_page first_published
## 1:                               Brittonia           1: 95          (1931)
## 2:                                  Prodr.          4: 461          (1830)
## 3:         J.D.Hooker      Fl. Brit. India          2: 489          (1878)
## 4:         J.D.Hooker      Fl. Brit. India          2: 490          (1878)
## 5:                      Bot. J. Linn. Soc.         64: 262          (1971)
## 6:                          Fl. Bor.-Amer.            2: 3          (1803)
##    nomenclatural_remarks geographic_area lifeform_description
## 1:                                                           
## 2:                                                           
## 3:                                                           
## 4:                                                           
## 5:                                                           
## 6:                                                           
##    climate_description                           taxon_name  taxon_authors
## 1:                                        Stachys pustulosa          Rydb.
## 2:                                   Stenostomum dichotomum            DC.
## 3:                                         Eugenia scoparia         Duthie
## 4:                                         Eugenia areolata   (DC.) Duthie
## 5:                     Thymus pallasianus subsp. brachyodon (Borbás) Jalas
## 6:                                                 Isanthus         Michx.
##    accepted_plant_name_id basionym_plant_name_id replaced_synonym_author
## 1:                 195467                   <NA>                        
## 2:                 197582                   <NA>                        
## 3:                 199254                   <NA>                        
## 4:                 200472                 199231                        
## 5:                 204938                 204517                        
## 6:                 208472                   <NA>                        
##    homotypic_synonym parent_plant_name_id  powo_id hybrid_formula reviewed
## 1:                                   <NA> 243233-2                       Y
## 2:                                   <NA> 767122-1                       Y
## 3:                                   <NA> 595920-1                       Y
## 4:                                   <NA> 593644-1                       Y
## 5:                                   <NA> 884387-1                       Y
## 6:                                   <NA> 326216-2                       Y
nrow(WCVP.data)
## [1] 1419265
nrow(WCVP.data[WCVP.data$taxon_rank == "Species", ])
## [1] 1028856
nrow(WCVP.data[WCVP.data$taxon_rank == "Species" & WCVP.data$acceptedNameUsageID == "", ])
## [1] 0

7.2 Change the acceptedNameUsageID where equal to the taxonID

For the WFO.match.fuzzyjoin() function to work quicker, where the acceptedNameUsageID is the same as the taxonID, it is better that the field is left blank.

nrow(WCVP.data[WCVP.data$taxonomicStatus == "Accepted", ])
## [1] 425349
nrow(WCVP.data[WCVP.data$taxonID == WCVP.data$acceptedNameUsageID, ])
## [1] 430633
unique(WCVP.data[WCVP.data$taxonID == WCVP.data$acceptedNameUsageID, "taxonomicStatus"])
##      taxonomicStatus
## 1:          Accepted
## 2: Artificial Hybrid
## 3:     Local Biotype
head(WCVP.data[WCVP.data$taxonID == WCVP.data$acceptedNameUsageID & WCVP.data$taxonomicStatus == "Accepted", ])[, 1:5]
##    taxonID       scientificName scientificNameAuthorship acceptedNameUsageID
## 1:  425448              Narenga                      Bor              425448
## 2: 2503628               Malope                       L.             2503628
## 3: 2545738 Pouzolzia variifolia      Friis & Wilmot-Dear             2545738
## 4: 2677268            Blepharis                    Juss.             2677268
## 5: 2623037               Adonis                       L.             2623037
## 6: 2646596    Aragoa cupressina                    Kunth             2646596
##    taxonomicStatus
## 1:        Accepted
## 2:        Accepted
## 3:        Accepted
## 4:        Accepted
## 5:        Accepted
## 6:        Accepted
head(WCVP.data[WCVP.data$taxonID == WCVP.data$acceptedNameUsageID & WCVP.data$taxonomicStatus == "Artificial Hybrid", ])[, 1:5]
##    taxonID    scientificName scientificNameAuthorship acceptedNameUsageID
## 1: 2756546      × Daltadenia                  Wiehler             2756546
## 2:  372890       × Geoclades                   A.Chen              372890
## 3:  372411 × Christendoritis               J.M.H.Shaw              372411
## 4: 2750311    × Cylindrantha                    Y.Itô             2750311
## 5: 3293147          × Melara       M.H.J.van der Meer             3293147
## 6:  372403   × Constanciaara               J.M.H.Shaw              372403
##      taxonomicStatus
## 1: Artificial Hybrid
## 2: Artificial Hybrid
## 3: Artificial Hybrid
## 4: Artificial Hybrid
## 5: Artificial Hybrid
## 6: Artificial Hybrid
head(WCVP.data[WCVP.data$taxonID == WCVP.data$acceptedNameUsageID & WCVP.data$taxonomicStatus == "Local Biotype", ])[, 1:5]
##    taxonID        scientificName scientificNameAuthorship acceptedNameUsageID
## 1: 2976323       Rubus abietinus                    Sudre             2976323
## 2: 2976769         Rubus amoenus          Köhler ex Weihe             2976769
## 3: 2976814      Rubus amplifrons                    Sudre             2976814
## 4: 2976876    Rubus angustifrons                    Sudre             2976876
## 5: 2976888 Rubus anisacanthoides                    Sudre             2976888
## 6: 2977350     Rubus bakonyensis                    Gáyer             2977350
##    taxonomicStatus
## 1:   Local Biotype
## 2:   Local Biotype
## 3:   Local Biotype
## 4:   Local Biotype
## 5:   Local Biotype
## 6:   Local Biotype
WCVP.data[WCVP.data$taxonID == WCVP.data$acceptedNameUsageID, "acceptedNameUsageID"] <- ""

head(WCVP.data[WCVP.data$taxonomicStatus == "Accepted", ])[, 1:5]
##    taxonID       scientificName scientificNameAuthorship acceptedNameUsageID
## 1:  425448              Narenga                      Bor                    
## 2: 2503628               Malope                       L.                    
## 3: 2545738 Pouzolzia variifolia      Friis & Wilmot-Dear                    
## 4: 2677268            Blepharis                    Juss.                    
## 5: 2623037               Adonis                       L.                    
## 6: 2646596    Aragoa cupressina                    Kunth                    
##    taxonomicStatus
## 1:        Accepted
## 2:        Accepted
## 3:        Accepted
## 4:        Accepted
## 5:        Accepted
## 6:        Accepted
unique(WCVP.data[WCVP.data$taxonomicStatus == "Accepted", "acceptedNameUsageID"])
##    acceptedNameUsageID
## 1:

7.3 Use WFO.match.fuzzyjoin

We can use similar scripts now as above with the taxonomic backbone of World Flora Online.

cuts <- cut(c(1:nrow(GTS)), breaks=20, labels=FALSE)
cut.i <- sort(unique(cuts))

start.time <- Sys.time()

for (i in 1:length(cut.i)) {

cat(paste("Cut: ", i, "\n"))  
    
GTS.i <- WFO.one(WFO.match.fuzzyjoin(spec.data=GTS[cuts==cut.i[i], ],
                                     WFO.data=WCVP.data,
                                     spec.name="TaxonName",
                                     Authorship="Author",
                                     fuzzydist.max=3),
                 verbose=FALSE)

if (i==1) {
  GTS.WCVP <- GTS.i
}else{
  GTS.WCVP <- rbind(GTS.WCVP, GTS.i)
}

}
## Cut:  1
## Checking for fuzzy matches for 16 records
## 
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut:  2
## Checking for fuzzy matches for 17 records
## 
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut:  3
## Checking for fuzzy matches for 16 records
## 
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut:  4
## Checking for fuzzy matches for 19 records
## 
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut:  5
## Checking for fuzzy matches for 14 records
## 
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut:  6
## Checking for fuzzy matches for 16 records
## 
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut:  7
## Checking for fuzzy matches for 21 records
## 
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut:  8
## Checking for fuzzy matches for 10 records
## 
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut:  9
## Checking for fuzzy matches for 22 records
## 
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut:  10
## Checking for fuzzy matches for 19 records
## 
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut:  11
## Checking for fuzzy matches for 19 records
## 
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut:  12
## Checking for fuzzy matches for 83 records
## 
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut:  13
## Checking for fuzzy matches for 16 records
## 
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut:  14
## Checking for fuzzy matches for 16 records
## 
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut:  15
## Checking for fuzzy matches for 22 records
## 
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut:  16
## Checking for fuzzy matches for 35 records
## 
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut:  17
## Checking for fuzzy matches for 31 records
## 
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut:  18
## Checking for fuzzy matches for 23 records
## 
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut:  19
## Checking for fuzzy matches for 18 records
## 
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut:  20
## Checking for fuzzy matches for 18 records
## 
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
end.time <- Sys.time()
end.time - start.time # 55.06009 mins
## Time difference of 57.85401 mins

7.4 Breakdown of matches

The results can be subdivided into species that could not be matched, species that could be directly matched and species with fuzzy matches.

A considerably higher number of species could not be matched, whereas also higher numbers of species had fuzzy matches than previously with the WFO.

# not matched
nrow(GTS.WCVP[GTS.WCVP$Matched == FALSE, ])
## [1] 47
# directly matched
nrow(GTS.WCVP[GTS.WCVP$Matched == TRUE & GTS.WCVP$Fuzzy == FALSE, ])
## [1] 57471
GTS.fuzzy2 <- GTS.WCVP[GTS.WCVP$Fuzzy == TRUE, ]
nrow(GTS.fuzzy2)
## [1] 404
nrow(GTS.fuzzy2[GTS.fuzzy2$Fuzzy.dist == 1, ])
## [1] 220
nrow(GTS.fuzzy2[GTS.fuzzy2$Fuzzy.dist == 2, ])
## [1] 157
nrow(GTS.fuzzy2[GTS.fuzzy2$Fuzzy.dist == 3, ])
## [1] 27

7.5 Check for likely fuzzy matches

With the WFO.acceptable.match() function, fuzzy matches can be identified that correspond to differences in gender or differences only for vowels.

accept.var <- WFO.acceptable.match(GTS.fuzzy2,
                                   spec.name="TaxonName",
                                   no.vowels=TRUE)

GTS.fuzzy2 <- data.frame(GTS.fuzzy2,
                        acceptable=accept.var)

Many of the fuzzy matches can be accepted.

nrow(GTS.fuzzy2[GTS.fuzzy2$acceptable == TRUE, ])
## [1] 269
head(GTS.fuzzy2[GTS.fuzzy2$acceptable == TRUE, 
               c("TaxonName", "scientificName", "Old.name")])
##                    TaxonName          scientificName Old.name
## 14     Abarema cochliocarpos   Abarema cochliacarpos         
## 417   Acacia macdonnellensis Acacia macdonnelliensis         
## 1013    Acropogon calcicolus     Acropogon calcicola         
## 1030 Acropogon sageniifolius  Acropogon sageniifolia         
## 1033 Acropogon schumannianus  Acropogon schumanniana         
## 1418      Aegiphila valerioi       Aegiphila valerii
tail(GTS.fuzzy2[GTS.fuzzy2$acceptable == TRUE, 
               c("TaxonName", "scientificName", "Old.name")])
##                     TaxonName          scientificName Old.name
## 157319    Vochysia antioquiae      Vochysia antioquia         
## 193220    Weinmannia trianaea      Weinmannia trianae         
## 214618 Wunderlichia crulsiana Wunderlichia cruelsiana         
## 265220       Yucca treculeana        Yucca treculiana         
## 265617     Zabelia tyaihyonii       Zabelia tyaihyoni         
## 293319        Zygia macbridii         Zygia macbridei

Also many of the other fuzzy matches could be accepted during a manual check. But definitely a manual check is required.

nrow(GTS.fuzzy2[GTS.fuzzy2$acceptable == FALSE, ])
## [1] 135
GTS.fuzzy2[GTS.fuzzy2$acceptable == FALSE & GTS.fuzzy2$Fuzzy.dist==1, 
               c("TaxonName", "scientificName", "Old.name")]
##                          TaxonName                  scientificName
## 803                Acer osmastonii                Acer ×osmastonii
## 1312            Adinandra milletii             Adinandra millettii
## 2379             Alnus mandshurica              Alnus mandschurica
## 3731    Antirhea novobritanniensis Achilleanthus novobrittaniensis
## 6731        Aralia castanopsiscola           Aralia castanopsicola
## 3412               Betula dahurica                 Betula davurica
## 5412              Betula murrayana                Betula ×purpusii
## 16552           Calodendrum eichii              Calodendrum eickii
## 25862       Carpinus turczaninowii          Carpinus turczaninovii
## 16253        Citharexylum mocinnoi            Citharexylum mocinoi
## 16714             Citrus deliciosa  Citrus ×aurantium f. deliciosa
## 10053            Corymbia paractia              Corymbia ×paractia
## 20294            Croton sarocarpus              Croton sarcocarpus
## 12475        Dicoryphe buddleoides          Dicoryphe buddlejoides
## 23866            Eucalyptus abdita              Eucalyptus ×abdita
## 241110        Eucalyptus angularis           Eucalyptus ×angularis
## 24796            Eucalyptus bunyip              Eucalyptus ×bunyip
## 24926          Eucalyptus calyerup            Eucalyptus ×calyerup
## 251010       Eucalyptus carolaniae          Eucalyptus ×carolaniae
## 251110       Eucalyptus castrensis          Eucalyptus ×castrensis
## 25585          Eucalyptus crispata            Eucalyptus ×crispata
## 262110      Eucalyptus erectifolia         Eucalyptus ×erectifolia
## 26956           Eucalyptus hawkeri          Eucalyptus ×silvestris
## 27086           Eucalyptus impensa             Eucalyptus ×impensa
## 27466        Eucalyptus lateritica          Eucalyptus ×lateritica
## 27496       Eucalyptus leprophloia         Eucalyptus ×leprophloia
## 28996           Eucalyptus phoenix             Eucalyptus ×phoenix
## 29296       Eucalyptus pruiniramis         Eucalyptus ×pruiniramis
## 3620         Eucalyptus silvestris          Eucalyptus ×silvestris
## 6677             Eugenia marchiana               Eugenia marshiana
## 14986          Eurya kueichowensis             Eurya kueichouensis
## 29853         Freziera monzonensis            Freziera monsonensis
## 4437   Garcinia xipshuanbannaensis      Garcinia xishuanbannaensis
## 11288          Gmelina leichardtii            Gmelina leichhardtii
## 148111 Graffenrieda conostegioides     Graffenrieda comostegioides
## 21597       Guettarda prenleloupii           Guettarda preneloupii
## 11059           Hopea tenuivervula              Hopea tenuinervula
## 18999               Ilex mathewsii                     Ilex ovalis
## 21689              Inga andersonii                Inga ×andersonii
## 243113    Inhambanella henriquezii        Inhambanella henriquesii
## 26949           Ixora longhanensis             Ixora longshanensis
## 13438        Kaunia camataguiensis           Kaunia camataquiensis
## 87310             Lebrunia bushaie                Lebrunia busbaie
## 158710     Linospadix monostachyos         Linospadix monostachyus
## 83011           Magnolia pilocarpa             Magnolia ×pilocarpa
## 95510             Majidea forsteri                 Majidea fosteri
## 112911            Malus floribunda               Malus ×floribunda
## 250610          Mespilus canescens            Crataegus ×canescens
## 261711        Mezoneuron kavaiense            Mezoneuron kauaiense
## 282910             Miconia doniana                 Miconia doriana
## 84214   Moldenhawera luschnathiana       Moldenhawera lushnathiana
## 142812       Mouriri retentipetala            Mouriri retenipetala
## 75013        Ocotea kostermanniana           Ocotea kostermansiana
## 182312          Ouratea littoralis               Ouratea litoralis
## 23680             Pauldopia ghorta                Pauldopia ghonta
## 48913       Peltophorum dasyrachis         Peltophorum dasyrhachis
## 87914      Phellocalyx vollescenii          Phellocalyx vollesenii
## 197118          Pisonia tahitensis                Ceodes taitensis
## 259512    Plinia spirito-santensis       Plinia spirito-sanctensis
## 261813        Plumeria trouinensis           Plumeria ×stenopetala
## 22690                Populus yuana                   Populus wuana
## 40715        Sideroxylon mirmulans           Sideroxylon mirmulano
## 111317           Sorbus arvonensis                  Aria avonensis
## 136717           Sorbus yondeensis        Griffitharia yongdeensis
## 170219     Stenostomum albobruneum        Stenostomum albobrunneum
## 219917              Styrax obassis                  Styrax obassia
## 229416       Swartzia brachyrachis          Swartzia brachyrhachis
## 246917        Swintonia schwenckii             Swintonia schwenkii
## 195018    Ternstroemia conicocarpa         Ternstroemia coniocarpa
## 208418          Tetralix moanensis               Tetralix moaensis
##                           Old.name
## 803                               
## 1312                              
## 2379                              
## 3731   Antirhea novobrittanniensis
## 6731                              
## 3412                              
## 5412             Betula ×murrayana
## 16552                             
## 25862                             
## 16253                             
## 16714            Citrus ×deliciosa
## 10053                             
## 20294                             
## 12475                             
## 23866                             
## 241110                            
## 24796                             
## 24926                             
## 251010                            
## 251110                            
## 25585                             
## 262110                            
## 26956          Eucalyptus ×hawkeri
## 27086                             
## 27466                             
## 27496                             
## 28996                             
## 29296                             
## 3620                              
## 6677                              
## 14986                             
## 29853                             
## 4437                              
## 11288                             
## 148111                            
## 21597                             
## 11059                             
## 18999              Ilex matthewsii
## 21689                             
## 243113                            
## 26949                             
## 13438                             
## 87310                             
## 158710                            
## 83011                             
## 95510                             
## 112911                            
## 250610         Mespilus ×canescens
## 261711                            
## 282910                            
## 84214                             
## 142812                            
## 75013                             
## 182312                            
## 23680                             
## 48913                             
## 87914                             
## 197118           Pisonia taitensis
## 259512                            
## 261813       Plumeria ×trouinensis
## 22690                             
## 40715                             
## 111317            Sorbus avonensis
## 136717          Sorbus yongdeensis
## 170219                            
## 219917                            
## 229416                            
## 246917                            
## 195018                            
## 208418
GTS.fuzzy2[GTS.fuzzy2$acceptable == FALSE & GTS.fuzzy2$Fuzzy.dist==2, 
               c("TaxonName", "scientificName", "Old.name")]
##                               TaxonName                   scientificName
## 5681                Aquilaria banaensis               Aquilaria banaense
## 22442              Canarium multinervis              Canarium multinerve
## 22802                 Canarium subtilis                 Canarium subtile
## 14003              Cinnamomum culilaban            Cinnamomum culitlawan
## 11845    Dichaetanthera tsaratananensis     Dichaetanthera tsaratanensis
## 13656            Empleurum unicapsulare          Empleurum unicapsularis
## 23825               Euadenia trifoliata                Crateva monticola
## 8129                  Eugenia poroensis                Eugenia pardensis
## 9657                 Eugenia tabouensis               Eugenia gabonensis
## 199111               Ficus binnendijkii                Ficus binnendykii
## 165210             Grewia mahafaliensis             Grewia sahafariensis
## 86211             Homalium brachystylum            Homalium brachystylis
## 21368              Indigofera ammoxylum             Indigofera ammoxylon
## 275210                    Ixora regalis                    Ixora rivalis
## 38510            Koanophyllon panamense          Koanophyllon panamensis
## 19959                  Litsea banaensis                 Litsea baviensis
## 247114              Lorostemon negrense             Lorostemon negrensis
## 257710                 Lunania sauvalii                Lunania sauvallei
## 8740                Machaerium quinatum               Machaerium lunatum
## 107212          Malouetia cuatrecasatis        Malouetia cuatrecasasatis
## 188011               Melanoxylon brauna               Melanoxylum brauna
## 218411            Memecylon arnhemensis             Memecylon arnhemense
## 284011             Miconia elaeodendron             Miconia elaeodendrum
## 109116            Monteverdia gonoclada              Maytenus gonoclados
## 113812         Monteverdia schummaniana            Maytenus schumanniana
## 290010                   Neolitsea chui                 Neolitsea chunii
## 126214               Opuntia monacantha               Opuntia mesacantha
## 193613              Piranhea trifoliata            Piranhea trifoliolata
## 232914           Platycelyphium voensis            Platycelyphium voense
## 234119            Pycnandra viridiflora                   Solanum viarum
## 289515             Quercus mangdenensis Quercus macrocarpa var. depressa
## 290515                   Quercus mexiae                 Quercus gambelii
## 79714        Rhododendron suoilenhensis        Rhododendron suoilenhense
## 36318            Syzygium kanneliyensis            Syzygium kanneliyense
## 51221                 Syzygium munronii                  Syzygium munroi
## 182616          Terminalia namorokensis         Terminalia narnorokensis
## 210021 Tetrapterocarpon septentrionalis  Tetrapterocarpon septentrionale
## 212418              Wrightia flavorosea            Wrightia flavidorosea
##                        Old.name
## 5681                           
## 22442                          
## 22802                          
## 14003                          
## 11845                          
## 13656                          
## 23825     Euadenia trifoliolata
## 8129                           
## 9657                           
## 199111                         
## 165210                         
## 86211                          
## 21368                          
## 275210                         
## 38510                          
## 19959                          
## 247114                         
## 257710                         
## 8740                           
## 107212                         
## 188011                         
## 218411                         
## 284011                         
## 109116   Monteverdia gonoclados
## 113812 Monteverdia schumanniana
## 290010                         
## 126214                         
## 193613                         
## 232914                         
## 234119    Pionandra viridiflora
## 289515      Quercus mandanensis
## 290515            Quercus media
## 79714                          
## 36318                          
## 51221                          
## 182616                         
## 210021                         
## 212418
GTS.fuzzy2[GTS.fuzzy2$acceptable == FALSE & GTS.fuzzy2$Fuzzy.dist==3, 
               c("TaxonName", "scientificName", "Old.name")]
##                           TaxonName                      scientificName
## 1106          Actinodaphne leiantha              Actinodaphne myriantha
## 18601            Ayenia cuatrecasae                Ayenia cuatrecasasii
## 28201             Bembicia uniflora                    Remijia uniflora
## 12382              Bursera zapoteca                      Bursera aptera
## 23362  Capparidastrum cuatrecasanum           Morisonia cuatrecasasiana
## 28704           Cybianthus pittieri               Lycianthes multiflora
## 10719              Drypetes louisii                     Drypetes dussii
## 13237       Euphorbia neospinescens Euphorbia cuneata subsp. spinescens
## 15798            Grewia androyensis                   Grewia angolensis
## 15878                Grewia barorum                      Grewia baronii
## 16588                Grewia milleri                    Cattleya milleri
## 19378          Guatteria esperanzae                Guatteria esmeraldae
## 20889             Litsea honbaensis                   Litsea tannaensis
## 65011          Magnolia cusucoensis                 Magnolia chocoensis
## 65411               Magnolia darioi                     Magnolia dandyi
## 89410              Magnolia talpana                    Magnolia caveana
## 90910            Magnolia veliziana                  Magnolia boliviana
## 274811        Miconia castaneiflora               Miconia castaneifolia
## 84715               Protium aidanum                    Protium bahianum
## 30381           Quercus sontraensis                  Quercus honbaensis
## 100616              Rinorea amietii                     Rinorea afzelii
## 103516             Rinorea dewildei                     Rinorea dewitii
## 104316               Rinorea faurei                Rinorea brachypetala
## 132415                Ruagea beckii                       Jungia beckii
## 133415               Ruagea obovata                      Kunzea obovata
## 212315        Saurauia cuatrecasana            Saurauia cuatrecasasiana
## 206518       Tetrachyron orizabense            Tetrachyron orizabaensis
##                               Old.name
## 1106                                  
## 18601                                 
## 28201                                 
## 12382                                 
## 23362  Capparidastrum cuatrecasasianum
## 28704              Lycianthes pittieri
## 10719                                 
## 13237             Euphorbia spinescens
## 15798                                 
## 15878                                 
## 16588                   Laelia milleri
## 19378                                 
## 20889                                 
## 65011                                 
## 65411                                 
## 89410                                 
## 90910                                 
## 274811                                
## 84715                                 
## 30381                                 
## 100616                                
## 103516                                
## 104316                   Rinorea dawei
## 132415                                
## 133415                                
## 212315                                
## 206518

8 Species that could not be matched comparison

A very small number of species could be matched with WFO (but note that not all fuzzy matches are acceptable).

As it happens, all these species - except one - were matched in the WCVP.

WFO.unmatched <- GTS.WFO[GTS.WFO$Matched == FALSE, c("TaxonName", "Author")]
WFO.unmatched
##                         TaxonName                        Author
## 14608          Licuala heatubunii            Barfod & W.J.Baker
## 261215          Nahuatlea smithii (B.L.Rob. & Greenm.) V.A.Funk
## 112813       Olearia traversiorum            (F.Muell.) Hook.f.
## 227015    Pterospermum wilkieanum                        Doweld
## 31119          Ravenia swartziana        (Miers) Fawc. & Rendle
## 21900      Trilepisium gymnandrum                          D.C.
## 119618        Viburnum inopinatum                   W. G. Craib
## 269718 Zanthoxylum chuquisaquense                        Reynel
GTS.WCVP[GTS.WCVP$TaxonName %in% WFO.unmatched$TaxonName, 
         c("TaxonName", "scientificName", 
           "Author", "scientificNameAuthorship")]
##                         TaxonName             scientificName
## 143510         Licuala heatubunii                       <NA>
## 256712          Nahuatlea smithii          Nahuatlea smithii
## 111313       Olearia traversiorum       Olearia traversiorum
## 221217    Pterospermum wilkieanum    Pterospermum wilkieanum
## 31020          Ravenia swartziana         Ravenia swartziana
## 22000      Trilepisium gymnandrum     Trilepisium gymnandrum
## 117819        Viburnum inopinatum        Viburnum inopinatum
## 269017 Zanthoxylum chuquisaquense Zanthoxylum chuquisaquense
##                               Author      scientificNameAuthorship
## 143510            Barfod & W.J.Baker                          <NA>
## 256712 (B.L.Rob. & Greenm.) V.A.Funk (B.L.Rob. & Greenm.) V.A.Funk
## 111313            (F.Muell.) Hook.f.            (F.Muell.) Hook.f.
## 221217                        Doweld                        Doweld
## 31020         (Miers) Fawc. & Rendle        (Miers) Fawc. & Rendle
## 22000                           D.C.             (Baker) J.Gerlach
## 117819                   W. G. Craib                         Craib
## 269017                        Reynel                        Reynel

The one species that was not matched in both WFO and WCVP is a Palm species that was described in 2022, Licuala heatubunii.

9 Global Biodiversity Standard

This publication was initiated partially from ongoing work in a Darwin Initiative project (DAREX001) that develops a Global Biodiversity Standard for tree planting. Recently the GlobalUsefulNativeTrees and Tree Globally Observed Environmental Ranges databases were released from this project. With scripts such as the ones shown here, when the Global Biodiversity Standard scheme becomes operational, tree planting projects can crosscheck lists of species before applying.

10 Session Information

sessionInfo()
## R version 4.2.1 (2022-06-23 ucrt)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 19045)
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=English_United Kingdom.utf8 
## [2] LC_CTYPE=English_United Kingdom.utf8   
## [3] LC_MONETARY=English_United Kingdom.utf8
## [4] LC_NUMERIC=C                           
## [5] LC_TIME=English_United Kingdom.utf8    
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] stringr_1.4.1     data.table_1.14.2 WorldFlora_1.13-2
## 
## loaded via a namespace (and not attached):
##  [1] bslib_0.4.0       compiler_4.2.1    pillar_1.9.0      jquerylib_0.1.4  
##  [5] tools_4.2.1       digest_0.6.29     jsonlite_1.8.0    evaluate_0.16    
##  [9] lifecycle_1.0.3   tibble_3.2.1      pkgconfig_2.0.3   rlang_1.1.1      
## [13] cli_3.4.1         rstudioapi_0.14   yaml_2.3.5        parallel_4.2.1   
## [17] fuzzyjoin_0.1.6   xfun_0.33         fastmap_1.1.1     withr_2.5.0      
## [21] dplyr_1.1.2       knitr_1.40        generics_0.1.3    vctrs_0.6.3      
## [25] sass_0.4.2        tidyselect_1.2.0  glue_1.6.2        R6_2.5.1         
## [29] fansi_1.0.3       rmarkdown_2.16    purrr_0.3.4       tidyr_1.2.1      
## [33] magrittr_2.0.3    htmltools_0.5.6   stringdist_0.9.10 utf8_1.2.2       
## [37] stringi_1.7.8     cachem_1.0.6