library(WorldFlora)
library(data.table)
library(stringr)
In previous posts (see here, here and here), I showed how the WorldFlora package Kindt 2020) can be used to standardize names from GlobalTreeSearch with the taxonomic backbone data from World Flora Online (WFO) or the World Checklist of Vascular Plants (WCVP).
Here I show standardization results for the latest versions of GlobalTreeSearch with the latest taxonomic backbone data sets of WFO and WCVP. Some of the scripts also need to be different given changes in more recent versions of the backbone data sets.
In this document, I use the latest available versions of World Flora Online (WFO, version 2023.12 downloaded from Zenodo) and the World Checklist of Vascular Plants (WCVP, version 11 downloaded from the Kew data depository).
As shown previously for the WCVP, I recommend to first use a text editor to replace instances of ’ × ’ by ’ ×’ in the WCVP. Now I recommend that this is also done for the latest (2023) versions of WFO.
I used the latest available version of World Flora Online of v.2023.12. The download was done earlier, followed by providing the location of the file to WFO.download via its argument ‘WFO.file’.
(You could use WFO.remember(WFO.file=file.choose()) to be certain that the right version of the WFO backbone is used.)
In the taxonomic backbone, World Flora Online lists about half a million current species names.
WFO.remember()
## Data sourced from: E:\Roeland\WorldFloraOnline\2023\wfo_202312_RK.csv (Thu Dec 28 10:43:25 2023)
## Reading WFO data
## The WFO data is now available from WFO.data
nrow(WFO.data)
## [1] 1576062
nrow(WFO.data[WFO.data$taxonRank == "species", ])
## [1] 1164296
nrow(WFO.data[WFO.data$taxonRank == "species" & WFO.data$acceptedNameUsageID == "", ])
## [1] 517755
The downloadable complete list of tree species (version 1.7) was obtained from Global Tree Search. The list includes close to 60,000 species names (so roughly 10 percent of current species names in World Flora Online).
# GTS.file <- choose.files()
GTS.file <- "E:\\Roeland\\WorldFloraOnline\\2023\\global_tree_search_trees_1_7.csv"
GTS <- fread(GTS.file, header=TRUE, encoding="UTF-8")
head(GTS)
## TaxonName Author V3 V4
## 1: Abarema abbottii (Rose & Leonard) Barneby & J.W.Grimes NA NA
## 2: Abarema acreana (J.F.Macbr.) L.Rico NA NA
## 3: Abarema adenophora (Ducke) Barneby & J.W.Grimes NA NA
## 4: Abarema alexandri (Urb.) Barneby & J.W.Grimes NA NA
## 5: Abarema asplenifolia (Griseb.) Barneby & J.W.Grimes NA NA
## 6: Abarema auriculata (Benth.) Barneby & J.W.Grimes NA NA
## Citation: GlobalTreeSearch online database. Botanic Gardens Conservation International. Richmond, U.K. Available at www.bgci.org. Accessed on DD/MM/YYYY. DOI: 10.13140/RG.2.2.34077.79847
## 1: DOI: 10.13140/RG.2.2.34077.79847
## 2:
## 3:
## 4:
## 5:
## 6:
GTS <- GTS[, c("TaxonName", "Author")]
nrow(GTS)
## [1] 57922
Everything is in place now to start the matching process. To avoid a crash of the WFO.match.fuzzyjoin() function, however, the data needs to be split. This can be done relatively easily via the cut() function.
It takes about an hour for the matching to be completed…
cuts <- cut(c(1:nrow(GTS)), breaks=20, labels=FALSE)
cut.i <- sort(unique(cuts))
start.time <- Sys.time()
for (i in 1:length(cut.i)) {
cat(paste("Cut: ", i, "\n"))
GTS.i <- WFO.one(WFO.match.fuzzyjoin(spec.data=GTS[cuts==cut.i[i], ],
WFO.data=WFO.data,
spec.name="TaxonName",
Authorship="Author",
fuzzydist.max=3),
verbose=FALSE)
if (i==1) {
GTS.WFO <- GTS.i
}else{
GTS.WFO <- rbind(GTS.WFO, GTS.i)
}
}
## Cut: 1
## Checking for fuzzy matches for 20 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut: 2
## Checking for fuzzy matches for 25 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut: 3
## Checking for fuzzy matches for 21 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut: 4
## Checking for fuzzy matches for 32 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut: 5
## Checking for fuzzy matches for 17 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut: 6
## Checking for fuzzy matches for 19 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut: 7
## Checking for fuzzy matches for 21 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut: 8
## Checking for fuzzy matches for 30 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut: 9
## Checking for fuzzy matches for 24 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut: 10
## Checking for fuzzy matches for 23 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut: 11
## Checking for fuzzy matches for 16 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut: 12
## Checking for fuzzy matches for 26 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut: 13
## Checking for fuzzy matches for 14 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut: 14
## Checking for fuzzy matches for 34 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut: 15
## Checking for fuzzy matches for 25 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut: 16
## Checking for fuzzy matches for 53 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut: 17
## Checking for fuzzy matches for 50 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut: 18
## Checking for fuzzy matches for 32 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut: 19
## Checking for fuzzy matches for 30 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut: 20
## Checking for fuzzy matches for 34 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
end.time <- Sys.time()
end.time - start.time # 57.78924 mins
## Time difference of 59.00002 mins
The results can be subdivided into species that could not be matched, species that could be directly matched and species with fuzzy matches.
# not matched
nrow(GTS.WFO[GTS.WFO$Matched == FALSE, ])
## [1] 8
# directly matched
nrow(GTS.WFO[GTS.WFO$Matched == TRUE & GTS.WFO$Fuzzy == FALSE, ])
## [1] 57376
GTS.fuzzy <- GTS.WFO[GTS.WFO$Fuzzy == TRUE, ]
nrow(GTS.fuzzy)
## [1] 538
nrow(GTS.fuzzy[GTS.fuzzy$Fuzzy.dist == 1, ])
## [1] 369
nrow(GTS.fuzzy[GTS.fuzzy$Fuzzy.dist == 2, ])
## [1] 160
nrow(GTS.fuzzy[GTS.fuzzy$Fuzzy.dist == 3, ])
## [1] 9
With the WFO.acceptable.match() function, fuzzy matches can be identified that correspond to differences in gender or differences only for vowels.
accept.var <- WFO.acceptable.match(GTS.fuzzy,
spec.name="TaxonName",
no.vowels=TRUE)
GTS.fuzzy <- data.frame(GTS.fuzzy,
acceptable=accept.var)
Most of the fuzzy matches can be accepted.
nrow(GTS.fuzzy[GTS.fuzzy$acceptable == TRUE, ])
## [1] 428
head(GTS.fuzzy[GTS.fuzzy$acceptable == TRUE,
c("TaxonName", "scientificName", "Old.name")])
## TaxonName scientificName Old.name
## 120 Abrahamia itremoensis Abrahamia itromoensis
## 1054 Acropogon calcicolus Acropogon calcicola
## 1239 Adelobotrys macranthus Adelobotrys macrantha
## 1488 Aeschynomene burttii Aeschynomene burttiie
## 1606 Ageratina urbani Ageratina urbanii
## 1757 Aidia congesta Aidia congestum
tail(GTS.fuzzy[GTS.fuzzy$acceptable == TRUE,
c("TaxonName", "scientificName", "Old.name")])
## TaxonName scientificName Old.name
## 253617 Xylopia subdehiscens Xylopia sub-dehiscens
## 258221 Xylosma kaalaensis Xylosma kaalensis
## 266618 Zabelia tyaihyonii Zabelia tyaihyoni
## 267816 Zanthoxylum amapaense Zanthoxylum amapense
## 286121 Ziziphus cambodiana Ziziphus cambodianus
## 296819 Zygocarpum caeruleum Zygocarpum coeruleum
Also many of the other fuzzy matches could be accepted during a manual check.
nrow(GTS.fuzzy[GTS.fuzzy$acceptable == FALSE, ])
## [1] 110
GTS.fuzzy[GTS.fuzzy$acceptable == FALSE & GTS.fuzzy$Fuzzy.dist==1,
c("TaxonName", "scientificName", "Old.name")]
## TaxonName scientificName
## 1348 Adinandra macquilingensis Adinandra maquilingensis
## 1428 Aegiphila luschnathii Aegiphila luschnatii
## 1492 Aeschynomene pararubrofarinacea Aeschynomene pararuhrofarinacea
## 6011 Arachnothryx chaconii Arachnothryx chaconis
## 10261 Ardisia labisiifolia Ardisia labrisiifolia
## 13301 Arytera litoralis Arytera littoralis
## 22671 Barringtonia augusta Barringtonia angusta
## 9882 Buchanania evrardii Buchanania ×evrardii
## 18613 Calophyllum touranense Calophyllum tournanense
## 6913 Castanopsis catappifolia Castanopsis catalpifolia
## 14114 Castanopsis motleyana Castanopsis mottleyana
## 14983 Cinnamomum malabatrum Cinnamomum malabathrum
## 16515 Citharexylum mocinnoi Citharexylum mocinoi
## 3154 Comocladia ebrenbergii Comocladia ehrenbergii
## 6054 Coprosma tahitensis Coprosma taitensis
## 9795 Dendropanax langsdorfii Dendropanax langsdorffii
## 12865 Dicoryphe buddleoides Dicoryphe buddlejoides
## 15735 Diospyros baloen-idjoek Diospyros baloen-ldjoek
## 17375 Diospyros exculpta Diospyros exsculpta
## 16420 Euclinia squamifera Euclinia ×squamifera
## 6836 Eugenia marshiana Eugenia marchiana
## 11267 Euodia nishimurae Melicope nishimurae
## 15157 Eurya leptanta Eurya leptantha
## 15696 Eurya sandwicensis Eurya sandwichensis
## 25606 Ficus scott-elliottii Ficus scott-elliotii
## 28186 Fordia brachybotrys Fordia brachybotrya
## 30602 Furcraea macdougallii Furcraea macdougalii
## 28228 Garcinia matsudae Garcinia matudae
## 29928 Garcinia moseleyana Garcinia moselleyana
## 3848 Garcinia schlecbteri Garcinia schlechteri
## 6388 Geissanthus serrulatus Geissanthus serulatus
## 11468 Gmelina leichardtii Gmelina leichhardtii
## 14997 Graffenrieda conostegioides Graffenrieda comostegioides
## 5449 Hibbertia wagapii Hibbertia wayapii
## 6189 Hibiscus tiliaceus Hibiscus tilliaceus
## 7068 Hirtella brachystachya Hirtella brachystachys
## 151310 Hymenodictyon perrieri Hymenodictyon perieri
## 18757 Ilex linii Ilex limii
## 21709 Inga andersonii Inga ×andersonii
## 138710 Licaria cannella Licaria canella
## 256114 Luehea conwentzii Luehea conwentsii
## 17180 Machilus minkweiensis Machilus mikweiensis
## 61213 Magnolia argyrothricha Magnolia argyrotricha
## 63811 Magnolia caricifragrans Magnolia caricifragans
## 77911 Magnolia macclurei Magnolia maclurei
## 189410 Melanochyla fulvinervis Melanochyla fulvinervia
## 266311 Mezoneuron kavaiense Mezoneuron kauaiense
## 277710 Miconia buddlejoides Miconia budlejoides
## 287810 Miconia doniana Miconia doriana
## 7850 Miconia paspaloides Miconia paspaploides
## 86214 Moldenhawera luschnathiana Moldenhawera lushnathiana
## 176116 Myrcia dolichopetala Myrcia dolicopetala
## 254313 Pancheria xaragurensis Pancheria xaraguensis
## 38611 Pavetta yambatensis Pavetta yamhatensis
## 88813 Phellocalyx vollescenii Phellocalyx vollesenii
## 96314 Phoebe mathewsii Phoebe matthewsii
## 237118 Plectroniella armata Canthium armatum
## 262514 Plumeria trouinensis Plumeria ×stenopetala
## 289215 Polyosma ilicifolia Polyosma illicifolia
## 203615 Psychotria uncumariana Psychotria ucumariana
## 256515 Pyrus tamamschiannae Pyrus tamamschianae
## 267119 Quararibea mayanum Quararibea mayarum
## 31141 Quercus shangxiensis Quercus shanxiensis
## 116315 Sorbus avonensis Sorbus ×avonensis
## 135816 Sorbus roopiana Sorbus ×roopiana
## 177316 Stenostomum albobruneum Stenostomum albobrunneum
## 228317 Styrax obassis Styrax obassia
## 169100 Syzygium dealbatum Syzygium dealatum
## 33017 Syzygium hullettianum Syzygium hulletianum
## 52517 Syzygium naiadum Syzygium najadum
## 55121 Syzygium normanbiensc Syzygium normanbiense
## 94616 Syzygium vanderwateri Syzygium vandewateri
## 96617 Syzygium vidalianum Syzygium vidaliarum
## 197219 Ternstroemia conicocarpa Ternstroemia coniocarpa
## 274016 Tricalysia elliottii Tricalysia elliotii
## 53718 Vaccinium dubiosum Vaccinium duhiosum
## 196519 Wendlandia buddleacea Wendlandia buddlejacea
## 226219 Xanthophyllum schizocarpon Xanthophyllum schixocarpon
## Old.name
## 1348
## 1428
## 1492
## 6011
## 10261
## 13301
## 22671
## 9882
## 18613
## 6913
## 14114
## 14983
## 16515
## 3154
## 6054
## 9795
## 12865
## 15735
## 17375
## 16420
## 6836
## 11267 Evodia nishimurae
## 15157
## 15696
## 25606
## 28186
## 30602
## 28228
## 29928
## 3848
## 6388
## 11468
## 14997
## 5449
## 6189
## 7068
## 151310
## 18757
## 21709
## 138710
## 256114
## 17180
## 61213
## 63811
## 77911
## 189410
## 266311
## 277710
## 287810
## 7850
## 86214
## 176116
## 254313
## 38611
## 88813
## 96314
## 237118 Plectroniella amata
## 262514 Plumeria trouinenais
## 289215
## 203615
## 256515
## 267119
## 31141
## 116315
## 135816
## 177316
## 228317
## 169100
## 33017
## 52517
## 55121
## 94616
## 96617
## 197219
## 274016
## 53718
## 196519
## 226219
GTS.fuzzy[GTS.fuzzy$acceptable == FALSE & GTS.fuzzy$Fuzzy.dist==2,
c("TaxonName", "scientificName", "Old.name")]
## TaxonName scientificName
## 25451 Beguea tsaratananensis Beguea tsaratanensis
## 10052 Buchanania sessifolia Buchanania sessilifolia
## 22902 Canarium multinervis Canarium multinerve
## 23282 Canarium subtilis Canarium subtile
## 26662 Coccoloba ramosissima Coccoloba ramosisissima
## 7295 Decarydendron ranomafanensis Decarydendron ranomafanense
## 18286 Eriocoelum dzangensis Eriocoelum dzangense
## 24127 Euadenia trifoliata Crateva monticola
## 154112 Grazielodendron riodocensis Grazielodendron riodocense
## 222411 Memecylon arnhemensis Memecylon arnhemense
## 116012 Monteverdia schummaniana Maytenus schumanniana
## 53511 Pentaceras australe Pentaceras australis
## 161915 Stadtmannia acuminata Stadmania acuminata
## 162017 Stadtmannia excelsa Stadmania excelsa
## 162121 Stadtmannia glauca Stadmania glauca
## 162217 Stadtmannia leandrii Stadmania leandrii
## 162317 Stadtmannia oppositifolia Stadmania oppositifolia
## 162417 Stadtmannia serrulata Stadmania serrulata
## 36617 Syzygium kanneliyensis Syzygium kanneliyense
## 202017 Ternstroemia oleifolia Ternstroemia alnifolia
## 208616 Tetrachyron orizabense Tetrachyron orizabaensis
## 270717 Trema orientale Trema orientalis
## 293219 Tridesmostemon congoensis Tridesmostemon congoense
## 186100 Turraea laciniata Turraea laciniosa
## 222219 Xanthophyllum laeve Xanthophyllum laevis
## Old.name
## 25451
## 10052
## 22902
## 23282
## 26662
## 7295
## 18286
## 24127 Euadenia trifoliolata
## 154112
## 222411
## 116012 Monteverdia schumanniana
## 53511
## 161915
## 162017
## 162121
## 162217
## 162317
## 162417
## 36617
## 202017
## 208616 Tetrachyron orizabensis
## 270717
## 293219
## 186100
## 222219
GTS.fuzzy[GTS.fuzzy$acceptable == FALSE & GTS.fuzzy$Fuzzy.dist==3,
c("TaxonName", "scientificName", "Old.name")]
## TaxonName scientificName
## 23842 Capparidastrum cuatrecasanum Morisonia cuatrecasasiana
## 13783 Cinnamomum austrosinense Cinnamomum austro-sinensis
## 23095 Eschweilera jefensis Eschweilera juruensis
## 160511 Matisia cuatrecasasiana Matisia cuatrecasana
## 26790 Portulacaria carrissoana Portulaca carrissoana
## 258615 Quadrella indica Cordia myxa
## 31271 Quercus sontraensis Lithocarpus sootepensis
## Old.name
## 23842 Capparidastrum cuatrecasasianum
## 13783
## 23095
## 160511
## 26790
## 258615 Quarena indica
## 31271 Quercus sootepensis
Instead of using the taxonomic backbone of World Flora Online, now we will use the taxonomic backbone of the World Checklist of Vascular Plants (WCVP). Note the modifications that were done in the file to replace instances of ’ × ’ by ’ ×’.
# WCVP.file <- choose.files()
WCVP.file <- "E:\\Roeland\\WorldFloraOnline\\2023\\wcvp_names_RK.csv"
WCVP.data <- fread(WCVP.file, header=TRUE, encoding="UTF-8")
## Warning in fread(WCVP.file, header = TRUE, encoding = "UTF-8"): Found and
## resolved improper quoting out-of-sample. First healed line 11102: <<329638|
## 328672-2|Variety|Synonym|Lamiaceae||Ajuga||decumbens|var.|vegeta||Honda||Bot.
## Mag. (Tokyo)| 45: 299|(1931)||"Hondo: Agematsu, prov. Shinano", Japan, Eastern
## Asia, Asia-Temperate|||Ajuga decumbens var. vegeta|Honda|5322|||||328672-2||Y>>.
## If the fields are not quoted (e.g. field separator does not appear within any
## field), try quote="" to avoid this warning.
head(WCVP.data)
## plant_name_id ipni_id taxon_rank taxon_status family genus_hybrid
## 1: 195508 243233-2 Species Synonym Lamiaceae
## 2: 197585 767122-1 Species Synonym Rubiaceae
## 3: 76791 595920-1 Species Synonym Myrtaceae
## 4: 74373 593644-1 Species Synonym Myrtaceae
## 5: 205204 884387-1 Subspecies Synonym Lamiaceae
## 6: 102745 326216-2 Genus Synonym Lamiaceae
## genus species_hybrid species infraspecific_rank infraspecies
## 1: Stachys pustulosa
## 2: Stenostomum dichotomum
## 3: Eugenia scoparia
## 4: Eugenia areolata
## 5: Thymus pallasianus subsp. brachyodon
## 6: Isanthus
## parenthetical_author primary_author publication_author place_of_publication
## 1: Rydb. Brittonia
## 2: DC. Prodr.
## 3: Duthie J.D.Hooker Fl. Brit. India
## 4: DC. Duthie J.D.Hooker Fl. Brit. India
## 5: Borbás Jalas Bot. J. Linn. Soc.
## 6: Michx. Fl. Bor.-Amer.
## volume_and_page first_published nomenclatural_remarks geographic_area
## 1: 1: 95 (1931)
## 2: 4: 461 (1830)
## 3: 2: 489 (1878)
## 4: 2: 490 (1878)
## 5: 64: 262 (1971)
## 6: 2: 3 (1803)
## lifeform_description climate_description
## 1:
## 2:
## 3:
## 4:
## 5:
## 6:
## taxon_name taxon_authors accepted_plant_name_id
## 1: Stachys pustulosa Rydb. 195467
## 2: Stenostomum dichotomum DC. 197582
## 3: Eugenia scoparia Duthie 199254
## 4: Eugenia areolata (DC.) Duthie 200472
## 5: Thymus pallasianus subsp. brachyodon (Borbás) Jalas 204938
## 6: Isanthus Michx. 208472
## basionym_plant_name_id replaced_synonym_author homotypic_synonym
## 1: NA
## 2: NA
## 3: NA
## 4: 199231
## 5: 204517
## 6: NA
## parent_plant_name_id powo_id hybrid_formula reviewed
## 1: NA 243233-2 Y
## 2: NA 767122-1 Y
## 3: NA 595920-1 Y
## 4: NA 593644-1 Y
## 5: NA 884387-1 Y
## 6: NA 326216-2 Y
tail(WCVP.data)
## plant_name_id ipni_id taxon_rank taxon_status family genus_hybrid
## 1: 3297495 Subspecies Accepted Fabaceae
## 2: 437859 204824-2 Species Accepted Poaceae
## 3: 2510011 946536-1 Species Accepted Rutaceae
## 4: 2580958 940927-1 Species Accepted Malvaceae
## 5: 3052324 978137-1 Species Accepted Asteraceae
## 6: 3052326 985860-1 Species Accepted Asteraceae
## genus species_hybrid species infraspecific_rank infraspecies
## 1: Onobrychis alba subsp. pentelica
## 2: Poidium calotheca
## 3: Melicope zahlbruckneri
## 4: Sterculia tantraensis
## 5: Vernonanthura santacruzensis
## 6: Vernonanthura schulziana
## parenthetical_author primary_author publication_author
## 1: Hausskn. Nyman
## 2: Trin. Matthei
## 3: Rock T.G.Hartley & B.C.Stone
## 4: Morat
## 5: Hieron. H.Rob.
## 6: Cabrera H.Rob.
## place_of_publication volume_and_page
## 1: Consp. Fl. Eur. , Suppl. 2(1): 99
## 2: Willdenowia, Beih. 8: 116
## 3: Taxon 38: 122
## 4: Bull. Mus. Natl. Hist. Nat., B, Adansonia 8: 362
## 5: Phytologia 76: 29
## 6: Phytologia 78: 386
## first_published nomenclatural_remarks
## 1: (1889)
## 2: (1975)
## 3: (1989)
## 4: (1986 publ. 1987)
## 5: (1994)
## 6: (1995)
## geographic_area lifeform_description
## 1: S. Italy, Balkan Pen. perennial
## 2: Colombia, SE. & S. Brazil to NE. Argentina perennial
## 3: Hawaiian Is. tree
## 4: Papuasia tree
## 5: Bolivia (Santa Cruz)
## 6: Brazil (Rio Grande do Sul) to Argentina (Chaco)
## climate_description taxon_name
## 1: temperate Onobrychis alba subsp. pentelica
## 2: subtropical Poidium calotheca
## 3: wet tropical Melicope zahlbruckneri
## 4: wet tropical Sterculia tantraensis
## 5: Vernonanthura santacruzensis
## 6: Vernonanthura schulziana
## taxon_authors accepted_plant_name_id basionym_plant_name_id
## 1: (Hausskn.) Nyman 3297495 2389037
## 2: (Trin.) Matthei 437859 412327
## 3: (Rock) T.G.Hartley & B.C.Stone 2510011 2541635
## 4: Morat 2580958 2552044
## 5: (Hieron.) H.Rob. 3052324 3052322
## 6: (Cabrera) H.Rob. 3052326 3052325
## replaced_synonym_author homotypic_synonym parent_plant_name_id powo_id
## 1: 2389084 3297495-4
## 2: 437855 204824-2
## 3: 2509107 946536-1
## 4: Lauterb. 2579857 940927-1
## 5: 3134351 978137-1
## 6: 3134351 985860-1
## hybrid_formula reviewed
## 1: Y
## 2: Y
## 3: N
## 4: N
## 5: N
## 6: N
WCVP.data <- new.backbone(WCVP.data,
taxonID="plant_name_id",
scientificName="taxon_name",
scientificNameAuthorship="taxon_authors",
acceptedNameUsageID = "accepted_plant_name_id",
taxonomicStatus = "taxon_status")
head(WCVP.data)
## taxonID scientificName scientificNameAuthorship
## 1: 195508 Stachys pustulosa Rydb.
## 2: 197585 Stenostomum dichotomum DC.
## 3: 76791 Eugenia scoparia Duthie
## 4: 74373 Eugenia areolata (DC.) Duthie
## 5: 205204 Thymus pallasianus subsp. brachyodon (Borbás) Jalas
## 6: 102745 Isanthus Michx.
## acceptedNameUsageID taxonomicStatus plant_name_id ipni_id taxon_rank
## 1: 195467 Synonym 195508 243233-2 Species
## 2: 197582 Synonym 197585 767122-1 Species
## 3: 199254 Synonym 76791 595920-1 Species
## 4: 200472 Synonym 74373 593644-1 Species
## 5: 204938 Synonym 205204 884387-1 Subspecies
## 6: 208472 Synonym 102745 326216-2 Genus
## taxon_status family genus_hybrid genus species_hybrid species
## 1: Synonym Lamiaceae Stachys pustulosa
## 2: Synonym Rubiaceae Stenostomum dichotomum
## 3: Synonym Myrtaceae Eugenia scoparia
## 4: Synonym Myrtaceae Eugenia areolata
## 5: Synonym Lamiaceae Thymus pallasianus
## 6: Synonym Lamiaceae Isanthus
## infraspecific_rank infraspecies parenthetical_author primary_author
## 1: Rydb.
## 2: DC.
## 3: Duthie
## 4: DC. Duthie
## 5: subsp. brachyodon Borbás Jalas
## 6: Michx.
## publication_author place_of_publication volume_and_page first_published
## 1: Brittonia 1: 95 (1931)
## 2: Prodr. 4: 461 (1830)
## 3: J.D.Hooker Fl. Brit. India 2: 489 (1878)
## 4: J.D.Hooker Fl. Brit. India 2: 490 (1878)
## 5: Bot. J. Linn. Soc. 64: 262 (1971)
## 6: Fl. Bor.-Amer. 2: 3 (1803)
## nomenclatural_remarks geographic_area lifeform_description
## 1:
## 2:
## 3:
## 4:
## 5:
## 6:
## climate_description taxon_name taxon_authors
## 1: Stachys pustulosa Rydb.
## 2: Stenostomum dichotomum DC.
## 3: Eugenia scoparia Duthie
## 4: Eugenia areolata (DC.) Duthie
## 5: Thymus pallasianus subsp. brachyodon (Borbás) Jalas
## 6: Isanthus Michx.
## accepted_plant_name_id basionym_plant_name_id replaced_synonym_author
## 1: 195467 <NA>
## 2: 197582 <NA>
## 3: 199254 <NA>
## 4: 200472 199231
## 5: 204938 204517
## 6: 208472 <NA>
## homotypic_synonym parent_plant_name_id powo_id hybrid_formula reviewed
## 1: <NA> 243233-2 Y
## 2: <NA> 767122-1 Y
## 3: <NA> 595920-1 Y
## 4: <NA> 593644-1 Y
## 5: <NA> 884387-1 Y
## 6: <NA> 326216-2 Y
nrow(WCVP.data)
## [1] 1419265
nrow(WCVP.data[WCVP.data$taxon_rank == "Species", ])
## [1] 1028856
nrow(WCVP.data[WCVP.data$taxon_rank == "Species" & WCVP.data$acceptedNameUsageID == "", ])
## [1] 0
For the WFO.match.fuzzyjoin() function to work quicker, where the acceptedNameUsageID is the same as the taxonID, it is better that the field is left blank.
nrow(WCVP.data[WCVP.data$taxonomicStatus == "Accepted", ])
## [1] 425349
nrow(WCVP.data[WCVP.data$taxonID == WCVP.data$acceptedNameUsageID, ])
## [1] 430633
unique(WCVP.data[WCVP.data$taxonID == WCVP.data$acceptedNameUsageID, "taxonomicStatus"])
## taxonomicStatus
## 1: Accepted
## 2: Artificial Hybrid
## 3: Local Biotype
head(WCVP.data[WCVP.data$taxonID == WCVP.data$acceptedNameUsageID & WCVP.data$taxonomicStatus == "Accepted", ])[, 1:5]
## taxonID scientificName scientificNameAuthorship acceptedNameUsageID
## 1: 425448 Narenga Bor 425448
## 2: 2503628 Malope L. 2503628
## 3: 2545738 Pouzolzia variifolia Friis & Wilmot-Dear 2545738
## 4: 2677268 Blepharis Juss. 2677268
## 5: 2623037 Adonis L. 2623037
## 6: 2646596 Aragoa cupressina Kunth 2646596
## taxonomicStatus
## 1: Accepted
## 2: Accepted
## 3: Accepted
## 4: Accepted
## 5: Accepted
## 6: Accepted
head(WCVP.data[WCVP.data$taxonID == WCVP.data$acceptedNameUsageID & WCVP.data$taxonomicStatus == "Artificial Hybrid", ])[, 1:5]
## taxonID scientificName scientificNameAuthorship acceptedNameUsageID
## 1: 2756546 × Daltadenia Wiehler 2756546
## 2: 372890 × Geoclades A.Chen 372890
## 3: 372411 × Christendoritis J.M.H.Shaw 372411
## 4: 2750311 × Cylindrantha Y.Itô 2750311
## 5: 3293147 × Melara M.H.J.van der Meer 3293147
## 6: 372403 × Constanciaara J.M.H.Shaw 372403
## taxonomicStatus
## 1: Artificial Hybrid
## 2: Artificial Hybrid
## 3: Artificial Hybrid
## 4: Artificial Hybrid
## 5: Artificial Hybrid
## 6: Artificial Hybrid
head(WCVP.data[WCVP.data$taxonID == WCVP.data$acceptedNameUsageID & WCVP.data$taxonomicStatus == "Local Biotype", ])[, 1:5]
## taxonID scientificName scientificNameAuthorship acceptedNameUsageID
## 1: 2976323 Rubus abietinus Sudre 2976323
## 2: 2976769 Rubus amoenus Köhler ex Weihe 2976769
## 3: 2976814 Rubus amplifrons Sudre 2976814
## 4: 2976876 Rubus angustifrons Sudre 2976876
## 5: 2976888 Rubus anisacanthoides Sudre 2976888
## 6: 2977350 Rubus bakonyensis Gáyer 2977350
## taxonomicStatus
## 1: Local Biotype
## 2: Local Biotype
## 3: Local Biotype
## 4: Local Biotype
## 5: Local Biotype
## 6: Local Biotype
WCVP.data[WCVP.data$taxonID == WCVP.data$acceptedNameUsageID, "acceptedNameUsageID"] <- ""
head(WCVP.data[WCVP.data$taxonomicStatus == "Accepted", ])[, 1:5]
## taxonID scientificName scientificNameAuthorship acceptedNameUsageID
## 1: 425448 Narenga Bor
## 2: 2503628 Malope L.
## 3: 2545738 Pouzolzia variifolia Friis & Wilmot-Dear
## 4: 2677268 Blepharis Juss.
## 5: 2623037 Adonis L.
## 6: 2646596 Aragoa cupressina Kunth
## taxonomicStatus
## 1: Accepted
## 2: Accepted
## 3: Accepted
## 4: Accepted
## 5: Accepted
## 6: Accepted
unique(WCVP.data[WCVP.data$taxonomicStatus == "Accepted", "acceptedNameUsageID"])
## acceptedNameUsageID
## 1:
We can use similar scripts now as above with the taxonomic backbone of World Flora Online.
cuts <- cut(c(1:nrow(GTS)), breaks=20, labels=FALSE)
cut.i <- sort(unique(cuts))
start.time <- Sys.time()
for (i in 1:length(cut.i)) {
cat(paste("Cut: ", i, "\n"))
GTS.i <- WFO.one(WFO.match.fuzzyjoin(spec.data=GTS[cuts==cut.i[i], ],
WFO.data=WCVP.data,
spec.name="TaxonName",
Authorship="Author",
fuzzydist.max=3),
verbose=FALSE)
if (i==1) {
GTS.WCVP <- GTS.i
}else{
GTS.WCVP <- rbind(GTS.WCVP, GTS.i)
}
}
## Cut: 1
## Checking for fuzzy matches for 16 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut: 2
## Checking for fuzzy matches for 17 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut: 3
## Checking for fuzzy matches for 16 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut: 4
## Checking for fuzzy matches for 19 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut: 5
## Checking for fuzzy matches for 14 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut: 6
## Checking for fuzzy matches for 16 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut: 7
## Checking for fuzzy matches for 21 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut: 8
## Checking for fuzzy matches for 10 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut: 9
## Checking for fuzzy matches for 22 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut: 10
## Checking for fuzzy matches for 19 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut: 11
## Checking for fuzzy matches for 19 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut: 12
## Checking for fuzzy matches for 83 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut: 13
## Checking for fuzzy matches for 16 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut: 14
## Checking for fuzzy matches for 16 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut: 15
## Checking for fuzzy matches for 22 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut: 16
## Checking for fuzzy matches for 35 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut: 17
## Checking for fuzzy matches for 31 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut: 18
## Checking for fuzzy matches for 23 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut: 19
## Checking for fuzzy matches for 18 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Cut: 20
## Checking for fuzzy matches for 18 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
end.time <- Sys.time()
end.time - start.time # 55.06009 mins
## Time difference of 57.85401 mins
The results can be subdivided into species that could not be matched, species that could be directly matched and species with fuzzy matches.
A considerably higher number of species could not be matched, whereas also higher numbers of species had fuzzy matches than previously with the WFO.
# not matched
nrow(GTS.WCVP[GTS.WCVP$Matched == FALSE, ])
## [1] 47
# directly matched
nrow(GTS.WCVP[GTS.WCVP$Matched == TRUE & GTS.WCVP$Fuzzy == FALSE, ])
## [1] 57471
GTS.fuzzy2 <- GTS.WCVP[GTS.WCVP$Fuzzy == TRUE, ]
nrow(GTS.fuzzy2)
## [1] 404
nrow(GTS.fuzzy2[GTS.fuzzy2$Fuzzy.dist == 1, ])
## [1] 220
nrow(GTS.fuzzy2[GTS.fuzzy2$Fuzzy.dist == 2, ])
## [1] 157
nrow(GTS.fuzzy2[GTS.fuzzy2$Fuzzy.dist == 3, ])
## [1] 27
With the WFO.acceptable.match() function, fuzzy matches can be identified that correspond to differences in gender or differences only for vowels.
accept.var <- WFO.acceptable.match(GTS.fuzzy2,
spec.name="TaxonName",
no.vowels=TRUE)
GTS.fuzzy2 <- data.frame(GTS.fuzzy2,
acceptable=accept.var)
Many of the fuzzy matches can be accepted.
nrow(GTS.fuzzy2[GTS.fuzzy2$acceptable == TRUE, ])
## [1] 269
head(GTS.fuzzy2[GTS.fuzzy2$acceptable == TRUE,
c("TaxonName", "scientificName", "Old.name")])
## TaxonName scientificName Old.name
## 14 Abarema cochliocarpos Abarema cochliacarpos
## 417 Acacia macdonnellensis Acacia macdonnelliensis
## 1013 Acropogon calcicolus Acropogon calcicola
## 1030 Acropogon sageniifolius Acropogon sageniifolia
## 1033 Acropogon schumannianus Acropogon schumanniana
## 1418 Aegiphila valerioi Aegiphila valerii
tail(GTS.fuzzy2[GTS.fuzzy2$acceptable == TRUE,
c("TaxonName", "scientificName", "Old.name")])
## TaxonName scientificName Old.name
## 157319 Vochysia antioquiae Vochysia antioquia
## 193220 Weinmannia trianaea Weinmannia trianae
## 214618 Wunderlichia crulsiana Wunderlichia cruelsiana
## 265220 Yucca treculeana Yucca treculiana
## 265617 Zabelia tyaihyonii Zabelia tyaihyoni
## 293319 Zygia macbridii Zygia macbridei
Also many of the other fuzzy matches could be accepted during a manual check. But definitely a manual check is required.
nrow(GTS.fuzzy2[GTS.fuzzy2$acceptable == FALSE, ])
## [1] 135
GTS.fuzzy2[GTS.fuzzy2$acceptable == FALSE & GTS.fuzzy2$Fuzzy.dist==1,
c("TaxonName", "scientificName", "Old.name")]
## TaxonName scientificName
## 803 Acer osmastonii Acer ×osmastonii
## 1312 Adinandra milletii Adinandra millettii
## 2379 Alnus mandshurica Alnus mandschurica
## 3731 Antirhea novobritanniensis Achilleanthus novobrittaniensis
## 6731 Aralia castanopsiscola Aralia castanopsicola
## 3412 Betula dahurica Betula davurica
## 5412 Betula murrayana Betula ×purpusii
## 16552 Calodendrum eichii Calodendrum eickii
## 25862 Carpinus turczaninowii Carpinus turczaninovii
## 16253 Citharexylum mocinnoi Citharexylum mocinoi
## 16714 Citrus deliciosa Citrus ×aurantium f. deliciosa
## 10053 Corymbia paractia Corymbia ×paractia
## 20294 Croton sarocarpus Croton sarcocarpus
## 12475 Dicoryphe buddleoides Dicoryphe buddlejoides
## 23866 Eucalyptus abdita Eucalyptus ×abdita
## 241110 Eucalyptus angularis Eucalyptus ×angularis
## 24796 Eucalyptus bunyip Eucalyptus ×bunyip
## 24926 Eucalyptus calyerup Eucalyptus ×calyerup
## 251010 Eucalyptus carolaniae Eucalyptus ×carolaniae
## 251110 Eucalyptus castrensis Eucalyptus ×castrensis
## 25585 Eucalyptus crispata Eucalyptus ×crispata
## 262110 Eucalyptus erectifolia Eucalyptus ×erectifolia
## 26956 Eucalyptus hawkeri Eucalyptus ×silvestris
## 27086 Eucalyptus impensa Eucalyptus ×impensa
## 27466 Eucalyptus lateritica Eucalyptus ×lateritica
## 27496 Eucalyptus leprophloia Eucalyptus ×leprophloia
## 28996 Eucalyptus phoenix Eucalyptus ×phoenix
## 29296 Eucalyptus pruiniramis Eucalyptus ×pruiniramis
## 3620 Eucalyptus silvestris Eucalyptus ×silvestris
## 6677 Eugenia marchiana Eugenia marshiana
## 14986 Eurya kueichowensis Eurya kueichouensis
## 29853 Freziera monzonensis Freziera monsonensis
## 4437 Garcinia xipshuanbannaensis Garcinia xishuanbannaensis
## 11288 Gmelina leichardtii Gmelina leichhardtii
## 148111 Graffenrieda conostegioides Graffenrieda comostegioides
## 21597 Guettarda prenleloupii Guettarda preneloupii
## 11059 Hopea tenuivervula Hopea tenuinervula
## 18999 Ilex mathewsii Ilex ovalis
## 21689 Inga andersonii Inga ×andersonii
## 243113 Inhambanella henriquezii Inhambanella henriquesii
## 26949 Ixora longhanensis Ixora longshanensis
## 13438 Kaunia camataguiensis Kaunia camataquiensis
## 87310 Lebrunia bushaie Lebrunia busbaie
## 158710 Linospadix monostachyos Linospadix monostachyus
## 83011 Magnolia pilocarpa Magnolia ×pilocarpa
## 95510 Majidea forsteri Majidea fosteri
## 112911 Malus floribunda Malus ×floribunda
## 250610 Mespilus canescens Crataegus ×canescens
## 261711 Mezoneuron kavaiense Mezoneuron kauaiense
## 282910 Miconia doniana Miconia doriana
## 84214 Moldenhawera luschnathiana Moldenhawera lushnathiana
## 142812 Mouriri retentipetala Mouriri retenipetala
## 75013 Ocotea kostermanniana Ocotea kostermansiana
## 182312 Ouratea littoralis Ouratea litoralis
## 23680 Pauldopia ghorta Pauldopia ghonta
## 48913 Peltophorum dasyrachis Peltophorum dasyrhachis
## 87914 Phellocalyx vollescenii Phellocalyx vollesenii
## 197118 Pisonia tahitensis Ceodes taitensis
## 259512 Plinia spirito-santensis Plinia spirito-sanctensis
## 261813 Plumeria trouinensis Plumeria ×stenopetala
## 22690 Populus yuana Populus wuana
## 40715 Sideroxylon mirmulans Sideroxylon mirmulano
## 111317 Sorbus arvonensis Aria avonensis
## 136717 Sorbus yondeensis Griffitharia yongdeensis
## 170219 Stenostomum albobruneum Stenostomum albobrunneum
## 219917 Styrax obassis Styrax obassia
## 229416 Swartzia brachyrachis Swartzia brachyrhachis
## 246917 Swintonia schwenckii Swintonia schwenkii
## 195018 Ternstroemia conicocarpa Ternstroemia coniocarpa
## 208418 Tetralix moanensis Tetralix moaensis
## Old.name
## 803
## 1312
## 2379
## 3731 Antirhea novobrittanniensis
## 6731
## 3412
## 5412 Betula ×murrayana
## 16552
## 25862
## 16253
## 16714 Citrus ×deliciosa
## 10053
## 20294
## 12475
## 23866
## 241110
## 24796
## 24926
## 251010
## 251110
## 25585
## 262110
## 26956 Eucalyptus ×hawkeri
## 27086
## 27466
## 27496
## 28996
## 29296
## 3620
## 6677
## 14986
## 29853
## 4437
## 11288
## 148111
## 21597
## 11059
## 18999 Ilex matthewsii
## 21689
## 243113
## 26949
## 13438
## 87310
## 158710
## 83011
## 95510
## 112911
## 250610 Mespilus ×canescens
## 261711
## 282910
## 84214
## 142812
## 75013
## 182312
## 23680
## 48913
## 87914
## 197118 Pisonia taitensis
## 259512
## 261813 Plumeria ×trouinensis
## 22690
## 40715
## 111317 Sorbus avonensis
## 136717 Sorbus yongdeensis
## 170219
## 219917
## 229416
## 246917
## 195018
## 208418
GTS.fuzzy2[GTS.fuzzy2$acceptable == FALSE & GTS.fuzzy2$Fuzzy.dist==2,
c("TaxonName", "scientificName", "Old.name")]
## TaxonName scientificName
## 5681 Aquilaria banaensis Aquilaria banaense
## 22442 Canarium multinervis Canarium multinerve
## 22802 Canarium subtilis Canarium subtile
## 14003 Cinnamomum culilaban Cinnamomum culitlawan
## 11845 Dichaetanthera tsaratananensis Dichaetanthera tsaratanensis
## 13656 Empleurum unicapsulare Empleurum unicapsularis
## 23825 Euadenia trifoliata Crateva monticola
## 8129 Eugenia poroensis Eugenia pardensis
## 9657 Eugenia tabouensis Eugenia gabonensis
## 199111 Ficus binnendijkii Ficus binnendykii
## 165210 Grewia mahafaliensis Grewia sahafariensis
## 86211 Homalium brachystylum Homalium brachystylis
## 21368 Indigofera ammoxylum Indigofera ammoxylon
## 275210 Ixora regalis Ixora rivalis
## 38510 Koanophyllon panamense Koanophyllon panamensis
## 19959 Litsea banaensis Litsea baviensis
## 247114 Lorostemon negrense Lorostemon negrensis
## 257710 Lunania sauvalii Lunania sauvallei
## 8740 Machaerium quinatum Machaerium lunatum
## 107212 Malouetia cuatrecasatis Malouetia cuatrecasasatis
## 188011 Melanoxylon brauna Melanoxylum brauna
## 218411 Memecylon arnhemensis Memecylon arnhemense
## 284011 Miconia elaeodendron Miconia elaeodendrum
## 109116 Monteverdia gonoclada Maytenus gonoclados
## 113812 Monteverdia schummaniana Maytenus schumanniana
## 290010 Neolitsea chui Neolitsea chunii
## 126214 Opuntia monacantha Opuntia mesacantha
## 193613 Piranhea trifoliata Piranhea trifoliolata
## 232914 Platycelyphium voensis Platycelyphium voense
## 234119 Pycnandra viridiflora Solanum viarum
## 289515 Quercus mangdenensis Quercus macrocarpa var. depressa
## 290515 Quercus mexiae Quercus gambelii
## 79714 Rhododendron suoilenhensis Rhododendron suoilenhense
## 36318 Syzygium kanneliyensis Syzygium kanneliyense
## 51221 Syzygium munronii Syzygium munroi
## 182616 Terminalia namorokensis Terminalia narnorokensis
## 210021 Tetrapterocarpon septentrionalis Tetrapterocarpon septentrionale
## 212418 Wrightia flavorosea Wrightia flavidorosea
## Old.name
## 5681
## 22442
## 22802
## 14003
## 11845
## 13656
## 23825 Euadenia trifoliolata
## 8129
## 9657
## 199111
## 165210
## 86211
## 21368
## 275210
## 38510
## 19959
## 247114
## 257710
## 8740
## 107212
## 188011
## 218411
## 284011
## 109116 Monteverdia gonoclados
## 113812 Monteverdia schumanniana
## 290010
## 126214
## 193613
## 232914
## 234119 Pionandra viridiflora
## 289515 Quercus mandanensis
## 290515 Quercus media
## 79714
## 36318
## 51221
## 182616
## 210021
## 212418
GTS.fuzzy2[GTS.fuzzy2$acceptable == FALSE & GTS.fuzzy2$Fuzzy.dist==3,
c("TaxonName", "scientificName", "Old.name")]
## TaxonName scientificName
## 1106 Actinodaphne leiantha Actinodaphne myriantha
## 18601 Ayenia cuatrecasae Ayenia cuatrecasasii
## 28201 Bembicia uniflora Remijia uniflora
## 12382 Bursera zapoteca Bursera aptera
## 23362 Capparidastrum cuatrecasanum Morisonia cuatrecasasiana
## 28704 Cybianthus pittieri Lycianthes multiflora
## 10719 Drypetes louisii Drypetes dussii
## 13237 Euphorbia neospinescens Euphorbia cuneata subsp. spinescens
## 15798 Grewia androyensis Grewia angolensis
## 15878 Grewia barorum Grewia baronii
## 16588 Grewia milleri Cattleya milleri
## 19378 Guatteria esperanzae Guatteria esmeraldae
## 20889 Litsea honbaensis Litsea tannaensis
## 65011 Magnolia cusucoensis Magnolia chocoensis
## 65411 Magnolia darioi Magnolia dandyi
## 89410 Magnolia talpana Magnolia caveana
## 90910 Magnolia veliziana Magnolia boliviana
## 274811 Miconia castaneiflora Miconia castaneifolia
## 84715 Protium aidanum Protium bahianum
## 30381 Quercus sontraensis Quercus honbaensis
## 100616 Rinorea amietii Rinorea afzelii
## 103516 Rinorea dewildei Rinorea dewitii
## 104316 Rinorea faurei Rinorea brachypetala
## 132415 Ruagea beckii Jungia beckii
## 133415 Ruagea obovata Kunzea obovata
## 212315 Saurauia cuatrecasana Saurauia cuatrecasasiana
## 206518 Tetrachyron orizabense Tetrachyron orizabaensis
## Old.name
## 1106
## 18601
## 28201
## 12382
## 23362 Capparidastrum cuatrecasasianum
## 28704 Lycianthes pittieri
## 10719
## 13237 Euphorbia spinescens
## 15798
## 15878
## 16588 Laelia milleri
## 19378
## 20889
## 65011
## 65411
## 89410
## 90910
## 274811
## 84715
## 30381
## 100616
## 103516
## 104316 Rinorea dawei
## 132415
## 133415
## 212315
## 206518
A very small number of species could be matched with WFO (but note that not all fuzzy matches are acceptable).
As it happens, all these species - except one - were matched in the WCVP.
WFO.unmatched <- GTS.WFO[GTS.WFO$Matched == FALSE, c("TaxonName", "Author")]
WFO.unmatched
## TaxonName Author
## 14608 Licuala heatubunii Barfod & W.J.Baker
## 261215 Nahuatlea smithii (B.L.Rob. & Greenm.) V.A.Funk
## 112813 Olearia traversiorum (F.Muell.) Hook.f.
## 227015 Pterospermum wilkieanum Doweld
## 31119 Ravenia swartziana (Miers) Fawc. & Rendle
## 21900 Trilepisium gymnandrum D.C.
## 119618 Viburnum inopinatum W. G. Craib
## 269718 Zanthoxylum chuquisaquense Reynel
GTS.WCVP[GTS.WCVP$TaxonName %in% WFO.unmatched$TaxonName,
c("TaxonName", "scientificName",
"Author", "scientificNameAuthorship")]
## TaxonName scientificName
## 143510 Licuala heatubunii <NA>
## 256712 Nahuatlea smithii Nahuatlea smithii
## 111313 Olearia traversiorum Olearia traversiorum
## 221217 Pterospermum wilkieanum Pterospermum wilkieanum
## 31020 Ravenia swartziana Ravenia swartziana
## 22000 Trilepisium gymnandrum Trilepisium gymnandrum
## 117819 Viburnum inopinatum Viburnum inopinatum
## 269017 Zanthoxylum chuquisaquense Zanthoxylum chuquisaquense
## Author scientificNameAuthorship
## 143510 Barfod & W.J.Baker <NA>
## 256712 (B.L.Rob. & Greenm.) V.A.Funk (B.L.Rob. & Greenm.) V.A.Funk
## 111313 (F.Muell.) Hook.f. (F.Muell.) Hook.f.
## 221217 Doweld Doweld
## 31020 (Miers) Fawc. & Rendle (Miers) Fawc. & Rendle
## 22000 D.C. (Baker) J.Gerlach
## 117819 W. G. Craib Craib
## 269017 Reynel Reynel
The one species that was not matched in both WFO and WCVP is a Palm species that was described in 2022, Licuala heatubunii.
This publication was initiated partially from ongoing work in a Darwin Initiative project (DAREX001) that develops a Global Biodiversity Standard for tree planting. Recently the GlobalUsefulNativeTrees and Tree Globally Observed Environmental Ranges databases were released from this project. With scripts such as the ones shown here, when the Global Biodiversity Standard scheme becomes operational, tree planting projects can crosscheck lists of species before applying.
sessionInfo()
## R version 4.2.1 (2022-06-23 ucrt)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 19045)
##
## Matrix products: default
##
## locale:
## [1] LC_COLLATE=English_United Kingdom.utf8
## [2] LC_CTYPE=English_United Kingdom.utf8
## [3] LC_MONETARY=English_United Kingdom.utf8
## [4] LC_NUMERIC=C
## [5] LC_TIME=English_United Kingdom.utf8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] stringr_1.4.1 data.table_1.14.2 WorldFlora_1.13-2
##
## loaded via a namespace (and not attached):
## [1] bslib_0.4.0 compiler_4.2.1 pillar_1.9.0 jquerylib_0.1.4
## [5] tools_4.2.1 digest_0.6.29 jsonlite_1.8.0 evaluate_0.16
## [9] lifecycle_1.0.3 tibble_3.2.1 pkgconfig_2.0.3 rlang_1.1.1
## [13] cli_3.4.1 rstudioapi_0.14 yaml_2.3.5 parallel_4.2.1
## [17] fuzzyjoin_0.1.6 xfun_0.33 fastmap_1.1.1 withr_2.5.0
## [21] dplyr_1.1.2 knitr_1.40 generics_0.1.3 vctrs_0.6.3
## [25] sass_0.4.2 tidyselect_1.2.0 glue_1.6.2 R6_2.5.1
## [29] fansi_1.0.3 rmarkdown_2.16 purrr_0.3.4 tidyr_1.2.1
## [33] magrittr_2.0.3 htmltools_0.5.6 stringdist_0.9.10 utf8_1.2.2
## [37] stringi_1.7.8 cachem_1.0.6