library(WorldFlora)
library(data.table)
library(stringr)
In previous posts (see here and here), I showed how the WorldFlora package can be used to standardize names from GlobalTreeSearch with the taxonomic backbone data from World Flora Online or the World Checklist of Vascular Plants. (My original inspiration to develop WorldFlora came during R scripting while updating the Agroforestry Species Switchboard; a next release of the ‘Switchboard’ was made available in 2022).
Here I show how standardization of the GlobalTreeSearch names can be done more quickly via the new function of WFO.match.fuzzyjoin. Using the new function comes with a cost of not being able to process larger data sets due to to memory size problems (on my machine, the error was Error: cannot allocate vector of size 6.9 Gb). But this problem can be relatively easily be circumvented by splitting the data in different data sets, as shown here.
What I also show is how some of the manual checks after fuzzy matching was completed, such as accepting fuzzy matches for species names that only differ in ending as -a, -us or -um (these changes indicate different genders; see the International Code of Botanical Nomenclature for various examples), can be facilitated by a customized function.
I used the latest available version of World Flora Online of v.2022.12. The download was done earlier, as was providing the location of the file to WFO.download via its argument ‘WFO.file’.
In the taxonomic backbone, World Flora Online lists about half a million current species names.
WFO.remember()
## Data sourced from: E:\Roeland\WorldFloraOnline\WFO 2022\classification.txt (Sun Jan 8 08:49:17 2023)
## Reading WFO data
## Warning in data.table::fread(WFO.file1, encoding = "UTF-8"): Found and resolved
## improper quoting out-of-sample. First healed line 117: <<wfo-0000000117
## GCC-1AF25765-5E36-4AED-8F64-12BB4F58DEC0 Hieracium onosmoides subsp.
## sphaerianthum SUBSPECIES wfo-0000034880 (Arv.-Touv.) Zahn Asteraceae Hieracium
## onosmoides sphaerianthum subsp. "Zahn, in Engler, Pflanzenr. 82. 1923." 1676
## 1923 ACCEPTED wfo-0000118008 More details could be found in <a href=http://
## www.theplantlist.org/tpl1.1/record/gcc-10011 >The Plant List v.1.1.</a>
## Originally in <a href=http://www.theplantlist.org/tpl/record/gcc-10011 >The
## Plant List v.1.0</a> 2012-02->>. If the fields are not quoted (e.g. field
## separator does not appear within any field), try quote="" to avoid this warning.
## The WFO data is now available from WFO.data
nrow(WFO.data)
## [1] 1425061
nrow(WFO.data[WFO.data$taxonRank == "SPECIES", ])
## [1] 1096665
nrow(WFO.data[WFO.data$taxonRank == "SPECIES" & WFO.data$acceptedNameUsageID == "", ])
## [1] 501257
The downloadable complete list of tree species (version 1.6) was obtained from Global Tree Search. The list includes close to 60,000 species names (so about 10 percent of current species names in World Flora Online).
# GTS.file <- choose.files()
GTS.file <- "E:\\Roeland\\WorldFloraOnline\\WFO 2022\\global_tree_search_trees_1_6.csv"
GTS <- fread(GTS.file, header=TRUE, encoding="UTF-8")
head(GTS)
## TaxonName Author V3 V4
## 1: Abarema abbottii (Rose & Leonard) Barneby & J.W.Grimes NA NA
## 2: Abarema acreana (J.F.Macbr.) L.Rico NA NA
## 3: Abarema adenophora (Ducke) Barneby & J.W.Grimes NA NA
## 4: Abarema alexandri (Urb.) Barneby & J.W.Grimes NA NA
## 5: Abarema asplenifolia (Griseb.) Barneby & J.W.Grimes NA NA
## 6: Abarema auriculata (Benth.) Barneby & J.W.Grimes NA NA
## Citation: GlobalTreeSearch online database. Botanic Gardens Conservation International. Richmond, U.K. Available at www.bgci.org. Accessed on DD/MM/YYYY. DOI: 10.13140/RG.2.2.34206.61761
## 1: DOI: 10.13140/RG.2.2.34206.61761
## 2:
## 3:
## 4:
## 5:
## 6:
GTS <- GTS[, c("TaxonName", "Author")]
nrow(GTS)
## [1] 57958
Everything is in place now to start the matching process. To avoid a crash of the new WFO.match.fuzzyjoin function, however, the data needs to be split. This can be done relatively easily via the cut function.
It still takes over an hour for the matching to be completed, but this is considerably faster than previously.
cuts <- cut(c(1:nrow(GTS)), breaks=10, labels=FALSE)
cut.i <- sort(unique(cuts))
start.time <- Sys.time()
for (i in 1:length(cut.i)) {
cat(paste("Cut: ", i, "\n"))
GTS.i <- WFO.one(WFO.match.fuzzyjoin(spec.data=GTS[cuts==cut.i[i], ],
WFO.data=WFO.data,
spec.name="TaxonName",
Authorship="Author",
fuzzydist.max=3),
verbose=FALSE)
if (i==1) {
GTS.WFO <- GTS.i
}else{
GTS.WFO <- rbind(GTS.WFO, GTS.i)
}
}
## Cut: 1
## Checking for fuzzy matches for 113 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Reached case # 3000
## Reached case # 4000
## Reached case # 5000
## Cut: 2
## Checking for fuzzy matches for 94 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Reached case # 3000
## Reached case # 4000
## Reached case # 5000
## Cut: 3
## Checking for fuzzy matches for 93 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Reached case # 3000
## Reached case # 4000
## Reached case # 5000
## Cut: 4
## Checking for fuzzy matches for 99 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Reached case # 3000
## Reached case # 4000
## Reached case # 5000
## Cut: 5
## Checking for fuzzy matches for 104 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Reached case # 3000
## Reached case # 4000
## Reached case # 5000
## Cut: 6
## Checking for fuzzy matches for 137 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Reached case # 3000
## Reached case # 4000
## Reached case # 5000
## Cut: 7
## Checking for fuzzy matches for 245 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Reached case # 3000
## Reached case # 4000
## Reached case # 5000
## Cut: 8
## Checking for fuzzy matches for 154 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Reached case # 3000
## Reached case # 4000
## Reached case # 5000
## Cut: 9
## Checking for fuzzy matches for 179 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Reached case # 3000
## Reached case # 4000
## Reached case # 5000
## Cut: 10
## Checking for fuzzy matches for 146 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Reached case # 3000
## Reached case # 4000
## Reached case # 5000
end.time <- Sys.time()
end.time - start.time # 1.113053 hours
## Time difference of 1.22014 hours
The results can be subdivided into species that could not be matched, species that could be directly matched and species with fuzzy matches.
# not matched
nrow(GTS.WFO[GTS.WFO$Matched == FALSE, ])
## [1] 410
# directly matched
nrow(GTS.WFO[GTS.WFO$Matched == TRUE & GTS.WFO$Fuzzy == FALSE, ])
## [1] 56594
GTS.fuzzy <- GTS.WFO[GTS.WFO$Fuzzy == TRUE, ]
nrow(GTS.fuzzy)
## [1] 954
Roughly 45 percent of the species with a fuzzy match had a matching distance of 1, indicating a difference of only 1 character.
nrow(GTS.fuzzy[GTS.fuzzy$Fuzzy.dist == 1, ])
## [1] 439
head(GTS.fuzzy[GTS.fuzzy$Fuzzy.dist == 1,
c("TaxonName", "scientificName", "Old.name", "taxonID")][1:30, ])
## TaxonName scientificName Old.name
## 40 Abarema microcalyx Abarema microcaly
## 252 Acacia cretacea Acacia creatacea
## 1319 Adinandra macquilingensis Adinandra maquilingensis
## 1396 Aegiphila luschnathii Aegiphila luschnatii
## 1456 Aeschynomene burttii Aeschynomene burttiie
## 1460 Aeschynomene pararubrofarinacea Aeschynomene pararuhrofarinacea
## taxonID
## 40 wfo-0000194017
## 252 wfo-0000201128
## 1319 wfo-0000520935
## 1396 wfo-0000811926
## 1456 wfo-0000173135
## 1460 wfo-0000173772
tail(GTS.fuzzy[GTS.fuzzy$Fuzzy.dist == 1,
c("TaxonName", "scientificName", "Old.name", "taxonID")])
## TaxonName scientificName Old.name taxonID
## 54807 Xylopia maccreae Xylopia maccreai wfo-0000428955
## 554113 Xylopia subdehiscens Xylopia sub-dehiscens wfo-0000428719
## 55879 Xylosma kaalaensis Xylosma kaalensis wfo-0001063072
## 56697 Zabelia tyaihyonii Zabelia tyaihyoni wfo-0000430178
## 56798 Zanthoxylum amapaense Zanthoxylum amapense wfo-0000430332
## 596211 Zygocarpum caeruleum Zygocarpum coeruleum wfo-0000430418
Almost a third of species with fuzzy matching had a distance of 2.
nrow(GTS.fuzzy[GTS.fuzzy$Fuzzy.dist == 2, ])
## [1] 276
head(GTS.fuzzy[GTS.fuzzy$Fuzzy.dist == 2,
c("TaxonName", "scientificName", "Old.name", "taxonID")])
## TaxonName scientificName Old.name taxonID
## 1028 Acropogon calcicolus Acropogon calcicola wfo-0000506268
## 1210 Adelobotrys macranthus Adelobotrys macrantha wfo-0001080704
## 1721 Aidia congesta Aidia congestum wfo-0000931235
## 1743 Ailanthus excelsa Ailanthus excelsus wfo-0000524612
## 2770 Amphitecna kennedyae Amphitecna kennedyi wfo-0000780939
## 2954 Aniba canelilla Aniba canellila wfo-0000536813
tail(GTS.fuzzy[GTS.fuzzy$Fuzzy.dist == 2,
c("TaxonName", "scientificName", "Old.name", "taxonID")])
## TaxonName scientificName Old.name
## 404113 Vepris occidentalis Rotala ramosior Peplis occidentalis
## 42758 Villaria purpurea Viscaria purpurea
## 43098 Virola marleneae Virola marlenei
## 45569 Vitex urbanii Teijsmanniodendron ahernianum Vitex curranii
## 461112 Vochysia condorensis Vochysia guatemalensis Vochysia hondurensis
## 52259 Xanthophyllum laeve Xanthophyllum laevis
## taxonID
## 404113 wfo-0000404304
## 42758 wfo-0000422656
## 43098 wfo-0001085217
## 45569 wfo-0000321239
## 461112 wfo-0001146187
## 52259 wfo-0000428598
Just about a quarter of species were matched with distance = 3.
nrow(GTS.fuzzy[GTS.fuzzy$Fuzzy.dist == 3, ])
## [1] 239
GTS.fuzzy[GTS.fuzzy$Fuzzy.dist == 3,
c("TaxonName", "scientificName", "Old.name", "taxonID")]
## TaxonName scientificName
## 773 Acer iranicum Acer creticum
## 1783 Aiouea leptophylla Ocotea leptophylla
## 1849 Alangium denudatum Allium denudatum
## 1854 Alangium gracile Eryngium gracile
## 2418 Alnus lusitanica Prunus lusitanica
## 2430 Alnus rohlenae Rubus rohlenae
## 3186 Annona oleifolia Annona cordifolia
## 3772 Arawakia lanceolata Minuartia rupestris subsp. clementei
## 3776 Arawakia macrocarpa Minuartia macrocarpa
## 3783 Arawakia parvifolia Arenaria parvifolia
## 4314 Artocarpus montanus Gonocarpus montanus
## 4418 Aspidosperma huberianum Aspidosperma tomentosum
## 5318 Barringtonia magnifolia Barringtonia pinnifolia
## 5528 Beaucarnea olsonii Nolina watsonii
## 6931 Bribria apiculata Beyeria apiculata
## 6961 Bribria crenata Zyrphelis crenata
## 12311 Bursera zapoteca Bursera aptera
## 21331 Campomanesia sepalifolia Campomanesia pubescens
## 23741 Capparidastrum cuatrecasanum Morisonia cuatrecasasiana
## 24631 Caragana gobica Caragana sinica
## 27461 Casearia americana Discaria americana
## 28211 Casearia kigeri Casearia engleri
## 29271 Casearia yucatanensis Senna pallida var. gaumeri
## 43921 Cinnamomum austrosinense Cinnamomum austro-sinensis
## 57391 Coccoloba tunii Coccoloba buchii
## 8103 Cordia megiae Cordia mexiana
## 13772 Coussarea mexiae Coussarea mexicana
## 16292 Craterispermum capitatum Craterispermum aristatum
## 20362 Croton nirguensis Croton dinghuensis
## 20622 Croton perstipulatus Croton stipulatus
## 32192 Cyrtandra kinhoi Cyrtandra keithii
## 32262 Cyrtandra longistamina Cryptandra longistaminea
## 41662 Desmopsis wendtii Ipomopsis wendtii
## 45222 Diospyros agnitser Diospyros anitae
## 50902 Diospyros robolot Diospyros discolor
## 51292 Diospyros sennenii Diospyros senensis
## 51632 Diospyros subargentea Diospyros argentea
## 6080 Drypetes louisii Drypetes dussii
## 24313 Durio connatus Durio carinatus
## 7015 Elaeocarpus avium Elaeocarpus badius
## 13973 Endiandra teschneri Endiandra teschneriana
## 16273 Eremanthus reticulatus Eremanthus auriculatus
## 16332 Eremanthus syncephalus Chresta pycnocephala
## 22992 Eucalyptus alatissima Eucalyptus plenissima
## 23803 Eucalyptus bunyip Eucalyptus dunnii
## 28683 Eucalyptus revelata Eucalyptus rugulata
## 32603 Eugenia crispula Begonia crispula
## 35753 Eugenia marleneae Luma apiculata
## 35983 Eugenia miragoanae Eugenia magoana
## 36503 Eugenia ochracea Eugenia chartacea
## 36662 Eugenia pachyadenia Eugenia macradenia
## 40233 Eumachia montana Fumaria montana
## 59462 Freziera trollii Freziera neillii
## 15115 Garcinia leptophylla Garcinia terpnophylla
## 15604 Grewia milleri Grimmia milleri
## 16834 Guapira laxa Guapira noxia
## 19484 Guatteria turrialbana Stenostomum turrialbanum
## 19504 Guatteria vallensis Guatteria allenii
## 28233 Helicia kingiana Triunia youngiana
## 29117 Heliotropium filiflorum Heliotropium flaviflorum
## 30624 Herrania cuatrecasasiana Herrania cuatrecasana
## 32254 Hibiscus ankeranensis Hibiscus ankaramyensis
## 32434 Hibiscus cooperi Hibiscus coulteri
## 36204 Homalium ovatifolium Homalium myrtifolium
## 36483 Homalium serratum Homalium dentatum
## 43224 Hyptidendron roseum Hyptidendron arboreum
## 53594 Ixora kalehensis Ixora balinensis
## 58194 Keetia davidii Swertia davidii
## 13618 Kurrimia paniculata Turpinia occidentalis
## 40517 Lasianthus linearifolius Lysiana linearifolia
## 12305 Linochilus fosbergii Microchilus fosbergii
## 12365 Linochilus rupestris Appendicula rupestris
## 22445 Lychnophorella santosii Lychnophora santosii
## 27385 Machilus coriacea Machilus sericea
## 32134 Magnolia betuliensis Magnolia betongensis
## 32235 Magnolia brasiliensis Magnolia braianensis
## 33319 Magnolia jaenensis Magnolia narinensis
## 33419 Magnolia juninensis Magnolia narinensis
## 33475 Magnolia kachinensis Magnolia narinensis
## 33855 Magnolia manuensis Magnolia panamensis
## 34005 Magnolia mindoensis Magnolia sieboldii subsp. sinensis
## 34275 Magnolia ottoi Magnolia kwangtungensis
## 34595 Magnolia quangninhensis Magnolia guangnanensis
## 34945 Magnolia sonlaensis Magnolia shiluensis
## 42185 Matisia cuatrecasasiana Matisia cuatrecasana
## 48285 Memecylon biokoense Memecylon boinense
## 49655 Memecylon rovumense Memecylon korupense
## 53135 Miconia antillana Miconia angelana
## 53545 Miconia birimosa Miconia formosa
## 55125 Miconia galeottii Mimosa galeottii
## 55365 Miconia haemantha Miconia desmantha
## 55495 Miconia hirticaulis Miconia seticaulis
## 55795 Miconia kappellei Miconia kappleri
## 56665 Miconia neoamygdalina Miconia fasciculata
## 56775 Miconia ocampensis Miconia onaensis
## 57235 Miconia polyflora Miconia polyandra
## 58824 Miconia tricostata Miconia aristata
## 6520 Microtropis densiflora Microtis media subsp. densiflora
## 6296 Monoon pachypetalum Monoon pachyphyllum
## 6966 Monteverdia chiapensis Maytenus chapadensis
## 6996 Monteverdia crassipes Pontederia crassipes
## 7086 Monteverdia elongata Pontederia crassipes
## 7486 Monteverdia planifolia Maytenus ebenifolia
## 12006 Myoporum semotum Myoporum insulare
## 12456 Myrcia acutissima Pinalia acutissima
## 12485 Myrcia adunca Acacia adunca
## 12726 Myrcia amplifolia Myrcia ampliflora
## 12866 Myrcia arenicola Myrcia rupicola
## 13046 Myrcia barkeri Myrcia splendens
## 13256 Myrcia brevispicata Varronia curassavica
## 13325 Myrcia calyptrata Cordia dentata
## 13436 Myrcia celaenensis Myrcia petenensis
## 13466 Myrcia chionantha Myrcia monantha
## 13485 Myrcia chytraculia Calyptranthes chytraculia
## 13526 Myrcia clarendonensis Varronia clarendonensis
## 13726 Myrcia corticosa Myrcia tortuosa
## 13926 Myrcia cymatophylla Myrcia dermatophylla
## 13955 Myrcia decandra Cordia decandra
## 14365 Myrcia fasciculata Mycetia fasciculata
## 14506 Myrcia fawcettii Cordia elliptica
## 14686 Myrcia galanoana Myrcia gamaeana
## 14806 Myrcia glomerata Madia glomerata
## 15165 Myrcia hydrophila Myrcia petrophila
## 15286 Myrcia irregularis Kurzia irregularis
## 15365 Myrcia krugii Myrcia fascicularis
## 15526 Myrcia legrandii Myrcia grandis
## 15926 Myrcia mayarensis Myrcia amapensis
## 16246 Myrcia mornicola Myrcia citrifolia
## 16396 Myrcia neocapitata Myrcia capitata
## 16436 Myrcia neocollina Myrcia guianensis
## 16486 Myrcia neoelegans Myrcia guianensis
## 16535 Myrcia neograndis Myrcia grandis
## 16556 Myrcia neohotteana Myrcia hotteana
## 16586 Myrcia neoinvolucrata Myrcia felisberti
## 16655 Myrcia neomyrcioides Myrcia myrcioides
## 16696 Myrcia neopalustris Myrcia palustris
## 16736 Myrcia neorubella Myrcia myrtillifolia
## 16746 Myrcia neosalicifolia Myrcia salicifolia
## 16766 Myrcia neosintenisii Myrcia fenzliana
## 16776 Myrcia neosmithii Myrcia selloi
## 16946 Myrcia nodosa Myrcia cymosa
## 17076 Myrcia nummularia Vicia nummularia
## 17285 Myrcia ovoidea Myrcia ovina
## 17386 Myrcia peduncularis Passiflora peduncularis
## 17466 Myrcia petricola Myrcia pineticola
## 17556 Myrcia pitoniana Myrcia doniana
## 17756 Myrcia pozasiana Myrcia thomasiana
## 17796 Myrcia protracta Cordia protracta
## 18205 Myrcia rufotomentosa Myrcia albotomentosa
## 18446 Myrcia siberiensis Myrcia sabaraensis
## 18756 Myrcia subcapitata Myrcia capitata
## 18926 Myrcia tenuiclada Myrcia tenuiflora
## 19406 Myrcia wilsonii Acacia wilsonii
## 22166 Myrsine brassii Myrsine brownii
## 28156 Neonauclea kranjiensis Neonauclea kraboensis
## 29826 Nolina brandegeei Polemonium brandegeei
## 29886 Nolina orbicularis Hoita orbicularis
## 29936 Nolina rodriguezii Sagina maritima
## 30746 Noronhia richardsiae Noronhia richardii
## 38946 Olinia chimanimani Olea chimanimani
## 45176 Ostrya chinensis Eurya chinensis
## 46596 Ouratea robusta Jurinea robusta
## 48206 Pachira moreirae Pachira morae
## 52086 Palicourea osaensis Palicourea paraensis
## 52746 Palicourea sucllii Palicourea pullei
## 52806 Palicourea tatei Palicourea patens
## 55606 Pandanus martinianus Pandanus marginatus
## 14456 Pinus vallartensis Pinus dalatensis
## 14526 Piparea spruceana Pilea spruceana
## 16427 Piptostigma macrophyllum Piptostigma calophyllum
## 16937 Pisonia roqueae Pisonia rosea
## 21857 Pleroma canescens Peronema canescens
## 23147 Plinia rufiflora Plinia cauliflora
## 26347 Polyosma subintegrifolia Polyosma integrifolia
## 29828 Portulacaria carrissoana Portulaca carrissoana
## 334111 Praravinia nitida Praravinia mimica
## 357111 Protium balsamiferum Aeonium balsamiferum
## 44477 Psychotria hamifera Psychotria pilifera
## 46066 Psychotria ortiziana Psychotria orosiana
## 471111 Psychotria sublyrata Psychotria subcordata
## 49277 Pterospermum aureum Pterospermum fuscum
## 49446 Pterospermum havilandii Pterospermum harmandii
## 52867 Quadrella indica Cordia myxa
## 54457 Quercus baolamensis Quercus blaoensis
## 545111 Quercus barrancana Quercus arkansana
## 54597 Quercus bidoupensis Quercus ×idzuensis
## 56167 Quercus honbaensis Quercus donnaiensis
## 57035 Quercus melissae Quercus lancifolia
## 7828 Rehderodendron macrophyllum Ixora schomburgkiana
## 38329 Rhododendron leigongshanense Rhododendron gongshanense
## 48528 Rhododendron stanleyi Rhododendron baileyi
## 9048 Rondeletia roynaefolia Rondeletia royenifolia
## 101210 Ruagea beckii Jungia beckii
## 10187 Ruagea obovata Dalea obovata
## 17798 Saurauia chaiana Saurauia tafana
## 179112 Saurauia corneri Saurauia roemeri
## 18078 Saurauia graciliflora Scurrula parasitica var. graciliflora
## 181112 Saurauia hispidicalyx Saurauia lepidicalyx
## 18139 Saurauia iliasii Saurauia klinkii
## 18157 Saurauia jeisinii Saurauia scabrida
## 18178 Saurauia joelii Saurauia poolei
## 18229 Saurauia juliae Saurauia molinae
## 18348 Saurauia latifolia Saurauia ilicifolia
## 18408 Saurauia leeana Saurauia tafana
## 18508 Saurauia linusii Saurauia klinkii
## 18658 Saurauia minutiflora Saurauia nudiflora
## 19098 Saurauia runiae Saurauia rufa
## 191210 Saurauia sammanniana Saurauia schumanniana
## 19258 Saurauia speciosa Sobralia speciosa
## 20428 Schefflera beamanii Schefflera glauca
## 20457 Schefflera bifurcata Schefflera heterophylla
## 20788 Schefflera chanii Schefflera bangii
## 21078 Schefflera crenata Schefflera caudata
## 248210 Schinus weinmanniifolia Schinus weinmannifolius
## 36465 Sloanea cruciata Sloanea cruenta
## 36977 Sloanea jaramilloi Brownea jaramilloi
## 37257 Sloanea morii Sloanea lamii
## 410211 Sorbus acutiserrata Pyrus acutiserrata
## 43486 Sorbus sellii Sorbus beckii
## 43768 Sorbus thayensis Sorbus zayuensis
## 57538 Symplocos juiyenensis Symplocos guianensis
## 577210 Symplocos limonensis Symplocos moaensis
## 60883 Syzygium barotsense Syzygium baramense
## 41039 Syzygium bengkulense Syzygium benguellense
## 39230 Syzygium komatiense Syzygium kalahiense
## 55139 Syzygium niassense Syzygium inasense
## 13878 Tamarix minoa Tamarix ninae
## 26309 Tovomita nidiae Tovomita gracilipes
## 28549 Trichilia reynelii Trichilia pallida
## 30999 Tritaxis pauciflora Hesperantha pauciflora
## 37477 Vadensea tenuifolia Tillandsia flexuosa
## 381310 Vantanea maculicarpa Vantanea macrocarpa
## 42689 Villaria coriacea Olearia coriacea
## 42869 Virola allenii Virola marlenei
## 43009 Virola fosteri Hiraea fosteri
## 468211 Vochysia peruviana Vochysia leguiana
## 47809 Warneckea albiflora Warneckea cauliflora
## 549113 Xylopia muricata Xylopia africana
## 56569 Yucca pinicola Yucca rupicola
## Old.name taxonID
## 773 wfo-0000514162
## 1783 wfo-0000390566
## 1849 wfo-0000756094
## 1854 wfo-0000677939
## 2418 wfo-0000998700
## 2430 wfo-0000994316
## 3186 wfo-0000537718
## 3772 Arenaria lanceolata wfo-0000374850
## 3776 Arenaria macrocarpa wfo-0000374757
## 3783 wfo-0000546388
## 4314 wfo-0000706669
## 4418 Aspidosperma hilarianum wfo-0000291837
## 5318 wfo-0000922995
## 5528 Beaucarnea watsonii wfo-0000700468
## 6931 wfo-0000911620
## 6961 Mairia crenata wfo-0000118134
## 12311 wfo-0000576119
## 21331 Campomanesia ovalifolia wfo-0000793699
## 23741 Capparidastrum cuatrecasasianum wfo-0001423797
## 24631 wfo-0000186069
## 27461 wfo-0000651539
## 28211 wfo-0000923944
## 29271 Cassia yucatanensis wfo-0000175298
## 43921 wfo-0000604901
## 57391 wfo-0000613029
## 8103 wfo-0000620732
## 13772 wfo-0000926140
## 16292 wfo-0000926715
## 20362 wfo-0000927832
## 20622 wfo-0000932450
## 32192 wfo-0000635454
## 32262 wfo-0000627232
## 41662 wfo-0001286738
## 45222 wfo-0000648500
## 50902 Diospyros mabolo wfo-0000648780
## 51292 wfo-0000649737
## 51632 wfo-0000648512
## 6080 wfo-0000946459
## 24313 wfo-0000658061
## 7015 wfo-0000664050
## 13973 wfo-0000667714
## 16273 wfo-0000056834
## 16332 Eremanthus pycnocephalus wfo-0000087102
## 22992 wfo-0000955652
## 23803 wfo-0000954854
## 28683 wfo-0000336518
## 32603 wfo-0000823746
## 35753 Eugenia palenae wfo-0000231064
## 35983 wfo-0000957988
## 36503 wfo-0000956783
## 36662 wfo-0000957957
## 40233 wfo-0000693316
## 59462 wfo-0001345281
## 15115 wfo-0000694684
## 15604 wfo-0001214311
## 16834 wfo-0000710750
## 19484 Guettarda turrialbana wfo-0000921751
## 19504 wfo-0000711141
## 28233 Helicia youngiana wfo-0000454537
## 29117 wfo-0000718574
## 30624 wfo-0001140732
## 32254 wfo-0000722278
## 32434 wfo-0001077057
## 36204 wfo-0001063037
## 36483 wfo-0001062878
## 43224 wfo-0000216415
## 53594 wfo-0000218193
## 58194 wfo-0001063887
## 13618 Turpinia paniculata wfo-0000459223
## 40517 Loranthus linearifolius wfo-0000366774
## 12305 wfo-0000796905
## 12365 Podochilus rupestris wfo-0000252079
## 22445 wfo-0000138908
## 27385 wfo-0000373769
## 32134 wfo-0000233012
## 32235 wfo-0000464751
## 33319 wfo-0000233291
## 33419 wfo-0000233291
## 33475 wfo-0000233291
## 33855 wfo-0000233323
## 34005 Magnolia sinensis wfo-0000233398
## 34275 Magnolia moto wfo-0000233234
## 34595 wfo-0001283317
## 34945 wfo-0000465081
## 42185 wfo-0000369064
## 48285 wfo-0001081375
## 49655 wfo-0001343185
## 53135 wfo-0001247703
## 53545 wfo-0001079631
## 55125 wfo-0000169715
## 55365 wfo-0001079603
## 55495 wfo-0001082742
## 55795 wfo-0001079684
## 56665 Miconia amygdalina wfo-0001079622
## 56775 wfo-0001082540
## 57235 wfo-0001079799
## 58824 wfo-0001082152
## 6520 Microtis densiflora wfo-0000244311
## 6296 wfo-0001334309
## 6966 Monteverdia chapadensis wfo-0001421659
## 6996 wfo-0000501039
## 7086 Pontederia elongata wfo-0000501039
## 7486 Monteverdia ebenifolia wfo-0001292498
## 12006 Myoporum serratum wfo-0000448228
## 12456 Eria acutissima wfo-0000922247
## 12485 wfo-0000187961
## 12726 wfo-0001424924
## 12866 wfo-0000247851
## 13046 Myrcia berberis wfo-0000247907
## 13256 Cordia brevispicata wfo-0001350461
## 13325 Cordia calyptrata wfo-0000620413
## 13436 wfo-0001425001
## 13466 wfo-0001086476
## 13485 Myrtus chytraculia wfo-0000784639
## 13526 Cordia clarendonensis wfo-0000420852
## 13726 wfo-0000247959
## 13926 wfo-0000247388
## 13955 wfo-0000620411
## 14365 wfo-0000246885
## 14506 Cordia fawcettii wfo-0000620449
## 14686 wfo-0001086131
## 14806 wfo-0000039422
## 15165 wfo-0001318292
## 15286 wfo-0001213408
## 15365 Myrcia bangii wfo-0000247448
## 15526 wfo-0000247499
## 15926 wfo-0000247228
## 16246 Myrcia vernicosa wfo-0000247328
## 16396 wfo-0000247310
## 16436 Myrcia collina wfo-0000247506
## 16486 Myrcia elegans wfo-0000247506
## 16535 wfo-0000247499
## 16556 wfo-0000247531
## 16586 Myrcia involucrata wfo-0000247450
## 16655 wfo-0000913451
## 16696 wfo-0000247729
## 16736 Myrcia rubella wfo-0000247677
## 16746 wfo-0000247855
## 16766 Myrcia sintenisii wfo-0000247452
## 16776 Myrcia smithii wfo-0000247874
## 16946 wfo-0000247378
## 17076 wfo-0000191387
## 17285 wfo-0001318050
## 17386 Murucuia peduncularis wfo-0001090803
## 17466 wfo-0000247763
## 17556 wfo-0000247414
## 17756 wfo-0000247948
## 17796 wfo-0000620876
## 18205 wfo-0000247219
## 18446 wfo-0001086147
## 18756 wfo-0000247310
## 18926 wfo-0001425017
## 19406 wfo-0000202277
## 22166 wfo-0000449072
## 28156 wfo-0000250281
## 29826 Gilia brandegeei wfo-0001099791
## 29886 wfo-0000184634
## 29936 Sagina rodriguezii wfo-0000438539
## 30746 wfo-0001315433
## 38946 wfo-0000817257
## 45176 wfo-0000682973
## 46596 wfo-0000017627
## 48206 wfo-0000397401
## 52086 wfo-0000263105
## 52746 wfo-0001429074
## 52806 wfo-0000263108
## 55606 wfo-0000730074
## 14456 wfo-0000481267
## 14526 wfo-0000472416
## 16427 wfo-0001065979
## 16937 wfo-0000476609
## 21857 wfo-0000267573
## 23147 wfo-0000278887
## 26347 wfo-0001239572
## 29828 wfo-0000489461
## 334111 wfo-0000282261
## 357111 wfo-0000521612
## 44477 wfo-0000287126
## 46066 wfo-0000286935
## 471111 wfo-0000287705
## 49277 wfo-0000476018
## 49446 wfo-0000476031
## 52867 Quarena indica wfo-0000620765
## 54457 wfo-0000289777
## 545111 wfo-0000289626
## 54597 wfo-0001220454
## 56167 wfo-0000290543
## 57035 Quercus molinae wfo-0000291543
## 7828 Siderodendron macrophyllum wfo-0000218978
## 38329 wfo-0001229447
## 48528 wfo-0001048729
## 9048 wfo-0000297920
## 101210 wfo-0000011401
## 10187 wfo-0000169103
## 17798 wfo-0000433161
## 179112 wfo-0000500170
## 18078 Scurrula graciliflora wfo-0001075783
## 181112 wfo-0000501399
## 18139 wfo-0000501390
## 18157 Saurauia nelsonii wfo-0000500915
## 18178 wfo-0000433208
## 18229 wfo-0000493595
## 18348 wfo-0000493598
## 18408 wfo-0000433161
## 18508 wfo-0000501390
## 18658 wfo-0000500946
## 19098 wfo-0000500935
## 191210 wfo-0001328779
## 19258 wfo-0000311582
## 20428 Schefflera seemannii wfo-0000305901
## 20457 Schefflera biternata wfo-0000305936
## 20788 wfo-0000305668
## 21078 wfo-0000305737
## 248210 wfo-0001049834
## 36465 wfo-0001046584
## 36977 wfo-0001334440
## 37257 wfo-0000499163
## 410211 wfo-0000987853
## 43486 wfo-0000998681
## 43768 wfo-0001015965
## 57538 wfo-0000491350
## 577210 wfo-0000490926
## 60883 wfo-0000318296
## 41039 wfo-0000318306
## 39230 wfo-0000318821
## 55139 wfo-0000318785
## 13878 wfo-0000458647
## 26309 Tovomita duidae wfo-0000407114
## 28549 Trichilia weddelii wfo-0000455741
## 30999 Tritonia pauciflora wfo-0000782819
## 37477 Vriesea tenuifolia wfo-0000578510
## 381310 wfo-0001065600
## 42689 wfo-0000046138
## 42869 wfo-0001085217
## 43009 wfo-0001263739
## 468211 wfo-0001146184
## 47809 wfo-0001081608
## 549113 wfo-0000428870
## 56569 wfo-0000752219
GTS.fuzzy[GTS.fuzzy$Fuzzy.dist == 3 & GTS.fuzzy$Auth.dist < 6,
c("TaxonName", "Author", "scientificName", "scientificNameAuthorship")]
## TaxonName Author scientificName
## 43921 Cinnamomum austrosinense H.T.Chang Cinnamomum austro-sinensis
## 50902 Diospyros robolot B.Walln. Diospyros discolor
## 7015 Elaeocarpus avium Coode Elaeocarpus badius
## 23803 Eucalyptus bunyip Rule Eucalyptus dunnii
## 40517 Lasianthus linearifolius H.Zhu Lysiana linearifolia
## 42185 Matisia cuatrecasasiana Fern.Alonso Matisia cuatrecasana
## 9048 Rondeletia roynaefolia DC. Rondeletia royenifolia
## scientificNameAuthorship
## 43921 H.T.Chang
## 50902 Willd.
## 7015 Coode
## 23803 Maiden
## 40517 Tiegh.
## 42185 Fern.Alonso
## 9048 DC.
One of the reasons that species could not be directly matched is that their names suggest different genders because botanical names differentiate between feminine, masculine and neutral names (see the International Code of Botanical Nomenclature for various examples) and their genders differ between different lists.
The following function checks whether names would match if names of the species end with any of ‘a’, ‘us’ and ‘um’. Additional checks remove hyphens and check for matches if ‘ii’ was replaced by ‘i’.
An option of the function is to check whether names would match if vowels and ‘y’ were ignored.
acceptable.match <- function(x, no.vowels=FALSE) {
x$submitted <- x$TaxonName
x$matched <- x$scientificName
x[x$New.accepted == TRUE, "matched"] <- x[x$New.accepted == TRUE, "Old.name"]
x$submitted <- str_replace(x$submitted, pattern="um$", replacement="a")
x$matched <- str_replace(x$matched, pattern="um$", replacement="a")
x$submitted <- str_replace(x$submitted, pattern="us$", replacement="a")
x$matched <- str_replace(x$matched, pattern="us$", replacement="a")
x$submitted <- str_replace(x$submitted, pattern="-", replacement="")
x$matched <- str_replace(x$matched, pattern="-", replacement="")
x$submitted <- str_replace_all(x$submitted, pattern="ii", replacement="i")
x$matched <- str_replace_all(x$matched, pattern="ii", replacement="i")
if (no.vowels == TRUE) {
x$submitted <- str_replace_all(x$submitted, pattern="[aeiouy]", replacement="")
x$matched <- str_replace_all(x$matched, pattern="[aeiouy]", replacement="")
}
return(x$submitted == x$matched)
}
GTS.acceptable <- acceptable.match(GTS.fuzzy)
GTS.acceptable2 <- acceptable.match(GTS.fuzzy, no.vowels=TRUE)
With the custom function, we can now check the species with fuzzy matches.
nrow(GTS.fuzzy[GTS.acceptable == TRUE, ])
## [1] 277
head(GTS.fuzzy[GTS.acceptable == TRUE, c("TaxonName", "scientificName", "Old.name", "taxonID")])
## TaxonName scientificName Old.name taxonID
## 1028 Acropogon calcicolus Acropogon calcicola wfo-0000506268
## 1210 Adelobotrys macranthus Adelobotrys macrantha wfo-0001080704
## 1569 Ageratina urbani Ageratina urbanii wfo-0000072197
## 1721 Aidia congesta Aidia congestum wfo-0000931235
## 1743 Ailanthus excelsa Ailanthus excelsus wfo-0000524612
## 2087 Alectryon connatus Alectryon connatum wfo-0000525455
tail(GTS.fuzzy[GTS.acceptable == TRUE, c("TaxonName", "scientificName", "Old.name", "taxonID")])
## TaxonName scientificName Old.name
## 45289 Vitex rubroaurantiaca Vitex rubro-aurantiaca
## 45579 Vitex vansteenisii Vitex vansteenisi
## 511210 Withania begoniifolia Mellissia begonifolia Withania begonifolia
## 51339 Wrightia flavorosea Wrightia flavo-rosea
## 554113 Xylopia subdehiscens Xylopia sub-dehiscens
## 56697 Zabelia tyaihyonii Zabelia tyaihyoni
## taxonID
## 45289 wfo-0000333420
## 45579 wfo-0000333528
## 511210 wfo-0001023587
## 51339 wfo-0000334528
## 554113 wfo-0000428719
## 56697 wfo-0000430178
nrow(GTS.fuzzy[GTS.acceptable == FALSE & GTS.acceptable2 == TRUE, ])
## [1] 206
head(GTS.fuzzy[GTS.acceptable == FALSE & GTS.acceptable2 == TRUE,
c("TaxonName", "scientificName", "Old.name", "taxonID")])
## TaxonName scientificName Old.name taxonID
## 252 Acacia cretacea Acacia creatacea wfo-0000201128
## 1456 Aeschynomene burttii Aeschynomene burttiie wfo-0000173135
## 2770 Amphitecna kennedyae Amphitecna kennedyi wfo-0000780939
## 2954 Aniba canelilla Aniba canellila wfo-0000536813
## 2987 Aniba rosodora Aniba rosaeodora wfo-0000536890
## 3173 Annona neoecuadorensis Annona neoecuadoarensis wfo-0000506349
tail(GTS.fuzzy[GTS.acceptable == FALSE & GTS.acceptable2 == TRUE,
c("TaxonName", "scientificName", "Old.name", "taxonID")])
## TaxonName scientificName Old.name taxonID
## 51548 Wunderlichia crulsiana Wunderlichia cruelsiana wfo-0000118141
## 531113 Xanthostemon grisii Xanthostemon grisei wfo-0000334842
## 54807 Xylopia maccreae Xylopia maccreai wfo-0000428955
## 55879 Xylosma kaalaensis Xylosma kaalensis wfo-0001063072
## 56798 Zanthoxylum amapaense Zanthoxylum amapense wfo-0000430332
## 596211 Zygocarpum caeruleum Zygocarpum coeruleum wfo-0000430418
nrow(GTS.fuzzy[GTS.acceptable2 == FALSE, ])
## [1] 471
As it happens, about half the species with fuzzy matches could be accepted with the rules specified by the custom function. For the remaining species, I advise to do a manual verification of the names. Therefore, a considerable number of close to 500 species remains to be manually checked. The script below saves results for these species locally.
However, this is also less than 1 percent of the original list of species. More importantly, as will be shown below, for GlobalTreeSearch, it is better to check first whether chances of finding acceptable matches become greater with Kew’s World Checklist of Vascular Plants.
GTS.fuzzy.remain1 <- GTS.fuzzy[GTS.acceptable2 == FALSE, ]
nrow(GTS.fuzzy.remain1)
## [1] 471
GTS.fuzzy.remain1[, c("TaxonName", "scientificName", "Old.name", "taxonID")][1:30, ]
## TaxonName scientificName
## 40 Abarema microcalyx Abarema microcaly
## 773 Acer iranicum Acer creticum
## 1319 Adinandra macquilingensis Adinandra maquilingensis
## 1396 Aegiphila luschnathii Aegiphila luschnatii
## 1460 Aeschynomene pararubrofarinacea Aeschynomene pararuhrofarinacea
## 1783 Aiouea leptophylla Ocotea leptophylla
## 1849 Alangium denudatum Allium denudatum
## 1854 Alangium gracile Eryngium gracile
## 2418 Alnus lusitanica Prunus lusitanica
## 2430 Alnus rohlenae Rubus rohlenae
## 3186 Annona oleifolia Annona cordifolia
## 3633 Arachnothryx chaconii Arachnothryx chaconis
## 3772 Arawakia lanceolata Minuartia rupestris subsp. clementei
## 3776 Arawakia macrocarpa Minuartia macrocarpa
## 3783 Arawakia parvifolia Arenaria parvifolia
## 3793 Arbutus bicolor Comarostaphylis discolor
## 3811 Archidendron bauchei Archidendron baucheri
## 4058 Ardisia labisiifolia Ardisia labrisiifolia
## 4163 Ardisia silamensis Ardisia siamensis
## 4314 Artocarpus montanus Gonocarpus montanus
## 4364 Arytera litoralis Arytera littoralis
## 4418 Aspidosperma huberianum Aspidosperma tomentosum
## 5275 Barleria mirabilis Barleria mutabilis
## 5290 Barringtonia augusta Barringtonia angusta
## 5318 Barringtonia magnifolia Barringtonia pinnifolia
## 5528 Beaucarnea olsonii Nolina watsonii
## 5558 Beguea tsaratananensis Beguea tsaratanensis
## 5578 Beilschmiedia atra Beilschmiedia atrata
## 24100 Betula murrayana Betula ×purpusii
## 6931 Bribria apiculata Beyeria apiculata
## Old.name taxonID
## 40 wfo-0000194017
## 773 wfo-0000514162
## 1319 wfo-0000520935
## 1396 wfo-0000811926
## 1460 wfo-0000173772
## 1783 wfo-0000390566
## 1849 wfo-0000756094
## 1854 wfo-0000677939
## 2418 wfo-0000998700
## 2430 wfo-0000994316
## 3186 wfo-0000537718
## 3633 wfo-0000254992
## 3772 Arenaria lanceolata wfo-0000374850
## 3776 Arenaria macrocarpa wfo-0000374757
## 3783 wfo-0000546388
## 3793 Arbutus discolor wfo-0000615941
## 3811 wfo-0000199765
## 4058 wfo-0000544511
## 4163 wfo-0000545167
## 4314 wfo-0000706669
## 4364 wfo-0000550703
## 4418 Aspidosperma hilarianum wfo-0000291837
## 5275 wfo-0001342460
## 5290 wfo-0000774826
## 5318 wfo-0000922995
## 5528 Beaucarnea watsonii wfo-0000700468
## 5558 wfo-0001269042
## 5578 wfo-0000561803
## 24100 Betula ×murrayana wfo-0000336488
## 6931 wfo-0000911620
tail(GTS.fuzzy.remain1[, c("TaxonName", "scientificName", "Old.name", "taxonID")])
## TaxonName scientificName Old.name
## 47809 Warneckea albiflora Warneckea cauliflora
## 49698 Wendlandia buddleacea Wendlandia buddlejacea
## 52259 Xanthophyllum laeve Xanthophyllum laevis
## 52659 Xanthophyllum schizocarpon Xanthophyllum schixocarpon
## 549113 Xylopia muricata Xylopia africana
## 56569 Yucca pinicola Yucca rupicola
## taxonID
## 47809 wfo-0001081608
## 49698 wfo-0000334069
## 52259 wfo-0000428598
## 52659 wfo-0000428323
## 549113 wfo-0000428870
## 56569 wfo-0000752219
file.save1 <- paste0(getwd(), "//GTS_Fuzzy_WFO_remain.txt")
fwrite(GTS.fuzzy.remain1, file=file.save1, sep="|", row.names=FALSE)
Instead of using the taxonomic backbone of World Flora Online, now we will use the taxonomic backbone of the World checklist of Vascular Plants (WCVP). Here I used the June 2022 version, downloaded via this link.
As shown previously, I recommend to first use a text editor to replace instances of ’ × ’ by ’ ×’ in the WCVP.
Where World Flora Online listed over 500,000 current species names, WCVP has slightly less than 400,000 current species names.
# WCVP.file <- choose.files()
WCVP.file <- "E:\\Roeland\\WorldFloraOnline\\WFO 2022\\wcvp_v9_jun_2022 x changed.txt"
WCVP.data <- fread(WCVP.file, header=TRUE, encoding="UTF-8", sep="|")
head(WCVP.data)
## kew_id family genus species infraspecies
## 1: 338-1 Acanthaceae Acanthodium
## 2: 44787-1 Acanthaceae Acanthodium angustum
## 3: 44788-1 Acanthaceae Acanthodium capense
## 4: 44789-1 Acanthaceae Acanthodium carduifolium
## 5: 44790-1 Acanthaceae Acanthodium delilii
## 6: 44792-1 Acanthaceae Acanthodium diversispinum
## taxon_name authors rank taxonomic_status
## 1: Acanthodium Delile GENUS Synonym
## 2: Acanthodium angustum Nees SPECIES Homotypic_Synonym
## 3: Acanthodium capense (L.f.) Nees SPECIES Homotypic_Synonym
## 4: Acanthodium carduifolium (L.f.) Nees SPECIES Homotypic_Synonym
## 5: Acanthodium delilii H.Buek SPECIES Synonym
## 6: Acanthodium diversispinum Nees SPECIES Homotypic_Synonym
## accepted_kew_id accepted_name accepted_authors parent_kew_id
## 1: 427-1 Blepharis Juss.
## 2: 46469-1 Blepharis angusta (Nees) T.Anderson
## 3: 46487-1 Blepharis capensis (L.f.) Pers.
## 4: 44830-1 Acanthopsis carduifolia (L.f.) Schinz
## 5: 46503-1 Blepharis edulis (Forssk.) Pers.
## 6: 46501-1 Blepharis diversispina (Nees) C.B.Clarke
## parent_name parent_authors reviewed
## 1: In review
## 2: In review
## 3: In review
## 4: In review
## 5: In review
## 6: In review
## publication original_name_id
## 1: Descr. Egypte, Hist. Nat. 2(Mém.): 241 (1813)
## 2: A.P.de Candolle, Prodr. 11: 273 (1847)
## 3: Linnaea 15: 361 (1841)
## 4: A.P.de Candolle, Prodr. 11: 278 (1847) 44848-1
## 5: Gen. Sp. Candoll. 3: 1 (1858)
## 6: A.P.de Candolle, Prodr. 11: 275 (1847)
WCVP.data <- new.backbone(WCVP.data,
taxonID="kew_id",
scientificName="taxon_name",
scientificNameAuthorship="authors",
acceptedNameUsageID = "accepted_kew_id",
taxonomicStatus = "taxonomic_status")
head(WCVP.data)
## taxonID scientificName scientificNameAuthorship
## 1: 338-1 Acanthodium Delile
## 2: 44787-1 Acanthodium angustum Nees
## 3: 44788-1 Acanthodium capense (L.f.) Nees
## 4: 44789-1 Acanthodium carduifolium (L.f.) Nees
## 5: 44790-1 Acanthodium delilii H.Buek
## 6: 44792-1 Acanthodium diversispinum Nees
## acceptedNameUsageID taxonomicStatus kew_id family genus
## 1: 427-1 Synonym 338-1 Acanthaceae Acanthodium
## 2: 46469-1 Homotypic_Synonym 44787-1 Acanthaceae Acanthodium
## 3: 46487-1 Homotypic_Synonym 44788-1 Acanthaceae Acanthodium
## 4: 44830-1 Homotypic_Synonym 44789-1 Acanthaceae Acanthodium
## 5: 46503-1 Synonym 44790-1 Acanthaceae Acanthodium
## 6: 46501-1 Homotypic_Synonym 44792-1 Acanthaceae Acanthodium
## species infraspecies taxon_name authors rank
## 1: Acanthodium Delile GENUS
## 2: angustum Acanthodium angustum Nees SPECIES
## 3: capense Acanthodium capense (L.f.) Nees SPECIES
## 4: carduifolium Acanthodium carduifolium (L.f.) Nees SPECIES
## 5: delilii Acanthodium delilii H.Buek SPECIES
## 6: diversispinum Acanthodium diversispinum Nees SPECIES
## taxonomic_status accepted_kew_id accepted_name accepted_authors
## 1: Synonym 427-1 Blepharis Juss.
## 2: Homotypic_Synonym 46469-1 Blepharis angusta (Nees) T.Anderson
## 3: Homotypic_Synonym 46487-1 Blepharis capensis (L.f.) Pers.
## 4: Homotypic_Synonym 44830-1 Acanthopsis carduifolia (L.f.) Schinz
## 5: Synonym 46503-1 Blepharis edulis (Forssk.) Pers.
## 6: Homotypic_Synonym 46501-1 Blepharis diversispina (Nees) C.B.Clarke
## parent_kew_id parent_name parent_authors reviewed
## 1: In review
## 2: In review
## 3: In review
## 4: In review
## 5: In review
## 6: In review
## publication original_name_id
## 1: Descr. Egypte, Hist. Nat. 2(Mém.): 241 (1813)
## 2: A.P.de Candolle, Prodr. 11: 273 (1847)
## 3: Linnaea 15: 361 (1841)
## 4: A.P.de Candolle, Prodr. 11: 278 (1847) 44848-1
## 5: Gen. Sp. Candoll. 3: 1 (1858)
## 6: A.P.de Candolle, Prodr. 11: 275 (1847)
nrow(WCVP.data)
## [1] 1232931
nrow(WCVP.data[WCVP.data$rank == "SPECIES", ])
## [1] 999556
nrow(WCVP.data[WCVP.data$rank == "SPECIES" & WCVP.data$acceptedNameUsageID == "", ])
## [1] 394407
We can use similar scripts now as above.
cuts <- cut(c(1:nrow(GTS)), breaks=10, labels=FALSE)
cut.i <- sort(unique(cuts))
start.time <- Sys.time()
for (i in 1:length(cut.i)) {
cat(paste("Cut: ", i, "\n"))
GTS.i <- WFO.one(WFO.match.fuzzyjoin(spec.data=GTS[cuts==cut.i[i], ],
WFO.data=WCVP.data,
spec.name="TaxonName",
Authorship="Author",
fuzzydist.max=3),
verbose=FALSE)
if (i==1) {
GTS.WFO <- GTS.i
}else{
GTS.WFO <- rbind(GTS.WFO, GTS.i)
}
}
## Cut: 1
## Checking for fuzzy matches for 35 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Reached case # 3000
## Reached case # 4000
## Reached case # 5000
## Cut: 2
## Checking for fuzzy matches for 37 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Reached case # 3000
## Reached case # 4000
## Reached case # 5000
## Cut: 3
## Checking for fuzzy matches for 42 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Reached case # 3000
## Reached case # 4000
## Reached case # 5000
## Cut: 4
## Checking for fuzzy matches for 21 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Reached case # 3000
## Reached case # 4000
## Reached case # 5000
## Cut: 5
## Checking for fuzzy matches for 48 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Reached case # 3000
## Reached case # 4000
## Reached case # 5000
## Cut: 6
## Checking for fuzzy matches for 47 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Reached case # 3000
## Reached case # 4000
## Reached case # 5000
## Cut: 7
## Checking for fuzzy matches for 36 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Reached case # 3000
## Reached case # 4000
## Reached case # 5000
## Cut: 8
## Checking for fuzzy matches for 59 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Reached case # 3000
## Reached case # 4000
## Reached case # 5000
## Cut: 9
## Checking for fuzzy matches for 62 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Reached case # 3000
## Reached case # 4000
## Reached case # 5000
## Cut: 10
## Checking for fuzzy matches for 37 records
##
## Checking new accepted IDs
## Reached case # 1000
## Reached case # 2000
## Reached case # 3000
## Reached case # 4000
## Reached case # 5000
end.time <- Sys.time()
end.time - start.time # 1.113053 hours
## Time difference of 58.9713 mins
Matching with the WCVP was considerably more successful than with World Flora Online. Instead of not finding matches for 410 species, now this number was close to 50. The number of species with fuzzy matches also dropped to 40% and less than 400.
# not matched
nrow(GTS.WFO[GTS.WFO$Matched == FALSE, ])
## [1] 52
# directly matched
nrow(GTS.WFO[GTS.WFO$Matched == TRUE & GTS.WFO$Fuzzy == FALSE, ])
## [1] 57534
GTS.fuzzy <- GTS.WFO[GTS.WFO$Fuzzy == TRUE, ]
nrow(GTS.fuzzy)
## [1] 372
These are the species for which no matches could found.
GTS.not2 <- GTS.WFO[GTS.WFO$Matched == FALSE, ]
GTS.not2[, c("TaxonName", "Author")]
## TaxonName
## 2483 Alseis sertaneja
## 2696 Amphitecna fonceti
## 4192 Artocarpus bergii
## 5619 Beilschmiedia obscura
## 5627 Beilschmiedia osacola
## 5694 Beilschmiedia tisseranti
## 39610 Bougainvillea fasciculata
## 20011 Camellia hengchunensis
## 42901 Cinnadenia liyuyingii
## 46051 Citharexylum ligustrinum
## 52311 Clusia aemygdioi
## 11222 Cotoneaster ellipticus
## 21352 Crudia bibundina
## 24972 Cryptocarya sheikelmudiyana
## 25792 Ctenodon molliculus
## 25802 Ctenodon monteiroi
## 31162 Cyrtandra balgooyi
## 37892 Deinbollia onanae
## 44422 Diospyros antakaranae
## 54182 Disepalum rawagambut
## 13893 Endiandra wongawallanensis
## 14183 Endlicheria goeldiana
## 14882 Englerodendron libassum
## 13034 Gordonia singaporeana
## 14824 Grewia delphinensis
## 15294 Grewia mansouriana
## 39564 Humiriastrum purusensis
## 55024 Jarandersonia pereirae
## 29795 Madhuca chia-ananii
## 33334 Magnolia llanganatensis
## 38118 Mangifera salomonensis
## 42895 Mediusella arenaria
## 53620 Petrea asperifolia
## 21577 Plerandra gordonii
## 21626 Plerandra moratiana
## 380111 Prunus klokovii
## 53907 Quercus centenaria
## 10258 Ruagea parvifructa
## 24888 Schizolaena noronhae
## 26608 Scyphostegia borneensis
## 47958 Sterculia multiovula
## 51229 Styrax cambodianus
## 58208 Synima cordieri
## 27739 Trichilia deminuta
## 35039 Uvariopsis dicaprio
## 402210 Vepris robertsoniae
## 40409 Vepris zapfackii
## 41269 Viburnum axillare
## 45689 Vochysia caroliae-scottii
## 46809 Volkameria emirnensis
## 46839 Volkameria grevei
## 57839 Zanthoxylum tenuipedicellatum
## Author
## 2483 L.Marinho & J.G.Jardim
## 2696 Ortiz-Rodr. & G\xf3mez-Dom\xednguez
## 4192 E.M.Gardner, Arifiani & Zerega
## 5619 (Stapf) Engl. ex A.Chev.
## 5627 Aguilar, D.Santam. & van der Werff
## 5694 A.Chev.
## 39610 Heimerl
## 20011 Chang
## 42901 (H.Liu) de Kok & Sengun
## 46051 (Thur. ex Decne.) Van Houtte
## 52311 Gomes da Silva & B.Weinberg
## 11222 (Lindl.) Loudon
## 21352 Harms
## 24972 A.K.H.Bachan & P.K.Fasila
## 25792 (Kunth) D.B.O.S.Cardoso, Filardi & H.C.Lima
## 25802 (A.Fern. & P.Bezerra) D.B.O.S.Cardoso, Filardi & H.C.Lima
## 31162 H.J.Atkins & Karton.
## 37892 Cheek
## 44422 Capuron ex G.E.Schatz & Lowry
## 54182 Randi, D.C.Thomas & Wijedasa
## 13893 L.Weber
## 14183 Vattimo
## 14882 Jongkind & Breteler
## 13034 (Dyer) Wall. ex Ridl.
## 14824 Capuron
## 15294 Abedin
## 39564 Prance
## 55024 S.K.Ganesan & R.C.K.Chung
## 29795 Chantar.
## 33334 A.V\xe1zquez & D.A.Neill
## 38118 C.T.White
## 42895 (F.G\xe9rard) Hong-Wa
## 53620 (Miranda) Hammel
## 21577 Lowry, G.M.Plunkett & Frodin
## 21626 Lowry & G.M.Plunkett
## 380111 (Sobko)
## 53907 L.M.Gonz\xe1lez
## 10258 T.D.Penn.
## 24888 (Tul.) G.E. Schatz & Lowry
## 26608 Stapf
## 47958 E.L. Taylor ex Mondrag\xf3n
## 51229 P.W.Fritsch
## 58208 (F.Muell.) Radlk.
## 27739 T.D.Penn.
## 35039 Cheek & Gosline
## 402210 Q. Luke
## 40409 Cheek & Onana
## 41269 Triana
## 45689 Marc.-Berti & Aymard
## 46809 (Bojer ex Hook.) Phillipson & Callm.
## 46839 (Moldenke) Phillipson & Callm.
## 57839 (Kokwaro) Vollesen
As before, we can check for species with matching distance equal to 1, …
nrow(GTS.fuzzy[GTS.fuzzy$Fuzzy.dist == 1, ]) # 207
## [1] 207
head(GTS.fuzzy[GTS.fuzzy$Fuzzy.dist == 1, c("TaxonName", "scientificName", "Old.name", "taxonID")])
## TaxonName scientificName Old.name
## 14 Abarema cochliocarpos Abarema cochliacarpos
## 402 Acacia macdonnellensis Acacia macdonnelliensis
## 1298 Adinandra milletii Adinandra millettii
## 1402 Aegiphila valerioi Aegiphila valerii
## 2011 Aldina aquae-nigrae Aldina macrophylla Aldina aquae-negrae
## 2029 Aldina rio-negrae Aldina macrophylla Aldina rionegrae
## taxonID
## 14 1020872-2
## 402 470815-1
## 1298 828439-1
## 1402 5839-2
## 2011 473441-1
## 2029 473441-1
tail(GTS.fuzzy[GTS.fuzzy$Fuzzy.dist == 1, c("TaxonName", "scientificName", "Old.name", "taxonID")])
## TaxonName scientificName Old.name taxonID
## 45549 Vochysia antioquiae Vochysia antioquia 60457217-2
## 49008 Weinmannia silvicola Weinmannia sylvicola 795085-1
## 491113 Weinmannia trianaea Weinmannia trianae 268328-2
## 51249 Wunderlichia crulsiana Wunderlichia cruelsiana 260740-1
## 56398 Zabelia tyaihyonii Zabelia tyaihyoni 150127-1
## 591112 Zygia macbridii Zygia macbridei 962832-1
… and distances equal to 2 …
nrow(GTS.fuzzy[GTS.fuzzy$Fuzzy.dist == 2, ]) # 137
## [1] 137
head(GTS.fuzzy[GTS.fuzzy$Fuzzy.dist == 2, c("TaxonName", "scientificName", "Old.name", "taxonID")])
## TaxonName scientificName Old.name
## 1001 Acropogon calcicolus Acropogon calcicola
## 1018 Acropogon sageniifolius Acropogon sageniifolia
## 1021 Acropogon schumannianus Acropogon schumanniana
## 3522 Apterosperma oblata Apterosperma oblatum
## 3527 Aquilaria banaensis Aquilaria banaense
## 3790 Archidendron oblonga Archidendropsis oblonga Archidendron oblongum
## taxonID
## 1001 77080941-1
## 1018 822024-1
## 1021 822025-1
## 3522 829855-1
## 3527 931120-1
## 3790 911899-1
tail(GTS.fuzzy[GTS.fuzzy$Fuzzy.dist == 2, c("TaxonName", "scientificName", "Old.name", "taxonID")])
## TaxonName scientificName Old.name taxonID
## 26749 Trema cannabinum Trema cannabina 856736-1
## 29439 Trigonostemon detritiferus Trigonostemon detritifer 979028-1
## 320211 Turraeanthus africana Turraeanthus africanus 579875-1
## 510211 Wrightia flavorosea Wrightia flavidorosea 82849-1
## 51299 Wurdastom ecuadorense Wurdastom ecuadorensis 994547-1
## 58308 Ziziphus cambodianus Ziziphus cambodiana 719271-1
… and those equal to 3.
GTS.fuzzy[GTS.fuzzy$Fuzzy.dist == 3, c("TaxonName", "scientificName", "Old.name", "taxonID")]
## TaxonName scientificName
## 1092 Actinodaphne leiantha Actinodaphne myriantha
## 4261 Arytera collina Drosera collina
## 4817 Ayenia cuatrecasae Ayenia cuatrecasasii
## 5758 Bembicia uniflora Remijia uniflora
## 12111 Bursera zapoteca Bursera aptera
## 23381 Capparidastrum cuatrecasanum Morisonia cuatrecasasiana
## 28913 Cybianthus pittieri Lycianthes multiflora
## 31432 Cyrtandra longistamina Cryptandra longistaminea
## 50272 Diospyros sennenii Diospyros senensis
## 59661 Drypetes louisii Drypetes dussii
## 22642 Eucalyptus alatissima Eucalyptus plenissima
## 41773 Euphorbia neospinescens Euphorbia cuneata subsp. spinescens
## 14554 Grewia androyensis Grewia angolensis
## 14634 Grewia barorum Grewia baronii
## 15354 Grewia milleri Grewia bicolor
## 15718 Labordia triflora Gagea triflora
## 53355 Miconia castaneiflora Miconia castaneifolia
## 47096 Padus napaulensis Prunus napaulensis
## 28897 Populus hyrcana Populus haoana
## 35567 Protium aidanum Protium bahianum
## 37587 Prunus dielsiana Cotoneaster dielsianus
## 101210 Ruagea beckii Jungia beckii
## 10198 Ruagea obovata Dalea obovata
## 18139 Saurauia cuatrecasana Saurauia cuatrecasasiana
## 421112 Sorbus neglecta Lotus lancerottensis
## 42148 Sorbus obtusifolia Hesperomeles obtusifolia
## 47697 Sterculia holtzei Sterculia megistophylla
## 19659 Ternstroemia huberi Ternstroemia hosei
## Old.name taxonID
## 1092 462296-1
## 4261 77142060-1
## 4817 27519-2
## 5758 302939-2
## 12111 127080-1
## 23381 Capparidastrum cuatrecasasianum 77184037-1
## 28913 Lycianthes pittieri 146571-2
## 31432 717246-1
## 50272 323003-1
## 59661 85490-2
## 22642 593258-1
## 41773 Euphorbia spinescens 880546-1
## 14554 834371-1
## 14634 834077-1
## 15354 Grewia dinteri 834087-1
## 15718 Lloydia triflora 535691-1
## 53355 572249-1
## 47096 730017-1
## 28897 776705-1
## 35567 300177-2
## 37587 Pyrus dielsiana 722471-1
## 101210 315219-2
## 10198 76270-2
## 18139 228033-2
## 421112 Lotus neglecta 503717-1
## 42148 Pyrus obtusifolia 725484-1
## 47697 Sterculia hosei 825342-1
## 19659 830529-1
The custom function can also be used now.
GTS.acceptable <- acceptable.match(GTS.fuzzy)
GTS.acceptable2 <- acceptable.match(GTS.fuzzy, no.vowels=TRUE)
Again the results of the function suggest that many of fuzzy matches can be accepted.
nrow(GTS.fuzzy[GTS.acceptable == TRUE, ]) # 130
## [1] 130
head(GTS.fuzzy[GTS.acceptable == TRUE, c("TaxonName", "scientificName", "Old.name", "taxonID")])
## TaxonName scientificName Old.name taxonID
## 1001 Acropogon calcicolus Acropogon calcicola 77080941-1
## 1018 Acropogon sageniifolius Acropogon sageniifolia 822024-1
## 1021 Acropogon schumannianus Acropogon schumanniana 822025-1
## 2029 Aldina rio-negrae Aldina macrophylla Aldina rionegrae 473441-1
## 2046 Alectryon macrococcum Alectryon macrococcus 781658-1
## 2630 Ambavia gerrardii Ambavia gerrardi 72022-1
tail(GTS.fuzzy[GTS.acceptable == TRUE, c("TaxonName", "scientificName", "Old.name", "taxonID")])
## TaxonName scientificName Old.name
## 11499 Tabernaemontana mocquerysii Tabernaemontana mocquerysi
## 13799 Tambourissa castri-delphinii Tambourissa castri-delphini
## 26749 Trema cannabinum Trema cannabina
## 320211 Turraeanthus africana Turraeanthus africanus
## 56398 Zabelia tyaihyonii Zabelia tyaihyoni
## 58308 Ziziphus cambodianus Ziziphus cambodiana
## taxonID
## 11499 82224-1
## 13799 582447-1
## 26749 856736-1
## 320211 579875-1
## 56398 150127-1
## 58308 719271-1
nrow(GTS.fuzzy[GTS.acceptable == FALSE & GTS.acceptable2 == TRUE, ]) # 113
## [1] 113
head(GTS.fuzzy[GTS.acceptable == FALSE & GTS.acceptable2 == TRUE,
c("TaxonName", "scientificName", "Old.name", "taxonID")])
## TaxonName scientificName Old.name
## 14 Abarema cochliocarpos Abarema cochliacarpos
## 402 Acacia macdonnellensis Acacia macdonnelliensis
## 1402 Aegiphila valerioi Aegiphila valerii
## 2011 Aldina aquae-nigrae Aldina macrophylla Aldina aquae-negrae
## 3003 Annickia kummeriae Annickia kummerae
## 3977 Ardisia lancifolia Tapeinosperma lanceifolium Ardisia lanceifolia
## taxonID
## 14 1020872-2
## 402 470815-1
## 1402 5839-2
## 2011 473441-1
## 3003 948443-1
## 3977 590094-1
tail(GTS.fuzzy[GTS.acceptable == FALSE & GTS.acceptable2 == TRUE,
c("TaxonName", "scientificName", "Old.name", "taxonID")])
## TaxonName scientificName Old.name
## 43879 Vitex carvalhi Vitex mossambicensis Vitex carvalhoi
## 45549 Vochysia antioquiae Vochysia antioquia
## 49008 Weinmannia silvicola Weinmannia sylvicola
## 491113 Weinmannia trianaea Weinmannia trianae
## 51249 Wunderlichia crulsiana Wunderlichia cruelsiana
## 591112 Zygia macbridii Zygia macbridei
## taxonID
## 43879 865886-1
## 45549 60457217-2
## 49008 795085-1
## 491113 268328-2
## 51249 260740-1
## 591112 962832-1
Slightly over 100 species remain that are not accepted with the rules of the custom function. It has now become a relatively easy task to manually check these species, so the script below saves the species and their matching details. But maybe we should check first whether any of these remaining species could have been matched with the World Flora Online? This is done in the next section.
GTS.fuzzy.remain2 <- GTS.fuzzy[GTS.acceptable2 == FALSE, ]
nrow(GTS.fuzzy.remain2) # 129
## [1] 129
head(GTS.fuzzy.remain2[, c("TaxonName", "scientificName", "Old.name", "taxonID")])
## TaxonName scientificName Old.name taxonID
## 1092 Actinodaphne leiantha Actinodaphne myriantha 462296-1
## 1298 Adinandra milletii Adinandra millettii 828439-1
## 2364 Alnus mandshurica Alnus mandschurica 107681-1
## 3336 Antirhea novobritanniensis Antirhea novobrittanniensis 969414-1
## 3527 Aquilaria banaensis Aquilaria banaense 931120-1
## 4261 Arytera collina Drosera collina 77142060-1
tail(GTS.fuzzy.remain2[, c("TaxonName", "scientificName", "Old.name", "taxonID")])
## TaxonName scientificName
## 19437 Ternstroemia conicocarpa Ternstroemia coniocarpa
## 19659 Ternstroemia huberi Ternstroemia hosei
## 20769 Tetralix moanensis Tetralix moaensis
## 209211 Tetrapterocarpon septentrionalis Tetrapterocarpon septentrionale
## 510211 Wrightia flavorosea Wrightia flavidorosea
## 51299 Wurdastom ecuadorense Wurdastom ecuadorensis
## Old.name taxonID
## 19437 1017244-1
## 19659 830529-1
## 20769 251704-2
## 209211 20004546-1
## 510211 82849-1
## 51299 994547-1
file.save2 <- paste0(getwd(), "//GTS_Fuzzy_WCVP_remain.txt")
fwrite(GTS.fuzzy.remain2, file=file.save2, sep="|", row.names=FALSE)
As GlobalTreeSearch was compiled from many different information sources, it is possible that some species that could not be matched with the WCVP could have been included in the World Flora Online. This is what we will check for here.
GTS.recheck <- rbind(GTS.not2[, c("TaxonName", "Author")],
GTS.fuzzy.remain2[, c("TaxonName", "Author")])
nrow(GTS.recheck)
## [1] 181
start.time <- Sys.time()
GTS.rechecked <- WFO.one(WFO.match.fuzzyjoin(spec.data=GTS.recheck,
WFO.data=WFO.data,
spec.name="TaxonName",
Authorship="Author",
fuzzydist.max=3),
verbose=FALSE)
## Checking for fuzzy matches for 72 records
##
## Checking new accepted IDs
end.time <- Sys.time()
end.time - start.time # 1.113053 hours
## Time difference of 4.634521 mins
As already shown among the messages, there were indeed some of the species that could be directly matched with World Flora Online. And also among the species with fuzzy matches, many can be accepted via a manual check.
nrow(GTS.rechecked[GTS.rechecked$Matched == TRUE, ])
## [1] 151
GTS.rechecked[GTS.rechecked$Matched == FALSE,
c("TaxonName", "scientificName")]
## TaxonName scientificName
## 2 Amphitecna fonceti <NA>
## 3 Artocarpus bergii <NA>
## 5 Beilschmiedia osacola <NA>
## 9 Cinnadenia liyuyingii <NA>
## 14 Cryptocarya sheikelmudiyana <NA>
## 15 Ctenodon molliculus <NA>
## 16 Ctenodon monteiroi <NA>
## 17 Cyrtandra balgooyi <NA>
## 18 Deinbollia onanae <NA>
## 19 Diospyros antakaranae <NA>
## 20 Disepalum rawagambut <NA>
## 21 Endiandra wongawallanensis <NA>
## 23 Englerodendron libassum <NA>
## 26 Grewia mansouriana <NA>
## 27 Humiriastrum purusensis <NA>
## 28 Jarandersonia pereirae <NA>
## 30 Magnolia llanganatensis <NA>
## 36 Prunus klokovii <NA>
## 37 Quercus centenaria <NA>
## 38 Ruagea parvifructa <NA>
## 42 Styrax cambodianus <NA>
## 44 Trichilia deminuta <NA>
## 45 Uvariopsis dicaprio <NA>
## 47 Vepris zapfackii <NA>
## 49 Vochysia caroliae-scottii <NA>
## 50 Volkameria emirnensis <NA>
## 51 Volkameria grevei <NA>
## 52 Zanthoxylum tenuipedicellatum <NA>
## 144 Monteverdia gonoclada <NA>
## 171 Rhododendron suoilenhensis <NA>
GTS.rechecked[GTS.rechecked$Fuzzy == TRUE,
c("TaxonName", "scientificName", "Old.name")]
## TaxonName scientificName
## 6 Beilschmiedia tisseranti Beilschmiedia tisserantii
## 29 Madhuca chia-ananii Madhuca chai-ananii
## 62 Betula murrayana Betula ×purpusii
## 63 Bursera zapoteca Bursera aptera
## 66 Canarium multinervis Canarium multinerve
## 67 Canarium subtilis Canarium subtile
## 68 Capparidastrum cuatrecasanum Morisonia cuatrecasasiana
## 71 Citharexylum mocinnoi Citharexylum mocinoi
## 78 Cyrtandra longistamina Cryptandra longistaminea
## 79 Dalbergia annamensis Dalbergia andapensis
## 82 Dicoryphe buddleoides Dicoryphe buddlejoides
## 84 Diospyros sennenii Diospyros senensis
## 85 Drypetes assymetricarpa Drypetes asymmetricarpa
## 86 Drypetes louisii Drypetes dussii
## 89 Escallonia myrtoides Escallonia myrtoidea
## 90 Euadenia trifoliata Crateva monticola
## 91 Eucalyptus alatissima Eucalyptus plenissima
## 94 Eugenia poroensis Eugenia pardensis
## 104 Gmelina leichardtii Gmelina leichhardtii
## 105 Graffenrieda conostegioides Graffenrieda comostegioides
## 112 Grewia milleri Grimmia milleri
## 114 Guettarda prenleloupii Guettarda preneloupii
## 115 Guettarda wayaensis Guettarda wagapensis
## 136 Memecylon arnhemensis Memecylon arnhemense
## 137 Memecylon plebeium Memecylon plebejum
## 139 Mezoneuron kavaiense Mezoneuron kauaiense
## 141 Miconia doniana Miconia doriana
## 143 Moldenhawera luschnathiana Moldenhawera lushnathiana
## 148 Neraudia melastomifolia Neraudia melastomatifolia
## 153 Pavieasia annamensis Pavieasia anamensis
## 155 Phellocalyx vollescenii Phellocalyx vollesenii
## 160 Plumeria trouinensis Plumeria ×stenopetala
## 170 Quercus mexiae Quercus gambelii
## 172 Ruagea beckii Jungia beckii
## 173 Ruagea obovata Dalea obovata
## 181 Sorbus arvonensis Sorbus arranensis
## 187 Stenostomum albobruneum Stenostomum albobrunneum
## 189 Styrax obassis Styrax obassia
## 190 Syzygium kanneliyensis Syzygium kanneliyense
## 192 Syzygium trukensis Syzygium trukense
## 194 Ternstroemia conicocarpa Ternstroemia coniocarpa
## 198 Wrightia flavorosea Wrightia flavo-rosea
## Old.name
## 6
## 29
## 62 Betula ×murrayana
## 63
## 66
## 67
## 68 Capparidastrum cuatrecasasianum
## 71
## 78
## 79
## 82
## 84
## 85
## 86
## 89
## 90 Euadenia trifoliolata
## 91
## 94
## 104
## 105
## 112
## 114
## 115
## 136
## 137
## 139
## 141
## 143
## 148
## 153
## 155
## 160 Plumeria ×trouinenais
## 170 Quercus media
## 172
## 173
## 181
## 187
## 189
## 190
## 192
## 194
## 198
This publication was initiated partially from ongoing work in a Darwin Initiative project (DAREX001) that develops a Global Biodiversity Standard for tree planting. Recently the GlobalUsefulNativeTrees database was released from this project. With scripts such as the ones shown here, when the Global Biodiversity Standard scheme becomes operational, tree planting projects can crosscheck lists of species before applying.
sessionInfo()
## R version 4.2.1 (2022-06-23 ucrt)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 19045)
##
## Matrix products: default
##
## locale:
## [1] LC_COLLATE=English_United Kingdom.utf8
## [2] LC_CTYPE=English_United Kingdom.utf8
## [3] LC_MONETARY=English_United Kingdom.utf8
## [4] LC_NUMERIC=C
## [5] LC_TIME=English_United Kingdom.utf8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] stringr_1.4.1 data.table_1.14.2 WorldFlora_1.12
##
## loaded via a namespace (and not attached):
## [1] bslib_0.4.0 compiler_4.2.1 pillar_1.8.1 jquerylib_0.1.4
## [5] tools_4.2.1 digest_0.6.29 jsonlite_1.8.0 evaluate_0.16
## [9] lifecycle_1.0.3 tibble_3.1.8 pkgconfig_2.0.3 rlang_1.0.6
## [13] cli_3.4.1 rstudioapi_0.14 yaml_2.3.5 parallel_4.2.1
## [17] fuzzyjoin_0.1.6 xfun_0.33 fastmap_1.1.0 withr_2.5.0
## [21] dplyr_1.0.10 knitr_1.40 generics_0.1.3 vctrs_0.5.1
## [25] sass_0.4.2 tidyselect_1.2.0 glue_1.6.2 R6_2.5.1
## [29] fansi_1.0.3 rmarkdown_2.16 purrr_0.3.4 tidyr_1.2.1
## [33] magrittr_2.0.3 htmltools_0.5.3 stringdist_0.9.10 utf8_1.2.2
## [37] stringi_1.7.8 cachem_1.0.6