1 Packages needed

library(WorldFlora)
library(data.table)
library(dplyr)

2 Introduction

The WorldFlora package (Kindt 2020) was originally designed to use the taxonomic backbone data of World Flora Online (WFO; Borsch et al. 2020; www.worldfloraonline.org).

A new function new.backbone introduced in version 1.8 of the package now allows to use alternative taxonomic backbone data sets. What is required is that the new backbone data includes a key subset of variables that correspond to variables of World Flora Online.

3 Prepare a taxonomic backbone database

The example here will standardize species names via the World Checklist of Vascular Plants (version 6 of September 2021; Govaerts et al. 2021).

WCVP (2021). World Checklist of Vascular Plants, version 2.0. Facilitated by the Royal Botanic Gardens, Kew. Published on the Internet; http://wcvp.science.kew.org/ Retrieved 21 September 2021

The checklist was downloaded to a local folder and then loaded via the following script:

data.dir <- "E:\\Roeland\\R\\World Flora Online\\2021"

WCVP <- fread(paste0(data.dir, "//wcvp_v6_sep_2021.txt"),
              encoding="UTF-8")
nrow(WCVP)
## [1] 1196875
head(WCVP)
##     kew_id      family       genus       species infraspecies
## 1:   338-1 Acanthaceae Acanthodium                           
## 2: 44787-1 Acanthaceae Acanthodium      angustum             
## 3: 44788-1 Acanthaceae Acanthodium       capense             
## 4: 44789-1 Acanthaceae Acanthodium  carduifolium             
## 5: 44790-1 Acanthaceae Acanthodium       delilii             
## 6: 44792-1 Acanthaceae Acanthodium diversispinum             
##                   taxon_name     authors    rank  taxonomic_status
## 1:               Acanthodium      Delile   GENUS           Synonym
## 2:      Acanthodium angustum        Nees SPECIES Homotypic_Synonym
## 3:       Acanthodium capense (L.f.) Nees SPECIES Homotypic_Synonym
## 4:  Acanthodium carduifolium (L.f.) Nees SPECIES Homotypic_Synonym
## 5:       Acanthodium delilii      H.Buek SPECIES           Synonym
## 6: Acanthodium diversispinum        Nees SPECIES Homotypic_Synonym
##    accepted_kew_id           accepted_name  accepted_authors parent_kew_id
## 1:           427-1               Blepharis             Juss.              
## 2:         46469-1       Blepharis angusta (Nees) T.Anderson              
## 3:         46487-1      Blepharis capensis      (L.f.) Pers.              
## 4:         44830-1 Acanthopsis carduifolia     (L.f.) Schinz              
## 5:         46503-1        Blepharis edulis   (Forssk.) Pers.              
## 6:         46501-1  Blepharis diversispina (Nees) C.B.Clarke              
##    parent_name parent_authors  reviewed
## 1:                            In review
## 2:                            In review
## 3:                            In review
## 4:                            In review
## 5:                            In review
## 6:                            In review
##                                      publication original_name_id
## 1: Descr. Egypte, Hist. Nat. 2(Mém.): 241 (1813)                 
## 2:        A.P.de Candolle, Prodr. 11: 273 (1847)                 
## 3:                        Linnaea 15: 361 (1841)                 
## 4:        A.P.de Candolle, Prodr. 11: 278 (1847)          44848-1
## 5:                 Gen. Sp. Candoll. 3: 1 (1858)                 
## 6:        A.P.de Candolle, Prodr. 11: 275 (1847)
tail(WCVP)
##        kew_id         family       genus     species infraspecies
## 1:   873720-1 Zygophyllaceae Zygophyllum  waterlotii             
## 2:   873721-1 Zygophyllaceae Zygophyllum   webbianum             
## 3: 77205534-1 Zygophyllaceae Zygophyllum xanthoxylum             
## 4:   873723-1 Zygophyllaceae Zygophyllum zanthoxylum             
## 5: 77184991-1 Zygophyllaceae Zygophyllum   zilloides             
## 6:   873724-1 Zygophyllaceae Zygophyllum  zonuzensis             
##                 taxon_name                     authors    rank
## 1:  Zygophyllum waterlotii                       Maire SPECIES
## 2:   Zygophyllum webbianum                       Coss. SPECIES
## 3: Zygophyllum xanthoxylum              (Bunge) Maxim. SPECIES
## 4: Zygophyllum zanthoxylum             Engl. ex Dippel SPECIES
## 5:   Zygophyllum zilloides (Humbert) Christenh. & Byng SPECIES
## 6:  Zygophyllum zonuzensis                      Sabeti SPECIES
##     taxonomic_status accepted_kew_id                          accepted_name
## 1: Homotypic_Synonym        967774-1 Zygophyllum gaetulum subsp. waterlotii
## 2:           Synonym        873565-1                 Zygophyllum fontanesii
## 3:          Accepted                                                       
## 4:           Synonym      77205534-1                Zygophyllum xanthoxylum
## 5:          Accepted                                                       
## 6:           Synonym                                                       
##                           accepted_authors parent_kew_id parent_name
## 1: (Maire) Dobignard, Jacquemoud & D.Jord.                          
## 2:                         Webb & Berthel.                          
## 3:                                               41749-1 Zygophyllum
## 4:                          (Bunge) Maxim.                          
## 5:                                               41749-1 Zygophyllum
## 6:                                                                  
##    parent_authors  reviewed                                     publication
## 1:                In review Bull. Soc. Hist. Nat. Afrique N. 28: 348 (1937)
## 2:                In review            Bull. Soc. Bot. France 2: 365 (1855)
## 3:             L. In review                         Fl. Tangut.: 103 (1889)
## 4:                In review                 Handb. Laubholzk. 2: 358 (1893)
## 5:             L. In review                         Global Fl. 4: 93 (2018)
## 6:                In review      Forests, Trees & Shrubs of Iran: 12 (1976)
##    original_name_id
## 1:                 
## 2:                 
## 3:         873370-1
## 4:                 
## 5:         873235-1
## 6:

3.1 Change format for hybrid names

In the WFO, the format for hybrid species names is ‘Fragaria ×ananassa’ instead of the format of ‘Fragaria × ananassa’ used in the WCVP. The format of WFO works better with the name matching algorithms of WorldFlora.

WCVP[WCVP$genus == "Fragaria" & WCVP$species == "ananassa", "taxon_name"]
##                                    taxon_name
## 1:                        Fragaria × ananassa
## 2: Fragaria × ananassa nothosubsp. cuneifolia

The following script does the job. But this process to be very slow, so instead I made the replacement of ’ × ’ by ’ ×’ in a text editor (EditPad) and then loaded the modified text file.

# following codes is not run

# pattern1 <- paste(" ", intToUtf8(215), " ", sep="")
# pattern2 <- paste(" ", intToUtf8(215), sep="")

#for (i in 1:nrow(WCVP)) {
#  if (round(i/100000) == i/100000) {cat(paste("Reached record ", i, "\n"))}
#  if (grepl(pattern=pattern1, x=WCVP[i, "taxon_name"]) == TRUE) {
#      WCVP[i, "taxon_name"] <- gsub(pattern=pattern1, 
#                                 replacement=pattern2, 
#                                 x=WCVP[i, "taxon_name"])
#  }
#}

# read in the modified text file

WCVP <- fread(paste0(data.dir, "//wcvp_v6_sep_2021 - no spacexspace.txt"),
              encoding="UTF-8")
nrow(WCVP)
## [1] 1196875
WCVP[WCVP$genus == "Fragaria" & WCVP$species == "ananassa", "taxon_name"]
##                                   taxon_name
## 1:                        Fragaria ×ananassa
## 2: Fragaria ×ananassa nothosubsp. cuneifolia

3.2 Make the backbone

Now the WorldFlora function of new.backbone can be used to create a reference data set that can be used by WorldFlora.

WCVP.data <- new.backbone(WCVP, 
                          taxonID="kew_id",
                          scientificName="taxon_name",
                          scientificNameAuthorship="authors",
                          acceptedNameUsageID = "accepted_kew_id",
                          taxonomicStatus = "taxonomic_status")
head(WCVP.data)
##    taxonID            scientificName scientificNameAuthorship
## 1:   338-1               Acanthodium                   Delile
## 2: 44787-1      Acanthodium angustum                     Nees
## 3: 44788-1       Acanthodium capense              (L.f.) Nees
## 4: 44789-1  Acanthodium carduifolium              (L.f.) Nees
## 5: 44790-1       Acanthodium delilii                   H.Buek
## 6: 44792-1 Acanthodium diversispinum                     Nees
##    acceptedNameUsageID   taxonomicStatus  kew_id      family       genus
## 1:               427-1           Synonym   338-1 Acanthaceae Acanthodium
## 2:             46469-1 Homotypic_Synonym 44787-1 Acanthaceae Acanthodium
## 3:             46487-1 Homotypic_Synonym 44788-1 Acanthaceae Acanthodium
## 4:             44830-1 Homotypic_Synonym 44789-1 Acanthaceae Acanthodium
## 5:             46503-1           Synonym 44790-1 Acanthaceae Acanthodium
## 6:             46501-1 Homotypic_Synonym 44792-1 Acanthaceae Acanthodium
##          species infraspecies                taxon_name     authors    rank
## 1:                                          Acanthodium      Delile   GENUS
## 2:      angustum                   Acanthodium angustum        Nees SPECIES
## 3:       capense                    Acanthodium capense (L.f.) Nees SPECIES
## 4:  carduifolium               Acanthodium carduifolium (L.f.) Nees SPECIES
## 5:       delilii                    Acanthodium delilii      H.Buek SPECIES
## 6: diversispinum              Acanthodium diversispinum        Nees SPECIES
##     taxonomic_status accepted_kew_id           accepted_name  accepted_authors
## 1:           Synonym           427-1               Blepharis             Juss.
## 2: Homotypic_Synonym         46469-1       Blepharis angusta (Nees) T.Anderson
## 3: Homotypic_Synonym         46487-1      Blepharis capensis      (L.f.) Pers.
## 4: Homotypic_Synonym         44830-1 Acanthopsis carduifolia     (L.f.) Schinz
## 5:           Synonym         46503-1        Blepharis edulis   (Forssk.) Pers.
## 6: Homotypic_Synonym         46501-1  Blepharis diversispina (Nees) C.B.Clarke
##    parent_kew_id parent_name parent_authors  reviewed
## 1:                                          In review
## 2:                                          In review
## 3:                                          In review
## 4:                                          In review
## 5:                                          In review
## 6:                                          In review
##                                      publication original_name_id
## 1: Descr. Egypte, Hist. Nat. 2(Mém.): 241 (1813)                 
## 2:        A.P.de Candolle, Prodr. 11: 273 (1847)                 
## 3:                        Linnaea 15: 361 (1841)                 
## 4:        A.P.de Candolle, Prodr. 11: 278 (1847)          44848-1
## 5:                 Gen. Sp. Candoll. 3: 1 (1858)                 
## 6:        A.P.de Candolle, Prodr. 11: 275 (1847)
tail(WCVP.data)
##       taxonID          scientificName    scientificNameAuthorship
## 1:   873720-1  Zygophyllum waterlotii                       Maire
## 2:   873721-1   Zygophyllum webbianum                       Coss.
## 3: 77205534-1 Zygophyllum xanthoxylum              (Bunge) Maxim.
## 4:   873723-1 Zygophyllum zanthoxylum             Engl. ex Dippel
## 5: 77184991-1   Zygophyllum zilloides (Humbert) Christenh. & Byng
## 6:   873724-1  Zygophyllum zonuzensis                      Sabeti
##    acceptedNameUsageID   taxonomicStatus     kew_id         family       genus
## 1:            967774-1 Homotypic_Synonym   873720-1 Zygophyllaceae Zygophyllum
## 2:            873565-1           Synonym   873721-1 Zygophyllaceae Zygophyllum
## 3:                              Accepted 77205534-1 Zygophyllaceae Zygophyllum
## 4:          77205534-1           Synonym   873723-1 Zygophyllaceae Zygophyllum
## 5:                              Accepted 77184991-1 Zygophyllaceae Zygophyllum
## 6:                               Synonym   873724-1 Zygophyllaceae Zygophyllum
##        species infraspecies              taxon_name                     authors
## 1:  waterlotii               Zygophyllum waterlotii                       Maire
## 2:   webbianum                Zygophyllum webbianum                       Coss.
## 3: xanthoxylum              Zygophyllum xanthoxylum              (Bunge) Maxim.
## 4: zanthoxylum              Zygophyllum zanthoxylum             Engl. ex Dippel
## 5:   zilloides                Zygophyllum zilloides (Humbert) Christenh. & Byng
## 6:  zonuzensis               Zygophyllum zonuzensis                      Sabeti
##       rank  taxonomic_status accepted_kew_id
## 1: SPECIES Homotypic_Synonym        967774-1
## 2: SPECIES           Synonym        873565-1
## 3: SPECIES          Accepted                
## 4: SPECIES           Synonym      77205534-1
## 5: SPECIES          Accepted                
## 6: SPECIES           Synonym                
##                             accepted_name
## 1: Zygophyllum gaetulum subsp. waterlotii
## 2:                 Zygophyllum fontanesii
## 3:                                       
## 4:                Zygophyllum xanthoxylum
## 5:                                       
## 6:                                       
##                           accepted_authors parent_kew_id parent_name
## 1: (Maire) Dobignard, Jacquemoud & D.Jord.                          
## 2:                         Webb & Berthel.                          
## 3:                                               41749-1 Zygophyllum
## 4:                          (Bunge) Maxim.                          
## 5:                                               41749-1 Zygophyllum
## 6:                                                                  
##    parent_authors  reviewed                                     publication
## 1:                In review Bull. Soc. Hist. Nat. Afrique N. 28: 348 (1937)
## 2:                In review            Bull. Soc. Bot. France 2: 365 (1855)
## 3:             L. In review                         Fl. Tangut.: 103 (1889)
## 4:                In review                 Handb. Laubholzk. 2: 358 (1893)
## 5:             L. In review                         Global Fl. 4: 93 (2018)
## 6:                In review      Forests, Trees & Shrubs of Iran: 12 (1976)
##    original_name_id
## 1:                 
## 2:                 
## 3:         873370-1
## 4:                 
## 5:         873235-1
## 6:

4 Checking GlobalTreeSearch species names

The GlobalTreeSearch database provides a global list of tree species names, compiled by a process documented in Beech et al. 2020. The version of GTS used here was version 1.5 of March 2021.

Before loading, a change was made for Monoon shendurunii to only list the first set of authors for this species.

GTS <- read.csv(paste0(data.dir, "//global_tree_search.csv"))
GTS <- GTS[, 1:2]
nrow(GTS)
## [1] 58497
head(GTS)
##              TaxonName                                Author
## 1     Abarema abbottii (Rose & Leonard) Barneby & J.W.Grimes
## 2      Abarema acreana                   (J.F.Macbr.) L.Rico
## 3   Abarema adenophora          (Ducke) Barneby & J.W.Grimes
## 4    Abarema alexandri           (Urb.) Barneby & J.W.Grimes
## 5 Abarema asplenifolia        (Griseb.) Barneby & J.W.Grimes
## 6   Abarema auriculata         (Benth.) Barneby & J.W.Grimes

The species names can now be checked via the functions of WFO.match and WFO.one of WorldFlora.

GTS.match.test <- WFO.one(WFO.match(GTS[1:1000, ], 
                               WFO.data = WCVP.data,
                               spec.name = "TaxonName",
                               Authorship = "Author"))
## Fuzzy matches for  Abarema cochliocarpos were:  Abarema cochliacarpos
## Fuzzy matches for  Acacia macdonnellensis were:  Olearia macdonnellensis, Acacia macdonnelliensis, Acacia macdonnelliensis subsp. macdonnelliensis, Acacia macdonnelliensis subsp. teretifolia
## Best fuzzy matches for  Acacia macdonnellensis were:  Acacia macdonnelliensis
## Fuzzy matches for  Acropogon calcicolus were:  Acropogon calcicola
## Fuzzy matches for  Acropogon sageniifolius were:  Acropogon sageniifolia
## Reached record # 1000
## Fuzzy matches for  Acropogon schumannianus were:  Acropogon schumanniana
## 
##  Checking new accepted IDs
## Reached record # 1000
## Different candidates for original record # 53, including Abies alba
## Found unique non-synonym case for record # 53
## Different candidates for original record # 54, including Abies lasiocarpa var. lasiocarpa
## Found unique non-synonym case for record # 54
## Different candidates for original record # 69, including Abies forrestii var. forrestii
## Found unique non-synonym case for record # 69
## Different candidates for original record # 80, including Abies grandis
## Found unique non-synonym case for record # 80
## Different candidates for original record # 119, including Abuta rufescens
## Found unique non-synonym case for record # 119
## Different candidates for original record # 149, including Acacia amoena
## Found unique non-synonym case for record # 149
## Different candidates for original record # 182, including Acacia binervata
## Found unique non-synonym case for record # 182
## Different candidates for original record # 198, including Acacia calamifolia
## Found unique non-synonym case for record # 198
## Different candidates for original record # 216, including Acacia conniana
## Found unique non-synonym case for record # 216
## Different candidates for original record # 218, including Acacia colletioides
## Found unique non-synonym case for record # 218
## Different candidates for original record # 249, including Acacia dealbata
## Found unique non-synonym case for record # 249
## Different candidates for original record # 270, including Acacia doratoxylon
## Found unique non-synonym case for record # 270
## Different candidates for original record # 277, including Albizia procera
## Found unique non-synonym case for record # 277
## Different candidates for original record # 289, including Acacia falcata
## Found unique non-synonym case for record # 289
## Different candidates for original record # 325, including Acacia koa
## Found unique non-synonym case for record # 325
## Different candidates for original record # 347, including Acacia julifera
## Found unique non-synonym case for record # 347
## Different candidates for original record # 359, including Acacia melanoxylon
## Found unique non-synonym case for record # 359
## Different candidates for original record # 374, including Acacia linearifolia
## Found unique non-synonym case for record # 374
## Different candidates for original record # 377, including Acacia dentifera
## Found unique non-synonym case for record # 377
## Different candidates for original record # 409, including Paraserianthes lophantha subsp. montana
## Found unique non-synonym case for record # 409
## Different candidates for original record # 410, including Senegalia montigena
## Found unique non-synonym case for record # 410
## Different candidates for original record # 447, including Senegalia hamulosa
## Found unique non-synonym case for record # 447
## Different candidates for original record # 476, including Acacia pubescens
## Found unique Accepted case for record # 476
## Different candidates for original record # 500, including Acacia rostellifera
## Found unique non-synonym case for record # 500
## Different candidates for original record # 577, including Acacia verticillata
## Found unique non-synonym case for record # 577
## Different candidates for original record # 616, including Acalypha glabrata
## Found unique non-synonym case for record # 616
## Different candidates for original record # 654, including Acalypha villosa
## Found unique non-synonym case for record # 654
## Different candidates for original record # 680, including Acer amoenum
## Found unique Accepted case for record # 680
## Different candidates for original record # 716, including Acer macrophyllum
## Smallest ID candidates for  716 were:  Acer campbellii subsp. flabellatum, Acer macrophyllum
## Selected record with smallest ID for record # 716
## Different candidates for original record # 772, including Acer palmatum
## Found unique non-synonym case for record # 772
## Different candidates for original record # 775, including Acer acuminatum
## Found unique non-synonym case for record # 775
## Different candidates for original record # 828, including Acer velutinum
## Found unique non-synonym case for record # 828
## Different candidates for original record # 895, including Sweetia fruticosa
## Found unique non-synonym case for record # 895
## Reached case # 1000

When checking the full species list, details of the matching process are not shown by setting the argument of ‘verbose’ to FALSE. This is done here simply to reduce the size of the output file, but in practical applications, typically with smaller data sets, I recommend to display these details.

GTS.match <- WFO.one(WFO.match(GTS, 
                               WFO.data = WCVP.data,
                               spec.name = "TaxonName",
                               Authorship = "Author",
                               verbose = FALSE, 
                               counter = 5000),
                     verbose = FALSE, counter = 10000)
## Reached record # 5000
## Reached record # 10000
## Reached record # 15000
## Reached record # 20000
## Reached record # 25000
## Reached record # 30000
## Reached record # 35000
## Reached record # 40000
## Reached record # 45000
## Reached record # 50000
## Reached record # 55000
## 
##  Checking new accepted IDs
## Reached record # 5000
## Reached record # 10000
## Reached record # 15000
## Reached record # 20000
## Reached record # 25000
## Reached record # 30000
## Reached record # 35000
## Reached record # 40000
## Reached record # 45000
## Reached record # 50000
## Reached record # 55000
## Reached record # 60000
## Reached case # 10000
## Reached case # 20000
## Reached case # 30000
## Reached case # 40000
## Reached case # 50000

5 Analysis of matches

For the majority of GTS species, there was a direct match with a species name in WCSP.

nrow(GTS.match[GTS.match$Fuzzy == FALSE, ])
## [1] 57918
nrow(GTS.match[GTS.match$Fuzzy == FALSE, ]) / nrow(GTS.match)
## [1] 0.9901021
nrow(GTS.match[GTS.match$Fuzzy == TRUE, ])
## [1] 579
nrow(GTS.match[GTS.match$Fuzzy == TRUE, ]) / nrow(GTS.match)
## [1] 0.009897943

The following script saves copies of the matched names and details about the matching process.

GTS.m <- GTS.match[GTS.match$Fuzzy==FALSE, ]
GTS.m <- GTS.m[, c("TaxonName", "Author", 
                   "kew_id", "family", "taxon_name", "taxonomic_status",
                   "authors", "Auth.dist",
                   "New.accepted", "Old.status")]

GTS.m.file <- paste0(data.dir, "/GTS direct match.txt")
write.table(GTS.m, file=GTS.m.file, 
            quote=FALSE, sep="|",
            row.names=FALSE,
            fileEncoding="UTF-8")

GTS.nm <- GTS.match[GTS.match$Fuzzy==TRUE, ]

GTS.nm.file <- paste0(data.dir, "/GTS fuzzy match.txt")
write.table(GTS.nm, file=GTS.nm.file, 
            quote=FALSE, sep="|",
            row.names=FALSE,
            fileEncoding="UTF-8")

For two tree species, no matching names could be found. Note that ‘Fuzzy’ is documented as FALSE, indicating that fuzzy matching was not used ultimately.

GTS.match[GTS.match$Matched == FALSE, ]
##                TaxonName    Author     TaxonName.ORIG Squished
## 24080    Gea crassifolia   Achille    Gea crassifolia    FALSE
## 24531 Glycomosis superba B.C.Stone Glycomosis superba    FALSE
##       Brackets.detected Number.detected Unique Matched Fuzzy Fuzzy.toomany
## 24080             FALSE           FALSE   TRUE   FALSE FALSE           585
## 24531             FALSE           FALSE   TRUE   FALSE FALSE             0
##       Fuzzy.two Fuzzy.one Fuzzy.dist Auth.dist OriSeq Subseq taxonID
## 24080     FALSE     FALSE         NA         7  24080      1        
## 24531     FALSE      TRUE         NA         9  24531      1        
##       scientificName scientificNameAuthorship acceptedNameUsageID
## 24080                                                            
## 24531                                                            
##       taxonomicStatus kew_id family genus species infraspecies taxon_name
## 24080                                                                    
## 24531                                                                    
##       authors rank taxonomic_status accepted_kew_id accepted_name
## 24080                                                            
## 24531                                                            
##       accepted_authors parent_kew_id parent_name parent_authors reviewed
## 24080                                                                   
## 24531                                                                   
##       publication original_name_id Hybrid New.accepted Old.status Old.ID
## 24080                                            FALSE                  
## 24531                                            FALSE                  
##       Old.name Old.author Old.author.dist One.Reason
## 24080                                               
## 24531

Over 3000 tree species were directly matched with a synonym.

nrow(GTS.m[GTS.m$New.accepted == TRUE, ])
## [1] 3052
GTS.m[GTS.m$New.accepted == TRUE, c("TaxonName", "taxon_name", 
                                    "taxonomic_status", "Old.status")][1:50, ]
##                         TaxonName                            taxon_name
## 2                 Abarema acreana                   Hydrochorea acreana
## 40                Abarema obovata                 Abarema brachystachya
## 129           Abutilon peruvianum                  Callianthe peruviana
## 130             Abutilon purpusii                   Callianthe purpusii
## 430                Acacia oliveri                     Senegalia senegal
## 440              Acacia ouyrarema                      Mimosa ouyrarema
## 456             Acacia pennivenia                  Vachellia pennivenia
## 602             Acalypha benensis                      Acalypha stricta
## 629           Acalypha multiflora                   Acalypha flagellata
## 652        Acalypha suirenbiensis                 Acalypha cardiophylla
## 675               Acca sellowiana                     Feijoa sellowiana
## 680.1                Acer amoenum                        Acer paihengii
## 682               Acer anhweiense                         Acer robustum
## 709                Acer divergens    Acer cappadocicum subsp. divergens
## 716              Acer flabellatum                     Acer macrophyllum
## 717               Acer floridanum      Acer saccharum subsp. floridanum
## 727            Acer heptaphlebium    Acer campbellii subsp. flabellatum
## 729                  Acer hookeri                       Acer sikkimense
## 734                Acer kawakamii                    Acer caudatifolium
## 744               Acer leucoderme      Acer saccharum subsp. leucoderme
## 745                  Acer lobelii      Acer cappadocicum subsp. lobelii
## 756              Acer miaotaiense       Acer miyabei subsp. miaotaiense
## 762                   Acer nigrum          Acer saccharum subsp. nigrum
## 766              Acer okamotoanum               Acer pictum subsp. mono
## 770               Acer osmastonii                       Acer calcaratum
## 790                Acer rubescens                     Acer morrisonense
## 812              Acer takesimense               Acer pseudosieboldianum
## 824            Acer turkestanicum Acer platanoides subsp. turkestanicum
## 860              Achuaria hirsuta                       Raputia hirsuta
## 886          Acnistus arborescens                  Iochroma arborescens
## 961            Acronychia porteri                Maclurodendron porteri
## 1036       Actinodaphne cuspidata                Actinodaphne notabilis
## 1052       Actinodaphne gullavara             Actinodaphne angustifolia
## 1100  Actinodaphne philippinensis                     Litsea whitfordii
## 1161    Adelobotrys antioquiensis                      Meriania hoyosii
## 1397     Aequatorium cajamarcense            Nordenstamia cajamarcensis
## 1398      Aequatorium carpishense             Nordenstamia carpishensis
## 1401        Aequatorium limonense               Nordenstamia limonensis
## 1403      Aequatorium rimachianum               Nordenstamia rimachiana
## 1404  Aequatorium stellatopilosum           Nordenstamia stellatopilosa
## 1406           Aeschrion cubensis                     Picrasma cubensis
## 1425              Aesculus wangii                     Aesculus assamica
## 1461             Afzelia coriacea                      Sindora coriacea
## 1474            Agapetes pilifera                   Vaccinium piliferum
## 1531          Aglaia bourdillonii                   Aglaia elaeagnoidea
## 1546             Aglaia dasyclada                    Aglaia spectabilis
## 1568                Aglaia iloilo                       Aglaia argentea
## 1705         Aiouea costaricensis                      Ocotea insularis
## 1716         Aiouea guatemalensis              Damburneya guatemalensis
## 1741            Aiouea parvissima                 Damburneya parvissima
##       taxonomic_status        Old.status
## 2             Accepted Homotypic_Synonym
## 40            Accepted           Synonym
## 129           Accepted Homotypic_Synonym
## 130           Accepted Homotypic_Synonym
## 430           Accepted           Synonym
## 440           Unplaced           Synonym
## 456           Accepted Homotypic_Synonym
## 602           Accepted           Synonym
## 629           Accepted           Synonym
## 652           Accepted           Synonym
## 675           Accepted Homotypic_Synonym
## 680.1         Accepted           Synonym
## 682           Accepted           Synonym
## 709           Accepted Homotypic_Synonym
## 716           Accepted           Synonym
## 717           Accepted Homotypic_Synonym
## 727           Accepted           Synonym
## 729           Accepted           Synonym
## 734           Accepted           Synonym
## 744           Accepted Homotypic_Synonym
## 745           Accepted Homotypic_Synonym
## 756           Accepted Homotypic_Synonym
## 762           Accepted Homotypic_Synonym
## 766           Accepted           Synonym
## 770           Accepted           Synonym
## 790           Accepted           Synonym
## 812           Accepted           Synonym
## 824           Accepted Homotypic_Synonym
## 860           Accepted Homotypic_Synonym
## 886           Accepted Homotypic_Synonym
## 961           Accepted Homotypic_Synonym
## 1036          Accepted           Synonym
## 1052          Accepted           Synonym
## 1100          Accepted Homotypic_Synonym
## 1161          Accepted           Synonym
## 1397          Accepted Homotypic_Synonym
## 1398          Accepted Homotypic_Synonym
## 1401          Accepted Homotypic_Synonym
## 1403          Accepted Homotypic_Synonym
## 1404          Accepted           Synonym
## 1406          Accepted Homotypic_Synonym
## 1425          Accepted           Synonym
## 1461          Accepted Homotypic_Synonym
## 1474          Accepted Homotypic_Synonym
## 1531          Accepted           Synonym
## 1546          Accepted           Synonym
## 1568          Accepted           Synonym
## 1705          Accepted           Synonym
## 1716          Accepted Homotypic_Synonym
## 1741          Accepted Homotypic_Synonym

32 tree species were matched with a name of a hybrid species.

nrow(GTS.nm[GTS.nm$Hybrid == "×", ])
## [1] 32
GTS.nm[GTS.nm$Hybrid == "×", c("TaxonName", "taxon_name", 
                               "taxonomic_status", "Old.status")]
##                      TaxonName                 taxon_name taxonomic_status
## 1176       Adenanthos sericeus       Adenanthos ×sericeus         Accepted
## 5926          Betula murrayana           Betula ×purpusii         Accepted
## 6744        Bruguiera hainesii        Bruguiera ×hainesii         Accepted
## 8568     Carramboa rodriguezii     Espeletia ×rodriguezii         Accepted
## 13234        Crataegus anamesa         Crataegus ×anamesa         Accepted
## 13283       Crataegus disperma        Crataegus ×disperma         Accepted
## 13284       Crataegus dispessa        Crataegus ×dispessa         Accepted
## 13308        Crataegus grandis        Crataegus ×disperma         Accepted
## 13311      Crataegus harveyana        Crataegus ×rupicola         Accepted
## 13328  Crataegus knieskerniana   Crataegus ×knieskerniana         Accepted
## 13410         Crataegus rufula          Crataegus ×rufula         Accepted
## 13425    Crataegus stenosepala     Crataegus ×stenosepala         Accepted
## 13438        Crataegus vailiae         Crataegus ×vailiae         Accepted
## 18957     Elizabetha grahamiae          Paloue ×grahamiae         Accepted
## 19109      Endiandra monothyra       Endiandra ×monothyra         Accepted
## 23726       Garcinia guacopary        Garcinia ×guacopary         Accepted
## 28309          Inga andersonii           Inga ×andersonii         Accepted
## 32128  Machaerium salvadorense   Machaerium ×salvadorense         Accepted
## 33105         Malus floribunda          Malus ×floribunda         Accepted
## 33116         Malus micromalus          Malus ×micromalus          Synonym
## 34442       Mespilus canescens × Crataemespilus canescens         Accepted
## 40926     Paulownia taiwaniana      Paulownia ×taiwaniana         Accepted
## 42790   Pittosporum obcordatum    Pittosporum ×obcordatum         Accepted
## 43311     Plumeria trouinensis      Plumeria ×stenopetala         Accepted
## 44465       Prosopis vinalillo        Prosopis ×vinalillo         Accepted
## 46011          Pyrus anatolica           Pyrus ×michauxii         Accepted
## 46059      Pyrus sinkiangensis       Pyrus ×sinkiangensis         Accepted
## 47374 Rhododendron wilhelminae  Rhododendron ×wilhelminae         Accepted
## 48255         Salix heteromera          Salix ×heteromera         Unplaced
## 50789   Sonneratia hainanensis    Sonneratia ×hainanensis         Accepted
## 50948       Sorbus leptophylla          Aria ×leptophylla         Accepted
## 58179             Yucca rigida              Yucca ×rigida          Synonym
##              Old.status
## 1176                   
## 5926            Synonym
## 6744                   
## 8568  Homotypic_Synonym
## 13234                  
## 13283                  
## 13284                  
## 13308           Synonym
## 13311           Synonym
## 13328                  
## 13410                  
## 13425                  
## 13438                  
## 18957 Homotypic_Synonym
## 19109                  
## 23726                  
## 28309                  
## 32128                  
## 33105                  
## 33116                  
## 34442 Homotypic_Synonym
## 40926                  
## 42790                  
## 43311           Synonym
## 44465                  
## 46011           Synonym
## 46059                  
## 47374                  
## 48255                  
## 50789                  
## 50948 Homotypic_Synonym
## 58179

In the majority of cases where fuzzy matching was done, the Levenshtein distance was 1 or 2.

nrow(GTS.nm[GTS.nm$Fuzzy.dist == 1, ])
## [1] 259
nrow(GTS.nm[GTS.nm$Fuzzy.dist == 1, ]) / nrow(GTS.match[GTS.match$Fuzzy == TRUE, ])
## [1] 0.447323
nrow(GTS.nm[GTS.nm$Fuzzy.dist < 3, ])
## [1] 423
nrow(GTS.nm[GTS.nm$Fuzzy.dist < 3, ]) / nrow(GTS.match[GTS.match$Fuzzy == TRUE, ])
## [1] 0.7305699

Where the Levenshtein distance was larger than 10, typically only a matching genus name could be found.

nrow(GTS.nm[GTS.nm$Fuzzy.dist > 10, ])
## [1] 64
nrow(GTS.nm[GTS.nm$Fuzzy.dist > 10, ]) / nrow(GTS.match[GTS.match$Fuzzy == TRUE, ])
## [1] 0.1105354
GTS.nm[GTS.nm$Fuzzy.dist > 10, c("TaxonName", "taxon_name", 
                                 "taxonomic_status", "rank", "Old.status")]
##                           TaxonName             taxon_name taxonomic_status
## 806                 Acer sosnowskyi                   Acer         Accepted
## 3094             Annona penicillata                 Annona         Accepted
## 5642       Beilschmiedia tisseranti          Beilschmiedia         Accepted
## 6249              Boscia mazzocchii                 Boscia         Accepted
## 6294      Bougainvillea fasciculata          Bougainvillea         Accepted
## 7966         Camellia hengchunensis               Camellia         Accepted
## 8987    Castanopsis damingshanensis            Castanopsis         Accepted
## 9010       Castanopsis globigemmata            Castanopsis         Accepted
## 9047         Castanopsis nigrescens            Castanopsis         Accepted
## 9078       Castanopsis subuliformis            Castanopsis         Accepted
## 10197         Cinnadenia liyuyingii             Cinnadenia         Accepted
## 12947        Cotoneaster ellipticus            Cotoneaster         Accepted
## 14375   Cryptocarya sheikelmudiyana            Cryptocarya         Accepted
## 14457           Ctenodon molliculus               Ctenodon         Accepted
## 14459         Ctenodon mucronulatus               Ctenodon         Accepted
## 15907          Desmopsis colombiana              Desmopsis         Accepted
## 19237      Englerodaphne ovalifolia          Englerodaphne         Accepted
## 23415           Freziera caesariata               Freziera         Accepted
## 23418          Freziera campanulata               Freziera         Accepted
## 23436          Freziera glabrescens               Freziera         Accepted
## 23713        Garcinia erythrosepala               Garcinia         Accepted
## 24891         Gordonia singaporeana               Gordonia         Accepted
## 25069           Grewia delphinensis                 Grewia         Accepted
## 25113            Grewia mansouriana                 Grewia         Accepted
## 27276          Hopea quisumbingiana                  Hopea         Accepted
## 28130            Ilex qingyuanensis                   Ilex         Accepted
## 32418           Madhuca chia-ananii                Madhuca         Accepted
## 33223        Mangifera salomonensis              Mangifera         Accepted
## 37300         Myrsine perpauciflora                Myrsine         Accepted
## 38153      Nototrichium sandwicense           Nototrichium         Accepted
## 38278           Ochrosia tahitensis               Ochrosia         Accepted
## 38482             Ocotea javitensis                 Ocotea         Accepted
## 39210            Ormosia boluoensis                Ormosia         Accepted
## 40235             Paloue guianensis                 Paloue         Accepted
## 40962   Pausinystalia brachythyrsum             Corynanthe         Accepted
## 42164       Piliocalyx ignambiensis               Syzygium         Accepted
## 42873     Plagioscyphus calciphilus          Plagioscyphus         Accepted
## 42875      Plagioscyphus danguyanus          Plagioscyphus         Accepted
## 42879    Plagioscyphus meridionalis          Plagioscyphus         Accepted
## 42880      Plagioscyphus unijugatus          Plagioscyphus         Accepted
## 43006         Plathymenia foliolosa Plathymenia reticulata         Accepted
## 43131       Plerandra letocartiorum              Plerandra         Accepted
## 43132          Plerandra longistyla              Plerandra         Accepted
## 43134        Plerandra memaoyaensis              Plerandra         Accepted
## 43145      Plerandra pouemboutensis              Plerandra         Accepted
## 43152          Plerandra tronchetii              Plerandra         Accepted
## 43850            Populus minhoensis                Populus         Accepted
## 46054              Pyrus sachokiana                  Pyrus         Accepted
## 46265            Quercus centenaria                Quercus         Accepted
## 47240   Rhododendron laojunshanense           Rhododendron         Accepted
## 47885            Ruagea parvifructa                 Ruagea         Accepted
## 48794          Scaphium scaphigerum               Scaphium         Accepted
## 49474       Scyphostegia borneensis           Scyphostelma         Accepted
## 49564       Seguieria paraguayensis              Seguieria         Accepted
## 50973            Sorbus obtusifolia                 Sorbus         Accepted
## 51545          Sterculia multiovula              Sterculia         Accepted
## 51572        Sterculia scortechinii              Sterculia         Accepted
## 51824   Stryphnodendron racemiferum        Stryphnodendron         Accepted
## 54123          Tapiscia yunnanensis               Tapiscia         Accepted
## 55621        Trilepisium gymnandrum            Trilepisium         Accepted
## 56275       Vachellia ormocarpoides              Vachellia         Accepted
## 56615           Vepris robertsoniae                 Vepris         Accepted
## 57242.1       Volkameria emirnensis             Volkameria         Accepted
## 57473        Weinmannia turckheimii             Weinmannia         Accepted
##            rank Old.status
## 806       GENUS           
## 3094      GENUS           
## 5642      GENUS           
## 6249      GENUS           
## 6294      GENUS           
## 7966      GENUS           
## 8987      GENUS           
## 9010      GENUS           
## 9047      GENUS           
## 9078      GENUS           
## 10197     GENUS           
## 12947     GENUS           
## 14375     GENUS           
## 14457     GENUS           
## 14459     GENUS           
## 15907     GENUS           
## 19237     GENUS           
## 23415     GENUS           
## 23418     GENUS           
## 23436     GENUS           
## 23713     GENUS           
## 24891     GENUS           
## 25069     GENUS           
## 25113     GENUS           
## 27276     GENUS           
## 28130     GENUS           
## 32418     GENUS           
## 33223     GENUS           
## 37300     GENUS           
## 38153     GENUS           
## 38278     GENUS           
## 38482     GENUS           
## 39210     GENUS           
## 40235     GENUS           
## 40962     GENUS    Synonym
## 42164     GENUS    Synonym
## 42873     GENUS           
## 42875     GENUS           
## 42879     GENUS           
## 42880     GENUS           
## 43006   SPECIES    Synonym
## 43131     GENUS           
## 43132     GENUS           
## 43134     GENUS           
## 43145     GENUS           
## 43152     GENUS           
## 43850     GENUS           
## 46054     GENUS           
## 46265     GENUS           
## 47240     GENUS           
## 47885     GENUS           
## 48794     GENUS           
## 49474     GENUS           
## 49564     GENUS           
## 50973     GENUS           
## 51545     GENUS           
## 51572     GENUS           
## 51824     GENUS           
## 54123     GENUS           
## 55621     GENUS           
## 56275     GENUS           
## 56615     GENUS           
## 57242.1   GENUS           
## 57473     GENUS
nrow(GTS.nm[GTS.nm$Fuzzy.one == TRUE, ])
## [1] 127
nrow(GTS.nm[GTS.nm$Fuzzy.one == TRUE, ]) / nrow(GTS.match[GTS.match$Fuzzy == TRUE, ])
## [1] 0.2193437

6 Matching with World Flora Online

For those species that could not be matched directly with the WCVP, we now try matching those names with World Flora Online. Afterwards a comparison will be made between matches of WCVP and WFO.

Selecting species names that were not matched directly with the WCVP:

GTS.recheck1 <- GTS.nm[, c("TaxonName", "Author")]

GTS.recheck2 <- GTS.match[GTS.match$Matched == FALSE, c("TaxonName", "Author")]

GTS.recheck <- rbind(GTS.recheck1, GTS.recheck2)

Loading the World Flora Online taxonomic backbone:

WFO.remember()
## Data sourced from: C:\R-3.6.1\bin\x64\classification.txt (Wed Sep 22 14:00:37 2021)
## Reading WFO data
## Warning in data.table::fread(WFO.file1, encoding = "UTF-8"): Found and resolved
## improper quoting out-of-sample. First healed line 118: <<wfo-0000000117
## Hieracium onosmoides subsp. sphaerianthum SUBSPECIES wfo-0000034880 (Arv.-
## Touv.) Zahn Asteraceae Hieracium onosmoides sphaerianthum subsp. "Zahn, in
## Engler, Pflanzenr. 82. 1923." 1676 1923 Accepted 2012-02-11 2012-02-11 http://
## www.theplantlist.org/tpl1.1/record/gcc-10011>>. If the fields are not quoted
## (e.g. field separator does not appear within any field), try quote="" to avoid
## this warning.
## The WFO data is now available from WFO.data
GTS.WFO <- WFO.one(WFO.match(GTS.recheck, 
                               WFO.data = WFO.data,
                               spec.name = "TaxonName",
                               Authorship = "Author",
                               verbose = FALSE, 
                               counter = 100),
                     verbose = FALSE, counter = 100)
## Reached record # 100
## Reached record # 200
## Reached record # 300
## Reached record # 400
## Reached record # 500
## 
##  Checking new accepted IDs
## Reached record # 100
## Reached record # 200
## Reached record # 300
## Reached record # 400
## Reached record # 500
## Reached case # 100
## Reached case # 200
## Reached case # 300
## Reached case # 400
## Reached case # 500

The majority of species that could not be directly matched by the WCVP could be matched by WFO.

nrow(GTS.WFO[GTS.WFO$Fuzzy == FALSE, ])
## [1] 478
nrow(GTS.WFO[GTS.WFO$Fuzzy == FALSE, ]) / nrow(GTS.WFO)
## [1] 0.8227194
nrow(GTS.WFO[GTS.WFO$Fuzzy == TRUE, ])
## [1] 103
nrow(GTS.WFO[GTS.WFO$Fuzzy == TRUE, ]) / nrow(GTS.WFO)
## [1] 0.1772806

Saving the results locally:

GTS.m2 <- GTS.WFO[GTS.WFO$Fuzzy==FALSE, ]
GTS.m2 <- GTS.m2[, c("TaxonName", "Author", 
                   "taxonID", "family", "scientificName", "taxonomicStatus",
                   "scientificNameAuthorship", "Auth.dist",
                   "New.accepted", "Old.status")]

GTS.m2.file <- paste0(data.dir, "/WFO direct match.txt")
write.table(GTS.m2, file=GTS.m2.file, 
            quote=FALSE, sep="|",
            row.names=FALSE,
            fileEncoding="UTF-8")

GTS.nm2 <- GTS.WFO[GTS.WFO$Fuzzy==TRUE, ]

GTS.nm2.file <- paste0(data.dir, "/WFO fuzzy match.txt")
write.table(GTS.nm2, file=GTS.nm2.file, 
            quote=FALSE, sep="|",
            row.names=FALSE,
            fileEncoding="UTF-8")

WFO includes species names for many instances where the WCVP could only be matched at the genus level.

GTS.comb <- dplyr::right_join(GTS.m2[, c("TaxonName", "scientificName")],
                             GTS.nm[, c("TaxonName", "taxon_name", "Fuzzy.dist")],
                             by="TaxonName")

GTS.comb[GTS.comb$Fuzzy.dist > 10, ]
##                       TaxonName              scientificName
## 3               Acer sosnowskyi             Acer sosnowskyi
## 29     Beilschmiedia tisseranti    Beilschmiedia tisseranti
## 33            Boscia mazzocchii           Boscia mazzocchii
## 34    Bougainvillea fasciculata   Bougainvillea fasciculata
## 43       Camellia hengchunensis         Camellia brevistyla
## 53  Castanopsis damingshanensis Castanopsis damingshanensis
## 54     Castanopsis globigemmata    Castanopsis globigemmata
## 55       Castanopsis nigrescens      Castanopsis nigrescens
## 56     Castanopsis subuliformis    Castanopsis subuliformis
## 83       Cotoneaster ellipticus      Cotoneaster ellipticus
## 126    Englerodaphne ovalifolia    Englerodaphne ovalifolia
## 139        Freziera campanulata        Freziera campanulata
## 140        Freziera glabrescens        Freziera glabrescens
## 145      Garcinia erythrosepala      Garcinia erythrosepala
## 156       Gordonia singaporeana       Gordonia singaporeana
## 161         Grewia delphinensis         Grewia delphinensis
## 183          Ilex qingyuanensis          Ilex qingyuanensis
## 212         Madhuca chia-ananii         Madhuca chia-ananii
## 222      Mangifera salomonensis      Mangifera salomonensis
## 253       Myrsine perpauciflora       Myrsine perpauciflora
## 261    Nototrichium sandwicense    Nototrichium sandwicense
## 262         Ochrosia tahitensis         Ochrosia tahitensis
## 264           Ocotea javitensis           Ocotea javitensis
## 271          Ormosia boluoensis          Ormosia boluoensis
## 274           Paloue guianensis           Paloue guianensis
## 285 Pausinystalia brachythyrsum Pausinystalia brachythyrsum
## 304   Plagioscyphus calciphilus   Plagioscyphus calciphilus
## 305    Plagioscyphus danguyanus    Plagioscyphus danguyanus
## 307  Plagioscyphus meridionalis  Plagioscyphus meridionalis
## 308    Plagioscyphus unijugatus    Plagioscyphus unijugatus
## 310       Plathymenia foliolosa       Plathymenia foliolosa
## 314     Plerandra letocartiorum     Plerandra letocartiorum
## 315        Plerandra longistyla        Plerandra longistyla
## 317      Plerandra memaoyaensis      Plerandra memaoyaensis
## 320    Plerandra pouemboutensis    Plerandra pouemboutensis
## 322        Plerandra tronchetii        Plerandra tronchetii
## 331          Populus minhoensis          Populus minhoensis
## 368            Pyrus sachokiana            Pyrus sachokiana
## 378 Rhododendron laojunshanense Rhododendron laojunshanense
## 397        Scaphium scaphigerum        Scaphium scaphigerum
## 410     Scyphostegia borneensis     Scyphostegia borneensis
## 411     Seguieria paraguayensis     Seguieria paraguayensis
## 420          Sorbus obtusifolia               Sorbus graeca
## 425        Sterculia multiovula        Sterculia multiovula
## 429 Stryphnodendron racemiferum Stryphnodendron racemiferum
## 442        Tapiscia yunnanensis        Tapiscia yunnanensis
## 459         Vepris robertsoniae         Vepris robertsoniae
## 467      Weinmannia turckheimii      Weinmannia turckheimii
## 484          Annona penicillata                        <NA>
## 495       Cinnadenia liyuyingii                        <NA>
## 499 Cryptocarya sheikelmudiyana                        <NA>
## 500         Ctenodon molliculus                        <NA>
## 502       Ctenodon mucronulatus                        <NA>
## 508        Desmopsis colombiana                        <NA>
## 519         Freziera caesariata                        <NA>
## 525          Grewia mansouriana                        <NA>
## 529        Hopea quisumbingiana                        <NA>
## 548     Piliocalyx ignambiensis                        <NA>
## 554          Quercus centenaria                        <NA>
## 559          Ruagea parvifructa                        <NA>
## 569      Sterculia scortechinii                        <NA>
## 574      Trilepisium gymnandrum                        <NA>
## 575     Vachellia ormocarpoides                        <NA>
## 577       Volkameria emirnensis                        <NA>
##                 taxon_name Fuzzy.dist
## 3                     Acer         11
## 29           Beilschmiedia         11
## 33                  Boscia         11
## 34           Bougainvillea         12
## 43                Camellia         14
## 53             Castanopsis         16
## 54             Castanopsis         13
## 55             Castanopsis         11
## 56             Castanopsis         13
## 83             Cotoneaster         11
## 126          Englerodaphne         11
## 139               Freziera         12
## 140               Freziera         12
## 145               Garcinia         14
## 156               Gordonia         13
## 161                 Grewia         13
## 183                   Ilex         14
## 212                Madhuca         12
## 222              Mangifera         13
## 253                Myrsine         14
## 261           Nototrichium         12
## 262               Ochrosia         11
## 264                 Ocotea         11
## 271                Ormosia         11
## 274                 Paloue         11
## 285             Corynanthe         14
## 304          Plagioscyphus         12
## 305          Plagioscyphus         11
## 307          Plagioscyphus         13
## 308          Plagioscyphus         11
## 310 Plathymenia reticulata         20
## 314              Plerandra         14
## 315              Plerandra         11
## 317              Plerandra         13
## 320              Plerandra         15
## 322              Plerandra         11
## 331                Populus         11
## 368                  Pyrus         11
## 378           Rhododendron         15
## 397               Scaphium         12
## 410           Scyphostelma         13
## 411              Seguieria         14
## 420                 Sorbus         12
## 425              Sterculia         11
## 429        Stryphnodendron         12
## 442               Tapiscia         12
## 459                 Vepris         13
## 467             Weinmannia         12
## 484                 Annona         12
## 495             Cinnadenia         11
## 499            Cryptocarya         16
## 500               Ctenodon         11
## 502               Ctenodon         13
## 508              Desmopsis         11
## 519               Freziera         11
## 525                 Grewia         12
## 529                  Hopea         15
## 548               Syzygium         13
## 554                Quercus         11
## 559                 Ruagea         12
## 569              Sterculia         13
## 574            Trilepisium         11
## 575              Vachellia         14
## 577             Volkameria         11

Since the number of species that are neither directly matched by the WCVP and the WFO is relatively small, it becomes possible now to manually check for potentially correct species names.

GTS.comb <- dplyr::left_join(GTS.nm2[, c("TaxonName", "scientificName")],
                             GTS.nm[, c("TaxonName", "taxon_name", "Fuzzy.dist")],
                             by="TaxonName")

GTS.comb[GTS.comb$Fuzzy.dist == 1, ]
##                       TaxonName             scientificName
## 4           Aldina aquae-nigrae        Aldina aquae-negrae
## 12      Athenaea wettesteiniana                   Athenaea
## 13           Bathysa mendoncaei          Bathysa mendoncae
## 14             Betula murrayana          Betula ×murrayana
## 18          Celtis orthacanthos        Celtis orthocanthos
## 20        Citharexylum mocinnoi       Citharexylum mocinoi
## 22           Crataegus jonesiae          Crataegus jonesae
## 28            Cynometra dwyerii                  Cynometra
## 31        Dalbergia entadioides       Dalbergia entadoides
## 34       Diospyros apeibacarpos     Diospyros apeibocarpos
## 35           Diospyros martinii          Diospyros martini
## 39      Eucalyptus gregoryensis    Eucalyptus gregoriensis
## 40            Eugenia marchiana          Eugenia marshiana
## 45         Freziera monzonensis       Freziera monsonensis
## 46             Garcinia martini          Garcinia murtonii
## 48      Gigasiphon humblotianus    Gigasiphon humblotianum
## 51       Guettarda prenleloupii      Guettarda preneloupii
## 54         Ixora richard-longii                      Ixora
## 55             Lebrunia bushaie           Lebrunia busbaie
## 57      Machaerium robsonnianum                 Machaerium
## 59   Macrolobium longiracemosum Macrolobium longeracemosum
## 62           Memecylon plebeium         Memecylon plebejum
## 63         Mezoneuron kavaiense     Caesalpinia kauaiensis
## 64              Miconia doniana            Miconia doriana
## 67        Myrciaria vismiifolia      Myrciaria vismeifolia
## 70         Pavieasia annamensis        Pavieasia anamensis
## 71       Peltophorum dasyrachis   Peltophorum dasyrrhachis
## 73         Plumeria trouinensis      Plumeria ×trouinenais
## 84          Schefflera le-ratii         Schefflera le-rati
## 85    Schefflera longepetiolata  Schefflera longipetiolata
## 87       Sonneratia hainanensis    Sonneratia ×hainanensis
## 88            Sorbus arvonensis          Sorbus arranensis
## 89      Sorbus pseudosemiincisa  Sorbus ×pseudosemi-incisa
## 90            Sorbus semiincisa         Sorbus semi-incisa
## 91            Sorbus yondeensis                     Sorbus
## 92      Stenostomum albobruneum   Stenostomum albobrunneum
## 96  Tabernaemontana mocquerysii Tabernaemontana mocquerysi
## 100              Vitex carvalhi       Vitex mossambicensis
## 103       Waltheria cinerascens      Waltheria cinerescens
##                           taxon_name Fuzzy.dist
## 4                 Aldina macrophylla          1
## 12            Athenaea wettsteiniana          1
## 13                 Bathysa mendoncae          1
## 14                  Betula ×purpusii          1
## 18               Celtis orthocanthos          1
## 20              Citharexylum mocinoi          1
## 22                 Crataegus jonesae          1
## 28                  Cynometra dwyeri          1
## 31              Dalbergia entadoides          1
## 34            Diospyros apeibocarpos          1
## 35                 Diospyros martini          1
## 39           Eucalyptus gregoriensis          1
## 40                 Eugenia marshiana          1
## 45              Freziera monsonensis          1
## 46                 Garcinia martinii          1
## 48           Gigasiphon humblotianum          1
## 51             Guettarda preneloupii          1
## 54             Ixora richardi-longii          1
## 55                  Lebrunia busbaie          1
## 57            Machaerium robsonianum          1
## 59        Macrolobium longeracemosum          1
## 62                Memecylon plebejum          1
## 63              Mezoneuron kauaiense          1
## 64                   Miconia doriana          1
## 67             Myrciaria vismeifolia          1
## 70               Pavieasia anamensis          1
## 71           Peltophorum dasyrhachis          1
## 73             Plumeria ×stenopetala          1
## 84                Schefflera leratii          1
## 85      Didymopanax longe-petiolatus          1
## 87           Sonneratia ×hainanensis          1
## 88                    Aria avonensis          1
## 89  Karpatiosorbus pseudosemi-incisa          1
## 90        Karpatiosorbus semi-incisa          1
## 91          Griffitharia yongdeensis          1
## 92          Stenostomum albobrunneum          1
## 96        Tabernaemontana mocquerysi          1
## 100             Vitex mossambicensis          1
## 103            Waltheria cinerescens          1
GTS.comb[GTS.comb$Fuzzy.dist == 2, ]
##                     TaxonName          scientificName                taxon_name
## 1        Acropogon calcicolus     Acropogon calcicola       Acropogon calcicola
## 2     Acropogon sageniifolius  Acropogon sageniifolia    Acropogon sageniifolia
## 3     Acropogon schumannianus  Acropogon schumanniana    Acropogon schumanniana
## 9        Archidendron oblonga Archidendropsis oblonga   Archidendropsis oblonga
## 11   Aspidostemon triantherus Aspidostemon trianthera   Aspidostemon trianthera
## 16       Canarium subsidiarum    Canarium subsidarium      Canarium subsidarium
## 17       Catostemma digitatum     Catostemma digitata       Catostemma digitata
## 27     Cupania castaneaefolia        Cupania racemosa     Cupania castaneifolia
## 29        Dalbergia acarantha   Dalbergia acariiantha     Dalbergia acariiantha
## 30       Dalbergia annamensis    Dalbergia andapensis      Dalbergia andapensis
## 41          Eugenia poroensis       Eugenia pardensis         Eugenia pardensis
## 47   Garnieria spathulaefolia Garnieria spathulifolia   Garnieria spathulifolia
## 52        Guettarda wayaensis    Guettarda wagapensis      Guettarda wagapensis
## 61      Memecylon arnhemensis    Memecylon arnhemense      Memecylon arnhemense
## 65      Monteverdia gonoclada      Maytenus buxifolia       Maytenus gonoclados
## 66   Monteverdia schummaniana      Maytenus buxifolia     Maytenus schumanniana
## 68        Omalanthus stokesii    Homalanthus stokesii      Homalanthus stokesii
## 69   Pandanus callmanderianus Pandanus callmanderiana   Pandanus callmanderiana
## 74   Priogymnanthus saxicolus          Priogymnanthus   Priogymnanthus saxicola
## 76           Psychotria volii        Psychotria bonii          Psychotria bonii
## 77             Psydrax glabra                 Psydrax        Canthiumera glabra
## 79             Quercus mexiae        Quercus gambelii          Quercus gambelii
## 80 Rhododendron suoilenhensis            Rhododendron Rhododendron suoilenhense
## 86      Semecarpus calcicolus    Semecarpus calcicola      Semecarpus calcicola
## 95         Syzygium trukensis                Syzygium         Syzygium trukense
##    Fuzzy.dist
## 1           2
## 2           2
## 3           2
## 9           2
## 11          2
## 16          2
## 17          2
## 27          2
## 29          2
## 30          2
## 41          2
## 47          2
## 52          2
## 61          2
## 65          2
## 66          2
## 68          2
## 69          2
## 74          2
## 76          2
## 77          2
## 79          2
## 80          2
## 86          2
## 95          2
GTS.comb[GTS.comb$Fuzzy.dist > 2, ]
##                       TaxonName         scientificName             taxon_name
## 5              Alseis sertaneja                 Alseis                 Alseis
## 6            Amphitecna fonceti             Amphitecna             Amphitecna
## 7              Annona imparilis                 Annona                 Annona
## 8            Annona penicillata                 Annona                 Annona
## 10            Artocarpus bergii             Artocarpus             Artocarpus
## 15             Bursera zapoteca                Bursera                Bursera
## 19        Cinnadenia liyuyingii             Cinnadenia             Cinnadenia
## 21          Conchocarpus rubrus           Conchocarpus           Conchocarpus
## 23  Cryptocarya sheikelmudiyana            Cryptocarya            Cryptocarya
## 24          Ctenodon molliculus               Stenodon               Ctenodon
## 25           Ctenodon monteiroi               Stenodon               Ctenodon
## 26        Ctenodon mucronulatus               Stenodon               Ctenodon
## 32         Desmopsis colombiana              Desmopsis              Desmopsis
## 33            Desmopsis confusa              Desmopsis              Desmopsis
## 36             Drypetes louisii               Drypetes               Drypetes
## 37      Englerodendron libassum         Englerodendron         Englerodendron
## 38        Eucalyptus alatissima  Eucalyptus plenissima  Eucalyptus plenissima
## 42         Flindersia bennettii Flindersia bennettiana Flindersia bennettiana
## 43          Freziera caesariata               Freziera               Freziera
## 44             Freziera ciliata               Freziera               Freziera
## 49           Grewia mansouriana                 Grewia                 Grewia
## 50               Grewia milleri                 Grewia                 Grewia
## 53         Hopea quisumbingiana                  Hopea                  Hopea
## 56            Lindera oxyphylla                Lindera                Lindera
## 58            Machilus coriacea               Machilus               Machilus
## 60             Mangifera lambii              Mangifera              Mangifera
## 72      Piliocalyx ignambiensis             Piliocalyx               Syzygium
## 75              Prunus klokovii                 Prunus                 Prunus
## 78           Quercus centenaria                Quercus                Quercus
## 81                Ruagea beckii                 Ruagea                 Ruagea
## 82               Ruagea obovata                 Ruagea                 Ruagea
## 83           Ruagea parvifructa                 Ruagea                 Ruagea
## 93       Sterculia scortechinii              Sterculia              Sterculia
## 94        Symplocos juiyenensis   Symplocos guianensis   Symplocos guianensis
## 97           Trichilia deminuta              Trichilia              Trichilia
## 98       Trilepisium gymnandrum            Trilepisium            Trilepisium
## 99      Vachellia ormocarpoides              Vachellia              Vachellia
## 101       Volkameria emirnensis             Volkameria             Volkameria
## 102           Volkameria grevei             Volkameria             Volkameria
##     Fuzzy.dist
## 5           10
## 6            8
## 7           10
## 8           12
## 10           7
## 15           9
## 19          11
## 21           7
## 23          16
## 24          11
## 25          10
## 26          13
## 32          11
## 33           8
## 36           8
## 37           9
## 38           3
## 42           3
## 43          11
## 44           8
## 49          12
## 50           8
## 53          15
## 56          10
## 58           9
## 60           7
## 72          13
## 75           9
## 78          11
## 81           7
## 82           8
## 83          12
## 93          13
## 94           3
## 97           9
## 98          11
## 99          14
## 101         11
## 102          7

7 Session Information

sessionInfo()
## R version 4.0.2 (2020-06-22)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 19042)
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=English_United Kingdom.1252 
## [2] LC_CTYPE=English_United Kingdom.1252   
## [3] LC_MONETARY=English_United Kingdom.1252
## [4] LC_NUMERIC=C                           
## [5] LC_TIME=English_United Kingdom.1252    
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] dplyr_1.0.2       data.table_1.12.8 WorldFlora_1.9   
## 
## loaded via a namespace (and not attached):
##  [1] crayon_1.3.4      digest_0.6.25     R6_2.4.1          lifecycle_0.2.0  
##  [5] magrittr_1.5      evaluate_0.14     pillar_1.4.4      rlang_0.4.8      
##  [9] stringi_1.4.6     ellipsis_0.3.1    generics_0.1.0    vctrs_0.3.4      
## [13] rmarkdown_2.3     tools_4.0.2       stringr_1.4.0     glue_1.4.1       
## [17] purrr_0.3.4       xfun_0.15         yaml_2.2.1        compiler_4.0.2   
## [21] pkgconfig_2.0.3   htmltools_0.5.1.1 tidyselect_1.1.0  knitr_1.28       
## [25] tibble_3.0.1