This is web scrapring example of getting data from official website of The Environmental Conservation Online System (ECOS). R sctipt pull data from the table of listed threatened and enadangered plant species in United States.
library(XML)
## Warning: package 'XML' was built under R version 3.2.2
url <- "http://ecos.fws.gov/tess_public/reports/ad-hoc-species-report?kingdom=P&status=E&status=T&status=EmE&status=EmT&status=EXPE&status=EXPN&status=SAE&status=SAT&mapstatus=3&fcrithab=on&fstatus=on&fspecrule=on&finvpop=on&fgroup=on&ffamily=on&header=Listed+Plants"
html <- htmlTreeParse(url, useInternalNodes = T)
raw <- xpathSApply(html, "//td", xmlValue)
names <-xpathSApply(html, "//th", xmlValue)
lngt <- length(raw)
table <-data.frame(raw[seq(1, lngt, 8)],
raw[seq(2, lngt, 8)],
raw[seq(3, lngt, 8)],
raw[seq(4, lngt, 8)],
raw[seq(5, lngt, 8)],
raw[seq(6, lngt, 8)],
raw[seq(7, lngt, 8)],
raw[seq(8, lngt, 8)]
)
colnames(table) <- names
Here are presented just first ten rows of columns of interest.
## Scientific Name Family Federal Listing Status
## 1 Abies guatemalensis Pinaceae Threatened
## 2 Abronia macrocarpa Nyctaginaceae Endangered
## 3 Abutilon eremitopetalum Malvaceae Endangered
## 4 Abutilon menziesii Malvaceae Endangered
## 5 Abutilon sandwicense Malvaceae Endangered
## 6 Acaena exigua Rosaceae Endangered
Bar plot chart presents number of threatened and endangered plant species.
This bar plot presents the most reperesented families of threatened and endangered species, together.