Introduction
For this assignment I chose 3 favorite books in a topic and created 3 equivalent tables in HTML, XML, and JSON. The topic was city infrastructure.
Load Packages
suppressWarnings(library(knitr))
suppressWarnings(library(plyr))
suppressMessages(suppressWarnings(library(RCurl)))
suppressWarnings(library(htmltab))
suppressWarnings(library(XML))
suppressWarnings(library(xtable))
suppressWarnings(library(jsonlite))
Import HTML
urlhtml<-"https://raw.githubusercontent.com/spsstudent15/2016-01-607-08/master/books.html"
filehtml<-getURL(url=urlhtml)
thtml<-htmltab(doc=filehtml)
## Argument 'which' was left unspecified. Choosing first table.
## Neither <thead> nor <th> information found. Taking first table row for the header. If incorrect, specifiy header argument.
| 2 |
The Power Broker |
Robert Caro |
1974 |
Adult |
History |
| 3 |
The Works: Anatomy of a City |
Kate Ascher |
2007 |
Young Adult |
Engineering |
| 4 |
Building Construction: Principles, Materials, and System |
Madan Mehta, Walter R. Scarborough, Diane Armpriest |
2012 |
Adult |
Engineering |
Import XML
urlxml<-"https://raw.githubusercontent.com/spsstudent15/2016-01-607-08/master/books.xml"
filexml<-getURL(urlxml)
dfxml<-ldply(xmlToList(filexml),data.frame)
kable(dfxml)
| book |
The Power Broker |
Robert Caro |
1974 |
Adult |
History |
1 |
| book |
The Works: Anatomy of a City |
Kate Ascher |
2007 |
Young Adult |
Engineering |
2 |
| book |
Building Construction: Principles, Materials, and Systems |
Madan Mehta, Walter R. Scarborough, Diane Armpriest |
2012 |
Adult |
Engineering |
3 |
Import JSON
urljson<-"https://raw.githubusercontent.com/spsstudent15/2016-01-607-08/master/books3.json"
filejson<-getURL(url=urljson)
tjson<-fromJSON(filejson)
kable(tjson)
| The Power Broker |
Robert Caro |
1974 |
Adult |
History |
| The Works: Anatomy of a City |
Kate Ascher |
2007 |
Young Adult |
Engineering |
| Building Construction: Principles, Materials, and Systems |
Madan Mehta, Walter R. Scarborough, Diane Armpriest |
2012 |
Adult |
Engineering |
|
Conclusion
The tables were slightly different. The HTML table added an index row. The XML table added a column for the table ID name, and also added an .attrs column for row count. The JSON seemed to make no changes to the intended table.