Introduction

For this assignment I chose 3 favorite books in a topic and created 3 equivalent tables in HTML, XML, and JSON. The topic was city infrastructure.

Load Packages

suppressWarnings(library(knitr))
suppressWarnings(library(plyr))
suppressMessages(suppressWarnings(library(RCurl)))
suppressWarnings(library(htmltab))
suppressWarnings(library(XML))
suppressWarnings(library(xtable))
suppressWarnings(library(jsonlite))

Import HTML

urlhtml<-"https://raw.githubusercontent.com/spsstudent15/2016-01-607-08/master/books.html"
filehtml<-getURL(url=urlhtml)
thtml<-htmltab(doc=filehtml)
## Argument 'which' was left unspecified. Choosing first table.
## Neither <thead> nor <th> information found. Taking first table row for the header. If incorrect, specifiy header argument.
kable(thtml)
Title Author Year Reading Level Genre
2 The Power Broker Robert Caro 1974 Adult History
3 The Works: Anatomy of a City Kate Ascher 2007 Young Adult Engineering
4 Building Construction: Principles, Materials, and System Madan Mehta, Walter R. Scarborough, Diane Armpriest 2012 Adult Engineering

Import XML

urlxml<-"https://raw.githubusercontent.com/spsstudent15/2016-01-607-08/master/books.xml"
filexml<-getURL(urlxml)
dfxml<-ldply(xmlToList(filexml),data.frame)
kable(dfxml)
.id title author year readinglevel genre .attrs
book The Power Broker Robert Caro 1974 Adult History 1
book The Works: Anatomy of a City Kate Ascher 2007 Young Adult Engineering 2
book Building Construction: Principles, Materials, and Systems Madan Mehta, Walter R. Scarborough, Diane Armpriest 2012 Adult Engineering 3

Import JSON

urljson<-"https://raw.githubusercontent.com/spsstudent15/2016-01-607-08/master/books3.json"
filejson<-getURL(url=urljson)
tjson<-fromJSON(filejson)
kable(tjson)
title author year readinglevel genre
The Power Broker Robert Caro 1974 Adult History
The Works: Anatomy of a City Kate Ascher 2007 Young Adult Engineering
Building Construction: Principles, Materials, and Systems Madan Mehta, Walter R. Scarborough, Diane Armpriest 2012 Adult Engineering

Conclusion

The tables were slightly different. The HTML table added an index row. The XML table added a column for the table ID name, and also added an .attrs column for row count. The JSON seemed to make no changes to the intended table.