Pick three of your favorite books on one of your favorite subjects. At least one of the books should have more than one author. For each book, include the title, authors, and two or three other attributes that you find interesting. Take the information that you’ve selected about these three books, and separately create three files which store the book’s information in HTML (using an html table), XML, and JSON formats (e.g. “books.html”, “books.xml”, and “books.json”). Create each of these files “by hand” unless you’re already very comfortable with the file formats. Your deliverable is the three source files and the R code.

Write R code, using your packages of choice, to load the information from each of the three sources into separate R data frames. Are the three data frames identical?

Identify and load in libraries

library(httr)
library(XML)
library(jsonlite)
library(RCurl)
## Loading required package: bitops

Lets start by reading in the JSON file. I used https://www.w3schools.com/js/js_json_intro.asp to learn how to manually create a JSON file.

json <- "https://raw.githubusercontent.com/vindication09/DATA-607-Week-7/master/books2.json"
json <- GET(json)
json <- rawToChar(json$content)
json <- fromJSON(json)
JSON <- data.frame(json)

Lets proceed to load in the XML file. I used https://www.w3schools.com/xml/default.asp to learn how to manually create an XML file.

xml <- "https://raw.githubusercontent.com/vindication09/DATA-607-Week-7/master/books.xml"
xml <- GET(xml)
xml <- rawToChar(xml$content)
xml <- xmlParse(xml)
XML <- xmlToDataFrame(xml)

Finally,lets proceed to load in the HTML file. As with the two previous data types, I used https://www.w3schools.com/html/ to learn how to manually create an HTML file.

html <- "https://raw.githubusercontent.com/vindication09/DATA-607-Week-7/master/books.html"
html <- GET(html)
html <- rawToChar(html$content)
html <- htmlParse(html)
html <- readHTMLTable(html)
HTML <- data.frame(html)

Lets view each of the Data Frames and see how similar/different they are to each other.

JSON
##                                                                                                  title
## 1                                                                                               Title 
## 2                                                                 Real Analysis For Graduate Students 
## 3                                                            Modern Statistical Methods for Astronomy 
## 4 Numerical Differential Equations: Theory and Technique, ODE Methods, Finite Elements and Collocation
##                              author              subjects
## 1                           Author               Subject 
## 2                    Richard F Bass     Pure Mathematics 
## 3 Eric D. Feigelson, G. Jogesh Babu Statistics,Astronomy 
## 4                      John Loustau   Applied Mathematics
##                                                                       publisher
## 1                                                                    Publisher 
## 2 CreateSpace Independent Publishing Platform; Second edition (January 4, 2013)
## 3                       Cambridge University Press; 1 edition (August 27, 2012)
## 4                               World Scientific Publishing Co (March 14, 2016)
##         isbn
## 1    ISBN-10
## 2 1481869140
## 3 052176727X
## 4 9814719498
XML
##                                                                                                  Title
## 1                                                                 Real Analysis For Graduate Students 
## 2                                                            Modern Statistical Methods for Astronomy 
## 3 Numerical Differential Equations: Theory and Technique, ODE Methods, Finite Elements and Collocation
##                              Author               Subject
## 1                    Richard F Bass     Pure Mathematics 
## 2 Eric D. Feigelson, G. Jogesh Babu Statistics,Astronomy 
## 3                      John Loustau   Applied Mathematics
##                                                                       Publisher
## 1 CreateSpace Independent Publishing Platform; Second edition (January 4, 2013)
## 2                       Cambridge University Press; 1 edition (August 27, 2012)
## 3                               World Scientific Publishing Co (March 14, 2016)
##      ISBN-10
## 1 1481869140
## 2 052176727X
## 3 9814719498
HTML
##                                                                                             NULL.Title
## 1                                                                  Real Analysis For Graduate Students
## 2                                                             Modern Statistical Methods for Astronomy
## 3 Numerical Differential Equations: Theory and Technique, ODE Methods, Finite Elements and Collocation
##                         NULL.Author         NULL.Subject
## 1                    Richard F Bass     Pure Mathematics
## 2 Eric D. Feigelson, G. Jogesh Babu Statistics,Astronomy
## 3                      John Loustau  Applied Mathematics
##                                                                  NULL.Publisher
## 1 CreateSpace Independent Publishing Platform; Second edition (January 4, 2013)
## 2                       Cambridge University Press; 1 edition (August 27, 2012)
## 3                               World Scientific Publishing Co (March 14, 2016)
##   NULL.ISBN.10
## 1   1481869140
## 2   052176727X
## 3   9814719498

Through visual inspection, we can see that they are all the same and display the same information with equal number of columns and rows. Despite coming from seemingly different formats, r gives us the functionality to turn three different data structures into workable data frames.