than one author. For each book, include the title, authors, and two or three other attributes that you find interesting. Take the information that you’ve selected about these three books, and separately create three files which store the book’s information in HTML (using an html table), XML, and JSON formats (e.g. “books.html”,“books.xml”, and “books.json”).[…]
Write R code, using your packages of choice, to load the information from each of the three sources into separate R data frames. Are the three data frames identical?
library(XML)
library(plyr)
## Warning: package 'plyr' was built under R version 3.3.3
library(RJSONIO)
##Reading XML file
docxml<- readLines("https://raw.githubusercontent.com/ambra1982/W7ReadingXMLJsonHTML/master/books.xml")
docxml <- xmlParse(docxml, asText = T)
docxmldf<- ldply(xmlToList(docxml), data.frame)
docxmldf
## .id title pubdate author
## 1 book The Signature of All Things 2013 Elizabeth Gilbert
## 2 book Eucalyptus 1999 Murray Bail
## 3 book The Secret Life of Plants 1989 <NA>
## character author.main author.coauthor
## 1 Alma Whittaker <NA> <NA>
## 2 Ellen Holland <NA> <NA>
## 3 Plants Peter Tompkins Christopher Bird
##Reading HTML File
dochtml<- readLines("https://raw.githubusercontent.com/ambra1982/W7ReadingXMLJsonHTML/master/index.html")
dochtml1<- readHTMLTable(dochtml)
dochtml1
## $`NULL`
## title pubdate author
## 1 The Signature of All Things 2013 Elizabeth Gilbert
## 2 Eucalyptus 1999 Murray Bail
## 3 The Secret Life of Plants 1989 Peter Tompkins & Christopher Bird
## character
## 1 Alma Whittaker
## 2 Ellen Holland
## 3 Plants
##Reading Json File
library(jsonlite)
## Warning: package 'jsonlite' was built under R version 3.3.3
##
## Attaching package: 'jsonlite'
## The following objects are masked from 'package:RJSONIO':
##
## fromJSON, toJSON
docjson1<- jsonlite::fromJSON("https://raw.githubusercontent.com/ambra1982/W7ReadingXMLJsonHTML/master/books.json", simplifyDataFrame=T)
docjson1
## $botany_and_fiction_books
## $botany_and_fiction_books$book
## title pubdate author
## 1 The Signature of All Things 2013 Elizabeth Gilbert
## 2 Eucalyptus 1999 Murray Bail
## 3 The Secret Life of Plants 1989 Peter Tompkins, Christopher Bird
## character
## 1 Alma Whittaker
## 2 Ellen Holland
## 3 Plants
The main difference among the dfs lies in two additional columns generated by the XML file.