Pick three of your favorite books on one of your favorite subjects. At least one of the books should have more
than one author. For each book, include the title, authors, and two or three other attributes that you find
interesting.
Take the information that you've selected about these three books, and separately create three files which
store the book's information in HTML (using an html table), XML, and JSON formats (e.g. "books.html",
"books.xml", and "books.json"). To help you better understand the different file structures, I'd prefer that you
create each of these files "by hand" unless you're already very comfortable with the file formats.
Write R code, using your packages of choice, to load the information from each of the three sources into
separate R data frames. Are the three data frames identical?
Read HTML
suppressWarnings(library(XML))
htmldoc<-readHTMLTable('books.html',skip.rows = 1,as.data.frame = TRUE)
htmlDF<-htmldoc[[1]]
htmlDF
## Book Name
## 1 In Defense of Food: An Eater's Manifesto
## 2 Quiet: The Power of Introverts in a World That Can't Stop Talking
## 3 For the Love of Physics
## Author(s) GENRES ISBN Price
## 1 Michael Pollan Self-Help 978-0143114963 $9.30
## 2 Susan Cain Non-fiction 978-0307352156 $9.07
## 3 Walter Lewin, Warren Goldstein science 978-1451607130 $13.53
Read XML
xmldoc<-xmlParse("books.xml")
xmlDF<-xmlToDataFrame(xmldoc)
xmlDF
## name
## 1 In Defense of Food: An Eater's Manifesto
## 2 Quiet: The Power of Introverts in a World That Can't Stop Talking
## 3 For the Love of Physics
## author genres price
## 1 Michael Pollan Non-fiction $9.30
## 2 Susan Cain Non-fiction $9.07
## 3 Walter Lewin, Warren Goldstein science $13.53
Read JSON
#install.packages("jsonlite")
suppressWarnings(library(jsonlite))
jsondoc<-fromJSON(txt="books.json")
jsonDF<-data.frame(jsondoc[[1]])
jsonDF
## book.name
## 1 In Defense of Food: An Eater's Manifesto
## 2 Quiet: The Power of Introverts in a World That Can't Stop Talking
## 3 For the Love of Physics
## book.author book.genres book.price
## 1 Michael Pollan Non-fiction $9.30
## 2 Susan Cain Non-fiction $9.07
## 3 Walter Lewin, Warren Goldstein science $13.53