TASK

Creating Files

I am an avid Romance and Mystery reader, hence why I chose two Romance novels and a Mystery. I included the title, author, length, year published, and Amazon rating. Since one of the books had to have two authors, I decided the latest James Patterson book that is a part of a series that he co-authors with Michael Ledwidge. I was appreciative of http://www.w3schools.com/xml/ as well as a few other sites that helped with creating the files. I found the HTML to be the most straight forward to completeand the JSON file to be the most time consuming.

Loading HTML file

library(XML)
library(RCurl)
## Loading required package: bitops
books.html <-"https://raw.githubusercontent.com/komotunde/DATA607/master/Homework%204/books.html"
books.html <- getURL(books.html)
books.html <- readHTMLTable(books.html, header = TRUE)
books.html <- as.data.frame(books.html)
View(books.html)
I noticed immedicately that my column names were preceeded by X3’s. Renaming the columns would take care of this.
colnames(books.html) <- c("TITLE", "AUTHOR", "YEAR", "PAGES", "RATING")
View(books.html)

Now for our xml file.

books.xml<- getURL("https://raw.githubusercontent.com/komotunde/DATA607/master/Homework%204/books.xml", ssl.verifyPeer=FALSE)
books.xml <- xmlTreeParse(books.xml,useInternal = TRUE)
books.xml <- xmlToDataFrame(books.xml)
View(books.xml) 

###I had an issue with loading from a https but I found this work around on: http://stackoverflow.com/questions/23584514/error-xml-content-does-not-seem-to-be-xml-r-3-1-0
This data frame did not result in the same X3’s and I don’t have to do any edits on the resulting data frame. Finally, we will finish with our JSON file.
library(rjson)
books.json <- "https://raw.githubusercontent.com/komotunde/DATA607/master/Homework%204/books.json"
books.json <- fromJSON(paste(readLines(books.json), collapse=""))
books.json <- as.data.frame(books.json)
View(books.json)
This file was the least appealing and would need the most editing to be visually sound.