Pick three of your favorite books on one of your favorite subjects. At least one of the books should have more than one author. For each book, include the title, authors, and two or three other attributes that you find interesting.
Create file representations of the selected books in the formats: HTML, XML, and JSON.
Are the three data frames identical?
library(tidyverse)
library(XML)
library(jsonlite)
library(xml2)
library(rvest)
html_file <- 'http://raw.githubusercontent.com/dab31415/DATA607/main/Homework/Assignment_5/books.html'
html_df <- read_html(html_file) %>%
html_elements('table') %>%
html_table()
html_df
## [[1]]
## # A tibble: 3 x 6
## Title Author Year Publisher Pages ISBN
## <chr> <chr> <int> <chr> <int> <chr>
## 1 Treasure Clive Cussler 1988 Simon an~ 539 0-67~
## 2 The Hunt for Red October Tom Clancey 1984 Naval In~ 387 0-87~
## 3 Command Authority Tom Clancey, Mark Greaney 2013 G.P. Put~ 736 978-~
xml_file <- 'http://raw.githubusercontent.com/dab31415/DATA607/main/Homework/Assignment_5/books.xml'
xml_df <- read_xml(xml_file) %>%
xmlParse() %>%
xmlToDataFrame()
xml_df
## title authors year publisher
## 1 Treasure Clive Cussler 1988 Simon and Schuster
## 2 The Hunt for Red October Tom Clancey 1984 Naval Institute Press
## 3 Command Authority Tom Clancey, Mark Greaney 2013 G.P. Putnam's Sons
## pages ISBN
## 1 539 0-671-62132
## 2 387 0-870-212850
## 3 736 978-0-39-916077-9
json_file <- 'https://raw.githubusercontent.com/dab31415/DATA607/main/Homework/Assignment_5/books.json'
json_df <- fromJSON(json_file)
json_df
## $books
## title authors year publisher
## 1 Treasure Clive Cussler 1988 Simon and Schuster
## 2 The Hunt for Red October Tom Clancey 1984 Naval Institute Press
## 3 Command Authority Tom Clancey, Mark Greaney 2013 G.P. Putnam's Sons
## pages ISBN
## 1 539 0-671-62132
## 2 387 0-870-212850
## 3 736 978-0-39-916077-9