R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

# Load the necessary libraries
library(rvest)
library(jsonlite)
library(XML)
library(xml2)

#html Data File 

book_data <- "https://raw.githubusercontent.com/MRobinson112/assignment7/main/books1.html"

html_file <- readLines(book_data)
## Warning in readLines(book_data): incomplete final line found on
## 'https://raw.githubusercontent.com/MRobinson112/assignment7/main/books1.html'
html_file <- paste(html_file, "\n", collapse = "\n")

html_data <- read_html(html_file)
html_table <- html_data %>% html_table(fill = TRUE)
html_data_frame <- as.data.frame(html_table)

colnames(html_data_frame) <- c("Title", "Author", "ISBN", "Pages", "Publisher", "Attributes")

print(html_data_frame)
##                     Title                          Author              ISBN
## 1  The Catcher in the Rye                   J.D. Salinger 978-0-316-76948-7
## 2 All the President's Men by Carl Bernstein, Bob Woodward 978-0-671-21781-5
## 3           The Alchemist                    Paulo Coelho 978-0-06-250217-9
##   Pages                 Publisher
## 1   277 Little, Brown and Company
## 2   349          Simon & Schuster
## 3   197                 HarperOne
##                                                        Attributes
## 1                        Coming of age, alienation, and identity.
## 2 Investigative journalism, Watergate scandal, political history.
## 3                                Quest, personal legend, destiny.
#XML Book File


xml_data <- "https://raw.githubusercontent.com/MRobinson112/assignment7/main/books1.xml"

xml_file <- readLines(xml_data, warn = FALSE)

xml_file <- paste(xml_file, collapse = "\n")

if (nzchar(xml_file)) {
  xml_data <- xmlParse(xml_file)
  
  xml_data_frame <- xmlToDataFrame(xml_data)
  
  print(xml_data_frame)
} else {
  cat("XML content is empty or invalid.\n")
}
##                     title                          author              isbn
## 1  The Catcher in the Rye                   J.D. Salinger 978-0-316-76948-7
## 2 All the President's Men Carl Bernstein and Bob Woodward 978-0-671-21781-5
## 3           The Alchemist                    Paulo Coelho 978-0-06-231500-7
##   pages                 publisher
## 1   277 Little, Brown and Company
## 2   349        Simon and Schuster
## 3   208                 HarperOne
##                                                         Attributes
## 1                         Coming of age, alienation, and identity.
## 2  Investigative journalism, Watergate scandal, political history.
## 3                                 Quest, personal legend, destiny.
#Json book file 

json_data <- "https://raw.githubusercontent.com/MRobinson112/assignment7/main/book1.json"
json_file <- fromJSON(json_data)

json_data_frame <- as.data.frame(json_file$books)

print(json_data_frame)
##                     title                           author              isbn
## 1  The Catcher in the Rye                    J.D. Salinger 978-0-316-76948-7
## 2 All the President's Men  Carl Bernstein and Bob Woodward 978-0-671-21781-5
## 3           The Alchemist                     Paulo Coelho 978-0-06-231500-7
##   pages                 publisher
## 1   277 Little, Brown and Company
## 2   349          Simon & Schuster
## 3   208                 HarperOne
##                                                        Attributes
## 1                        Coming of age, alienation, and identity.
## 2 Investigative journalism, Watergate scandal, political history.
## 3                                Quest, personal legend, destiny.

Conclusion

After reading all three file into R and looking at the from all the data they are all identical. the only difference i notices is the json output was automatically numbered.