library(XML)
library(xml2)
library(htmltab)
library(rvest)
library(jsonlite)
webpage <- read_html("https://raw.githubusercontent.com/Patel-Krutika/Data_607/main/books.html")
webpage %>% html_table() %>% data.frame()
## Book.Title Author Year Genre Tag
## 1 The Great Gatsby F. Scott Fitzgerald 1925 Fiction Novel
## 2 Little Women Louisa May Alcott 1868 Fiction Domestic Fiction
## 3 1984 George Orwell 1949 Fiction Science Fiction
d <- xml2::read_xml("https://raw.githubusercontent.com/Patel-Krutika/Data_607/main/books.xml")
xmlParse(d) %>% xmlToDataFrame()
## name author year genre tag
## 1 The Great Gatsby F. Scott Fitzgerald 1925 Fiction Novel
## 2 Little Women Louisa May Alcott 1868 Fiction Domestic Fiction
## 3 1984 George Orwell 1949 Fiction Science Fiction
books_json <- fromJSON("https://raw.githubusercontent.com/Patel-Krutika/Data_607/main/books.json")
books_json %>% data.frame()
## books.name books.author books.year books.genre books.tag
## 1 The Great Gatsby F. Scott Fitzgerald 1925 Fiction Novel
## 2 Little Women Louisa May Alcott 1868 Fiction Domestic Fiction
## 3 1984 George Orwell 1949 Fiction Science Fiction
All three data frames seem to be similar visually. The column names for all three data frames were different. The html table data frame preserved the int data type of the year column, where as the XML and JSON data frames were changed to character type.