Load the needed libraries

library(xml2)
library(jsonlite)
library(XML)

HTML

You can find the code I used to create the html table in my Github page!

XML

Load the xml file I created with my favourite books in a dataframe!

books_xml <- xmlToDataFrame("books.xml")
print(books_xml)
##                                       Title
## 1                           Thomas Calculus
## 2                Introduction to Algorithms
## 3 Discrete Mathematics and Its Applications
##                                                                  Author   Cost
## 1                      Joel R. Hass,Christopher E. Heil,Maurice D. Weir  95.99
## 2 Thomas H. Cormen,Charles E. Leiserson,Ronald L. Rivest,Clifford Stein  52.12
## 3                                                      Kenneth H. Rosen 129.15
##   Edition           ISBN Publication_Year
## 1     3rd 978-0137442997             2021
## 2     3rd 978-0262033848             2009
## 3     7th 978-0073383095             2011

JSON

Load the json file I created with my favourite books in a dataframe!

books_json <- fromJSON("books.json")
print(books_json)
##                                       Title
## 1                           Thomas Calculus
## 2                Introduction to Algorithms
## 3 Discrete Mathematics and Its Applications
##                                                                  Author   Cost
## 1                      Joel R. Hass,Christopher E. Heil,Maurice D. Weir  95.99
## 2 Thomas H. Cormen,Charles E. Leiserson,Ronald L. Rivest,Clifford Stein  52.12
## 3                                                      Kenneth H. Rosen 129.15
##   Edition           ISBN Publication_Year
## 1     3rd 978-0137442997             2021
## 2     3rd 978-0262033848             2009
## 3     7th 978-0073383095             2011

I will use the identical() function to compare the 3 data frames.Identical() will check if all the data frames have the same column names, column types, and data values in each cell. If all these conditions are met, identical() will return TRUE, indicating that the data frames are identical. Otherwise, it will return FALSE.

identical(books_html, books_xml)
## [1] FALSE
identical(books_html, books_json)
## [1] FALSE
identical(books_xml, books_json)
## [1] FALSE

I’ll try to make them identical , but first I need to figure out where the problem is.

names(books_html)
## [1] "Title"            "Authors"          "Cost"             "Edition"         
## [5] "ISBN"             "Publication Year"
names(books_xml)
## [1] "Title"            "Author"           "Cost"             "Edition"         
## [5] "ISBN"             "Publication_Year"
names(books_json)
## [1] "Title"            "Author"           "Cost"             "Edition"         
## [5] "ISBN"             "Publication_Year"
identical(names(books_html), names(books_xml))
## [1] FALSE
identical(names(books_html), names(books_json))
## [1] FALSE
str(books_html)
## 'data.frame':    3 obs. of  6 variables:
##  $ Title           : chr  "Thomas Calculus" "Introduction to Algorithms" "Discrete Mathematics and Its Applications"
##  $ Authors         : chr  "Joel R. Hass,Christopher E. Heil,Maurice D. Weir" "Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein" "Kenneth H. Rosen"
##  $ Cost            : chr  "95.99" "52.12" "129.15"
##  $ Edition         : chr  "3rd" "3rd" "7th"
##  $ ISBN            : chr  "978-0137442997" "978-0262033848" "978-0073383095"
##  $ Publication Year: chr  "2021" "2009" "2011"
str(books_xml)
## 'data.frame':    3 obs. of  6 variables:
##  $ Title           : chr  "Thomas Calculus" "Introduction to Algorithms" "Discrete Mathematics and Its Applications"
##  $ Author          : chr  "Joel R. Hass,Christopher E. Heil,Maurice D. Weir" "Thomas H. Cormen,Charles E. Leiserson,Ronald L. Rivest,Clifford Stein" "Kenneth H. Rosen"
##  $ Cost            : chr  "95.99" "52.12" "129.15"
##  $ Edition         : chr  "3rd" "3rd" "7th"
##  $ ISBN            : chr  "978-0137442997" "978-0262033848" "978-0073383095"
##  $ Publication_Year: chr  "2021" "2009" "2011"
str(books_json) ### The problem is located there. The cost and the publication year in the json dataframe are not characters as the others.Let's change that!
## 'data.frame':    3 obs. of  6 variables:
##  $ Title           : chr  "Thomas Calculus" "Introduction to Algorithms" "Discrete Mathematics and Its Applications"
##  $ Author          : chr  "Joel R. Hass,Christopher E. Heil,Maurice D. Weir" "Thomas H. Cormen,Charles E. Leiserson,Ronald L. Rivest,Clifford Stein" "Kenneth H. Rosen"
##  $ Cost            : num  96 52.1 129.2
##  $ Edition         : chr  "3rd" "3rd" "7th"
##  $ ISBN            : chr  "978-0137442997" "978-0262033848" "978-0073383095"
##  $ Publication_Year: int  2021 2009 2011

The problem is located in the json dataframe.The cost and the publication year in the json dataframe are not characters as the others.Let’s change that!

books_json$Cost <- as.character(books_json$Cost)
books_json$Publication_Year <- as.character(books_json$Publication_Year)

identical(books_html, books_xml)
## [1] FALSE
identical(books_html, books_json)
## [1] FALSE
identical(books_xml, books_json)
## [1] TRUE

Conclusion

I managed to make 2 dataframes identical to each other but couldn’t figure out a way to make the third identical with every dataframe. By the looks of it html maybe has something different compared to the other 2.