The objective of this assignment is to select three of your favorite books on one of your favorite subjects. At least one of the books should have more than one author. For each book, include the title, authors, and two or three other attributes that you find interesting.
Take the information that you’ve selected about these three books, and separately create three files which store the book’s information in HTML (using an html table), XML, and JSON formats (e.g. “books.html”,“books.xml”, and “books.json”).
Write R code, using your packages of choice, to load the information from each of the three sources into separate R data frames. Are the three data frames identical?
I selected 3 of my favorite books based on the subject of race that contain multiple authors shown below:
“The Fire This Time: A New Generation Speaks about Race” by Jesmyn Ward (Editor), Clint Smith (Contributor), Kevin Young (Contributor), Mitchell S. Jackson (Contributor), Natasha Trethewey (Contributor), Daniel José Older (Contributor), Edwidge Danticat (Contributor), Honorée Fanonne Jeffers (Contributor), Claudia Rankine (Contributor), Isabel Wilkerson (Contributor)
Goodreads Rating: 4.3/5
Number of ratings: 14,000+
“The Race Card: How Bluffing About Bias Makes Race Relations Worse” by Richard Thompson Ford, Karen E. Fields
Amazon Rating: 4.1/5
Number of ratings: 50+
“This Bridge Called My Back: Writings by Radical Women of Color” by Cherríe L. Moraga (Editor), Gloria Anzaldúa (Editor), Toni Cade Bambara (Contributor), Audre Lorde (Contributor), Barbara Smith (Contributor), Ana Castillo (Contributor), Cherrie Moraga (Contributor), and others
Goodreads Rating: 4.4/5
Number of ratings: 10,000+
I created the files: “books.html”, “books.xml”, and “books.json” with the information of these books in their respective formats. Then, I loaded the information from each source into separate R data frames.
## [1] "HTML Data Frame:"
## [[1]]
## # A tibble: 3 × 4
## Title Authors Rating `Number of Ratings`
## <chr> <chr> <dbl> <chr>
## 1 The Fire This Time: A New Generation Speak… Jesmyn… 4.3 14000+
## 2 The Race Card: How Bluffing About Bias Mak… Richar… 4.1 50+
## 3 This Bridge Called My Back: Writings by Ra… Cherrí… 4.4 10000+
## [1] "\nXML Data Frame:"
## $books
## $books$book
## $books$book$title
## $books$book$title[[1]]
## [1] "The Fire This Time: A New Generation Speaks about Race"
##
##
## $books$book$authors
## $books$book$authors[[1]]
## [1] "Jesmyn Ward (Editor), Clint Smith (Contributor), Kevin Young (Contributor), Mitchell S. Jackson (Contributor), Natasha Trethewey (Contributor), Daniel José Older (Contributor), Edwidge Danticat (Contributor), Honorée Fanonne Jeffers (Contributor), Claudia Rankine (Contributor), Isabel Wilkerson (Contributor)"
##
##
## $books$book$rating
## $books$book$rating[[1]]
## [1] "4.3"
##
##
## $books$book$num_ratings
## $books$book$num_ratings[[1]]
## [1] "14000+"
##
##
##
## $books$book
## $books$book$title
## $books$book$title[[1]]
## [1] "The Race Card: How Bluffing About Bias Makes Race Relations Worse"
##
##
## $books$book$authors
## $books$book$authors[[1]]
## [1] "Richard Thompson Ford, Karen E. Fields"
##
##
## $books$book$rating
## $books$book$rating[[1]]
## [1] "4.1"
##
##
## $books$book$num_ratings
## $books$book$num_ratings[[1]]
## [1] "50+"
##
##
##
## $books$book
## $books$book$title
## $books$book$title[[1]]
## [1] "This Bridge Called My Back: Writings by Radical Women of Color"
##
##
## $books$book$authors
## $books$book$authors[[1]]
## [1] "Cherríe L. Moraga (Editor), Gloria Anzaldúa (Editor), Toni Cade Bambara (Contributor), Audre Lorde (Contributor), Barbara Smith (Contributor), Ana Castillo (Contributor), Cherrie Moraga (Contributor), and others"
##
##
## $books$book$rating
## $books$book$rating[[1]]
## [1] "4.4"
##
##
## $books$book$num_ratings
## $books$book$num_ratings[[1]]
## [1] "10000+"
## [1] "\nJSON Data Frame:"
## $books
## title
## 1 The Fire This Time: A New Generation Speaks about Race
## 2 The Race Card: How Bluffing About Bias Makes Race Relations Worse
## 3 This Bridge Called My Back: Writings by Radical Women of Color
## authors
## 1 Jesmyn Ward (Editor), Clint Smith (Contributor), Kevin Young (Contributor), Mitchell S. Jackson (Contributor), Natasha Trethewey (Contributor), Daniel José Older (Contributor), Edwidge Danticat (Contributor), Honorée Fanonne Jeffers (Contributor), Claudia Rankine (Contributor), Isabel Wilkerson (Contributor)
## 2 Richard Thompson Ford, Karen E. Fields
## 3 Cherríe L. Moraga (Editor), Gloria Anzaldúa (Editor), Toni Cade Bambara (Contributor), Audre Lorde (Contributor), Barbara Smith (Contributor), Ana Castillo (Contributor), Cherrie Moraga (Contributor), and others
## rating num_ratings
## 1 4.3 14000+
## 2 4.1 50+
## 3 4.4 10000+
## logical(0)
## logical(0)
## logical(0)
## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] FALSE
## [1] FALSE
## [1] FALSE
No, the three data frames are not identical because HTML, XML, and JSON have different data structures. While the information is the same, they are represented differently in each format. However, it is proven that they contain the same data.