I utilized the htmlTable and kableExtra packages to create a visually appealing HTML table for my book list. The combination of these two packages allowed me to easily format the data and enhance the presentation with custom styles and features. This made it simple to organize the information effectively and present it in a user-friendly manner, as I am familiar with hmtl.
knitr::opts_chunk$set(echo = TRUE)
#install.packages("htmlTable")
#install.packages("magrittr")
library(htmlTable)
library(magrittr)
library(knitr)
library(kableExtra)
Manually I created the hmtl table by first include the table border of Title, Author, Genre, Publish, Good Reads Ranking, and Pages. Then it is followed by the actual books in the list.
knitr::opts_chunk$set(echo = TRUE)
# Create the HTML table manually
html_content <- "
<!DOCTYPE html>
<html>
<head>
<title>Book List</title>
</head>
<body>
<table border='1'>
<tr>
<th>Title</th>
<th>Author</th>
<th>Genre</th>
<th>Publish</th>
<th>Goodreads Ranking</th>
<th>Pages</th>
</tr>
<tr>
<td>Data Feminism</td>
<td>Catherine D'Ignazio, Lauren Klein</td>
<td>Reference work</td>
<td>February 21, 2020</td>
<td>4.3</td>
<td>328</td>
</tr>
<tr>
<td>The Rise of Kyoshi</td>
<td>F. C. Yee, Michael Dante DiMartino</td>
<td>Fantasy</td>
<td>July 16, 2019</td>
<td>4.4</td>
<td>442</td>
</tr>
<tr>
<td>The Shadow of Kyoshi</td>
<td>F. C. Yee, Michael Dante DiMartino</td>
<td>Fantasy</td>
<td>July 21, 2020</td>
<td>4.3</td>
<td>341</td>
</tr>
<tr>
<td>The Dawn of Yangchen</td>
<td>F. C. Yee, Michael Dante DiMartino</td>
<td>Fantasy</td>
<td>January 1, 2022</td>
<td>4.0</td>
<td>1136</td>
</tr>
<tr>
<td>The Legacy of Yangchen</td>
<td>F. C. Yee, Michael Dante DiMartino</td>
<td>Fantasy</td>
<td>July 18, 2023</td>
<td>4.1</td>
<td>336</td>
</tr>
<tr>
<td>The Reckoning of Roku</td>
<td>Randy Ribay</td>
<td>Fantasy</td>
<td>July 23, 2024</td>
<td>4.0</td>
<td>368</td>
</tr>
<tr>
<td>Playground</td>
<td>Aaron Beauregard</td>
<td>Horror</td>
<td>November 25, 2022</td>
<td>3.4</td>
<td>290</td>
</tr>
<tr>
<td>Demon Slayer Vol 1</td>
<td>Koyoharu Gotouge</td>
<td>Manga/Horror</td>
<td>June 6, 2016</td>
<td>4.5</td>
<td>192</td>
</tr>
<tr>
<td>Goodnight PunPun Vol 1</td>
<td>Inio Asano</td>
<td>Manga/Psychological</td>
<td>August 3, 2007</td>
<td>4.2</td>
<td>448</td>
</tr>
</table>
</body>
</html>
"
# Write to an HTML file
cat(html_content, file = "html_books.html")
I was able to create an HTML file using html_content. The table, which features a book list, is very clear, with column names and each cell matching the correct data entries, ensuring that the information is easy to read and interpret.
Reference: htmlTable.”htmlTable: A Package for Creating HTML Tables.”Accessed October 12, 2024. https://cran.r-project.org/web/packages/htmlTable/vignettes/general.html.
kableExtra. “kableExtra: Construct Complex Table with ‘kable’ and LaTeX.” Accessed October 12, 2024. https://cran.r-project.org/web/packages/kableExtra/vignettes/awesome_table_in_html.html.
To convert the book list into a JSON file, I utilized the tidyjson and jsonlite packages. The tidyjson package allowed me to efficiently manipulate and structure the data, while jsonlite provided the functionality to convert the organized data frame into a well-formatted JSON file. This combination of packages ensured a smooth process in creating a JSON representation of the book list, making the data easy to share and integrate with other applications.
knitr::opts_chunk$set(echo = TRUE,collapse = T, comment = "#>")
options(tibble.print_min = 4L, tibble.print_max = 4L)
#install.packages("tidyjson")
#install.packages("dplyr")
#install.packages("jsonlite")
library(tidyjson)
##
## Attaching package: 'tidyjson'
## The following object is masked from 'package:stats':
##
## filter
library(dplyr)
##
## Attaching package: 'dplyr'
## The following object is masked from 'package:kableExtra':
##
## group_rows
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(jsonlite)
##
## Attaching package: 'jsonlite'
## The following object is masked from 'package:tidyjson':
##
## read_json
To manually create a JSON file, the data must be structured line by line, which is quite different from the process for HTML. In HTML, I can simply label the columns and then list the books in subsequent lines. However, in JSON, the column name needs to precede the corresponding book information for each entry, requiring a more meticulous approach to ensure proper formatting.
knitr::opts_chunk$set(echo = TRUE,collapse = T, comment = "#>")
options(tibble.print_min = 4L, tibble.print_max = 4L)
# books for JSON collection
books <- c(
'{"Title": "Data Feminism", "Author": "Catherine D\'Ignazio, Lauren Klein", "Genre": "Reference work", "Publish": "February 21, 2020", "Goodreads ranking": 4.3, "Pages": 328}',
'{"Title": "The Rise of Kyoshi", "Author": "F. C. Yee, Michael Dante DiMartino", "Genre": "Fantasy", "Publish": "July 16, 2019", "Goodreads ranking": 4.4, "Pages": 442}',
'{"Title": "The Shadow of Kyoshi", "Author": "F. C. Yee, Michael Dante DiMartino", "Genre": "Fantasy", "Publish": "July 21, 2020", "Goodreads ranking": 4.3, "Pages": 341}',
'{"Title": "The Dawn of Yangchen", "Author": "F. C. Yee, Michael Dante DiMartino", "Genre": "Fantasy", "Publish": "January 1, 2022", "Goodreads ranking": 4.0, "Pages": 1136}',
'{"Title": "The Legacy of Yangchen", "Author": "F. C. Yee, Michael Dante DiMartino", "Genre": "Fantasy", "Publish": "July 18, 2023", "Goodreads ranking": 4.1, "Pages": 336}',
'{"Title": "The Reckoning of Roku", "Author": "Randy Ribay", "Genre": "Fantasy", "Publish": "July 23, 2024", "Goodreads ranking": 4.0, "Pages": 368}',
'{"Title": "Playground", "Author": "Aaron Beauregard", "Genre": "Horror", "Publish": "November 25, 2022", "Goodreads ranking": 3.4, "Pages": 290}',
'{"Title": "Demon Slayer Vol 1", "Author": "Koyoharu Gotouge", "Genre": "Manga/Horror", "Publish": "June 6, 2016", "Goodreads ranking": 4.5, "Pages": 192}',
'{"Title": "Goodnight PunPun Vol 1", "Author": "Inio Asano", "Genre": "Manga/Psychological", "Publish": "August 3, 2007", "Goodreads ranking": 4.2, "Pages": 448}'
)
# tidy the JSON data
books %>% spread_all
#> # A tbl_json: 9 x 8 tibble with a "JSON" attribute
#> ..JSON document.id Title Author Genre Publish `Goodreads ranking` Pages
#> <chr> <int> <chr> <chr> <chr> <chr> <dbl> <dbl>
#> 1 "{\"Title\":… 1 Data… Cathe… Refe… Februa… 4.3 328
#> 2 "{\"Title\":… 2 The … F. C.… Fant… July 1… 4.4 442
#> 3 "{\"Title\":… 3 The … F. C.… Fant… July 2… 4.3 341
#> 4 "{\"Title\":… 4 The … F. C.… Fant… Januar… 4 1136
#> # ℹ 5 more rows
# convert the character vector of JSON strings into a list of R objects
book_json<- lapply(books, fromJSON)
#createe df
df_json <- bind_rows(book_json)
# export JSON file
write_json(df_json, "JSON_books.json", pretty = TRUE)
print(df_json)
#> # A tibble: 9 × 6
#> Title Author Genre Publish `Goodreads ranking` Pages
#> <chr> <chr> <chr> <chr> <dbl> <int>
#> 1 Data Feminism Catherine D'Igna… Refe… Februa… 4.3 328
#> 2 The Rise of Kyoshi F. C. Yee, Micha… Fant… July 1… 4.4 442
#> 3 The Shadow of Kyoshi F. C. Yee, Micha… Fant… July 2… 4.3 341
#> 4 The Dawn of Yangchen F. C. Yee, Micha… Fant… Januar… 4 1136
#> # ℹ 5 more rows
The output of a JSON file is different from HTML in that it’s organized as key-value pairs, which you can easily view in a text editor like Notepad. In contrast, HTML is designed for the web, so it displays information in a structured format that looks good in a browser.
Reference: Arendt, Cole. “Visualizing JSON.” GitHub, last modified October 12, 2024. https://github.com/colearendt/tidyjson/blob/master/vignettes/visualizing-json.Rmd.
Initially, I wanted to use the xml2 package to create the book list in XML format. However, due to some issues I encountered, I opted for the XML package instead. This package offers robust tools for reading and writing XML files, making it easier to manipulate XML documents. The XML package allows users to construct and modify XML structures programmatically. I used functions like newXMLDoc() to create a new XML document and newXMLNode() to add child nodes. This enabled me to build a hierarchical structure for my book list, incorporating details such as title, author, genre, and more as child elements under each book entry.
knitr::opts_chunk$set(echo = TRUE)
#install.packages("XML")
library(XML)
# create XML document
books_xml <- newXMLDoc()
# root node
root <- newXMLNode("books", doc = books_xml)
# Function to add a book
add_book <- function(title, author, genre, publish, goodreads_ranking, pages) {
book_node <- newXMLNode("book", parent = root)
newXMLNode("title", title, parent = book_node)
newXMLNode("author", author, parent = book_node)
newXMLNode("genre", genre, parent = book_node)
newXMLNode("publish", publish, parent = book_node)
newXMLNode("goodreads_ranking", goodreads_ranking, parent = book_node)
newXMLNode("pages", pages, parent = book_node)
}
# Add book entries
add_book("Data Feminism", "Catherine D'Ignazio, Lauren Klein", "Reference work", "February 21, 2020", "4.3", "328")
#> <pages>328</pages>
add_book("The Rise of Kyoshi", "F. C. Yee, Michael Dante DiMartino", "Fantasy", "July 16, 2019", "4.4", "442")
#> <pages>442</pages>
add_book("The Shadow of Kyoshi", "F. C. Yee, Michael Dante DiMartino", "Fantasy", "July 21, 2020", "4.3", "341")
#> <pages>341</pages>
add_book("The Dawn of Yangchen", "F. C. Yee, Michael Dante DiMartino", "Fantasy", "January 1, 2022", "4.0", "1136")
#> <pages>1136</pages>
add_book("The Legacy of Yangchen", "F. C. Yee, Michael Dante DiMartino", "Fantasy", "July 18, 2023", "4.1", "336")
#> <pages>336</pages>
add_book("The Reckoning of Roku", "Randy Ribay", "Fantasy", "July 23, 2024", "4.0", "368")
#> <pages>368</pages>
add_book("Playground", "Aaron Beauregard", "Horror", "November 25, 2022", "3.4", "290")
#> <pages>290</pages>
add_book("Demon Slayer Vol 1", "Koyoharu Gotouge", "Manga/Horror", "June 6, 2016", "4.5", "192")
#> <pages>192</pages>
add_book("Goodnight PunPun Vol 1", "Inio Asano", "Manga/Psychological", "August 3, 2007", "4.2", "448")
#> <pages>448</pages>
# Save to XML file
saveXML(books_xml, "book_list.xml")
#> [1] "book_list.xml"
XML serves as a bridge between HTML and JSON, combining elements of both formats. Similar to HTML, I could define a function with columns to organize the book information effectively. However, like JSON, I had to explicitly write out each column alongside its corresponding book details. The exported format also resembled JSON, presenting the data as a list rather than a table. While I considered using the data.table package to create a tree structure in XML, I encountered several issues that made this approach challenging. Ultimately, I found that HTML provided a more concise and easily readable table format for my book list.
Reference: R Core Team. “xml2: Parse and Generate XML.” CRAN, 2024. https://cran.r-project.org/web/packages/xml2/xml2.pdf.