Three of my favorite books on one of my favorite subjects with more than one author. For each book, the title, authors, genre and year of publishing were used to create and read HTML table, XML and JSON file formats.
Title | Author | Genre | Year Published |
---|---|---|---|
Good Omens | Terry Pratchett, Neil Gaiman | Mystery | 5/10/1990 |
Heads You Lose | Lisa Lutz, David Hayward | Homorous Fiction | 4/5/2011 |
Between The Lines | Jodi Picoult, Samantha van Leer | Fantasy Fiction | 6/26/2012 |
url<-"file:///C:/Users/newma/OneDrive/Desktop/MSDS%20Fall%202021/DATA%20607%20-%20Data%20Acquisition%20and%20Mgt/html%20course/Book.html"
Myhtml <- readHTMLTable(url,which=1)
#Display class
class(Myhtml)
## [1] "data.frame"
#Display data
Myhtml
## Title Author Genre
## 1 Good Omens Terry Pratchett, Neil Gaiman Mystery
## 2 Heads You Lose Lisa Lutz, David Hayward Homorous Fiction
## 3 Between The Lines Jodi Picoult, Samantha van Leer Fantasy Fiction
## Year Published
## 1 5/10/1990
## 2 4/5/2011
## 3 6/26/2012
[ { “title”: “Good Omens”, “authors”: [ “Neil Gaiman”, “Terry Pratchett” ], “Genre”: “Homorous Fiction”, “Year_Published”: “5/10/1990” }, { “title”: “Heads You Lose”, “authors”: [ “Lisa Lutz”, “David Hayward” ], “Genre”: “Mystery”, “Year_Published”: “4/5/2011” }, { “title”: “Between The Lines”, “authors”: [ “Jodi Picoult”, “Samantha van Leer” ], “Genre”: “Fantasy Fiction”, “Year_Published”: “6/26/2012” }]
# Giving the input file name to the function fromJSON.
Myjson <- fromJSON(txt="https://raw.githubusercontent.com/nnaemeka-git/global-datasets/main/Books.json")
# Display the class
class(Myjson)
## [1] "data.frame"
# Printing the result.
Myjson
## title authors Genre
## 1 Good Omens Neil Gaiman, Terry Pratchett Homorous Fiction
## 2 Heads You Lose Lisa Lutz, David Hayward Mystery
## 3 Between The Lines Jodi Picoult, Samantha van Leer Fantasy Fiction
## Year_Published
## 1 5/10/1990
## 2 4/5/2011
## 3 6/26/2012
<Person>
<book>
<title>Good Omens</title>
<first_author>Neil Gaiman</first_author>
<second_author>Terry Pratchett</second_author>
<Genre>Homorous Fiction</Genre>
<Year_Published>5/10/1990</Year_Published>
</book>
<book>
<title>Heads You Lose</title>
<first_author>Lisa Lutz</first_author>
<second_author>David Hayward</second_author>
<Genre>Mystery</Genre>
<Year_Published>4/5/2011</Year_Published>
</book>
<book>
<title>Between The Lines</title>
<first_author>Jodi Picoult</first_author>
<second_author>Samantha van Leer</second_author>
<Genre>Fantasy Fiction</Genre>
<Year_Published>6/26/2012</Year_Published>
</book>
</Person>
# Giving the input file name to the function.
url<-"file:///C:/Users/newma/OneDrive/Desktop/MSDS%20Fall%202021/DATA%20607%20-%20Data%20Acquisition%20and%20Mgt/html%20course/Books.xml"
# Giving the input file name to the function xmlToDataFrame.
Myxml <- xmlToDataFrame(url)
#Display class
class(Myxml)
## [1] "data.frame"
#Printing the dataframe
print(Myxml)
## title first_author second_author Genre
## 1 Good Omens Neil Gaiman Terry Pratchett Homorous Fiction
## 2 Heads You Lose Lisa Lutz David Hayward Mystery
## 3 Between The Lines Jodi Picoult Samantha van Leer Fantasy Fiction
## Year_Published
## 1 5/10/1990
## 2 4/5/2011
## 3 6/26/2012
The dataframe from HTML table and JSON file are identical. But the dataframe from the XML file is different because it has different column for each author.