1)Pick 3 Favorite Books I selected books from authors I’ve enjoyed in the past and/or found their works to be literary masterpieces. I had to go outside of my favorites in order to fulfill the requirement that at least one of the books had to have more than one author, hence why I settle for authors I admire or who wrote books I’d like. 2) TAKE THE INFORMATION THAT YOU”VE SELECTED ABOUT THE # BOOKS AND CREATE THREE SEPARTE FILES IN WHICH TO STORE THE BOOK’s INFO IN -HTML -XML and -JSON 3)Write R code, using your packages of choice, to load the information from each of the three sources into separate R data frames. 4. Conclusion. Are the three data frames identical?
1)Pick 3 Favorite Books - Weaverbird by Nigerian authors :Ayodele Arigbabu, Ayọ̀bámi Adébáyọ̀, Unoma Nguemo Azuah, Khalidah Aderonke - Merry Sexy Christmas by Beverly Jenkins,Kayla Perrins and Maureen Smith - Sula by the ;iterary genuis Toni Morrison
2)These are the handwritten scripts for each format HTML
=<!DOCTYPE html>
<html>
<head>
<style>
table, th, td {
border: 1px solid #dddddd;
text-align: center;
border-collapse: collapse;
}
td, th {
padding: 6px;
}
</style>
</head>
<body>
<h2>Favorite Books</h2>
<table style=”width:100%”>
<tr>
<th>Book Number</th>
<th>Title</th>
<th>1ST Author </th>
<th>2ND Author </th>
<th>3RD Author </th>
<th>Subject(s)</th>
<th>Publication year</th>
<th>Goodreads rating</th>
</tr>
<tr>
<td> 1 </td>
<td>Weaverbird</td>
<td>Ayọ̀bámi Adébáyọ̀</td>
<td>Ayodele Arigbabu</td>
<td>Unoma Nguemo Azuah</td>
<td> Nigeria,Fiction,Anthologies,Challenging,Reflective</td>
<td> 2013 </td>
<td> 4.00 </td>
</tr>
<tr>
<td> 2 </td>
<td>Merry Sexy Christmas</td>
<td>Beverly Jenkins</td>
<td>Kayla Perrin</td>
<td>Maureen Smith</td>
<td>Romance,African American Romance, Anthologies,Holiday </td>
<td> 2012 </td>
<td> 4.18</td>
</tr>
<tr>
<td> 3 </td>
<td>Sula</td>
<td>Toni Morrison</td>
<td> </td>
<td> </td>
<td> Fiction,Classic,Historical Fiction, African American </td>
<td> 1973 </td>
<td> 4.01 </td>
</tr>
</table>
</body>
</html>
XML
<?xml version="1.0" encoding="UTF-8"?>
<FavoriteBooks>
<Book1>
<BookNumber>1</BookNumber>
<Title>Weaverbird</Title>
<Author1>Ayọ̀bámi Adébáyọ̀</Author1>
<Author2>Ayodele Arigbabu</Author2>
<Author3>Unoma Nguemo Azuah</Author3>
<Subjects>Nigeria,Fiction,Anthologies,Challenging,Reflective</Subjects>
<PublicationYR>2013</PublicationYR>
<Goodreads_rating>4.00</Goodreads_rating>
</Book1>
<Book2>
<BookNumber>2</BookNumber>
<Title>Merry Sexy Christmas</Title>
<Author1>Beverly Jenkins</Author1>
<Author2>Kayla Perrin</Author2>
<Author3>Maureen Smith</Author3>
<Subjects>Romance,African American Romance, Anthologies,Holiday </Subjects>
<PublicationYR>2012</PublicationYR>
<Goodreads_rating>4.18</Goodreads_rating>
</Book2>
<Book3>
<BookNumber>3</BookNumber>
<Title>Sula</Title>
<Author1>Toni Morrison</Author1>
<Author2> </Author2>
<Author3> </Author3>
<Subjects>Fiction,Classic,Historical Fiction, African American </Subjects>
<PublicationYR>1973</PublicationYR>
<Goodreads_rating>4.01</Goodreads_rating>
</Book3>
</FavoriteBooks>
JSON -The one that did not work well
{
"library": [
{
"Book Number": 1,
"Title": "Weaverbird",
"Author1": "Ayọ̀bámi Adébáyọ̀",
"Author2": "Ayodele Arigbabu",
"Author3": "Unoma Nguemo Azuah",
"Subjects": "Nigeria,Fiction,Anthologies,Challenging,Reflective",
"PublicationYR": 2013,
"Goodreads_rating": 4
},
{
"Book Number": 2,
"Title": "Merry Sexy Christmas",
"Author1": "Beverly Jenkins",
"Author2": "Ayodele Arigbabu",
"Author3": "Kayla Perrin",
"Subjects": "Romance,African American Romance, Anthologies,Holiday",
"PublicationYR": 2012,
"Goodreads_rating": 4.18
},
{
"Book Number": 3,
"Title": "Sula",
"Author1": "Toni Morrison",
"Author2": " ",
"Author3": " ",
"Subjects": "Fiction,Classic,Historical Fiction, African American",
"PublicationYR": 1973,
"Goodreads_rating": 4.01
}
]
}
JSON -The one that work well
{
"library": [
{
"Book Number": [
1,
2,
3
],
"Title": [
"Weaverbird",
"Merry Sexy Christmas",
"Sula"
],
"Author1": [
"Ayọ̀bámi Adébáyọ̀",
"Beverly Jenkins",
"Toni Morrison"
],
"Author2": [
"Ayodele Arigbabu",
"Kayla Perrin",
" "
],
"Author3": [
"Unoma Nguemo Azuah",
"Maureen Smith",
" "
],
"Subjects": [
"Nigeria,Fiction,Anthologies,Challenging,Reflective",
"Romance,African American Romance, Anthologies,Holiday",
"Fiction,Classic,Historical Fiction, African American"
],
"PublicationYR": [
2013,
2012,
1973
],
"Goodreads_rating": [
4,
4.18,
4.01
]
}
]
}
PACKAGES
library(httr)
library(XML)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(xml2)
library(rvest)
HTMLurl <- "https://raw.githubusercontent.com/Zee-Mo/Data607/main/Assignment7_Data607_Practice_html"
HTMLTABLE = read_html(HTMLurl) %>% html_table(fill = TRUE)
HTMLTABLE = as.data.frame(HTMLTABLE, optional = TRUE)
HTMLTABLE
## Book Number Title 1ST Author 2ND Author 3RD Author
## 1 1 2 3 NA NA
## 2 Weaverbird Merry Sexy Christmas Sula NA NA
## 3 Ayọ̀bámi Adébáyọ̀ Beverly Jenkins Toni Morrison NA NA
## 4 Ayodele Arigbabu Kayla Perrin NA NA
## 5 Unoma Nguemo Azuah Maureen Smith NA NA
## Subject(s) Publication year Goodreads rating
## 1 NA NA NA
## 2 NA NA NA
## 3 NA NA NA
## 4 NA NA NA
## 5 NA NA NA
Instead of coming down vertical, the colmun values came out horiztonal Change up Read in Practice 2
HTMLurl2 <- "https://raw.githubusercontent.com/Zee-Mo/Data607/main/Assignment7_Praticerun2"
HTMLTABLE2 = read_html(HTMLurl2) %>% html_table(fill = TRUE)
HTMLTABLE2 = as.data.frame(HTMLTABLE2, optional = TRUE)
HTMLTABLE2
## Book Number Title 1ST Author 2ND Author
## 1 1 Weaverbird Ayọ̀bámi Adébáyọ̀ Ayodele Arigbabu
## 2 2 Merry Sexy Christmas Beverly Jenkins Kayla Perrin
## 3 3 Sula Toni Morrison
## 3RD Author Subject(s) Publication year Goodreads rating
## 1 Unoma Nguemo Azuah NA NA NA
## 2 Maureen Smith NA NA NA
## 3 NA NA NA
theurl <- "https://raw.githubusercontent.com/Zee-Mo/Data607/main/Assign7_HTML_Script"
htmlbox = read_html(theurl) %>% html_table(fill = TRUE)
htmlbox = as.data.frame(htmlbox, optional = TRUE)
htmlbox
## Book Number Title 1ST Author 2ND Author
## 1 1 Weaverbird Ayọ̀bámi Adébáyọ̀ Ayodele Arigbabu
## 2 2 Merry Sexy Christmas Beverly Jenkins Kayla Perrin
## 3 3 Sula Toni Morrison
## 3RD Author Subject(s)
## 1 Unoma Nguemo Azuah Nigeria,Fiction,Anthologies,Challenging,Reflective
## 2 Maureen Smith Romance,African American Romance, Anthologies,Holiday
## 3 Fiction,Classic,Historical Fiction, African American
## Publication year Goodreads rating
## 1 2013 4.00
## 2 2012 4.18
## 3 1973 4.01
Load and read in the XML script
xml.url= "https://raw.githubusercontent.com/Zee-Mo/Data607/main/Assign7_XML_script"
xml.url = GET(xml.url)
xml.url
## Response [https://raw.githubusercontent.com/Zee-Mo/Data607/main/Assign7_XML_script]
## Date: 2023-10-16 03:31
## Status: 200
## Content-Type: text/plain; charset=utf-8
## Size: 1.26 kB
## <?xml version="1.0" encoding="UTF-8"?>
##
## <FavoriteBooks>
## <Book1>
## <BookNumber>1</BookNumber>
## <Title>Weaverbird</Title>
## <Author1>Ayọ̀bámi Adébáyọ̀</Author1>
## <Author2>Ayodele Arigbabu</Author2>
## <Author3>Unoma Nguemo Azuah</Author3>
## <Subjects>Nigeria,Fiction,Anthologies,Challenging,Reflective</Subjects>
## ...
I kept getting an error so I broke the code i half
I think I see the issue
xml.url= "https://raw.githubusercontent.com/Zee-Mo/Data607/main/Assign7_XML_script"
xml.url = GET(xml.url)
xml.url
## Response [https://raw.githubusercontent.com/Zee-Mo/Data607/main/Assign7_XML_script]
## Date: 2023-10-16 03:31
## Status: 200
## Content-Type: text/plain; charset=utf-8
## Size: 1.26 kB
## <?xml version="1.0" encoding="UTF-8"?>
##
## <FavoriteBooks>
## <Book1>
## <BookNumber>1</BookNumber>
## <Title>Weaverbird</Title>
## <Author1>Ayọ̀bámi Adébáyọ̀</Author1>
## <Author2>Ayodele Arigbabu</Author2>
## <Author3>Unoma Nguemo Azuah</Author3>
## <Subjects>Nigeria,Fiction,Anthologies,Challenging,Reflective</Subjects>
## ...
xml.table= xmlParse(xml.url, useInternal=TRUE )
xml.table = xmlToDataFrame(xml.table)
xml.table
## BookNumber Title Author1 Author2
## 1 1 Weaverbird Ayọ̀bámi Adébáyọ̀ Ayodele Arigbabu
## 2 2 Merry Sexy Christmas Beverly Jenkins Kayla Perrin
## 3 3 Sula Toni Morrison
## Author3 Subjects
## 1 Unoma Nguemo Azuah Nigeria,Fiction,Anthologies,Challenging,Reflective
## 2 Maureen Smith Romance,African American Romance, Anthologies,Holiday
## 3 Fiction,Classic,Historical Fiction, African American
## PublicationYR Goodreads_rating
## 1 2013 4.00
## 2 2012 4.18
## 3 1973 4.01
#install.packages("rjson")
library(rjson)
json_url= "https://raw.githubusercontent.com/Zee-Mo/Data607/main/myJsonfileforWeek7"
json_df=fromJSON(file = json_url)
json_df=json_df[['library']]
json_df = as.data.frame(json_df, optional = TRUE)
json_df
## Book Number Title Author1 Author2 Author3
## 1 1 Weaverbird Ayọ̀bámi Adébáyọ̀ Ayodele Arigbabu Unoma Nguemo Azuah
## Subjects PublicationYR
## 1 Nigeria,Fiction,Anthologies,Challenging,Reflective 2013
## Goodreads_rating Book Number Title Author1
## 1 4 2 Merry Sexy Christmas Beverly Jenkins
## Author2 Author3
## 1 Ayodele Arigbabu Kayla Perrin
## Subjects PublicationYR
## 1 Romance,African American Romance, Anthologies,Holiday 2012
## Goodreads_rating Book Number Title Author1 Author2 Author3
## 1 4.18 3 Sula Toni Morrison
## Subjects PublicationYR
## 1 Fiction,Classic,Historical Fiction, African American 1973
## Goodreads_rating
## 1 4.01
#Change the format of my code
json_url2= "https://raw.githubusercontent.com/Zee-Mo/Data607/main/jsom_script_try2"
json_df2=fromJSON(file = json_url2)
json_df2=json_df2[['library']]
json_df2 = as.data.frame(json_df2, optional = TRUE)
json_df2
## Book Number Title Author1 Author2
## 1 1 Weaverbird Ayọ̀bámi Adébáyọ̀ Ayodele Arigbabu
## 2 2 Merry Sexy Christmas Beverly Jenkins Kayla Perrin
## 3 3 Sula Toni Morrison
## Author3 Subjects
## 1 Unoma Nguemo Azuah Nigeria,Fiction,Anthologies,Challenging,Reflective
## 2 Maureen Smith Romance,African American Romance, Anthologies,Holiday
## 3 Fiction,Classic,Historical Fiction, African American
## PublicationYR Goodreads_rating
## 1 2013 4.00
## 2 2012 4.18
## 3 1973 4.01
library(stringr)
CONCLUSION Are the three data frames identical? The format, the aesthetics are similar . The differences are seen in the class types. In the JSON file, Book number, PublicationYR and Rating are all dbl, while in the HTMl file Book Number and PublicationYR values are integers while Good reads rating value is a dbl. However, the XMl file is completely different and has Book Number, PublicationYR and Good reads rating as characters; they are seen as strings.
Now when it comes to the written script, HTMl and XML are similar because they are broken down into a hierarchy.They have a root and sub_roots under each. The JSON file was strictly category based. I did not separate the information in respective to the books but to the categories like “ 1stAuthor”, “Subject”, “PublicationYR” and etc
This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
Note that the echo = FALSE parameter was added to the
code chunk to prevent printing of the R code that generated the
plot.