Pick three of your favorite books on one of your favorite subjects. At least one of the books should have more than one author. For each book, include the title, authors, and two or three other attributes that you find interesting. Take the information that you’ve selected about these three books, and separately create three files which store the book’s information in HTML (using an html table), XML, and JSON formats (e.g. “books.html”, “books.xml”, and “books.json”). To help you better understand the different file structures, I’d prefer that you create each of these files “by hand” unless you’re already very comfortable with the file formats. Write R code, using your packages of choice, to load the information from each of the three sources into separate R data frames. Are the three data frames identical? Your deliverable is the three source files and the R code. If you can, package your assignment solution up into an .Rmd file and publish to rpubs.com. [This will also require finding a way to make your three text files accessible from the web].
library(XML)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(rvest)
library(R2HTML)
library(jsonlite)
library(RCurl)
HTML is an acronym for hypertext markup language. It is mainly used for web page structuring. It has a table structure that can hold data, similar to a data frame.
custom_html <- '<!DOCTYPE html>
<html>
<head>
<title>Book Table</title>
<style>
tr.spaceUnder>td {
padding-bottom: 1em;
text-align:center;
}
</style>
</head>
<body>
<table>
<thead>
<tr class="spaceUnder">
<th>Title</th>
<th>Author</th>
<th>Published</th>
<th>Detail</th>
</tr>
</thead>
<tbody>
<tr class="spaceUnder">
<td>Young Goodman Brown</td>
<td>Nathaniel Hawethorn</td>
<td>1835</td>
<td>Young Goodman Brown" is a short story published in 1835 by American writer Nathaniel Hawthorne. The story takes place in 17th-century Puritan New England, a common setting for Hawthornes works.</td>
</tr>
<tr class="spaceUnder">
<td>Brain Rules</td>
<td>John Medina</td>
<td>February 2008</td>
<td>Most of us have no idea what’s really goingon inside our heads. Yet brain scientists have uncovered details every business leader, parent, and teacher should know—like the need for physical activity to get your brain working its best. How do we learn? What exactly do sleep and stress do to our brains? Why is multi-tasking a myth? Why is it so easy to forget—and so important to repeat new knowledge? Is it true that men and women have different brains? In Brain Rules, Dr. John Medina, a molecular biologist, shares his lifelong interest in how the brain sciences might influence the way we teach our children and the way we work. In each chapter, he describes a brain rule—what scientists know for sure about how our brains work—and then offers transformative ideas for our daily lives. Medina’s fascinating stories and infectious sense of humor breathe life into brain science. You’ll learn why Michael Jordan was no good at baseball. You’ll peer over a surgeon’s shoulder as he proves that most of us have a Jennifer Aniston neuron. You’ll meet a boy who has an amazing memory for music but can’t tie his own shoes.</td>
</tr>
<tr class="spaceUnder">
<td>Python for Finance</td>
<td>Yves Hilpsich</td>
<td>December 2018</td>
<td>The financial industry has recently adopted Python at a tremendous rate, with some of the largest investment banks and hedge funds using it to build core trading and risk management systems. Updated for Python 3, the second edition of this hands-on book helps you get started with the language, guiding developers and quantitative analysts through Python libraries and tools for building financial applications and interactive financial analytics. Using practical examples throughout the book, author Yves Hilpisch also shows you how to develop a full-fledged framework for Monte Carlo simulation-based derivatives and risk analytics, based on a large, realistic case study. Much of the book uses interactive IPython Notebooks.</td>
</tr>
<tr class="spaceUnder">
<td>Victim F: From Crime Victims to Suspects to Survivors</td>
<td>Denise Huskins, Aaron Quinn, Nicole Weisensee Egan</td>
<td>June 2021</td>
<td>The shocking true story of a bizarre kidnapping and the victims re-victimization by the justice system. In March 2015, Denise Huskins and her boyfriend Aaron Quinn awoke from a sound sleep into a nightmare. Armed men bound and drugged them, then abducted Denise. Warned not to call the police or Denise would be killed. Aaron agonized about what to do. Finally he put his trust in law enforcement and dialed 911. But instead ofsearching for Denise, the police accused Aaron of her murder. His story, they told him, was just unbelievable. When Denise was released alive, the police turned their fire on her, dubbing her the "real-life Gone Girl" who had faked her own kidnapping. In Victim F , Aaron and Denise recount the horrific ordeal that almost cost them everything. Like too many victims of sexual violence, they were dismissed, disbelieved, and dragged through the mud. With no one to rely on except each other, they took on the victim blaming, harassment, misogyny, and abuse of power running rife in the criminal justice system. Their story is, in the end, a love story, but one that sheds necessary light on sexual assault and the abuse by law enforcement that all too frequently compounds crime victims suffering.</td>
</tr>
</tbody>
</table>
</body>
</html>'
writeLines(text= custom_html,
con ="./html_table.html")
doc <- read_html("https://raw.githubusercontent.com/uriahman/607/main/data/Book.html",
encoding = "UTF-8")
doc_tables <- html_nodes(doc, "table")
doc_html <- html_table(doc_tables, header = TRUE)
html_df <- doc_html[[1]]
html_df
## # A tibble: 4 × 4
## Title Author Published Detail
## <chr> <chr> <chr> <chr>
## 1 Young Goodman Brown Nathan… 1835 "Youn…
## 2 Brain Rules John M… February… "Most…
## 3 Python for Finance Yves H… December… "The …
## 4 Victim F: From Crime Victims to Suspects to Survivors Denise… June 2021 "The …
XML is an acronym for extensible markup language. It is very similar to html but where html has stricter regulations on the tag types, xml is more intuitive.
For example table in html must be ’
Xml allows for the defining of new text elements by using document type definitions. Which defines the document types and the meanings of the tags used in them. Basically each xml file can have unique tag names. This is something that html does not allow.
custom_xml <- '<?xml version="1.0" encoding="UTF-8"?>
<books>
<book>
<title lang="en">Young Goodman Brown</title>
<author>Nathaniel Hawethorn</author>
<published>1835</published>
<detail>"Young Goodman Brown is a short story published in 1835 by American writer Nathaniel Hawthorne. The story takes place in 17th-century Puritan New England, a common setting for Hawthornes works."</detail>
</book>
<book>
<title lang="en">Brain Rules</title>
<author>John Medina</author>
<published>February 2008</published>
<detail>Most of us have no idea what’s really goingon inside our heads. Yet brain scientists have uncovered details every business leader, parent, and teacher should know—like the need for physical activity to get your brain working its best. How do we learn? What exactly do sleep and stress do to our brains? Why is multi-tasking a myth? Why is it so easy to forget—and so important to repeat new knowledge? Is it true that men and women have different brains? In Brain Rules, Dr. John Medina, a molecular biologist, shares his lifelong interest in how the brain sciences might influence the way we teach our children and the way we work. In each chapter, he describes a brain rule—what scientists know for sure about how our brains work—and then offers transformative ideas for our daily lives. Medina’s fascinating stories and infectious sense of humor breathe life into brain science. You’ll learn why Michael Jordan was no good at baseball. You’ll peer over a surgeon’s shoulder as he proves that most of us have a Jennifer Aniston neuron. You’ll meet a boy who has an amazing memory for music but can’t tie his own shoes.</detail>
</book>
<book>
<title lang="en">Python for Finance</title>
<author>Yves Hilpsich</author>
<published>December 2018</published>
<detail>The financial industry has recently adopted Python at a tremendous rate, with some of the largest investment banks and hedge funds using it to build core trading and risk management systems. Updated for Python 3, the second edition of this hands-on book helps you get started with the language, guiding developers and quantitative analysts through Python libraries and tools for building financial applications and interactive financial analytics. Using practical examples throughout the book, author Yves Hilpisch also shows you how to develop a full-fledged framework for Monte Carlo simulation-based derivatives and risk analytics, based on a large, realistic case study. Much of the book uses interactive IPython Notebooks.</detail>
</book>
<book>
<title>Victim F: From Crime Victims to Suspects to Survivors</title>
<author>Denise Huskins, Aaron Quinn, Nicole Weisensee Egan</author>
<published>June 2021</published>
<detail>The shocking true story of a bizarre kidnapping and the victims re-victimization by the justice system. In March 2015, Denise Huskins and her boyfriend Aaron Quinn awoke from a sound sleep into a nightmare. Armed men bound and drugged them, then abducted Denise. Warned not to call the police or Denise would be killed. Aaron agonized about what to do. Finally he put his trust in law enforcement and dialed 911. But instead ofsearching for Denise, the police accused Aaron of her murder. His story, they told him, was just unbelievable. When Denise was released alive, the police turned their fire on her, dubbing her the "real-life Gone Girl" who had faked her own kidnapping. In Victim F , Aaron and Denise recount the horrific ordeal that almost cost them everything. Like too many victims of sexual violence, they were dismissed, disbelieved, and dragged through the mud. With no one to rely on except each other, they took on the victim blaming, harassment, misogyny, and abuse of power running rife in the criminal justice system. Their story is, in the end, a love story, but one that sheds necessary light on sexual assault and the abuse by law enforcement that all too frequently compounds crime victims suffering.</detail>
</book>
</books>'
writeLines(text= custom_xml,
con ="./xml_table.xml")
xml_link <- getURL('https://raw.githubusercontent.com/uriahman/607/main/data/Books.xml')
docx <- xmlParse(xml_link)
docx_root <- xmlRoot(docx)
doc_xml <- xmlToDataFrame(docx_root)
doc_xml
## title
## 1 Young Goodman Brown
## 2 Brain Rules
## 3 Python for Finance
## 4 Victim F: From Crime Victims to Suspects to Survivors
## author published
## 1 Nathaniel Hawethorn 1835
## 2 John Medina February 2008
## 3 Yves Hilpsich December 2018
## 4 Denise Huskins, Aaron Quinn, Nicole Weisensee Egan June 2021
## detail
## 1 "Young Goodman Brown" is a short story published in 1835 by American writer Nathaniel Hawthorne. The story takes place in 17th-century Puritan New England, a common setting for Hawthornes works.'
## 2 Most of us have no idea what’s really goingon inside our heads. Yet brain scientists have uncovered details every business leader, parent, and teacher should know—like the need for physical activity to get your brain working its best. How do we learn? What exactly do sleep and stress do to our brains? Why is multi-tasking a myth? Why is it so easy to forget—and so important to repeat new knowledge? Is it true that men and women have different brains? In Brain Rules, Dr. John Medina, a molecular biologist, shares his lifelong interest in how the brain sciences might influence the way we teach our children and the way we work. In each chapter, he describes a brain rule—what scientists know for sure about how our brains work—and then offers transformative ideas for our daily lives. Medina’s fascinating stories and infectious sense of humor breathe life into brain science. You’ll learn why Michael Jordan was no good at baseball. You’ll peer over a surgeon’s shoulder as he proves that most of us have a Jennifer Aniston neuron. You’ll meet a boy who has an amazing memory for music but can’t tie his own shoes.
## 3 The financial industry has recently adopted Python at a tremendous rate, with some of the largest investment banks and hedge funds using it to build core trading and risk management systems. Updated for Python 3, the second edition of this hands-on book helps you get started with the language, guiding developers and quantitative analysts through Python libraries and tools for building financial applications and interactive financial analytics. Using practical examples throughout the book, author Yves Hilpisch also shows you how to develop a full-fledged framework for Monte Carlo simulation-based derivatives and risk analytics, based on a large, realistic case study. Much of the book uses interactive IPython Notebooks.
## 4 The shocking true story of a bizarre kidnapping and the victims re-victimization by the justice system. In March 2015, Denise Huskins and her boyfriend Aaron Quinn awoke from a sound sleep into a nightmare. Armed men bound and drugged them, then abducted Denise. Warned not to call the police or Denise would be killed. Aaron agonized about what to do. Finally he put his trust in law enforcement and dialed 911. But instead ofsearching for Denise, the police accused Aaron of her murder. His story, they told him, was just unbelievable. When Denise was released alive, the police turned their fire on her, dubbing her the "real-life Gone Girl" who had faked her own kidnapping. In Victim F , Aaron and Denise recount the horrific ordeal that almost cost them everything. Like too many victims of sexual violence, they were dismissed, disbelieved, and dragged through the mud. With no one to rely on except each other, they took on the victim blaming, harassment, misogyny, and abuse of power running rife in the criminal justice system. Their story is, in the end, a love story, but one that sheds necessary light on sexual assault and the abuse by law enforcement that all too frequently compounds crime victims suffering.
JSON is an acronym for javascript object notation. Which is meant to represent structured data. In python it is almost exactly like a dictionary, with a few minor differences. It is a very common way of structuring and distributing data. In my opinion json is a little more complicated than both html and xml, but it is less time consuming of any of them. It is also very good at storing structure data and distributing it.
library(jsonlite)
json_doc <- fromJSON('https://raw.githubusercontent.com/uriahman/607/main/data/Books.json')
jdoc <- as.data.frame(json_doc,simplifyvector = TRUE,simplifyDataFrame= simplifyvector,dataframe='columns')
jdoc
## title
## 1 Young Goodman Brown
## 2 Brain Rules
## 3 Python for Finance
## 4 Victim F: From Crime Victims to Suspects to Survivors
## author published
## 1 Nathaniel Hawethorn 1835
## 2 John Medina February 2008
## 3 Yves Hilpsich December 2018
## 4 Denise Huskins,Aaron Quinn, Nicole Weisensee Egan June 2021
## details
## 1 "Young Goodman Brown" is a short story published in 1835 by American writer Nathaniel Hawthorne. The story takes place in 17th-century Puritan New England, a common setting for Hawthornes works.
## 2 Most of us have no idea what’s really goingon inside our heads. Yet brain scientists have uncovered details every business leader, parent, and teacher should know—like the need for physical activity to get your brain working its best. How do we learn? What exactly do sleep and stress do to our brains? Why is multi-tasking a myth? Why is it so easy to forget—and so important to repeat new knowledge? Is it true that men and women have different brains? In Brain Rules, Dr. John Medina, a molecular biologist, shares his lifelong interest in how the brain sciences might influence the way we teach our children and the way we work. In each chapter, he describes a brain rule—what scientists know for sure about how our brains work—and then offers transformative ideas for our daily lives. Medina’s fascinating stories and infectious sense of humor breathe life into brain science. You’ll learn why Michael Jordan was no good at baseball. You’ll peer over a surgeon’s shoulder as he proves that most of us have a Jennifer Aniston neuron. You’ll meet a boy who has an amazing memory for music but can’t tie his own shoes.
## 3 The financial industry has recently adopted Python at a tremendous rate, with some of the largest investment banks and hedge funds using it to build core trading and risk management systems. Updated for Python 3, the second edition of this hands-on book helps you get started with the language, guiding developers and quantitative analysts through Python libraries and tools for building financial applications and interactive financial analytics. Using practical examples throughout the book, author Yves Hilpisch also shows you how to develop a full-fledged framework for Monte Carlo simulation-based derivatives and risk analytics, based on a large, realistic case study. Much of the book uses interactive IPython Notebooks.
## 4 The shocking true story of a bizarre kidnapping and the victims re-victimization by the justice system. In March 2015, Denise Huskins and her boyfriend Aaron Quinn awoke from a sound sleep into a nightmare. Armed men bound and drugged them, then abducted Denise. Warned not to call the police or Denise would be killed. Aaron agonized about what to do. Finally he put his trust in law enforcement and dialed 911. But instead ofsearching for Denise, the police accused Aaron of her murder. His story, they told him, was just unbelievable. When Denise was released alive, the police turned their fire on her, dubbing her the "real-life Gone Girl" who had faked her own kidnapping. In Victim F , Aaron and Denise recount the horrific ordeal that almost cost them everything. Like too many victims of sexual violence, they were dismissed, disbelieved, and dragged through the mud. With no one to rely on except each other, they took on the victim blaming, harassment, misogyny, and abuse of power running rife in the criminal justice system. Their story is, in the end, a love story, but one that sheds necessary light on sexual assault and the abuse by law enforcement that all too frequently compounds crime victims suffering.
The three different types of basic data structures all return data in the same fashion on the client end. The only true differences between them is on the frontend when creating the structures. With each one having its on time and place for being used.