Assignment week7

Eunkyu Hahm

10/11/2019

Loading packages

Firstly, let’s load necessary packages. XML package is used to parse XML and html file, and jsonlite is used to parse json file.

## Loading required package: bitops
## 
## Attaching package: 'tidyr'
## The following object is masked from 'package:RCurl':
## 
##     complete
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

HTML

Let’s read html file from github with using getURL() function and then read html table. Since the clasee of html_book is a list, I used data.frame() function in order to convert to data frame class.

## $`NULL`
##                                                                 Title
## 1                            Harry potter and the philosopher's stone
## 2                               Hitchhiker's guide to the galaxy book
## 3 Good Omens: The Nice and Accurate Prophecies of Agnes Nutter, Witch
##                         Author       ISBN                 Genre
## 1                J. K. Rowling 0747532699               Fantasy
## 2                Douglas Adams 0330258648 Comic science fiction
## 3 Terry Pratchetth,Neil Gaiman 057504800X                Horror
## [1] "list"

Now, we are taking a look of the html data table.

XML

Let’s get xml file from github with using getURL() function and then parse xml table with xmlParse(). I used getNodeSet() function to find matching each node in an xml treen and then change the class to dataframe. Lastly set the names in an object for each node.

Now, we are taking a look of xml data table.

JSON

Let’s read JSON file from github with using getURL() function and then convert r object from Json using fromJSON . Since the clasee of json_book is a list, I used data.frame() function in order to convert to data frame class.

Now we are looking at the data table of json_df

Conclusion

The three data frames from html, xml, and json files look identical.