Parse HTML file to dataframe
library(XML)
library(RJSONIO)
authorsparse = readHTMLTable("authors.html")
authorsparsedf = data.frame(authorsparse)
colnames(authorsparsedf) = c("First_Name","Last_Name","Title","Genre","Published","Living_Status")
print(authorsparsedf)
## First_Name Last_Name Title Genre
## 1 Albert Camus The Stranger Philosophical Fiction
## 2 Friedrich Nietzsche On the Genealogy of Morality Philosophy
## 3 Stephen King The Talisman Fantasy
## 4 Peter Straub The Talisman Fantasy
## Published Living_Status
## 1 1942 Dead
## 2 1887 Dead
## 3 1984 Alive
## 4 1984 Alive
Parse XML file to dataframe
xmlauthors = xmlParse("authors.xml")
xmlrootauthors = xmlRoot(xmlauthors)
authorsxmldf = xmlToDataFrame(xmlrootauthors)
print(authorsxmldf)
## First_Name Last_Name Title Genre Published
## 1 Albert Camus The Stranger Fiction 1942
## 2 Friedrich Nietzsche On the Genealogy of Morality Philosophy 1887
## 3 Stephen King The Talisman Fantasy 1984
## 4 Peter Straub The Talisman Fantasy 1984
## Living_Status
## 1 Dead
## 2 Dead
## 3 Alive
## 4 Alive
Parse JSON file to dataframe
parseauthorjson = fromJSON(content = "authors.json")
authorjsondf = do.call("rbind",lapply(parseauthorjson,data.frame,stringsAsFactors=FALSE))
rownames(authorjsondf) = c(1,2,3,4)
print(authorjsondf)
## First_Name Last_Name Title Genre Published
## 1 Albert Camus The Stranger Fiction 1942
## 2 Friedrich Nietzsche On the Genealogy of Morality Philosophy 1887
## 3 Stephen King The Talisman Fantasy 1984
## 4 Peter Straub The Talisman Fantasy 1984
## Living_Status
## 1 Dead
## 2 Dead
## 3 Alive
## 4 Alive
Conclusion: The dataframes are pulled in nearly identical with minor differences in row and column naming conventions.