I created my file for html manually and reading it in as an HTML file.
myurl <- "https://raw.githubusercontent.com/BanuB/Week7Assignment/master/Books2.html"
html<-tbl_df(as.data.frame(read_html(myurl) %>% html_table(header = NA, trim=TRUE, fill=TRUE)))
html %>% kable() %>% kable_styling() %>% scroll_box(width = "910px")
|
SNO
|
Book
|
Subject
|
Authors
|
Cost
|
ISBN.10
|
PrintLength
|
Format.Type
|
Publisher
|
|
1
|
SQL Server 2016 Developer’s Guide: Build efficient database applications for your organization with SQL Server 2016 Database
|
Database
|
Dejan Sarka (Author), Milos Radivojevic (Author), William Durkin (Author)
|
37.67
|
1786465345
|
616
|
Paperback
|
Packt Publishing (March 22, 2017)
|
|
2
|
SQL Server 2016 Developer’s Guide: Build efficient database applications for your organization with SQL Server 2016 Database
|
Database
|
Dejan Sarka (Author), Milos Radivojevic (Author), William Durkin (Author)
|
55.99
|
1786465345
|
616
|
Hardcover
|
Packt Publishing (March 22, 2017)
|
|
3
|
North and South (Wordsworth Classics) 2nd Revised ed. Edition
|
Classic Literature
|
Dejan Sarka (Author), Milos Radivojevic (Author), William Durkin (Author)
|
55.99
|
1786465345
|
616
|
Paperback
|
Packt Publishing (March 22, 2017)
|
|
4
|
Fashion in Detail: World Dress (paperback)
|
Fashion
|
Rosemary Crill, Jennifer Wearden and Verity Wilson
|
25.08
|
1851775684
|
224
|
Paperback
|
V & A Publishing; 1st edition (April 1, 2009)
|
|
5
|
Pride and Prejudice
|
Classic Literature
|
Jane Auston
|
9.45
|
1503290565
|
226
|
Paperback
|
CreateSpace Independent Publishing Platform (November 6, 2018)
|
Created and read in an XML file
myXML <- getURL("https://raw.githubusercontent.com/BanuB/Week7Assignment/master/BooksXML2.xml")
xmlData<-xmlToDataFrame(myXML)
xmlData %>% kable() %>% kable_styling() %>% scroll_box(width = "910px")
|
SNO
|
BookTitle
|
Subject
|
Authors
|
Cost
|
ISBN-10
|
PrintLength
|
FormatType
|
Publisher
|
|
1
|
SQL Server 2016 Developer’s Guide: Build efficient database applications for your organization with SQL Server 2016
|
Database
|
Dejan Sarka (Author), Milos Radivojevic (Author), William Durkin (Author)
|
37.7
|
1786465345
|
616
|
Paperback
|
Packt Publishing (March 22, 2017)
|
|
2
|
SQL Server 2016 Developer’s Guide: Build efficient database applications for your organization with SQL Server 2016
|
Database
|
Dejan Sarka (Author), Milos Radivojevic (Author), William Durkin (Author)
|
56
|
1786465345
|
616
|
Hardcover
|
Packt Publishing (March 22, 2017)
|
|
3
|
North and South (Wordsworth Classics) 2nd Revised ed. Edition
|
Classic Literature
|
Elizabeth Gaskell
|
3.95
|
1853260932
|
448
|
Paperback
|
Wordsworth Editions; 2nd Revised ed. edition (April 1, 1998)
|
|
4
|
Fashion in Detail: World Dress (paperback)
|
Fashion
|
Rosemary Crill, Jennifer Wearden and Verity Wilson
|
25.1
|
1851775684
|
616
|
Paperback
|
V & A Publishing; 1st edition (April 1, 2009)
|
|
5
|
Pride and Prejudice
|
Classic Literature
|
Jane Auston
|
9.45
|
1503290565
|
226
|
Paperback
|
CreateSpace Independent Publishing Platform (November 6, 2018)
|
Reading in a json file
myjson<-fromJSON(content = "https://raw.githubusercontent.com/BanuB/Week7Assignment/master/Booksjson.JSON")
myjsontb<-do.call("cbind", lapply(myjson[[1]], data.frame, stringsAsFactors = FALSE))
s <- t(myjsontb)
s %>% kable() %>% kable_styling() %>% scroll_box(width = "910px")
|
|
SNO
|
BookTitle
|
Subject
|
Authors
|
Cost
|
ISBN-10
|
PrintLength
|
FormatType
|
Publisher
|
|
X..i..
|
1
|
SQL Server 2016 Developer’s Guide: Build efficient database applications for your organization with SQL Server 2016
|
Database
|
Dejan Sarka (Author), Milos Radivojevic (Author), William Durkin  (Author)
|
37.7
|
1786465345
|
616
|
Paperback
|
Packt Publishing (March 22, 2017)
|
|
X..i..
|
2
|
SQL Server 2016 Developer’s Guide: Build efficient database applications for your organization with SQL Server 2016
|
Database
|
Dejan Sarka (Author), Milos Radivojevic (Author), William Durkin  (Author)
|
56
|
1786465345
|
616
|
Paperback
|
Packt Publishing (March 22, 2017)
|
|
X..i..
|
3
|
North and South (Wordsworth Classics) 2nd Revised ed. Edition
|
Classic Literature
|
Elizabeth Gaskell
|
3.95
|
1853260932
|
448
|
Paperback
|
Wordsworth Editions; 2nd Revised ed. edition (April 1, 1998)
|
|
X..i..
|
4
|
Fashion in Detail: World Dress (paperback)
|
Fashion
|
Rosemary Crill, Jennifer Wearden and Verity Wilson
|
25.1
|
1851775684
|
616
|
Paperback
|
V & A Publishing; 1st edition (April 1, 2009)
|
|
X..i..
|
5
|
Pride and Prejudice
|
Classic Literature
|
Jane Auston
|
9.45
|
1503290565
|
226
|
Paperback
|
CreateSpace Independent Publishing Platform (November 6, 2018)
|
str(xmlData)
## 'data.frame': 5 obs. of 9 variables:
## $ SNO : Factor w/ 5 levels "1","2","3","4",..: 1 2 3 4 5
## $ BookTitle : Factor w/ 4 levels "Fashion in Detail: World Dress (paperback)",..: 4 4 2 1 3
## $ Subject : Factor w/ 3 levels "Classic Literature",..: 2 2 1 3 1
## $ Authors : Factor w/ 4 levels "Dejan Sarka (Author), Milos Radivojevic (Author), William Durkin (Author)",..: 1 1 2 4 3
## $ Cost : Factor w/ 5 levels "25.1","3.95",..: 3 4 2 1 5
## $ ISBN-10 : Factor w/ 4 levels " 1503290565",..: 2 2 4 3 1
## $ PrintLength: Factor w/ 3 levels "226","448","616": 3 3 2 3 1
## $ FormatType : Factor w/ 2 levels "Hardcover","Paperback": 2 1 2 2 2
## $ Publisher : Factor w/ 4 levels "CreateSpace Independent Publishing Platform (November 6, 2018)",..: 2 2 4 3 1
str(html)
## Classes 'tbl_df', 'tbl' and 'data.frame': 5 obs. of 9 variables:
## $ SNO : int 1 2 3 4 5
## $ Book : chr "SQL Server 2016 Developer's Guide: Build efficient database applications for your organization with SQL Server 2016 Database" "SQL Server 2016 Developer's Guide: Build efficient database applications for your organization with SQL Server 2016 Database" "North and South (Wordsworth Classics) 2nd Revised ed. Edition" "Fashion in Detail: World Dress (paperback)" ...
## $ Subject : chr "Database" "Database" "Classic Literature" "Fashion" ...
## $ Authors : chr "Dejan Sarka (Author), Milos Radivojevic (Author), William Durkin (Author)" "Dejan Sarka (Author), Milos Radivojevic (Author), William Durkin (Author)" "Dejan Sarka (Author), Milos Radivojevic (Author), William Durkin (Author)" "Rosemary Crill, Jennifer Wearden and Verity Wilson" ...
## $ Cost : num 37.67 55.99 55.99 25.08 9.45
## $ ISBN.10 : int 1786465345 1786465345 1786465345 1851775684 1503290565
## $ PrintLength: int 616 616 616 224 226
## $ Format.Type: chr "Paperback" "Hardcover" "Paperback" "Paperback" ...
## $ Publisher : chr "Packt Publishing (March 22, 2017)" "Packt Publishing (March 22, 2017)" "Packt Publishing (March 22, 2017)" "V & A Publishing; 1st edition (April 1, 2009)" ...
str(s)
## chr [1:5, 1:9] "1" "2" "3" "4" "5" ...
## - attr(*, "dimnames")=List of 2
## ..$ : chr [1:5] "X..i.." "X..i.." "X..i.." "X..i.." ...
## ..$ : chr [1:9] "SNO" "BookTitle" "Subject" "Authors" ...
For the HTML file creation, I manually created the file, For XML, manually created a file as well. For XML the & in the attribute value caused an error so I had to resubmit the equivalent & . JSON file was created but I had to use like lapply and transpose steps to show the information in a dataframe format. The HTML and XML seem to similar in structure. For JSON, we were reading in a list and then I perform cbind and then transpose to get it in the similar structure as the HTML and XML files.