My Approach
My approach follows:
- I created the three files by hand using the brackets editor.
- I will use rvest, XML and jsonlite packages to parse the html, xml and json files, repectively.
- Once imported I will use the kableExtra package to display each of the files
- Use all.equal and/or identical functions to determine if the data frames are identical
Import Data
1 HTML File
|
Title
|
Author.s.
|
Subject
|
Publisher
|
Year
|
ISBN
|
|
Efficiency of Racetrack Betting Markets
|
Donald B. Hausch,Victor S.Y. Lo, William T. Ziemba
|
Academic Finance - Betting Markets
|
World Scientific
|
2012
|
9.812819e+09
|
|
Precision: Statistical and Mathematical Methods in Horse Racing
|
C X Wong
|
Quantitative Methods in Horse Racing
|
Outskirts Press
|
2011
|
9.781433e+12
|
|
The Odds Must Be Crazy: Beating the Races with the Man Who Revolutionized Handicapping
|
Len Ragozin
|
Handicapping-Figure Making
|
Little, Brown and Company
|
1997
|
9.780317e+12
|
2. XML File
|
Title
|
Author
|
Subject
|
Publisher
|
Year
|
ISBN
|
|
Efficiency of Racetrack Betting Markets
|
Donald B. Hausch,Victor S.Y. Lo, William T. Ziemba
|
Academic Finance - Betting Markets
|
World Scientific
|
2012
|
9812819185
|
|
Precision: Statistical and Mathematical Methods in Horse Racing
|
C X Wong
|
Quantitative Methods in Horse Racing
|
Outskirts Press
|
2011
|
978143276852
|
|
The Odds Must Be Crazy: Beating the Races with the Man Who Revolutionized Handicapping
|
Len Ragozin
|
Handicapping-Figure Making
|
Little, Brown and Company
|
1997
|
9781432768522
|
3. JSON File
|
Title
|
Author
|
Subject
|
Publisher
|
Year
|
ISBN
|
|
Efficiency of Racetrack Betting Markets
|
Donald B. Hausch,Victor S.Y. Lo, William T. Ziemba
|
Academic Finance Betting Markets
|
World Scientific
|
2012
|
9812819185
|
|
Precision: Statistical and Mathematical Methods in Horse Racing
|
C X Wong
|
Quantitative Methods in Horse Racing
|
Outskirts Press
|
2011
|
978143276852
|
|
The Odds Must Be Crazy: Beating the Races with the Man Who Revolutionized Handicapping
|
Len Ragozin
|
Handicapping Figure Making
|
Little, Brown and Company
|
1997
|
9781432768522
|
|
Are The Files Identical
## [1] "Names: 1 string mismatch"
## [2] "Component \"Year\": Modes: numeric, character"
## [3] "Component \"Year\": target is numeric, current is character"
## [4] "Component \"ISBN\": Modes: numeric, character"
## [5] "Component \"ISBN\": target is numeric, current is character"
## [1] FALSE
The all.equal and identical functions indicate that the table/files are not identical. The reason for this is that Year and ISBN are numeric in some files and characters in others. The characters variables could easily be coersed to numerics, thus rendering the files identical.