In search of Financial Databases that potentially could be interesting as training material, ETF databases quickly came to mind. More specifically the data available on the different ETF issuers websites, such as Ishares, SPDR, Vanduard etc.
The Ishares website has a nice overview of 328 ETF’s. (As of 13-Jan-2016) Website: https://www.ishares.com/us/products/etf-product-list#!type=ishares&tab=overview&view=list
Interesting enough, as this file / report came to shape, it indeed showed how challanging it was to capture all the information in one graph properly. Just to confirm that this is a great exercise and captures the essence of many R packages.
Being the start of a Visual data exploration, much more can be done and will be done in the future. Including creating a similiar report with Interactive Graphics. Feel free to send comments / suggestionsSome initial plots done with ggplot2 to get a quick sense of the data. Initial segregation between the five main investment areas based on Net Assets
Now incorporating the Region
Here looking at the inception data of all the ETF’s
A treegraph would make sense to get an understanding of the different categories, but size does matter.
Maybe Split the whole into Regions
Number of ETF’s per classification, using the waffle package.
Treemap providing a quick overview on Performance 2015
FALSE rect[GRID.rect.1044]
Highlighting Regions other then the U.S. & Global. Europe has performed well for 2015.
The Global Performance Picture is not a pretty one. Best Performance goes to SCZ (iShares MSCI EAFE Small-Cap ETF) with 9.16% return.
US shows overall positive performance for Fixed Income
Performance According to Sub Asset Class
Looking into the other sub-categories, lets see how Smart-Beta Performed
This is an interesting graphic as it captures the performance but also shows which Asset Classes are not present per Region!
A summary of each group can be achieved with Boxplots
And this is the result of the new geom “label”
Violin plot with a label to show the average Performance
Long Term Performance 3Y against 1Y Return
Data: Spreadsheet from Ishares
Plotting & Calculations done by open source software R
Main Packages used: ggplot2, waffle, treemap, RColorBrewer, dplyr, magrittr