Introduction


In search of Financial Databases that potentially could be interesting as training material, ETF databases quickly came to mind. More specifically the data available on the different ETF issuers websites, such as Ishares, SPDR, Vanduard etc.

The Ishares website has a nice overview of 328 ETF’s. (As of 13-Jan-2016) Website: https://www.ishares.com/us/products/etf-product-list#!type=ishares&tab=overview&view=list

Interesting enough, as this file / report came to shape, it indeed showed how challanging it was to capture all the information in one graph properly. Just to confirm that this is a great exercise and captures the essence of many R packages.

Being the start of a Visual data exploration, much more can be done and will be done in the future. Including creating a similiar report with Interactive Graphics. Feel free to send comments / suggestions

Segregation

Some initial plots done with ggplot2 to get a quick sense of the data. Initial segregation between the five main investment areas based on Net Assets

Now incorporating the Region

Here looking at the inception data of all the ETF’s


A treegraph would make sense to get an understanding of the different categories, but size does matter.

Maybe Split the whole into Regions


Number of ETF’s per classification, using the waffle package.


Performance

Treemap providing a quick overview on Performance 2015

FALSE rect[GRID.rect.1044]


Highlighting Regions other then the U.S. & Global. Europe has performed well for 2015.

The Global Performance Picture is not a pretty one. Best Performance goes to SCZ (iShares MSCI EAFE Small-Cap ETF) with 9.16% return.

US shows overall positive performance for Fixed Income


Performance According to Sub Asset Class

Looking into the other sub-categories, lets see how Smart-Beta Performed


Performance - Alternatives to Barplots

This is an interesting graphic as it captures the performance but also shows which Asset Classes are not present per Region!


A summary of each group can be achieved with Boxplots

And this is the result of the new geom “label”


Violin plot with a label to show the average Performance

Long Term Performance 3Y against 1Y Return


Data: Spreadsheet from Ishares
Plotting & Calculations done by open source software R
Main Packages used: ggplot2, waffle, treemap, RColorBrewer, dplyr, magrittr