This is an investigation of government bond data between 1970 and 2020 - from https://www.kaggle.com/datasets/everget/government-bonds - completed as part of an interview process for the International Capital Market Association in April 2022.
The starting point questions/points to consider are:
..as well as additional/optional points:
I’ll try to answer these broadly in order, and explore any tangents or interesting bits I find or go down as I go. Write-ups like this in very similar ‘conversational’ formats are my preferred way of working for more ‘exploratory’ pieces of work, as they provide documentation, process notes, future avenues of exploration, as well as some notes about failed approaches or similar ‘blind alleys’ all in one. These give me a decently thorough ‘refresher’ if I need to come back to a piece of work weeks or months later, so I am not left wondering what I was trying to accomplish with a piece of code, or why I approached something in a certain way - there is at least a modicum of context.
‘Other Insights and Observations’ is covered more-or-less throughout, and will not have a separate section. For those with little interest in the data reformatting (which will be, I assume, most people), I would recommend skipping straight to the start of my discussion on Volatility.
Most of the ‘hard part’ in this exercise is going to be restructuring the data. Currently, there are two tables - prices and yields - each with one row per day (apparently excluding weekends and at least some bank holidays, presumably in whatever jurisdiction this was collated in/for), and one column per bond type per nation, leading to c.13000 rows and 216 columns, as below:
knitr::kable(head(yields))
| time | AU01 | AU02 | AU03 | AU05 | AU07 | AU10 | AU15 | AU20 | AU30 | BE01 | BE02 | BE03 | BE05 | BE07 | BE10 | BE15 | BE20 | BE30 | CA03M | CA06M | CA01 | CA02 | CA03 | CA05 | CA10 | CA30 | CN01 | CN02 | CN03 | CN05 | CN07 | CN10 | CN15 | CN20 | CN30 | DK02 | DK03 | DK05 | DK10 | DK20 | FR01 | FR02 | FR03 | FR05 | FR07 | FR10 | FR15 | FR20 | FR25 | FR30 | FR50 | DE01 | DE02 | DE03 | DE05 | DE07 | DE10 | DE15 | DE20 | DE25 | DE30 | GR05 | GR10 | GR15 | GR20 | GR25 | HK01 | HK02 | HK03 | HK05 | HK07 | HK10 | IN01 | IN02 | IN05 | IN07 | IN10 | IN20 | IN25 | ID01 | ID03 | ID05 | ID10 | ID15 | ID20 | ID25 | ID30 | IE01 | IE02 | IE03 | IE05 | IE10 | IE15 | IT01 | IT02 | IT03 | IT05 | IT07 | IT10 | IT15 | IT20 | IT25 | IT30 | JP01 | JP02 | JP03 | JP05 | JP07 | JP10 | JP15 | JP20 | JP30 | JP40 | KR01 | KR02 | KR03 | KR05 | KR10 | KR20 | KR30 | MY03 | MY05 | MY07 | MY10 | MY15 | MY20 | MY30 | NL01 | NL02 | NL03 | NL05 | NL07 | NL10 | NL15 | NL20 | NL30 | NZ02 | NZ05 | NZ10 | NZ15 | NZ20 | NO02 | NO05 | NO07 | NO10 | PL02 | PL05 | PL10 | PT02 | PT03 | PT05 | PT07 | PT10 | PT20 | SG02 | SG05 | SG10 | SG15 | SG20 | SG30 | ZA03 | ZA05 | ZA10 | ZA15 | ZA20 | ZA30 | ES02 | ES03 | ES05 | ES07 | ES10 | ES20 | ES30 | SE02 | SE05 | SE07 | SE10 | SE20 | TW02 | TW05 | TW10 | TW20 | TW30 | TH02 | TH05 | TH10 | TH15 | TR01 | TR02 | TR03 | TR05 | TR10 | GB01 | GB02 | GB03 | GB05 | GB07 | GB10 | GB15 | GB20 | GB25 | GB30 | GB50 | US01M | US02M | US03M | US06M | US01 | US02 | US03 | US05 | US07 | US10 | US20 | US30 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2000-05-05 | 6.280 | 6.34 | 6.460 | 6.52 | 0 | 6.56 | 6.56 | 0 | 0 | 2.86 | 4.931 | 5.114 | 5.419 | 0 | 5.728 | 5.838 | 5.657 | 0 | 5.58 | 5.89 | 6.19 | 6.251 | 6.395 | 6.375 | 6.285 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 5.442 | 5.619 | 5.697 | 5.848 | 0 | 4.592 | 4.805 | 5.119 | 5.289 | 0 | 5.566 | 5.571 | 5.719 | 0 | 5.818 | 0 | 4.571 | 4.770 | 5.006 | 5.223 | 0 | 5.436 | 0 | 5.533 | 0 | 5.698 | 6.238 | 6.299 | 6.350 | 6.391 | 0 | 0 | 0 | 0 | 0 | 0 | 7.645 | 0 | 0 | 9.699 | 0 | 10.432 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 4.433 | 4.664 | 0 | 5.460 | 5.700 | 0 | 0 | 5.044 | 5.227 | 5.529 | 0 | 5.762 | 0 | 0 | 0 | 6.078 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 8.28 | 8.720000 | 8.86 | 9.229999 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 4.927 | 5.142 | 5.373 | 0 | 5.603 | 4.259 | 0 | 5.808 | 0 | 0 | 0 | 0 | 0 | 0 | 6.49 | 0 | 6.34 | 0 | 0 | 0 | 5.237 | 0 | 5.538 | 0 | 5.774 | 0 | 3.089 | 3.88 | 4.388 | 0 | 0 | 0 | 0 | 13.85 | 14.41 | 0 | 14.33 | 0 | 4.545 | 5.151 | 5.413 | 0 | 5.717 | 0 | 5.933 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 6.156 | 6.267 | 6.248 | 5.846 | 0 | 5.373 | 5.012 | 4.804 | 0 | 4.593 | 0 | 0 | 0 | 5.899 | 5.910 | 0 | 6.784 | 4.557 | 6.692 | 0 | 6.511 | 0 | 6.157 |
| 2000-05-08 | 6.260 | 6.34 | 6.415 | 6.52 | 0 | 6.56 | 6.49 | 0 | 0 | 2.86 | 4.946 | 5.115 | 5.419 | 0 | 5.716 | 5.830 | 5.657 | 0 | 5.58 | 5.89 | 6.19 | 6.251 | 6.395 | 6.375 | 6.285 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 5.431 | 5.574 | 5.699 | 5.798 | 0 | 4.602 | 4.819 | 5.088 | 5.299 | 0 | 5.533 | 5.542 | 5.682 | 0 | 5.793 | 0 | 4.630 | 4.788 | 4.999 | 5.186 | 0 | 5.429 | 0 | 5.543 | 0 | 5.671 | 6.184 | 6.261 | 6.347 | 6.375 | 0 | 0 | 0 | 0 | 0 | 0 | 7.730 | 0 | 0 | 9.807 | 0 | 10.505 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 4.500 | 4.664 | 0 | 5.441 | 5.656 | 0 | 0 | 5.035 | 5.219 | 5.506 | 0 | 5.719 | 0 | 0 | 0 | 6.046 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 8.28 | 8.720000 | 8.85 | 9.210000 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 4.892 | 5.119 | 5.334 | 0 | 5.572 | 4.259 | 0 | 5.780 | 0 | 0 | 0 | 0 | 0 | 0 | 6.45 | 0 | 6.30 | 0 | 0 | 0 | 5.200 | 0 | 5.505 | 0 | 5.736 | 0 | 3.170 | 3.92 | 4.515 | 0 | 0 | 0 | 0 | 13.85 | 14.41 | 0 | 14.33 | 0 | 4.545 | 5.136 | 5.414 | 0 | 5.691 | 0 | 5.913 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 6.166 | 6.301 | 6.257 | 5.839 | 0 | 5.340 | 4.974 | 4.771 | 0 | 4.551 | 0 | 0 | 0 | 5.899 | 5.910 | 0 | 6.784 | 4.557 | 6.692 | 0 | 6.568 | 0 | 6.157 |
| 2000-05-09 | 6.340 | 6.38 | 6.500 | 6.54 | 0 | 6.55 | 6.55 | 0 | 0 | 2.86 | 4.937 | 5.104 | 5.386 | 0 | 5.679 | 5.800 | 5.657 | 0 | 5.58 | 5.88 | 6.20 | 6.316 | 6.481 | 6.457 | 6.363 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 5.393 | 5.526 | 5.658 | 5.751 | 0 | 4.569 | 4.801 | 5.060 | 5.267 | 0 | 5.496 | 5.494 | 5.631 | 0 | 5.732 | 0 | 4.545 | 4.753 | 4.966 | 5.139 | 0 | 5.382 | 0 | 5.491 | 0 | 5.617 | 6.184 | 6.248 | 6.355 | 6.386 | 0 | 0 | 0 | 0 | 0 | 0 | 7.761 | 0 | 0 | 9.786 | 0 | 10.514 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 4.480 | 4.664 | 0 | 5.382 | 5.617 | 0 | 0 | 5.027 | 5.220 | 5.494 | 0 | 5.693 | 0 | 0 | 0 | 5.983 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 8.28 | 8.720000 | 8.87 | 9.210000 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 4.870 | 5.099 | 5.307 | 0 | 5.537 | 4.259 | 0 | 5.725 | 0 | 0 | 0 | 0 | 0 | 0 | 6.50 | 0 | 6.30 | 0 | 0 | 0 | 5.140 | 0 | 5.461 | 0 | 5.696 | 0 | 3.159 | 3.99 | 4.519 | 0 | 0 | 0 | 0 | 13.85 | 14.41 | 0 | 14.33 | 0 | 4.545 | 5.118 | 5.387 | 0 | 5.657 | 0 | 5.880 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 6.140 | 6.321 | 6.226 | 5.801 | 0 | 5.299 | 4.931 | 4.722 | 0 | 4.491 | 0 | 0 | 0 | 5.984 | 6.031 | 0 | 6.872 | 4.557 | 6.827 | 0 | 6.524 | 0 | 6.248 |
| 2000-05-10 | 6.240 | 6.34 | 6.390 | 6.51 | 0 | 6.54 | 6.50 | 0 | 0 | 2.86 | 4.885 | 5.040 | 5.338 | 0 | 5.644 | 5.748 | 5.657 | 0 | 5.67 | 5.90 | 6.23 | 6.306 | 6.462 | 6.430 | 6.323 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 5.393 | 5.519 | 5.625 | 5.732 | 0 | 4.546 | 4.757 | 5.011 | 5.233 | 0 | 5.472 | 5.472 | 5.611 | 0 | 5.730 | 0 | 4.521 | 4.736 | 4.927 | 5.128 | 0 | 5.353 | 0 | 5.491 | 0 | 5.595 | 6.172 | 6.221 | 6.328 | 6.381 | 0 | 0 | 0 | 0 | 0 | 0 | 7.750 | 0 | 0 | 9.819 | 0 | 10.535 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 4.485 | 4.664 | 0 | 5.387 | 5.620 | 0 | 0 | 4.977 | 5.175 | 5.455 | 0 | 5.667 | 0 | 0 | 0 | 5.970 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 8.29 | 8.729999 | 8.89 | 9.220000 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 4.848 | 5.062 | 5.286 | 0 | 5.499 | 4.259 | 0 | 5.703 | 0 | 0 | 0 | 0 | 0 | 0 | 6.49 | 0 | 6.21 | 0 | 0 | 0 | 5.148 | 0 | 5.450 | 0 | 5.675 | 0 | 3.154 | 3.91 | 4.465 | 0 | 0 | 0 | 0 | 13.85 | 14.41 | 0 | 14.33 | 0 | 4.545 | 5.107 | 5.359 | 0 | 5.619 | 0 | 5.823 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 6.132 | 6.291 | 6.214 | 5.818 | 0 | 5.291 | 4.915 | 4.705 | 0 | 4.469 | 0 | 0 | 0 | 6.115 | 6.140 | 0 | 6.851 | 4.557 | 6.788 | 0 | 6.457 | 0 | 6.216 |
| 2000-05-11 | 6.220 | 6.26 | 6.374 | 6.42 | 0 | 6.41 | 6.42 | 0 | 0 | 2.86 | 4.915 | 5.094 | 5.376 | 0 | 5.723 | 5.838 | 5.657 | 0 | 5.66 | 5.87 | 6.20 | 6.272 | 6.386 | 6.342 | 6.230 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 5.512 | 5.638 | 5.740 | 5.825 | 0 | 4.600 | 4.842 | 5.100 | 5.201 | 0 | 5.532 | 5.543 | 5.710 | 0 | 5.799 | 0 | 4.717 | 4.754 | 4.976 | 5.238 | 0 | 5.395 | 0 | 5.554 | 0 | 5.693 | 6.148 | 6.230 | 6.342 | 6.381 | 0 | 0 | 0 | 0 | 0 | 0 | 7.750 | 0 | 0 | 9.809 | 0 | 10.527 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 4.509 | 4.664 | 0 | 5.475 | 5.686 | 0 | 0 | 5.044 | 5.214 | 5.491 | 0 | 5.758 | 0 | 0 | 0 | 6.075 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 8.29 | 8.729999 | 8.89 | 9.220000 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 4.969 | 5.171 | 5.389 | 0 | 5.585 | 4.259 | 0 | 5.795 | 0 | 0 | 0 | 0 | 0 | 0 | 6.51 | 0 | 6.27 | 0 | 0 | 0 | 5.258 | 0 | 5.556 | 0 | 5.767 | 0 | 3.198 | 3.96 | 4.487 | 0 | 0 | 0 | 0 | 13.85 | 14.41 | 0 | 14.33 | 0 | 4.545 | 5.130 | 5.402 | 0 | 5.668 | 0 | 5.885 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 6.213 | 6.289 | 6.307 | 5.922 | 0 | 5.393 | 5.027 | 4.791 | 0 | 4.527 | 0 | 0 | 0 | 6.083 | 6.104 | 0 | 6.804 | 4.557 | 6.668 | 0 | 6.420 | 0 | 6.154 |
| 2000-05-12 | 6.243 | 6.26 | 6.374 | 6.42 | 0 | 6.45 | 6.43 | 0 | 0 | 2.86 | 5.022 | 5.155 | 5.457 | 0 | 5.714 | 5.822 | 5.657 | 0 | 5.65 | 5.89 | 6.22 | 6.304 | 6.406 | 6.370 | 6.227 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 5.529 | 5.639 | 5.731 | 5.805 | 0 | 4.631 | 4.910 | 5.141 | 5.352 | 0 | 5.550 | 5.550 | 5.669 | 0 | 5.763 | 0 | 4.790 | 4.863 | 5.065 | 5.256 | 0 | 5.419 | 0 | 5.532 | 0 | 5.641 | 6.214 | 6.268 | 6.361 | 6.395 | 0 | 0 | 0 | 0 | 0 | 0 | 7.729 | 0 | 0 | 9.809 | 0 | 10.520 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 4.534 | 4.664 | 0 | 5.474 | 5.672 | 0 | 0 | 5.105 | 5.295 | 5.551 | 0 | 5.739 | 0 | 0 | 0 | 6.006 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 8.30 | 8.750000 | 8.91 | 9.229999 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 5.003 | 5.210 | 5.397 | 0 | 5.570 | 4.259 | 0 | 5.759 | 0 | 0 | 0 | 0 | 0 | 0 | 6.49 | 0 | 6.27 | 0 | 0 | 0 | 5.300 | 0 | 5.560 | 0 | 5.750 | 0 | 3.156 | 3.89 | 4.487 | 0 | 0 | 0 | 0 | 13.85 | 14.41 | 0 | 14.33 | 0 | 4.545 | 5.191 | 5.444 | 0 | 5.664 | 0 | 5.862 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 6.244 | 6.378 | 6.346 | 5.933 | 0 | 5.390 | 5.035 | 4.802 | 0 | 4.545 | 0 | 0 | 0 | 6.017 | 6.019 | 0 | 6.835 | 4.557 | 6.668 | 0 | 6.524 | 0 | 6.143 |
…and so forth. It is worth mentioning at this point that I made the date into something human readable during import, converting from the UNIX epoch millisecond time stamp to an actual date.
Fundamentally, this format will be incredibly difficult to work with: if I want to pull a single nation, I have to find every bond type included for that nation to manually include them, and ditto for the prices table. First step for me will, therefore, be to break this down into a ‘tidy’ or ‘long’ data format with one observation per row, and far fewer columns. My target is to resemble:
| Date | Nation | Region | Bond Maturity | Price | Yield |
|---|---|---|---|---|---|
| 2000-05-05 | AU | Pacific | 01 | nnn | 6.28 |
| 2000-05-05 | AU | Pacific | 02 | nnn | 6.34 |
…and so forth. In this way I can effortlessly filter or compare by nation, bond types, regions, or any combination thereof.
To accomplish this, I will make one fairly ugly loop which will step through each column, split the title into nation and bond type, and generate one row per day in the dataset for that nation and bond type. The same loop, with very minor tweaks, will do the same job on the prices dataset. I’ll then join them on date, nation, and maturity to marry them into a single 3 million-ish row dataset, and then do a bit of cleanup work - I know there are some packages that do easy conversion from ISO2C nation identifiers to ISO3C and/or full names, as well as including region identifiers, so I will use one of those to make the data a bit easier to work with when it comes to visualisations, since I’ll want nicer looking names at that point.
#Create an empty tibble in the correct format to receive the reformatted data
TidyPrices <- tibble("Date"=as.Date("2022-01-01"),"Nation"="","Maturity"="","Price"=0.0)[-1,]
for(i in 2:ncol(prices)){
#The first two characters of the column name are taken as the nation identifier, and the 3..nth as the bond maturity
nation <- str_sub(names(prices)[i],1,2)
bond <- str_sub(names(prices)[i],3)
PTemp <- prices %>% select(time,"Price"=i)
#Generate a tibble and insert the first column data
TempPrices <- tibble("Date"=prices$time, "Nation"=nation,"Maturity"=as.character(bond)) %>%
left_join(PTemp, by=c("Date"="time"))
#Append the data to the 'master' variable
TidyPrices <- union(TidyPrices,TempPrices)
}
#The same again, but for Yields
TidyYields <- tibble("Date"=as.Date("2022-01-01"),"Nation"="","Maturity"="","Yield"=0.0)[-1,]
for(i in 2:ncol(yields)){
nation <- str_sub(names(yields)[i],1,2)
bond <- str_sub(names(yields)[i],3)
YTemp <- yields %>% select(time,"Yield"=i)
TempYields <- tibble("Date"=yields$time, "Nation"=nation,"Maturity"=as.character(bond)) %>%
left_join(YTemp, by=c("Date"="time"))
TidyYields <- union(TidyYields,TempYields)
}
TidyData <- TidyPrices %>%
full_join(TidyYields, by=c("Date"="Date","Nation"="Nation","Maturity"="Maturity"))
rm(TidyPrices,TidyYields,i,nation,bond,YTemp,TempYields,PTemp,TempPrices)
R isn’t massively thrilled about those loops: it takes about two minutes to run through, and I suspect there is a ‘nicer’ way to accomplish same. That being said, a few minutes for a pair of 3 million entry tables isn’t too unreasonable.
So. I am going to use the countrycode package to append the country names (ISO English short name, technically, so China rather than the People’s Republic of China, e.g) as well as regions. The regions are a bit awkward: the Kaggle page says ‘6 world regions’, but these six match neither the UN Geoscheme regions, nor the World Bank regions. That consistency doesn’t matter much as long as I take it into account going forward. In this case, I am happiest with simple continent groupings rather than going with any of the regions, though this is fairly arbitrary. These will mainly be important for the ‘correlation within regions’ question, but most of that will come down to ‘it depends how you define the regions’ anyway. For the sake of clarity, I’ll also rename the column ‘Nation’ to ‘CountryCode’, and reorder the columns to be in line with what I would expect, even though this makes absolutely no difference to how I will handle them going forward.
TidyData <- TidyData %>%
mutate("Region"=countrycode(Nation,"iso2c","continent"), "Country"=countrycode(TidyData$Nation,"iso2c","iso.name.en")) %>%
rename("CountryCode"=Nation)
TidyData <- TidyData[,c(1,7,2,3,4,5,6)]
This leaves me with a dataset that is easy to use, and easy to further transform based on whatever needs I work out going forward. The table, filtered for 10 year instruments arbitrarily (..and to exclude price=0, just so the result below looks ‘complete’), looks like:
knitr::kable(head(TidyData %>%
filter(Maturity=="10", !Price==0) %>%
arrange(desc(Date)),10))
| Date | Country | CountryCode | Maturity | Price | Yield | Region |
|---|---|---|---|---|---|---|
| 2020-07-24 | Australia | AU | 10 | 115.2130 | 0.8784 | Oceania |
| 2020-07-24 | Belgium | BE | 10 | 102.6260 | -0.1629 | Europe |
| 2020-07-24 | Canada | CA | 10 | 107.1910 | 0.5000 | Americas |
| 2020-07-24 | China | CN | 10 | 98.2916 | 2.8808 | Asia |
| 2020-07-24 | Denmark | DK | 10 | 107.6710 | -0.3116 | Europe |
| 2020-07-24 | France | FR | 10 | 101.5058 | -0.1446 | Europe |
| 2020-07-24 | Germany | DE | 10 | 104.5834 | -0.4450 | Europe |
| 2020-07-24 | Greece | GR | 10 | 103.9000 | 1.0820 | Europe |
| 2020-07-24 | Hong Kong | HK | 10 | 116.4500 | 0.3952 | Asia |
| 2020-07-24 | India | IN | 10 | 99.5400 | 5.8510 | Asia |
That is all for the starting point data transformation, but there is a fairly substantial ‘missingness’ problem with regards to the data. There are quite a few entries with 0 yield and 0 price:
knitr::kable(TidyData %>%
filter(Yield==0, Price==0) %>%
summarise("Rows with missing Yield and Price"=n()))
| Rows with missing Yield and Price |
|---|
| 1755644 |
I am going to make the bold assumption that no yield and no price probably means the instrument wasn’t available in that nation at that time. I think this is an interesting point to explore in some cases, such as if some bond types disappear for a few days or weeks around certain events, but it is unclear whether the longer periods of missingness are at all interesting.
Perhaps more interesting are the rows where Price is 0 but Yield is present:
knitr::kable(TidyData %>%
filter(!Yield==0, Price==0) %>%
summarise("Rows with Yield but 0 Price"=n()))
| Rows with Yield but 0 Price |
|---|
| 667324 |
This means that about 2.4 million of the 2.75 million rows contains some degree of missingness, which isn’t fantastic. A huge portion of this should simply correlate to unavailable instruments, so it may not impact analysis too badly.
After a little bit of googling about why prices might be 0 with listed yields, my best guess is that these were benchmark figures with no instruments actually being sold, assuming the data is correct. The more likely option is that the data is simply incomplete for one reason or another. The yield figures still seem in line with reality, so I think I am safe to leave these in when considering volatility and other questions, but in a ‘real life’ situation I would definitely be asking for guidance from a conveniently available subject matter expert.
Realistically, though, I can save myself a little bit of (processing) time by creating a filtered dataset omitting entries with 0 price and 0 yield.
TFData <- TidyData %>%
filter(!(Yield==0 & Price==0))
…leaving me with just under 1,000,000 rows to play with.
Slightly out of order, but I’ll handle the two volatility questions, and probably touch on
My starting point with volatility is simply: how do you define it?
First, it seems a bit more realistic to give period volatility (e.g over two weeks, a month, a year, etc) than to give volatility over a 50 year sample, so I will almost certainly break it down somewhat.
Second, and more importantly, what is it? I mean, I know the ‘lay’ definition, but I am not entirely certain in context. Min vs Max values over a period might be interesting or useful, and the simple delta gives an easily compared metric, but it seems overly simplistic. Amount of change over a period is an option: take the absolute of the delta of the day-to-day values and provide the mean, as a sort of proto-standard deviation.
After some reading (https://www.investopedia.com/terms/v/volatility.asp and https://onlinelibrary.wiley.com/doi/pdf/10.1002/9781118267059.app1) it seems like the ‘basic’ volatility measure is simple standard deviation, which is easy enough to accomplish. \(\beta\), which approximates the degree of movement of one security’s values versus a baseline, would be interesting and within my capabilities, but I am uncertain how to define a baseline appropriately - the mean of the sample, I suppose, but the more guesswork I apply in this way the less justifiable the outcome seems to me. For purpose, I am going to stick with simple standard deviation as a measure of volatility.
So, the ‘by nation’ portion of this question needs no thought or explaining, but I need to think about time frames at least a little bit. The first part of the first question is ‘Which countries have the most volatility in yield and which the least?’ - I think this can be reasonably answered with simple SD over the sample, but I’m not really thrilled with this: it feels simplistic, and I question whether anything meaningful could be garnered from SD over 50 years. However, that appears to be the request, so:
knitr::kable(TFData %>%
group_by(Country) %>%
summarise("Volatility"=sd(Yield)) %>%
arrange(desc(Volatility)))
| Country | Volatility |
|---|---|
| Greece | 11.5164471 |
| Spain | 10.3033781 |
| Turkey | 4.4130973 |
| Australia | 3.9365278 |
| Italy | 3.6655040 |
| Portugal | 3.5232880 |
| Korea (the Republic of) | 3.0438087 |
| United Kingdom | 3.0123540 |
| Canada | 2.9580026 |
| France | 2.8854052 |
| South Africa | 2.8824648 |
| United States of America (the) | 2.8006201 |
| Germany | 2.7477103 |
| Ireland | 2.5935859 |
| Denmark | 2.3436969 |
| Belgium | 2.2738368 |
| Indonesia | 2.2597620 |
| Netherlands (the) | 2.2506727 |
| Norway | 2.1616012 |
| Hong Kong | 1.9665356 |
| India | 1.4901352 |
| Thailand | 1.1865921 |
| Singapore | 0.9779651 |
| Japan | 0.8148027 |
| New Zealand | 0.8145194 |
| Poland | 0.7929027 |
| China | 0.7883345 |
| Sweden | 0.7659881 |
| Malaysia | 0.5977965 |
| Taiwan (Province of China) | 0.4360028 |
This is, of course, after removing the zero price/zero yield entries, but retaining the zero price/listed yield entries. To me, this illustrates two things. First, that SD over fifty years can give a relatively meaningful result: Greece and Spain with their semi-regular economic crises are extremely volatile, most ‘western’ or EU nations are clustered, and a group of nations that could be seen as being less economically free are grouped at the ‘low yield volatility’ end of the spectrum, which feels relatively consistent to me.
There are, of course, outliers: Sweden is the most ‘Western’ nation at the bottom end of the volatility spectrum, and I don’t have an instant explanation as to why. Poland’s yields are incredibly stable, which is at odds with my understanding of their post-Soviet economic history, with near hyperinflative conditions in the early 90s. I’ll have at least a quick look at both of these shortly. Frankly, I suspect that the ‘high outlier’ nature of Spain and Greece may be exasperated by the years in which data is available for them - the near inverse to Sweden and Poland, where
I will highlight that in a ‘real world’ situation, I would be exploring these data quality issues, coming up with a few quick charts or visualisations to illustrate the issue as I saw it, and then bringing that to the involved stakeholders to discuss options. In this context, I am going to answer the question ‘as written’ since direction cannot be obtained and to do anything else would simply be guessing, but caveat the answer and highlight the data issues. I will take this same approach when presenting.
To finish answering the first part of this question I can do a quick visualisation showing rates of volatility, giving an ‘at a glance’ comparison of the nations.
vol_full <- TFData %>%
group_by(Country) %>%
summarise("Volatility"=sd(Yield)) %>%
arrange(desc(Volatility)) %>%
ggplot(aes(x=Volatility,y=reorder(Country,Volatility)))+
geom_bar(stat="identity",fill="cornflowerblue")+
xlab("Volatility")+
ylab("")+
ggtitle("Yield Volatility 1970-2020")+
theme_minimal()
vol_full
‘Over Time’ is tied in with the data quality question, so I’ll take my little detour before moving on.
The biggest ‘problem’ with this dataset is one of missingness. This contains a nominal 50 years of data, but how many nations actually have 50 years (or anywhere close to that) of data other than rows of zeros?
TFData %>%
group_by(Country,Date) %>%
summarise(Present=TRUE,.groups="keep") %>%
ggplot(aes(x=Date,y=reorder(Country,desc(Country)), fill=Present))+
scale_fill_manual(values=c("cornflowerblue"))+
geom_tile()+
theme_minimal()+
ylab("")+
ggtitle("Periods with Data Present")+
theme(legend.position="none")
Stripiness due to trading days could be resolved, but takes an unreasonable amount of time to find a ‘good’ solution contextually, so I will leave this as a bit of an eyesore. It is, at least, illustrative of the problem: some nations which look EXTREMELY stable have only very recent data. To give a more complete view:
knitr::kable(TFData %>%
group_by(Country) %>%
summarise("Start Date"=min(Date),.groups="keep") %>%
arrange(`Start Date`))
| Country | Start Date |
|---|---|
| Canada | 1970-01-05 |
| Germany | 1979-12-24 |
| United Kingdom | 1979-12-24 |
| Australia | 1979-12-27 |
| United States of America (the) | 1980-01-02 |
| France | 1986-01-02 |
| Italy | 1988-09-01 |
| Netherlands (the) | 1993-01-04 |
| Belgium | 1993-12-06 |
| Spain | 1994-02-01 |
| Norway | 1994-02-02 |
| Denmark | 1994-02-14 |
| Portugal | 1994-05-04 |
| Korea (the Republic of) | 1995-05-02 |
| South Africa | 1995-05-19 |
| Singapore | 1997-04-18 |
| Hong Kong | 1998-01-09 |
| India | 1998-02-02 |
| Ireland | 1998-02-02 |
| Greece | 1998-12-22 |
| China | 2000-09-21 |
| Thailand | 2001-01-16 |
| Malaysia | 2001-10-15 |
| Indonesia | 2003-05-14 |
| Japan | 2006-02-06 |
| Turkey | 2006-07-03 |
| Sweden | 2012-08-14 |
| Poland | 2016-03-09 |
| New Zealand | 2016-04-26 |
| Taiwan (Province of China) | 2016-04-26 |
Sweden, Taiwan, New Zealand and Poland being especially notable, in that they all miss out on the 2008 financial crisis and aftermath completely. At the other extreme, Greece and Spain show up just in time to demonstrate their modern reputation of economic uncertainty. Most notably, this dataset starts at 1970 but has only Canadian data until 1979. The ‘overall’ figures should, therefore, be taken with a few tonnes of salt.
As an exercise, I will redo the ‘overall’ volatility measure using only the data since 2016-04-26, when the dataset becomes ‘complete’ for the first time:
knitr::kable(TFData %>%
filter(Date >= "2016-04-26") %>%
group_by(Country) %>%
summarise("Volatility"=sd(Yield),.groups="keep") %>%
arrange(desc(Volatility)))
| Country | Volatility |
|---|---|
| Turkey | 4.7266250 |
| Greece | 2.5102988 |
| Italy | 1.3493113 |
| South Africa | 1.1198848 |
| Portugal | 1.1098210 |
| Spain | 0.9891931 |
| Indonesia | 0.8801135 |
| Australia | 0.8401523 |
| United States of America (the) | 0.8400791 |
| Netherlands (the) | 0.8373112 |
| France | 0.8331294 |
| New Zealand | 0.8145194 |
| Poland | 0.7967633 |
| Belgium | 0.7569377 |
| India | 0.7293758 |
| Malaysia | 0.6603802 |
| China | 0.6505065 |
| Thailand | 0.6467702 |
| Germany | 0.6238885 |
| Ireland | 0.6233773 |
| Canada | 0.6141509 |
| United Kingdom | 0.5887705 |
| Singapore | 0.5660003 |
| Hong Kong | 0.5557842 |
| Korea (the Republic of) | 0.4440892 |
| Taiwan (Province of China) | 0.4360028 |
| Sweden | 0.4199198 |
| Norway | 0.4189074 |
| Denmark | 0.3712102 |
| Japan | 0.3614450 |
vol_filt <- TFData %>%
filter(Date >= "2016-04-26") %>%
group_by(Country) %>%
summarise("Volatility"=sd(Yield),.groups="keep") %>%
arrange(desc(Volatility)) %>%
ggplot(aes(x=Volatility,y=reorder(Country,Volatility)))+
geom_bar(stat="identity",fill="cornflowerblue",alpha=0.9)+
xlab("Volatility")+
ylab("")+
ggtitle("Yield Volatility Since 2016")+
theme_minimal()
vol_filt
..or with matching axes for a better side-by-side comparison (if one can excuse the horizontal compression):
grid.arrange(vol_full+xlim(0,12), vol_filt+xlim(0,12),ncol=2)
With a like-for-like comparison data completion-wise, most outliers disappear. Turkey’s yield volatility has barely changed with the date shift (from 2006 previously, now from 2016), implying relatively consistent market conditions (or, consistently inconsistent market conditions). Greece is less of an outlier, but still volatile especially when compared to other EU/EEA nations. The remaining high volatility nations are about as one would expect: Italy, Spain, South Africa, and Portugal being slightly higher than nations which would be viewed as being relatively stable (UK, Canada, Scandinavia, etc.) and similar, with Scandinavia and Japan rounding up the ‘stable’ end of the spectrum.
As illustrated above, especially with regard to the data start dates, change over the whole period is relatively meaningless. Prior to 1980, there is only really data from Canada, which goes up to 7 nations by 1990. For the sake of the exercise, I’m going to try a few graphs starting in 1980 to illustrate yield changes over time. It is worth noting here that there is what appears to be some garbage data in this dataset with regards to Spain in March 1994, where some days show in excess of 800% bond yields, throwing the volatility entirely out of line with reality, especially when broken down by month. Presumably, this is simple garbage data rather than having a real explanation, so I will filter it out.
VolAnum <- TFData %>%
filter(Date >= "1980-01-01", !Yield>=100) %>%
group_by(Country, Region, Year=year(Date)) %>%
summarise(Volatility=sd(Yield),.groups="keep")
VolMonthly <- TFData %>%
filter(Date >= "1980-01-01", !Yield>=100) %>%
group_by(Country, Region, Date=as.Date(paste(format(Date,'%Y-%m'),"-1",sep=""),"%Y-%m-%d")) %>%
summarise(Volatility=sd(Yield),.groups="keep")
VolAnum %>%
ggplot(aes(x=Year,y=Volatility,colour=Country))+
geom_line()+
theme_minimal()+
scale_y_log10()+
ylab("Volatility (log10)")+
xlab("")+
theme(legend.position="none")+
ggtitle("Yearly Volatility by Nation")
VolMonthly %>%
ggplot(aes(x=Date,y=Volatility,colour=Country))+
geom_line()+
theme_minimal()+
scale_y_log10()+
ylab("Volatility (log10)")+
xlab("")+
theme(legend.position="none")+
ggtitle("Monthly Volatility by Nation")
The inclusion of Greece alone makes this far less readable than it has any business being, though there are too many nations included for this to be particularly informative. I have deliberately omitted the legend to make the lines fractionally more readable, on the basis that the graph is so unreadable with so much colour overlap that even with a legend no sense could be made of it.
This being done, I don’t think lines or a candlestick graph will ever be massively informative here unless trying to look at one (or very few) nations, though with the annual breakdown one can start to see some fairly extreme correlations, both temporally and between nations.
I think a heatmap might display the ‘over time’ trends better:
VolAnum %>%
mutate(Outlier=ifelse(Volatility>=3,TRUE,FALSE), Volatility=ifelse(Volatility>=2,2,Volatility)) %>%
ggplot(aes(x=Year,y=reorder(Country,desc(Country)), fill=Volatility))+
geom_tile()+
theme_minimal()+
ylab("")+
theme(legend.position="none")+
ggtitle("Yield Volatility Since 1980")
This obviously compresses the level of volatility in no small way - without some compression the more volatile periods completely overwhelm everything else, and handling them more gracefully would require a decent amount of time expended on one small visual. To answer the question, though: yield volatility evolves massively over time - you can more or less pick out major global and regional events even on a simplistic graphic like this one.
For instance, looking at Europe:
VolAnum %>%
filter(Region=="Europe") %>%
mutate(Outlier=ifelse(Volatility>=3,TRUE,FALSE), Volatility=ifelse(Volatility>=2,2,Volatility)) %>%
ggplot(aes(x=Year,y=reorder(Country,desc(Country)), fill=Volatility))+
geom_tile()+
theme_minimal()+
ylab("")+
theme(legend.position="none")+
ggtitle("European Yield Volatility Since 1980")
…one can easily pick out the 2008 financial crash, the Eurozone debt crisis, and (at least in the case of Italy and Greece) what probably amounts to the beginnings of COVID-19 related volatility. Without looking at summary statistics or mathematically derived correlation coefficients, one can intuit with a reasonable degree of accuracy how linked or unlinked nations are economically - the EU nations are near carbon copies of one another, while Scandinavia remains clearly separated, but not entirely unaffected - and fairly strongly correlated with each other.
Looking at the United States only:
VolMonthly %>%
filter(Country=="United States of America (the)") %>%
ggplot(aes(x=Date,y=Volatility))+
geom_line()+
theme_minimal()+
ylab("Volatility")+
geom_vline(aes(xintercept=as.Date("1990-08-01"),colour="red"))+
geom_vline(aes(xintercept=as.Date("1994-01-01"),colour="red"))+
geom_vline(aes(xintercept=as.Date("2001-09-01"),colour="red"))+
geom_vline(aes(xintercept=as.Date("2003-03-01"),colour="red"))+
geom_vline(aes(xintercept=as.Date("2007-09-01"),colour="red"))+
theme(legend.position="none")+
ggtitle("USA Yield Volatility Since 1980")
..we can, even without the assistance of lines marking dates, pick out the Gulf War, start of NAFTA and localised shocks caused by the Mexican ‘Peso Crisis’, September 11 2001, followed closely by the second Iraq War, as well as the end of Q2 2007 leading into the sub-prime mortgage crisis. The chart ends, of course, with the beginnings of the COVID uncertainty.
Without the annotations, we can look at two quite close trade partners, France and Germany:
VolMonthly %>%
filter(Country %in% c("France","Germany"), Date >= "1990-01-01") %>%
ggplot(aes(x=Date,y=Volatility,colour=Country))+
geom_line()+
theme_minimal()+
ylab("")+
ggtitle("German and French Yield Volatility Since 1990")
..we can see that they track incredibly closely, especailly between 2000 and 2010, when they are almost indistinguishable. Since 2010, the shapes remain consistent, despite the divergence.
I don’t really have much of a ‘gut feeling’ of what the answer here will be, or why, but it is easy enough to look at:
knitr::kable(TFData %>%
group_by(Maturity) %>%
summarise(Volatility=sd(Yield), Sample=n(),.groups="keep") %>%
arrange(desc(Volatility)))
| Maturity | Volatility | Sample |
|---|---|---|
| 03 | 6.4294127 | 122351 |
| 20 | 5.0716333 | 74775 |
| 05 | 4.9837343 | 171897 |
| 25 | 3.8471081 | 12829 |
| 10 | 3.5408829 | 175350 |
| 02 | 3.4206638 | 138432 |
| 01 | 3.2596165 | 75990 |
| 03M | 3.0163429 | 16149 |
| 15 | 2.8719753 | 68090 |
| 30 | 2.5092943 | 74187 |
| 06M | 1.9384928 | 19422 |
| 07 | 1.6609227 | 28069 |
| 01M | 1.4703415 | 4676 |
| 50 | 1.1811947 | 7698 |
| 02M | 0.8622539 | 445 |
| 40 | 0.7026141 | 3201 |
So, the reaction looking at the entire dataset would be ‘it doesn’t’. However, we have tens or hundreds of thousands of examples of some maturities, and hundreds or low thousands of others. This may also change substantially by nation - if Greece is ‘responsible’ for the majority of the 3 or 20 year bonds, they would be a high outlier because of the association with Greece, not because of the Maturity.
Because of the data quality issues discussed before, I’m almost not certain how to approach this properly. This is a case where I would definitely be going back to a stakeholder and talking about the data issues and trying to work out what they were trying to figure out here to know what the most appropriate way to slice the data is.
I guess I can consider which nations use which maturities, but then there is the question of ‘when’. I think the best approach is to use the dataset containing all nations (2016 onward) to get the most consistent picture. I’m also going to exclude maturities with below 5000 samples, as they are likely to be issued by only one or two nations, and therefore not contributing at all to an overall picture of the effects of Maturity on Volatility.
knitr::kable(TFData %>%
filter(Date >= "2016-04-26") %>%
group_by(Maturity) %>%
summarise(Volatility=sd(Yield), Sample=n(),.groups="keep") %>%
filter(Sample >= 5000) %>%
arrange(desc(Volatility)))
| Maturity | Volatility | Sample |
|---|---|---|
| 01 | 4.022928 | 18190 |
| 03 | 3.614053 | 22470 |
| 05 | 3.239926 | 32100 |
| 02 | 3.058203 | 27820 |
| 25 | 3.011663 | 7490 |
| 10 | 3.011514 | 32100 |
| 20 | 2.845421 | 19170 |
| 15 | 2.705239 | 17307 |
| 30 | 2.478833 | 19137 |
| 07 | 1.882220 | 18190 |
I would say that, broadly speaking, the volatility of yield is inversely correlated with Maturity - the longer the Maturity, the lower the volatility. However, this is at odds with what a bit of reading about bonds tells me: it says that longer-term bonds should be MORE volatile than lower term bonds. I am going to assume, therefore, that there is something weird with the data: the sample of nations and distribution of maturities means that something ‘funny’ is going on.
I’ll try to demonstrate this by breaking the above down by nation:
knitr::kable(TFData %>%
filter(Date >= "2016-04-26") %>%
group_by(Country,Maturity) %>%
summarise(Volatility=sd(Yield), Sample=n(),.groups="keep") %>%
arrange(Country))
| Country | Maturity | Volatility | Sample |
|---|---|---|---|
| Australia | 01 | 0.5232951 | 1070 |
| Australia | 02 | 0.5740956 | 1070 |
| Australia | 03 | 0.6187630 | 1070 |
| Australia | 05 | 0.6845195 | 1070 |
| Australia | 07 | 0.7161627 | 1070 |
| Australia | 10 | 0.7023857 | 1070 |
| Australia | 15 | 0.7401314 | 1070 |
| Australia | 20 | 0.1154084 | 187 |
| Australia | 30 | 1.1069387 | 947 |
| Belgium | 01 | 0.0683349 | 1070 |
| Belgium | 02 | 0.0741974 | 1070 |
| Belgium | 03 | 0.1060574 | 1070 |
| Belgium | 05 | 0.2100796 | 1070 |
| Belgium | 07 | 0.2787462 | 1070 |
| Belgium | 10 | 0.3681651 | 1070 |
| Belgium | 15 | 0.4282925 | 1070 |
| Belgium | 20 | 0.4138677 | 1070 |
| Belgium | 30 | 0.4396111 | 1070 |
| Canada | 01 | 0.5814184 | 1070 |
| Canada | 02 | 0.5917615 | 1070 |
| Canada | 03 | 0.5871970 | 1070 |
| Canada | 03M | 0.5302934 | 1070 |
| Canada | 05 | 0.5742670 | 1070 |
| Canada | 06M | 0.5468774 | 1070 |
| Canada | 10 | 0.5126812 | 1070 |
| Canada | 30 | 0.4113583 | 1070 |
| China | 01 | 0.9756378 | 1070 |
| China | 02 | 0.5121565 | 1070 |
| China | 03 | 0.7504183 | 1070 |
| China | 05 | 0.4601611 | 1070 |
| China | 07 | 0.4035091 | 1070 |
| China | 10 | 0.3705094 | 1070 |
| China | 15 | 0.3397111 | 1070 |
| China | 20 | 0.3460935 | 1070 |
| China | 30 | 0.3249406 | 1070 |
| Denmark | 02 | 0.1223675 | 1070 |
| Denmark | 03 | 0.1488948 | 1070 |
| Denmark | 05 | 0.2253679 | 1070 |
| Denmark | 10 | 0.3707640 | 1070 |
| Denmark | 20 | 0.1412321 | 187 |
| France | 01 | 0.0603588 | 1070 |
| France | 02 | 0.0918562 | 1070 |
| France | 03 | 0.1491822 | 1070 |
| France | 05 | 0.2398681 | 1070 |
| France | 07 | 0.2875683 | 1070 |
| France | 10 | 0.3936286 | 1070 |
| France | 15 | 0.4359402 | 1070 |
| France | 20 | 0.4613718 | 1070 |
| France | 25 | 0.4502617 | 1070 |
| France | 30 | 0.4601811 | 1070 |
| France | 50 | 0.4795901 | 1070 |
| Germany | 01 | 0.0805831 | 1070 |
| Germany | 02 | 0.0872463 | 1070 |
| Germany | 03 | 0.1209972 | 1070 |
| Germany | 05 | 0.2086514 | 1070 |
| Germany | 07 | 0.2843686 | 1070 |
| Germany | 10 | 0.3711332 | 1070 |
| Germany | 15 | 0.4005665 | 1070 |
| Germany | 20 | 0.4845173 | 1070 |
| Germany | 25 | 0.4652682 | 1070 |
| Germany | 30 | 0.5348418 | 1070 |
| Greece | 05 | 3.3252171 | 1070 |
| Greece | 10 | 2.2246409 | 1070 |
| Greece | 15 | 2.3196618 | 1070 |
| Greece | 20 | 2.2389929 | 1070 |
| Greece | 25 | 2.1852659 | 1070 |
| Hong Kong | 01 | 0.5940577 | 1070 |
| Hong Kong | 02 | 0.5504825 | 1070 |
| Hong Kong | 03 | 0.5303794 | 1070 |
| Hong Kong | 05 | 0.5267569 | 1070 |
| Hong Kong | 07 | 0.5130899 | 1070 |
| Hong Kong | 10 | 0.4944564 | 1070 |
| India | 01 | 0.5957291 | 1070 |
| India | 02 | 0.8343062 | 1070 |
| India | 05 | 0.6576624 | 1070 |
| India | 07 | 0.5894984 | 1070 |
| India | 10 | 0.6195345 | 1070 |
| India | 20 | 0.4576384 | 1070 |
| India | 25 | 0.4500240 | 1070 |
| Indonesia | 01 | 0.8343626 | 1070 |
| Indonesia | 03 | 0.6450666 | 1070 |
| Indonesia | 05 | 0.6605422 | 1070 |
| Indonesia | 10 | 0.5589509 | 1070 |
| Indonesia | 15 | 0.4970779 | 1070 |
| Indonesia | 20 | 0.4578735 | 1070 |
| Indonesia | 25 | 0.4825076 | 1070 |
| Indonesia | 30 | 0.5026452 | 1070 |
| Ireland | 01 | 0.3905691 | 1070 |
| Ireland | 02 | 0.0783087 | 1070 |
| Ireland | 03 | 0.1024767 | 1070 |
| Ireland | 05 | 0.1749922 | 1070 |
| Ireland | 10 | 0.3783510 | 1070 |
| Ireland | 15 | 0.4190097 | 1070 |
| Italy | 01 | 0.3232781 | 1070 |
| Italy | 02 | 0.4644664 | 1070 |
| Italy | 03 | 0.5513359 | 1070 |
| Italy | 05 | 0.6174200 | 1070 |
| Italy | 07 | 0.6618042 | 1070 |
| Italy | 10 | 0.6441796 | 1070 |
| Italy | 15 | 0.9913044 | 1070 |
| Italy | 20 | 1.2828464 | 1070 |
| Italy | 25 | 0.9922118 | 1070 |
| Italy | 30 | 0.7849119 | 1070 |
| Japan | 01 | 0.0833918 | 1070 |
| Japan | 02 | 0.0565517 | 1070 |
| Japan | 03 | 0.0606268 | 1070 |
| Japan | 05 | 0.0720416 | 1070 |
| Japan | 07 | 0.0995204 | 1070 |
| Japan | 10 | 0.0945444 | 1070 |
| Japan | 15 | 0.1185438 | 1070 |
| Japan | 20 | 0.1632577 | 1070 |
| Japan | 30 | 0.2133646 | 1070 |
| Japan | 40 | 0.2601987 | 1070 |
| Korea (the Republic of) | 01 | 0.3242430 | 1070 |
| Korea (the Republic of) | 02 | 0.3572878 | 1070 |
| Korea (the Republic of) | 03 | 0.3772311 | 1070 |
| Korea (the Republic of) | 05 | 0.4164929 | 1070 |
| Korea (the Republic of) | 10 | 0.4380287 | 1070 |
| Korea (the Republic of) | 20 | 0.4175923 | 1070 |
| Korea (the Republic of) | 30 | 0.4112942 | 1070 |
| Malaysia | 03 | 0.6610088 | 1070 |
| Malaysia | 05 | 0.6618528 | 1070 |
| Malaysia | 07 | 0.3963499 | 1070 |
| Malaysia | 10 | 0.3923085 | 1070 |
| Malaysia | 15 | 0.4533075 | 1070 |
| Malaysia | 20 | 0.4502009 | 1070 |
| Malaysia | 30 | 0.3943682 | 1070 |
| Netherlands (the) | 01 | 0.0705049 | 1070 |
| Netherlands (the) | 02 | 0.0764310 | 1070 |
| Netherlands (the) | 03 | 0.1016692 | 1070 |
| Netherlands (the) | 05 | 0.1700871 | 1070 |
| Netherlands (the) | 07 | 0.2741386 | 1070 |
| Netherlands (the) | 10 | 0.3624083 | 1070 |
| Netherlands (the) | 15 | 1.4593897 | 1070 |
| Netherlands (the) | 20 | 0.3974531 | 1070 |
| Netherlands (the) | 30 | 0.4714278 | 1070 |
| New Zealand | 02 | 0.5654251 | 1070 |
| New Zealand | 05 | 0.6693993 | 1070 |
| New Zealand | 10 | 0.7591092 | 1070 |
| New Zealand | 15 | 1.3508619 | 187 |
| New Zealand | 20 | 1.1156280 | 187 |
| Norway | 02 | 0.3618590 | 1070 |
| Norway | 05 | 0.3345781 | 1070 |
| Norway | 07 | 0.3331710 | 1070 |
| Norway | 10 | 0.3558900 | 1070 |
| Poland | 02 | 0.5858118 | 1070 |
| Poland | 05 | 0.5512618 | 1070 |
| Poland | 10 | 0.6375984 | 1070 |
| Portugal | 02 | 0.3382581 | 1070 |
| Portugal | 03 | 0.4745268 | 1070 |
| Portugal | 05 | 0.7657995 | 1070 |
| Portugal | 07 | 1.1072724 | 1070 |
| Portugal | 10 | 1.1658532 | 1070 |
| Portugal | 20 | 0.2513097 | 187 |
| Singapore | 02 | 0.4721856 | 1070 |
| Singapore | 05 | 0.4319671 | 1070 |
| Singapore | 10 | 0.4181775 | 1070 |
| Singapore | 15 | 0.4416675 | 1070 |
| Singapore | 20 | 0.4220559 | 1070 |
| Singapore | 30 | 0.4289101 | 1070 |
| South Africa | 03 | 0.7521665 | 1070 |
| South Africa | 05 | 0.5390450 | 1070 |
| South Africa | 10 | 0.5152318 | 1070 |
| South Africa | 15 | 0.5424790 | 1070 |
| South Africa | 20 | 0.5799395 | 1070 |
| South Africa | 30 | 0.5535330 | 1070 |
| Spain | 02 | 0.1250253 | 1070 |
| Spain | 03 | 0.1657500 | 1070 |
| Spain | 05 | 0.2426607 | 1070 |
| Spain | 07 | 0.3539316 | 1070 |
| Spain | 10 | 0.4852516 | 1070 |
| Spain | 20 | 0.6372948 | 1070 |
| Spain | 30 | 0.6295438 | 1070 |
| Sweden | 02 | 0.1515283 | 1070 |
| Sweden | 05 | 0.2272169 | 1070 |
| Sweden | 07 | 0.2759229 | 1070 |
| Sweden | 10 | 0.3418934 | 1070 |
| Sweden | 20 | 0.1029414 | 187 |
| Taiwan (Province of China) | 02 | 0.0775297 | 1070 |
| Taiwan (Province of China) | 05 | 0.1280148 | 1070 |
| Taiwan (Province of China) | 10 | 0.2036157 | 1070 |
| Taiwan (Province of China) | 20 | 0.3571261 | 1070 |
| Taiwan (Province of China) | 30 | 0.3740729 | 1070 |
| Thailand | 02 | 0.3643583 | 1070 |
| Thailand | 05 | 0.4276580 | 1070 |
| Thailand | 10 | 0.5281419 | 1070 |
| Thailand | 15 | 0.6254620 | 1070 |
| Turkey | 01 | 5.8353690 | 1070 |
| Turkey | 02 | 5.3098306 | 1070 |
| Turkey | 03 | 4.8250379 | 1070 |
| Turkey | 05 | 3.9614126 | 1070 |
| Turkey | 10 | 3.0810486 | 1070 |
| United Kingdom | 01 | 0.2690984 | 1070 |
| United Kingdom | 02 | 0.2791825 | 1070 |
| United Kingdom | 03 | 0.2717020 | 1070 |
| United Kingdom | 05 | 0.3163380 | 1070 |
| United Kingdom | 07 | 0.3499090 | 1070 |
| United Kingdom | 10 | 0.3860596 | 1070 |
| United Kingdom | 15 | 0.4133474 | 1070 |
| United Kingdom | 20 | 0.4161874 | 1070 |
| United Kingdom | 25 | 0.4076955 | 1070 |
| United Kingdom | 30 | 0.3353617 | 1070 |
| United Kingdom | 50 | 0.3936867 | 1070 |
| United States of America (the) | 01 | 0.7812878 | 1070 |
| United States of America (the) | 01M | 0.7965811 | 1070 |
| United States of America (the) | 02 | 0.7636595 | 1070 |
| United States of America (the) | 02M | 0.8622539 | 445 |
| United States of America (the) | 03 | 0.7461904 | 1070 |
| United States of America (the) | 03M | 0.7902415 | 1070 |
| United States of America (the) | 05 | 0.7133855 | 1070 |
| United States of America (the) | 06M | 0.7859200 | 1070 |
| United States of America (the) | 07 | 0.6812258 | 1070 |
| United States of America (the) | 10 | 0.6597364 | 1070 |
| United States of America (the) | 20 | 0.0989086 | 45 |
| United States of America (the) | 30 | 0.5342911 | 1070 |
TFData %>%
filter(Date >= "2016-04-26", !Price==0) %>%
mutate(Maturity=as.numeric(Maturity)) %>%
na.omit() %>%
group_by(Country,Maturity,Region) %>%
summarise(Volatility=sd(Yield), Sample=n(),.groups="keep") %>%
filter(Sample >=1000) %>%
arrange(Country) %>%
ggplot(aes(x=Maturity,y=Volatility,colour=Country))+
geom_point()+
geom_line(stat="smooth",method = "lm", formula = y ~ x, se = FALSE, size=0.5, alpha=0.2)+
theme_minimal()+
ggtitle("Maturity vs Volatility")
So, it appears that this is lovely example of Simpson’s Paradox, one of my favourite statistical fallacies. It refers to a phenomenon where trends exist in groups of data, but disappear or reverse when the groups are combined. In this case, breaking this down into nations provides the ‘correct’ answer: that higher maturity bonds have higher yield volatility… but not by much, and not universally. In aggregate, the few outlier volatility nations with ‘incorrect’ best fit slopes (Greece in dark green and Turkey in pink) overwhelm the others. It may be fair to say that the less economic certainty there is in a nation, the higher the variance in Yield, and that makes a kind of intuitive sense to me, but this is where my lack of domain knowledge comes to the fore.
Edit 01/05/22: It is worth noting that there is a consistent drop in volatility with regards to the longest maturity instrument available in most jurisdictions. This is far more prominent in those where the longest maturity corresponds with a substantial drop in sample size. I do not know if this implies a lack of interest, or that the instrument is not consistently available as a primary market offering, but either of those could realistically cause a lack of apparent volatility among those products. It is unclear whether these should be omitted or included, but it is worth noting regardless.
Note 02/05/22: I think this section (and the bit above on volatility) have been a bit clumsily written to imply that the relationship between national yields and/or yield volatilities is causative rather than correlative. While it is very likely that it is somewhat causative, this should be read as (e.g in the case of France and Germany having near 98% correlation) that an event which affects one of these economies is likely to have a near-identical effect on the other.
This has been covered a little bit above with regard to yield volatility, but not with strict yield. I would expect the result to be similar: nations with close economic ties are impacted by fluctuations in each other’s economies, as well as by national, regional, and international events at least somewhat similarly. In this way, I would expect Japan to have a stronger correlation with the USA than with China, and Australia to be more closely linked to its Commonwealth peers than its near neighbours.
As a starting point, I’ll look at France and Germany again, as their yield volatility, at least, is closely correlated:
TFData %>%
filter(Country %in% c("France","Germany"), Date >= "1990-01-01") %>%
group_by(Country, Date) %>%
summarise(Yield=mean(Yield),.groups="keep") %>%
ggplot(aes(x=Date,y=Yield,colour=Country))+
geom_line()+
theme_minimal()+
ylab("")+
ggtitle("German and French Yields Since 1990")
I mean, I can do this mathematically as well, but those are nearly identically shaped even when they deviate. Correlation plots and matrices, especially with time series data, is not something I’ve spent a lot of time playing with, so we’ll see if I get to something useful.
What it seems like I need is to get the data from my current ‘tidy’ format into a slightly wider format, like:
| Date | DE_Yield | FR_Yield |
|---|---|---|
..which is easy enough, actually. I think from there I can use correlate() from the corrr package (which I haven’t actually used before..) to generate a correlation matrix. This should be replicable for the complete dataset easily enough.
FYield <- TFData %>%
filter(Country %in% "France", Date >= "1990-01-01") %>%
group_by(Date) %>%
summarise(FR_Yield=mean(Yield),.groups="keep")
GYield <- TFData %>%
filter(Country %in% "Germany", Date >= "1990-01-01") %>%
group_by(Date) %>%
summarise(DE_Yield=mean(Yield),.groups="keep")
GFYield <- full_join(FYield,GYield,by="Date")
knitr::kable(GFYield %>%
ungroup() %>%
select(FR_Yield,DE_Yield) %>%
correlate(quiet=TRUE))
| term | FR_Yield | DE_Yield |
|---|---|---|
| FR_Yield | NA | 0.9771743 |
| DE_Yield | 0.9771743 | NA |
Okay, cool. So, the Pearson’s Correlation Coefficient between the German and French yields are 0.977 - an absurdly closely correlated relationship. Pearson’s Correlation Coefficient provides an estimate of the slope of the relationship between the two factors, so long as they use the same scale. In this case, a one point fluctuation in the yield of one of these nations should provide an 0.977 point fluctuation in the same direction for the other.
I think I can use a reshaping package to do this in one go for the full dataset instead of doing this for 30 nations manually, and then create a little visualisation without too much trouble. Correlation functions handle null values very badly, so I am going to use the 2016 date where all of the nations are present in the data as a starting point. Realistically, I probably could have used melt() and cast() to do the initial cleanup more gracefully rather than use a loop, but oh well.
WideYield <- TFData %>%
filter(Date >= "2016-04-26") %>%
group_by(Date,CountryCode) %>%
summarise(Yield=mean(Yield),.groups="keep") %>%
pivot_wider(id_cols=Date,names_from=CountryCode,values_from=Yield)
YieldCorrs <- WideYield %>%
ungroup() %>%
select(-Date) %>%
correlate(quiet=TRUE) %>%
column_to_rownames(var="term")
knitr::kable(YieldCorrs)
| AU | BE | CA | CN | DE | DK | ES | FR | GB | GR | HK | ID | IE | IN | IT | JP | KR | MY | NL | NO | NZ | PL | PT | SE | SG | TH | TR | TW | US | ZA | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| AU | NA | 0.8870312 | 0.3911642 | 0.6417824 | 0.9429857 | 0.7549636 | 0.8811956 | 0.9326234 | 0.8304877 | 0.6461212 | 0.3717444 | 0.1351675 | 0.7671742 | 0.6845899 | 0.3763824 | 0.7142192 | 0.8523321 | 0.8762465 | 0.8259843 | 0.4476247 | 0.9482942 | 0.7841900 | 0.5818301 | 0.6669701 | 0.7600473 | 0.8816584 | 0.0335420 | 0.9044513 | 0.4935790 | -0.4153050 |
| BE | 0.8870312 | NA | 0.4711414 | 0.6059199 | 0.9436634 | 0.8200378 | 0.8801707 | 0.9676223 | 0.8372466 | 0.4500394 | 0.4594577 | 0.2918290 | 0.8257587 | 0.7295274 | 0.6116291 | 0.7871695 | 0.8642044 | 0.8310146 | 0.7181063 | 0.4992508 | 0.7766363 | 0.6178124 | 0.4170695 | 0.8080461 | 0.7369825 | 0.8326243 | 0.2241310 | 0.7611191 | 0.5843807 | -0.2780308 |
| CA | 0.3911642 | 0.4711414 | NA | 0.6154976 | 0.4864123 | 0.1233518 | 0.1328027 | 0.3552329 | 0.6614010 | -0.2704249 | 0.9460893 | 0.1917836 | 0.0927301 | 0.6379730 | 0.4889613 | 0.4009744 | 0.6735412 | 0.5176527 | 0.0602138 | 0.8986183 | 0.2136858 | 0.3128744 | -0.4080698 | 0.3463706 | 0.7831740 | 0.4961926 | 0.6872376 | 0.2257962 | 0.9520739 | -0.2915587 |
| CN | 0.6417824 | 0.6059199 | 0.6154976 | NA | 0.6981484 | 0.4093182 | 0.4248279 | 0.6084594 | 0.6140995 | 0.0849612 | 0.5260864 | -0.2095985 | 0.2500815 | 0.4743627 | 0.2503735 | 0.6490446 | 0.8113094 | 0.6242709 | 0.3045604 | 0.4932474 | 0.5462258 | 0.5482670 | -0.0419342 | 0.5162376 | 0.6198703 | 0.5703326 | 0.2280510 | 0.5868611 | 0.6137073 | -0.4331410 |
| DE | 0.9429857 | 0.9436634 | 0.4864123 | 0.6981484 | NA | 0.8406511 | 0.8547517 | 0.9467046 | 0.8789453 | 0.5216559 | 0.4359689 | 0.0951142 | 0.7784952 | 0.7327440 | 0.4141212 | 0.7597975 | 0.9036164 | 0.8443313 | 0.8111676 | 0.5083664 | 0.8627683 | 0.7128413 | 0.4601613 | 0.7716760 | 0.7656342 | 0.8350607 | 0.0919476 | 0.8258343 | 0.5642985 | -0.4116513 |
| DK | 0.7549636 | 0.8200378 | 0.1233518 | 0.4093182 | 0.8406511 | NA | 0.7951999 | 0.8224070 | 0.6372000 | 0.4882066 | 0.1099711 | 0.0236656 | 0.8182273 | 0.5684379 | 0.2827398 | 0.6423529 | 0.6863332 | 0.5904540 | 0.8183082 | 0.1635314 | 0.7155954 | 0.4547212 | 0.5477329 | 0.8121138 | 0.4313770 | 0.5511314 | -0.1665369 | 0.6116830 | 0.2033908 | -0.2408408 |
| ES | 0.8811956 | 0.8801707 | 0.1328027 | 0.4248279 | 0.8547517 | 0.7951999 | NA | 0.9297748 | 0.7093239 | 0.7362352 | 0.1338097 | 0.2885410 | 0.9013996 | 0.6017404 | 0.4482302 | 0.6549443 | 0.6724485 | 0.7701143 | 0.8479878 | 0.1976253 | 0.8548744 | 0.6541161 | 0.7302855 | 0.6757845 | 0.5505193 | 0.7897732 | -0.0568263 | 0.8249878 | 0.2564142 | -0.1417553 |
| FR | 0.9326234 | 0.9676223 | 0.3552329 | 0.6084594 | 0.9467046 | 0.8224070 | 0.9297748 | NA | 0.8021916 | 0.5795105 | 0.3549778 | 0.2091545 | 0.8381453 | 0.6470834 | 0.4907413 | 0.7874904 | 0.8369840 | 0.8398088 | 0.7710074 | 0.4117826 | 0.8667711 | 0.7231295 | 0.5602838 | 0.7890622 | 0.6915880 | 0.8492330 | 0.0622053 | 0.8675629 | 0.4697444 | -0.3072362 |
| GB | 0.8304877 | 0.8372466 | 0.6614010 | 0.6140995 | 0.8789453 | 0.6372000 | 0.7093239 | 0.8021916 | NA | 0.3939936 | 0.6401657 | 0.2812246 | 0.7055754 | 0.8355429 | 0.4439865 | 0.5098200 | 0.8372463 | 0.8470582 | 0.7157414 | 0.7272158 | 0.7449323 | 0.6843325 | 0.3000069 | 0.6103911 | 0.9102522 | 0.8335089 | 0.3197094 | 0.7058767 | 0.7358572 | -0.3578252 |
| GR | 0.6461212 | 0.4500394 | -0.2704249 | 0.0849612 | 0.5216559 | 0.4882066 | 0.7362352 | 0.5795105 | 0.3939936 | NA | -0.2706130 | 0.1031037 | 0.7025122 | 0.3194746 | -0.0302860 | 0.1965682 | 0.2356404 | 0.5014677 | 0.8049060 | -0.1032516 | 0.7588498 | 0.6021429 | 0.9278676 | 0.2033026 | 0.2730120 | 0.5752330 | -0.3920566 | 0.7058846 | -0.1441591 | -0.2386630 |
| HK | 0.3717444 | 0.4594577 | 0.9460893 | 0.5260864 | 0.4359689 | 0.1099711 | 0.1338097 | 0.3549778 | 0.6401657 | -0.2706130 | NA | 0.3402767 | 0.1179660 | 0.6442931 | 0.5314668 | 0.3794933 | 0.6382622 | 0.5319335 | 0.0433025 | 0.9199561 | 0.2044794 | 0.3247795 | -0.3578746 | 0.3381692 | 0.8018852 | 0.5195183 | 0.7161885 | 0.2273320 | 0.9318048 | -0.1802882 |
| ID | 0.1351675 | 0.2918290 | 0.1917836 | -0.2095985 | 0.0951142 | 0.0236656 | 0.2885410 | 0.2091545 | 0.2812246 | 0.1031037 | 0.3402767 | NA | 0.3608218 | 0.3886371 | 0.6463048 | 0.0723546 | 0.0340262 | 0.3248384 | 0.1396119 | 0.3124090 | 0.0354513 | 0.0333344 | 0.1324365 | 0.0506745 | 0.3472447 | 0.4109520 | 0.5932707 | 0.0714957 | 0.3210029 | 0.3859559 |
| IE | 0.7671742 | 0.8257587 | 0.0927301 | 0.2500815 | 0.7784952 | 0.8182273 | 0.9013996 | 0.8381453 | 0.7055754 | 0.7025122 | 0.1179660 | 0.3608218 | NA | 0.6752144 | 0.3837843 | 0.4356998 | 0.5605316 | 0.6964962 | 0.8598236 | 0.1953743 | 0.7618993 | 0.5465415 | 0.7204500 | 0.6249735 | 0.5383200 | 0.7004328 | -0.0438032 | 0.6858203 | 0.2280601 | -0.1614131 |
| IN | 0.6845899 | 0.7295274 | 0.6379730 | 0.4743627 | 0.7327440 | 0.5684379 | 0.6017404 | 0.6470834 | 0.8355429 | 0.3194746 | 0.6442931 | 0.3886371 | 0.6752144 | NA | 0.5155226 | 0.3793774 | 0.7182927 | 0.7230827 | 0.6288284 | 0.6549026 | 0.5741039 | 0.4664353 | 0.2249105 | 0.4567032 | 0.8314620 | 0.7438017 | 0.4650625 | 0.4614727 | 0.7209303 | -0.2719622 |
| IT | 0.3763824 | 0.6116291 | 0.4889613 | 0.2503735 | 0.4141212 | 0.2827398 | 0.4482302 | 0.4907413 | 0.4439865 | -0.0302860 | 0.5314668 | 0.6463048 | 0.3837843 | 0.5155226 | NA | 0.5493995 | 0.4556152 | 0.4588377 | 0.1548424 | 0.4444365 | 0.1842722 | 0.1213302 | -0.0554801 | 0.4196564 | 0.4792528 | 0.5238000 | 0.6670654 | 0.1988683 | 0.5829604 | 0.1477253 |
| JP | 0.7142192 | 0.7871695 | 0.4009744 | 0.6490446 | 0.7597975 | 0.6423529 | 0.6549443 | 0.7874904 | 0.5098200 | 0.1965682 | 0.3794933 | 0.0723546 | 0.4356998 | 0.3793774 | 0.5493995 | NA | 0.7629559 | 0.5951764 | 0.4212975 | 0.3337002 | 0.5809748 | 0.4204909 | 0.1890006 | 0.7853377 | 0.4632148 | 0.5947643 | 0.1577777 | 0.5953644 | 0.4363745 | -0.1489002 |
| KR | 0.8523321 | 0.8642044 | 0.6735412 | 0.8113094 | 0.9036164 | 0.6863332 | 0.6724485 | 0.8369840 | 0.8372463 | 0.2356404 | 0.6382622 | 0.0340262 | 0.5605316 | 0.7182927 | 0.4556152 | 0.7629559 | NA | 0.8136326 | 0.5686665 | 0.6445228 | 0.7226666 | 0.6523179 | 0.1604115 | 0.7154380 | 0.8199423 | 0.7627421 | 0.2632737 | 0.7204986 | 0.7253244 | -0.3995203 |
| MY | 0.8762465 | 0.8310146 | 0.5176527 | 0.6242709 | 0.8443313 | 0.5904540 | 0.7701143 | 0.8398088 | 0.8470582 | 0.5014677 | 0.5319335 | 0.3248384 | 0.6964962 | 0.7230827 | 0.4588377 | 0.5951764 | 0.8136326 | NA | 0.6856801 | 0.5895543 | 0.8032857 | 0.7432153 | 0.4173800 | 0.5256810 | 0.8492449 | 0.9087671 | 0.2628979 | 0.8153338 | 0.6378713 | -0.3110147 |
| NL | 0.8259843 | 0.7181063 | 0.0602138 | 0.3045604 | 0.8111676 | 0.8183082 | 0.8479878 | 0.7710074 | 0.7157414 | 0.8049060 | 0.0433025 | 0.1396119 | 0.8598236 | 0.6288284 | 0.1548424 | 0.4212975 | 0.5686665 | 0.6856801 | NA | 0.1881831 | 0.8495014 | 0.6248745 | 0.7939812 | 0.5331859 | 0.5363621 | 0.7027485 | -0.2101456 | 0.7373591 | 0.1646139 | -0.3022398 |
| NO | 0.4476247 | 0.4992508 | 0.8986183 | 0.4932474 | 0.5083664 | 0.1635314 | 0.1976253 | 0.4117826 | 0.7272158 | -0.1032516 | 0.9199561 | 0.3124090 | 0.1953743 | 0.6549026 | 0.4444365 | 0.3337002 | 0.6445228 | 0.5895543 | 0.1881831 | NA | 0.3248484 | 0.4708639 | -0.1927064 | 0.3140852 | 0.8721695 | 0.6139877 | 0.6164304 | 0.3592907 | 0.9257208 | -0.3246456 |
| NZ | 0.9482942 | 0.7766363 | 0.2136858 | 0.5462258 | 0.8627683 | 0.7155954 | 0.8548744 | 0.8667711 | 0.7449323 | 0.7588498 | 0.2044794 | 0.0354513 | 0.7618993 | 0.5741039 | 0.1842722 | 0.5809748 | 0.7226666 | 0.8032857 | 0.8495014 | 0.3248484 | NA | 0.8140374 | 0.7097068 | 0.5798993 | 0.6573695 | 0.8061906 | -0.1727084 | 0.9114451 | 0.3069331 | -0.3847664 |
| PL | 0.7841900 | 0.6178124 | 0.3128744 | 0.5482670 | 0.7128413 | 0.4547212 | 0.6541161 | 0.7231295 | 0.6843325 | 0.6021429 | 0.3247795 | 0.0333344 | 0.5465415 | 0.4664353 | 0.1213302 | 0.4204909 | 0.6523179 | 0.7432153 | 0.6248745 | 0.4708639 | 0.8140374 | NA | 0.5424482 | 0.3345324 | 0.6975497 | 0.7908699 | -0.0390739 | 0.8760888 | 0.4110531 | -0.4538014 |
| PT | 0.5818301 | 0.4170695 | -0.4080698 | -0.0419342 | 0.4601613 | 0.5477329 | 0.7302855 | 0.5602838 | 0.3000069 | 0.9278676 | -0.3578746 | 0.1324365 | 0.7204500 | 0.2249105 | -0.0554801 | 0.1890006 | 0.1604115 | 0.4173800 | 0.7939812 | -0.1927064 | 0.7097068 | 0.5424482 | NA | 0.2636413 | 0.1649462 | 0.4864557 | -0.4981141 | 0.6678145 | -0.2776004 | -0.1426094 |
| SE | 0.6669701 | 0.8080461 | 0.3463706 | 0.5162376 | 0.7716760 | 0.8121138 | 0.6757845 | 0.7890622 | 0.6103911 | 0.2033026 | 0.3381692 | 0.0506745 | 0.6249735 | 0.4567032 | 0.4196564 | 0.7853377 | 0.7154380 | 0.5256810 | 0.5331859 | 0.3140852 | 0.5798993 | 0.3345324 | 0.2636413 | NA | 0.4400672 | 0.4590945 | -0.0154838 | 0.5278960 | 0.3511126 | -0.1526391 |
| SG | 0.7600473 | 0.7369825 | 0.7831740 | 0.6198703 | 0.7656342 | 0.4313770 | 0.5505193 | 0.6915880 | 0.9102522 | 0.2730120 | 0.8018852 | 0.3472447 | 0.5383200 | 0.8314620 | 0.4792528 | 0.4632148 | 0.8199423 | 0.8492449 | 0.5363621 | 0.8721695 | 0.6573695 | 0.6975497 | 0.1649462 | 0.4400672 | NA | 0.8647628 | 0.4816576 | 0.6601702 | 0.8688741 | -0.4000656 |
| TH | 0.8816584 | 0.8326243 | 0.4961926 | 0.5703326 | 0.8350607 | 0.5511314 | 0.7897732 | 0.8492330 | 0.8335089 | 0.5752330 | 0.5195183 | 0.4109520 | 0.7004328 | 0.7438017 | 0.5238000 | 0.5947643 | 0.7627421 | 0.9087671 | 0.7027485 | 0.6139877 | 0.8061906 | 0.7908699 | 0.4864557 | 0.4590945 | 0.8647628 | NA | 0.3263941 | 0.8333181 | 0.6561389 | -0.3545035 |
| TR | 0.0335420 | 0.2241310 | 0.6872376 | 0.2280510 | 0.0919476 | -0.1665369 | -0.0568263 | 0.0622053 | 0.3197094 | -0.3920566 | 0.7161885 | 0.5932707 | -0.0438032 | 0.4650625 | 0.6670654 | 0.1577777 | 0.2632737 | 0.2628979 | -0.2101456 | 0.6164304 | -0.1727084 | -0.0390739 | -0.4981141 | -0.0154838 | 0.4816576 | 0.3263941 | NA | -0.1320428 | 0.7274695 | 0.0973620 |
| TW | 0.9044513 | 0.7611191 | 0.2257962 | 0.5868611 | 0.8258343 | 0.6116830 | 0.8249878 | 0.8675629 | 0.7058767 | 0.7058846 | 0.2273320 | 0.0714957 | 0.6858203 | 0.4614727 | 0.1988683 | 0.5953644 | 0.7204986 | 0.8153338 | 0.7373591 | 0.3592907 | 0.9114451 | 0.8760888 | 0.6678145 | 0.5278960 | 0.6601702 | 0.8333181 | -0.1320428 | NA | 0.3479677 | -0.4432692 |
| US | 0.4935790 | 0.5843807 | 0.9520739 | 0.6137073 | 0.5642985 | 0.2033908 | 0.2564142 | 0.4697444 | 0.7358572 | -0.1441591 | 0.9318048 | 0.3210029 | 0.2280601 | 0.7209303 | 0.5829604 | 0.4363745 | 0.7253244 | 0.6378713 | 0.1646139 | 0.9257208 | 0.3069331 | 0.4110531 | -0.2776004 | 0.3511126 | 0.8688741 | 0.6561389 | 0.7274695 | 0.3479677 | NA | -0.3504164 |
| ZA | -0.4153050 | -0.2780308 | -0.2915587 | -0.4331410 | -0.4116513 | -0.2408408 | -0.1417553 | -0.3072362 | -0.3578252 | -0.2386630 | -0.1802882 | 0.3859559 | -0.1614131 | -0.2719622 | 0.1477253 | -0.1489002 | -0.3995203 | -0.3110147 | -0.3022398 | -0.3246456 | -0.3847664 | -0.4538014 | -0.1426094 | -0.1526391 | -0.4000656 | -0.3545035 | 0.0973620 | -0.4432692 | -0.3504164 | NA |
corrplot(as.matrix(YieldCorrs),method="circle",order="hclust",title="Yield Correlations",cl.pos="n",mar=c(0,0,2,0))
Just about understandable, and considering the number of variables, I’m
happy enough given the context of this activity.
This is pretty much as expected, though: The closer the trading partner, the more closely correlated the yields: ‘The West’ can be broadly construed as a cluster, with Australia more closely related to Europe than their near neighbours. South Africa is off on its own, with no other African nations in the sample. Surprisingly, to me, China is more heavily correlated with Korea and Japan than its larger trading partners, and even more closely correlated with Korea than Hong Kong. This is made up for, I guess, by Hong Kong having a closer correlation to the USA, Canada, and slightly strangely, Norway than to China, Korea, and Japan. I suppose this makes sense due to the international financial market centricity of Hong Kong, but it still surprises me.
Least correlated are, as expected, geographically distant nations with no real trading relationship. Greece and Portugal’s yields have no relationship at all with Norway, China, Taiwan, Hong Kong, Canada, or the American yields. South Africa has no real relationship with most of the other sampled nations, even having apparent negative correlations with most of them.
This is a quick answer: it is because Yield already accounts for the relationship between coupon rate and the trading price of the instrument. The price, in and of itself, tells us nothing about the yield or the coupon rate, only the trading price versus the par value of the bond.
The list is near endless.
Coupon Rate might be interesting, and could add value in places. This could almost certainly be worked out as a solid estimate from Yield and Price (probably with a little bit of rounding to smooth it out), but having it in the data would be useful.
The dataset being complete would be, of course, useful, but seems to be a bit of a pipe dream. Incomplete data limits what I can do with this, and severely limit the value of this dataset - unless one were interested in Britain, France, Canada, German, and the USA, this dataset is far more limited than it appears on its face.
National economic information would provide huge amounts of context and potential avenues of exploration: inflation (and inflation estimates), as well as interest rates (and the dates when rate rises are announced) would both add value to this dataset and analysis.
Dates of major national, regional, and international events would be valuable as well: the aforementioned rate rise announcements, economic policy/budget announcements, conflicts such as demonstrated on the United States Yield Variance graph above, election dates (especially contentious ones), smaller scale conflicts, disasters, and anything else that can cause economic upheaval.
The last one there is a bit of a catch-all, but it really could go on for absolutely ages.
RPI <- fread("RPI.csv") %>%
mutate(Date=ym(Year)) %>%
select(-Year) %>%
group_by(Date=as.Date(paste(format(Date,'%Y-%m'),"-1",sep=""),"%Y-%m-%d")) %>%
summarise(RPI=mean(RPI),.groups="keep") %>%
ungroup()
Interest <- fread("BOEInterestRate.csv") %>%
mutate(Date=dmy(`Date Changed`)) %>%
select(-`Date Changed`) %>%
group_by(Date=as.Date(paste(format(Date,'%Y-%m'),"-1",sep=""),"%Y-%m-%d")) %>%
summarise(Rate=mean(Rate),.groups="keep") %>%
ungroup()
UKExpanded <- TFData %>%
filter(CountryCode=="GB") %>%
group_by(Country, Region, Date=as.Date(paste(format(Date,'%Y-%m'),"-1",sep=""),"%Y-%m-%d")) %>%
summarise(Volatility=sd(Yield),.groups="keep",Yield=mean(Yield)) %>%
left_join(Interest, by="Date") %>%
left_join(RPI, by="Date") %>%
arrange(Date)
for(i in 2:nrow(UKExpanded)){
UKExpanded$Rate[i]=ifelse(is.na(UKExpanded$Rate[i]),UKExpanded$Rate[i-1],UKExpanded$Rate[i])
UKExpanded$RPI[i]=ifelse(is.na(UKExpanded$RPI[i]),UKExpanded$RPI[i-1],UKExpanded$RPI[i])
}
Graphing ^
colours <- c("Yield"="plum","Rate"="cornflowerblue","RPI"="gray70")
UKExpanded %>%
filter(!Rate==0 & !RPI==0) %>%
ggplot(aes(x=Date))+
theme_minimal()+
geom_ribbon(aes(y=Yield,ymin=Yield-Volatility,ymax=Yield+Volatility, fill="Yield"),alpha=.15)+
geom_line(aes(y=Yield,colour="Yield"),size=.75)+
geom_line(aes(y=RPI,colour="RPI"),size=.75)+
geom_line(aes(y=Rate,colour="Rate"),size=.75)+
scale_colour_manual(values=colours)+
ylab("")+
xlab("")+
scale_y_continuous(labels = scales::percent_format(scale = 1))+
ggtitle("UK Bond Yield vs Interest Rates and Reported RPI")+
guides(fill="none",colour=guide_legend(""))
TFData %>%
group_by(Country,Date=as.Date(paste(format(Date,'%Y'),"-1-1",sep=""),"%Y-%m-%d")) %>%
summarise(Present=TRUE,.groups="keep") %>%
ggplot(aes(x=Date,y=reorder(Country,desc(Country)), fill=Present,alpha=0.9))+
scale_fill_manual(values=c("cornflowerblue"))+
geom_tile()+
theme_minimal()+
ylab("")+
xlab("")+
ggtitle("Periods with Data Present")+
theme(legend.position="none")
EuroYield <- TFData %>%
filter(Date >= "2016-04-26",Region=="Europe") %>%
group_by(Date,CountryCode) %>%
summarise(Yield=mean(Yield),.groups="keep") %>%
pivot_wider(id_cols=Date,names_from=CountryCode,values_from=Yield) %>%
ungroup() %>%
select(-Date) %>%
correlate(quiet=TRUE) %>%
column_to_rownames(var="term")
corrplot(as.matrix(EuroYield),method="circle",order="hclust",title="European Yield Correlations", cl.pos="n", mar=c(0,0,2,0))
AsiaYield <- TFData %>%
filter(Date >= "2016-04-26",Region=="Asia") %>%
group_by(Date,CountryCode) %>%
summarise(Yield=mean(Yield),.groups="keep") %>%
pivot_wider(id_cols=Date,names_from=CountryCode,values_from=Yield) %>%
ungroup() %>%
select(-Date) %>%
correlate(quiet=TRUE) %>%
column_to_rownames(var="term")
corrplot(as.matrix(AsiaYield),method="circle",order="hclust",title="Asian Yield Correlations", cl.pos="n", mar=c(0,0,2,0))
NATOYield <- TFData %>%
filter(Date >= "2016-04-26",CountryCode %in% c("CA","BE","DK","FR","DE","GR","US","IT","NE","PL","PT","NO","GB","TK")) %>%
group_by(Date,CountryCode) %>%
summarise(Yield=mean(Yield),.groups="keep") %>%
pivot_wider(id_cols=Date,names_from=CountryCode,values_from=Yield) %>%
ungroup() %>%
select(-Date) %>%
correlate(quiet=TRUE) %>%
column_to_rownames(var="term")
corrplot(as.matrix(NATOYield),method="circle",order="hclust",title="NATO Member Yield Correlations",cl.pos="n",mar=c(0,0,2,0))
EuropeYields <- TFData %>%
filter(Region=="Europe") %>%
group_by(Country, Region, Date=as.Date(paste(format(Date,'%Y-%m'),"-1",sep=""),"%Y-%m-%d")) %>%
summarise(Volatility=sd(Yield),.groups="keep",Yield=mean(Yield),Price=mean(Price))
EuropeYields %>%
filter(Date >= "1997-01-01", Date <= "2010-06-01", !Country %in% c("Norway","Poland","Portugal","Netherlands (the)","Ireland")) %>%
ggplot(aes(x=Date,y=Yield,colour=Country))+
geom_line(size=0.75,alpha=0.75)+
geom_vline(xintercept=as.Date("1999-01-01"),alpha=0.5,size=.75,colour="grey70")+
geom_vline(xintercept=as.Date("2008-09-01"),alpha=0.5,size=.75,colour="grey70")+
theme_minimal()+
ylab("")+
xlab("")+
scale_y_continuous(labels = scales::percent_format(scale = 1))+
guides(fill="none",colour=guide_legend(""))+
ggtitle("Select European Bond Yields")
EuropeYields %>%
filter(Date >= "1997-01-01", Date <= "2010-06-01", !Country %in% c("Norway","Poland","Portugal","Netherlands (the)","Ireland")) %>%
ggplot(aes(x=Date,y=Volatility,colour=Country))+
geom_line(size=0.75,alpha=0.75)+
geom_vline(xintercept=as.Date("1999-01-01"),alpha=0.5,size=.75,colour="grey70")+
geom_vline(xintercept=as.Date("2008-09-01"),alpha=0.5,size=.75,colour="grey70")+
theme_minimal()+
ylab("")+
xlab("")+
guides(fill="none",colour=guide_legend(""))+
ggtitle("Select European Bond Volatilities")
AsianYields <- TFData %>%
filter(Region %in% c("Asia")) %>%
group_by(Country, Region, Date=as.Date(paste(format(Date,'%Y-%m'),"-1",sep=""),"%Y-%m-%d")) %>%
summarise(Volatility=sd(Yield),.groups="keep",Yield=mean(Yield))
AsianYields %>%
filter(Date >= "1997-01-01", Date <= "2010-01-01", !Country %in% c("Turkey","Indonesia")) %>%
ggplot(aes(x=Date,y=Yield,colour=Country))+
geom_line(size=0.75,alpha=0.75)+
theme_minimal()+
ylab("")+
xlab("")+
scale_y_continuous(labels = scales::percent_format(scale = 1))+
guides(fill="none",colour=guide_legend(""))+
ggtitle("Select Southeast Asian Bond Yields")
TFData %>%
filter(Date >= "2016-01-01", !Price==0) %>%
mutate(Maturity=as.numeric(Maturity)) %>%
na.omit() %>%
group_by(Country,Maturity,Region) %>%
summarise(Volatility=sd(Yield), Sample=n(),.groups="keep") %>%
filter(Sample >= 1000, !Country=="Turkey") %>%
arrange(Country) %>%
ggplot(aes(x=Maturity,y=Volatility))+
geom_point(aes(colour=Country),alpha=.25)+
geom_line(stat="smooth",method = "lm", formula = y ~ x, se = FALSE, size=0.5,alpha=.5,aes(colour=Country))+
geom_line(stat="smooth",method = "lm", formula = y ~ x, se = FALSE, size=1, alpha=.5)+
theme_minimal()+
ggtitle("Maturity vs Volatility")+
facet_wrap(vars(Region),scales="free")+
theme(legend.position="none")