Introduction

This is an investigation of government bond data between 1970 and 2020 - from https://www.kaggle.com/datasets/everget/government-bonds - completed as part of an interview process for the International Capital Market Association in April 2022.

The starting point questions/points to consider are:

..as well as additional/optional points:

I’ll try to answer these broadly in order, and explore any tangents or interesting bits I find or go down as I go. Write-ups like this in very similar ‘conversational’ formats are my preferred way of working for more ‘exploratory’ pieces of work, as they provide documentation, process notes, future avenues of exploration, as well as some notes about failed approaches or similar ‘blind alleys’ all in one. These give me a decently thorough ‘refresher’ if I need to come back to a piece of work weeks or months later, so I am not left wondering what I was trying to accomplish with a piece of code, or why I approached something in a certain way - there is at least a modicum of context.

‘Other Insights and Observations’ is covered more-or-less throughout, and will not have a separate section. For those with little interest in the data reformatting (which will be, I assume, most people), I would recommend skipping straight to the start of my discussion on Volatility.

Data Reformatting

Most of the ‘hard part’ in this exercise is going to be restructuring the data. Currently, there are two tables - prices and yields - each with one row per day (apparently excluding weekends and at least some bank holidays, presumably in whatever jurisdiction this was collated in/for), and one column per bond type per nation, leading to c.13000 rows and 216 columns, as below:

knitr::kable(head(yields))
time AU01 AU02 AU03 AU05 AU07 AU10 AU15 AU20 AU30 BE01 BE02 BE03 BE05 BE07 BE10 BE15 BE20 BE30 CA03M CA06M CA01 CA02 CA03 CA05 CA10 CA30 CN01 CN02 CN03 CN05 CN07 CN10 CN15 CN20 CN30 DK02 DK03 DK05 DK10 DK20 FR01 FR02 FR03 FR05 FR07 FR10 FR15 FR20 FR25 FR30 FR50 DE01 DE02 DE03 DE05 DE07 DE10 DE15 DE20 DE25 DE30 GR05 GR10 GR15 GR20 GR25 HK01 HK02 HK03 HK05 HK07 HK10 IN01 IN02 IN05 IN07 IN10 IN20 IN25 ID01 ID03 ID05 ID10 ID15 ID20 ID25 ID30 IE01 IE02 IE03 IE05 IE10 IE15 IT01 IT02 IT03 IT05 IT07 IT10 IT15 IT20 IT25 IT30 JP01 JP02 JP03 JP05 JP07 JP10 JP15 JP20 JP30 JP40 KR01 KR02 KR03 KR05 KR10 KR20 KR30 MY03 MY05 MY07 MY10 MY15 MY20 MY30 NL01 NL02 NL03 NL05 NL07 NL10 NL15 NL20 NL30 NZ02 NZ05 NZ10 NZ15 NZ20 NO02 NO05 NO07 NO10 PL02 PL05 PL10 PT02 PT03 PT05 PT07 PT10 PT20 SG02 SG05 SG10 SG15 SG20 SG30 ZA03 ZA05 ZA10 ZA15 ZA20 ZA30 ES02 ES03 ES05 ES07 ES10 ES20 ES30 SE02 SE05 SE07 SE10 SE20 TW02 TW05 TW10 TW20 TW30 TH02 TH05 TH10 TH15 TR01 TR02 TR03 TR05 TR10 GB01 GB02 GB03 GB05 GB07 GB10 GB15 GB20 GB25 GB30 GB50 US01M US02M US03M US06M US01 US02 US03 US05 US07 US10 US20 US30
2000-05-05 6.280 6.34 6.460 6.52 0 6.56 6.56 0 0 2.86 4.931 5.114 5.419 0 5.728 5.838 5.657 0 5.58 5.89 6.19 6.251 6.395 6.375 6.285 0 0 0 0 0 0 0 0 0 0 5.442 5.619 5.697 5.848 0 4.592 4.805 5.119 5.289 0 5.566 5.571 5.719 0 5.818 0 4.571 4.770 5.006 5.223 0 5.436 0 5.533 0 5.698 6.238 6.299 6.350 6.391 0 0 0 0 0 0 7.645 0 0 9.699 0 10.432 0 0 0 0 0 0 0 0 0 0 4.433 4.664 0 5.460 5.700 0 0 5.044 5.227 5.529 0 5.762 0 0 0 6.078 0 0 0 0 0 0 0 0 0 0 8.28 8.720000 8.86 9.229999 0 0 0 0 0 0 0 0 0 0 0 4.927 5.142 5.373 0 5.603 4.259 0 5.808 0 0 0 0 0 0 6.49 0 6.34 0 0 0 5.237 0 5.538 0 5.774 0 3.089 3.88 4.388 0 0 0 0 13.85 14.41 0 14.33 0 4.545 5.151 5.413 0 5.717 0 5.933 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6.156 6.267 6.248 5.846 0 5.373 5.012 4.804 0 4.593 0 0 0 5.899 5.910 0 6.784 4.557 6.692 0 6.511 0 6.157
2000-05-08 6.260 6.34 6.415 6.52 0 6.56 6.49 0 0 2.86 4.946 5.115 5.419 0 5.716 5.830 5.657 0 5.58 5.89 6.19 6.251 6.395 6.375 6.285 0 0 0 0 0 0 0 0 0 0 5.431 5.574 5.699 5.798 0 4.602 4.819 5.088 5.299 0 5.533 5.542 5.682 0 5.793 0 4.630 4.788 4.999 5.186 0 5.429 0 5.543 0 5.671 6.184 6.261 6.347 6.375 0 0 0 0 0 0 7.730 0 0 9.807 0 10.505 0 0 0 0 0 0 0 0 0 0 4.500 4.664 0 5.441 5.656 0 0 5.035 5.219 5.506 0 5.719 0 0 0 6.046 0 0 0 0 0 0 0 0 0 0 8.28 8.720000 8.85 9.210000 0 0 0 0 0 0 0 0 0 0 0 4.892 5.119 5.334 0 5.572 4.259 0 5.780 0 0 0 0 0 0 6.45 0 6.30 0 0 0 5.200 0 5.505 0 5.736 0 3.170 3.92 4.515 0 0 0 0 13.85 14.41 0 14.33 0 4.545 5.136 5.414 0 5.691 0 5.913 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6.166 6.301 6.257 5.839 0 5.340 4.974 4.771 0 4.551 0 0 0 5.899 5.910 0 6.784 4.557 6.692 0 6.568 0 6.157
2000-05-09 6.340 6.38 6.500 6.54 0 6.55 6.55 0 0 2.86 4.937 5.104 5.386 0 5.679 5.800 5.657 0 5.58 5.88 6.20 6.316 6.481 6.457 6.363 0 0 0 0 0 0 0 0 0 0 5.393 5.526 5.658 5.751 0 4.569 4.801 5.060 5.267 0 5.496 5.494 5.631 0 5.732 0 4.545 4.753 4.966 5.139 0 5.382 0 5.491 0 5.617 6.184 6.248 6.355 6.386 0 0 0 0 0 0 7.761 0 0 9.786 0 10.514 0 0 0 0 0 0 0 0 0 0 4.480 4.664 0 5.382 5.617 0 0 5.027 5.220 5.494 0 5.693 0 0 0 5.983 0 0 0 0 0 0 0 0 0 0 8.28 8.720000 8.87 9.210000 0 0 0 0 0 0 0 0 0 0 0 4.870 5.099 5.307 0 5.537 4.259 0 5.725 0 0 0 0 0 0 6.50 0 6.30 0 0 0 5.140 0 5.461 0 5.696 0 3.159 3.99 4.519 0 0 0 0 13.85 14.41 0 14.33 0 4.545 5.118 5.387 0 5.657 0 5.880 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6.140 6.321 6.226 5.801 0 5.299 4.931 4.722 0 4.491 0 0 0 5.984 6.031 0 6.872 4.557 6.827 0 6.524 0 6.248
2000-05-10 6.240 6.34 6.390 6.51 0 6.54 6.50 0 0 2.86 4.885 5.040 5.338 0 5.644 5.748 5.657 0 5.67 5.90 6.23 6.306 6.462 6.430 6.323 0 0 0 0 0 0 0 0 0 0 5.393 5.519 5.625 5.732 0 4.546 4.757 5.011 5.233 0 5.472 5.472 5.611 0 5.730 0 4.521 4.736 4.927 5.128 0 5.353 0 5.491 0 5.595 6.172 6.221 6.328 6.381 0 0 0 0 0 0 7.750 0 0 9.819 0 10.535 0 0 0 0 0 0 0 0 0 0 4.485 4.664 0 5.387 5.620 0 0 4.977 5.175 5.455 0 5.667 0 0 0 5.970 0 0 0 0 0 0 0 0 0 0 8.29 8.729999 8.89 9.220000 0 0 0 0 0 0 0 0 0 0 0 4.848 5.062 5.286 0 5.499 4.259 0 5.703 0 0 0 0 0 0 6.49 0 6.21 0 0 0 5.148 0 5.450 0 5.675 0 3.154 3.91 4.465 0 0 0 0 13.85 14.41 0 14.33 0 4.545 5.107 5.359 0 5.619 0 5.823 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6.132 6.291 6.214 5.818 0 5.291 4.915 4.705 0 4.469 0 0 0 6.115 6.140 0 6.851 4.557 6.788 0 6.457 0 6.216
2000-05-11 6.220 6.26 6.374 6.42 0 6.41 6.42 0 0 2.86 4.915 5.094 5.376 0 5.723 5.838 5.657 0 5.66 5.87 6.20 6.272 6.386 6.342 6.230 0 0 0 0 0 0 0 0 0 0 5.512 5.638 5.740 5.825 0 4.600 4.842 5.100 5.201 0 5.532 5.543 5.710 0 5.799 0 4.717 4.754 4.976 5.238 0 5.395 0 5.554 0 5.693 6.148 6.230 6.342 6.381 0 0 0 0 0 0 7.750 0 0 9.809 0 10.527 0 0 0 0 0 0 0 0 0 0 4.509 4.664 0 5.475 5.686 0 0 5.044 5.214 5.491 0 5.758 0 0 0 6.075 0 0 0 0 0 0 0 0 0 0 8.29 8.729999 8.89 9.220000 0 0 0 0 0 0 0 0 0 0 0 4.969 5.171 5.389 0 5.585 4.259 0 5.795 0 0 0 0 0 0 6.51 0 6.27 0 0 0 5.258 0 5.556 0 5.767 0 3.198 3.96 4.487 0 0 0 0 13.85 14.41 0 14.33 0 4.545 5.130 5.402 0 5.668 0 5.885 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6.213 6.289 6.307 5.922 0 5.393 5.027 4.791 0 4.527 0 0 0 6.083 6.104 0 6.804 4.557 6.668 0 6.420 0 6.154
2000-05-12 6.243 6.26 6.374 6.42 0 6.45 6.43 0 0 2.86 5.022 5.155 5.457 0 5.714 5.822 5.657 0 5.65 5.89 6.22 6.304 6.406 6.370 6.227 0 0 0 0 0 0 0 0 0 0 5.529 5.639 5.731 5.805 0 4.631 4.910 5.141 5.352 0 5.550 5.550 5.669 0 5.763 0 4.790 4.863 5.065 5.256 0 5.419 0 5.532 0 5.641 6.214 6.268 6.361 6.395 0 0 0 0 0 0 7.729 0 0 9.809 0 10.520 0 0 0 0 0 0 0 0 0 0 4.534 4.664 0 5.474 5.672 0 0 5.105 5.295 5.551 0 5.739 0 0 0 6.006 0 0 0 0 0 0 0 0 0 0 8.30 8.750000 8.91 9.229999 0 0 0 0 0 0 0 0 0 0 0 5.003 5.210 5.397 0 5.570 4.259 0 5.759 0 0 0 0 0 0 6.49 0 6.27 0 0 0 5.300 0 5.560 0 5.750 0 3.156 3.89 4.487 0 0 0 0 13.85 14.41 0 14.33 0 4.545 5.191 5.444 0 5.664 0 5.862 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6.244 6.378 6.346 5.933 0 5.390 5.035 4.802 0 4.545 0 0 0 6.017 6.019 0 6.835 4.557 6.668 0 6.524 0 6.143

…and so forth. It is worth mentioning at this point that I made the date into something human readable during import, converting from the UNIX epoch millisecond time stamp to an actual date.

Fundamentally, this format will be incredibly difficult to work with: if I want to pull a single nation, I have to find every bond type included for that nation to manually include them, and ditto for the prices table. First step for me will, therefore, be to break this down into a ‘tidy’ or ‘long’ data format with one observation per row, and far fewer columns. My target is to resemble:

Date Nation Region Bond Maturity Price Yield
2000-05-05 AU Pacific 01 nnn 6.28
2000-05-05 AU Pacific 02 nnn 6.34

…and so forth. In this way I can effortlessly filter or compare by nation, bond types, regions, or any combination thereof.

To accomplish this, I will make one fairly ugly loop which will step through each column, split the title into nation and bond type, and generate one row per day in the dataset for that nation and bond type. The same loop, with very minor tweaks, will do the same job on the prices dataset. I’ll then join them on date, nation, and maturity to marry them into a single 3 million-ish row dataset, and then do a bit of cleanup work - I know there are some packages that do easy conversion from ISO2C nation identifiers to ISO3C and/or full names, as well as including region identifiers, so I will use one of those to make the data a bit easier to work with when it comes to visualisations, since I’ll want nicer looking names at that point.

#Create an empty tibble in the correct format to receive the reformatted data

TidyPrices <- tibble("Date"=as.Date("2022-01-01"),"Nation"="","Maturity"="","Price"=0.0)[-1,]

for(i in 2:ncol(prices)){
  
    #The first two characters of the column name are taken as the nation identifier, and the 3..nth as the bond maturity
    nation <- str_sub(names(prices)[i],1,2)
    bond <- str_sub(names(prices)[i],3)
    PTemp <- prices %>% select(time,"Price"=i)
    
    #Generate a tibble and insert the first column data
    TempPrices <- tibble("Date"=prices$time, "Nation"=nation,"Maturity"=as.character(bond)) %>% 
      left_join(PTemp, by=c("Date"="time"))
    
    #Append the data to the 'master' variable
    TidyPrices <- union(TidyPrices,TempPrices)
  }

#The same again, but for Yields
TidyYields <- tibble("Date"=as.Date("2022-01-01"),"Nation"="","Maturity"="","Yield"=0.0)[-1,]

for(i in 2:ncol(yields)){
    nation <- str_sub(names(yields)[i],1,2)
    bond <- str_sub(names(yields)[i],3)
    YTemp <- yields %>% select(time,"Yield"=i)
    TempYields <- tibble("Date"=yields$time, "Nation"=nation,"Maturity"=as.character(bond)) %>% 
      left_join(YTemp, by=c("Date"="time"))
    TidyYields <- union(TidyYields,TempYields)
}

TidyData <- TidyPrices %>% 
  full_join(TidyYields, by=c("Date"="Date","Nation"="Nation","Maturity"="Maturity"))

rm(TidyPrices,TidyYields,i,nation,bond,YTemp,TempYields,PTemp,TempPrices)

R isn’t massively thrilled about those loops: it takes about two minutes to run through, and I suspect there is a ‘nicer’ way to accomplish same. That being said, a few minutes for a pair of 3 million entry tables isn’t too unreasonable.

So. I am going to use the countrycode package to append the country names (ISO English short name, technically, so China rather than the People’s Republic of China, e.g) as well as regions. The regions are a bit awkward: the Kaggle page says ‘6 world regions’, but these six match neither the UN Geoscheme regions, nor the World Bank regions. That consistency doesn’t matter much as long as I take it into account going forward. In this case, I am happiest with simple continent groupings rather than going with any of the regions, though this is fairly arbitrary. These will mainly be important for the ‘correlation within regions’ question, but most of that will come down to ‘it depends how you define the regions’ anyway. For the sake of clarity, I’ll also rename the column ‘Nation’ to ‘CountryCode’, and reorder the columns to be in line with what I would expect, even though this makes absolutely no difference to how I will handle them going forward.

TidyData <- TidyData %>% 
  mutate("Region"=countrycode(Nation,"iso2c","continent"), "Country"=countrycode(TidyData$Nation,"iso2c","iso.name.en")) %>% 
  rename("CountryCode"=Nation)
TidyData <- TidyData[,c(1,7,2,3,4,5,6)]

This leaves me with a dataset that is easy to use, and easy to further transform based on whatever needs I work out going forward. The table, filtered for 10 year instruments arbitrarily (..and to exclude price=0, just so the result below looks ‘complete’), looks like:

knitr::kable(head(TidyData %>% 
                    filter(Maturity=="10", !Price==0) %>% 
                    arrange(desc(Date)),10))
Date Country CountryCode Maturity Price Yield Region
2020-07-24 Australia AU 10 115.2130 0.8784 Oceania
2020-07-24 Belgium BE 10 102.6260 -0.1629 Europe
2020-07-24 Canada CA 10 107.1910 0.5000 Americas
2020-07-24 China CN 10 98.2916 2.8808 Asia
2020-07-24 Denmark DK 10 107.6710 -0.3116 Europe
2020-07-24 France FR 10 101.5058 -0.1446 Europe
2020-07-24 Germany DE 10 104.5834 -0.4450 Europe
2020-07-24 Greece GR 10 103.9000 1.0820 Europe
2020-07-24 Hong Kong HK 10 116.4500 0.3952 Asia
2020-07-24 India IN 10 99.5400 5.8510 Asia

That is all for the starting point data transformation, but there is a fairly substantial ‘missingness’ problem with regards to the data. There are quite a few entries with 0 yield and 0 price:

knitr::kable(TidyData %>% 
  filter(Yield==0, Price==0) %>% 
  summarise("Rows with missing Yield and Price"=n()))
Rows with missing Yield and Price
1755644

I am going to make the bold assumption that no yield and no price probably means the instrument wasn’t available in that nation at that time. I think this is an interesting point to explore in some cases, such as if some bond types disappear for a few days or weeks around certain events, but it is unclear whether the longer periods of missingness are at all interesting.

Perhaps more interesting are the rows where Price is 0 but Yield is present:

knitr::kable(TidyData %>% 
  filter(!Yield==0, Price==0) %>% 
  summarise("Rows with Yield but 0 Price"=n()))
Rows with Yield but 0 Price
667324

This means that about 2.4 million of the 2.75 million rows contains some degree of missingness, which isn’t fantastic. A huge portion of this should simply correlate to unavailable instruments, so it may not impact analysis too badly.

After a little bit of googling about why prices might be 0 with listed yields, my best guess is that these were benchmark figures with no instruments actually being sold, assuming the data is correct. The more likely option is that the data is simply incomplete for one reason or another. The yield figures still seem in line with reality, so I think I am safe to leave these in when considering volatility and other questions, but in a ‘real life’ situation I would definitely be asking for guidance from a conveniently available subject matter expert.

Realistically, though, I can save myself a little bit of (processing) time by creating a filtered dataset omitting entries with 0 price and 0 yield.

TFData <- TidyData %>% 
  filter(!(Yield==0 & Price==0))

…leaving me with just under 1,000,000 rows to play with.

Volatility

Slightly out of order, but I’ll handle the two volatility questions, and probably touch on

My starting point with volatility is simply: how do you define it?

First, it seems a bit more realistic to give period volatility (e.g over two weeks, a month, a year, etc) than to give volatility over a 50 year sample, so I will almost certainly break it down somewhat.

Second, and more importantly, what is it? I mean, I know the ‘lay’ definition, but I am not entirely certain in context. Min vs Max values over a period might be interesting or useful, and the simple delta gives an easily compared metric, but it seems overly simplistic. Amount of change over a period is an option: take the absolute of the delta of the day-to-day values and provide the mean, as a sort of proto-standard deviation.

After some reading (https://www.investopedia.com/terms/v/volatility.asp and https://onlinelibrary.wiley.com/doi/pdf/10.1002/9781118267059.app1) it seems like the ‘basic’ volatility measure is simple standard deviation, which is easy enough to accomplish. \(\beta\), which approximates the degree of movement of one security’s values versus a baseline, would be interesting and within my capabilities, but I am uncertain how to define a baseline appropriately - the mean of the sample, I suppose, but the more guesswork I apply in this way the less justifiable the outcome seems to me. For purpose, I am going to stick with simple standard deviation as a measure of volatility.

Yield Volatility by Nation

So, the ‘by nation’ portion of this question needs no thought or explaining, but I need to think about time frames at least a little bit. The first part of the first question is ‘Which countries have the most volatility in yield and which the least?’ - I think this can be reasonably answered with simple SD over the sample, but I’m not really thrilled with this: it feels simplistic, and I question whether anything meaningful could be garnered from SD over 50 years. However, that appears to be the request, so:

knitr::kable(TFData %>% 
               group_by(Country) %>% 
               summarise("Volatility"=sd(Yield)) %>% 
               arrange(desc(Volatility)))
Country Volatility
Greece 11.5164471
Spain 10.3033781
Turkey 4.4130973
Australia 3.9365278
Italy 3.6655040
Portugal 3.5232880
Korea (the Republic of) 3.0438087
United Kingdom 3.0123540
Canada 2.9580026
France 2.8854052
South Africa 2.8824648
United States of America (the) 2.8006201
Germany 2.7477103
Ireland 2.5935859
Denmark 2.3436969
Belgium 2.2738368
Indonesia 2.2597620
Netherlands (the) 2.2506727
Norway 2.1616012
Hong Kong 1.9665356
India 1.4901352
Thailand 1.1865921
Singapore 0.9779651
Japan 0.8148027
New Zealand 0.8145194
Poland 0.7929027
China 0.7883345
Sweden 0.7659881
Malaysia 0.5977965
Taiwan (Province of China) 0.4360028

This is, of course, after removing the zero price/zero yield entries, but retaining the zero price/listed yield entries. To me, this illustrates two things. First, that SD over fifty years can give a relatively meaningful result: Greece and Spain with their semi-regular economic crises are extremely volatile, most ‘western’ or EU nations are clustered, and a group of nations that could be seen as being less economically free are grouped at the ‘low yield volatility’ end of the spectrum, which feels relatively consistent to me.

There are, of course, outliers: Sweden is the most ‘Western’ nation at the bottom end of the volatility spectrum, and I don’t have an instant explanation as to why. Poland’s yields are incredibly stable, which is at odds with my understanding of their post-Soviet economic history, with near hyperinflative conditions in the early 90s. I’ll have at least a quick look at both of these shortly. Frankly, I suspect that the ‘high outlier’ nature of Spain and Greece may be exasperated by the years in which data is available for them - the near inverse to Sweden and Poland, where

I will highlight that in a ‘real world’ situation, I would be exploring these data quality issues, coming up with a few quick charts or visualisations to illustrate the issue as I saw it, and then bringing that to the involved stakeholders to discuss options. In this context, I am going to answer the question ‘as written’ since direction cannot be obtained and to do anything else would simply be guessing, but caveat the answer and highlight the data issues. I will take this same approach when presenting.

To finish answering the first part of this question I can do a quick visualisation showing rates of volatility, giving an ‘at a glance’ comparison of the nations.

vol_full <- TFData %>% 
  group_by(Country) %>% 
  summarise("Volatility"=sd(Yield)) %>% 
  arrange(desc(Volatility)) %>%
  ggplot(aes(x=Volatility,y=reorder(Country,Volatility)))+
    geom_bar(stat="identity",fill="cornflowerblue")+
    xlab("Volatility")+
    ylab("")+
    ggtitle("Yield Volatility 1970-2020")+
    theme_minimal()

vol_full

‘Over Time’ is tied in with the data quality question, so I’ll take my little detour before moving on.

Sweden, Poland, and Data Quality

The biggest ‘problem’ with this dataset is one of missingness. This contains a nominal 50 years of data, but how many nations actually have 50 years (or anywhere close to that) of data other than rows of zeros?

TFData %>% 
  group_by(Country,Date) %>% 
  summarise(Present=TRUE,.groups="keep") %>%
  ggplot(aes(x=Date,y=reorder(Country,desc(Country)), fill=Present))+
    scale_fill_manual(values=c("cornflowerblue"))+
    geom_tile()+
    theme_minimal()+
    ylab("")+
    ggtitle("Periods with Data Present")+
    theme(legend.position="none")

Stripiness due to trading days could be resolved, but takes an unreasonable amount of time to find a ‘good’ solution contextually, so I will leave this as a bit of an eyesore. It is, at least, illustrative of the problem: some nations which look EXTREMELY stable have only very recent data. To give a more complete view:

knitr::kable(TFData %>%
  group_by(Country) %>%
  summarise("Start Date"=min(Date),.groups="keep") %>%
  arrange(`Start Date`)) 
Country Start Date
Canada 1970-01-05
Germany 1979-12-24
United Kingdom 1979-12-24
Australia 1979-12-27
United States of America (the) 1980-01-02
France 1986-01-02
Italy 1988-09-01
Netherlands (the) 1993-01-04
Belgium 1993-12-06
Spain 1994-02-01
Norway 1994-02-02
Denmark 1994-02-14
Portugal 1994-05-04
Korea (the Republic of) 1995-05-02
South Africa 1995-05-19
Singapore 1997-04-18
Hong Kong 1998-01-09
India 1998-02-02
Ireland 1998-02-02
Greece 1998-12-22
China 2000-09-21
Thailand 2001-01-16
Malaysia 2001-10-15
Indonesia 2003-05-14
Japan 2006-02-06
Turkey 2006-07-03
Sweden 2012-08-14
Poland 2016-03-09
New Zealand 2016-04-26
Taiwan (Province of China) 2016-04-26

Sweden, Taiwan, New Zealand and Poland being especially notable, in that they all miss out on the 2008 financial crisis and aftermath completely. At the other extreme, Greece and Spain show up just in time to demonstrate their modern reputation of economic uncertainty. Most notably, this dataset starts at 1970 but has only Canadian data until 1979. The ‘overall’ figures should, therefore, be taken with a few tonnes of salt.

As an exercise, I will redo the ‘overall’ volatility measure using only the data since 2016-04-26, when the dataset becomes ‘complete’ for the first time:

knitr::kable(TFData %>% 
               filter(Date >= "2016-04-26") %>%
               group_by(Country) %>% 
               summarise("Volatility"=sd(Yield),.groups="keep") %>% 
               arrange(desc(Volatility)))
Country Volatility
Turkey 4.7266250
Greece 2.5102988
Italy 1.3493113
South Africa 1.1198848
Portugal 1.1098210
Spain 0.9891931
Indonesia 0.8801135
Australia 0.8401523
United States of America (the) 0.8400791
Netherlands (the) 0.8373112
France 0.8331294
New Zealand 0.8145194
Poland 0.7967633
Belgium 0.7569377
India 0.7293758
Malaysia 0.6603802
China 0.6505065
Thailand 0.6467702
Germany 0.6238885
Ireland 0.6233773
Canada 0.6141509
United Kingdom 0.5887705
Singapore 0.5660003
Hong Kong 0.5557842
Korea (the Republic of) 0.4440892
Taiwan (Province of China) 0.4360028
Sweden 0.4199198
Norway 0.4189074
Denmark 0.3712102
Japan 0.3614450
vol_filt <- TFData %>% 
  filter(Date >= "2016-04-26") %>%
  group_by(Country) %>% 
  summarise("Volatility"=sd(Yield),.groups="keep") %>% 
  arrange(desc(Volatility)) %>%
  ggplot(aes(x=Volatility,y=reorder(Country,Volatility)))+
    geom_bar(stat="identity",fill="cornflowerblue",alpha=0.9)+
    xlab("Volatility")+
    ylab("")+
    ggtitle("Yield Volatility Since 2016")+
    theme_minimal()

vol_filt

..or with matching axes for a better side-by-side comparison (if one can excuse the horizontal compression):

grid.arrange(vol_full+xlim(0,12), vol_filt+xlim(0,12),ncol=2)

With a like-for-like comparison data completion-wise, most outliers disappear. Turkey’s yield volatility has barely changed with the date shift (from 2006 previously, now from 2016), implying relatively consistent market conditions (or, consistently inconsistent market conditions). Greece is less of an outlier, but still volatile especially when compared to other EU/EEA nations. The remaining high volatility nations are about as one would expect: Italy, Spain, South Africa, and Portugal being slightly higher than nations which would be viewed as being relatively stable (UK, Canada, Scandinavia, etc.) and similar, with Scandinavia and Japan rounding up the ‘stable’ end of the spectrum.

Yield Volatility Over Time

As illustrated above, especially with regard to the data start dates, change over the whole period is relatively meaningless. Prior to 1980, there is only really data from Canada, which goes up to 7 nations by 1990. For the sake of the exercise, I’m going to try a few graphs starting in 1980 to illustrate yield changes over time. It is worth noting here that there is what appears to be some garbage data in this dataset with regards to Spain in March 1994, where some days show in excess of 800% bond yields, throwing the volatility entirely out of line with reality, especially when broken down by month. Presumably, this is simple garbage data rather than having a real explanation, so I will filter it out.

VolAnum <- TFData %>%
  filter(Date >= "1980-01-01", !Yield>=100) %>%
  group_by(Country, Region, Year=year(Date)) %>%
  summarise(Volatility=sd(Yield),.groups="keep")

VolMonthly <- TFData %>%
  filter(Date >= "1980-01-01", !Yield>=100) %>%
  group_by(Country, Region, Date=as.Date(paste(format(Date,'%Y-%m'),"-1",sep=""),"%Y-%m-%d")) %>%
  summarise(Volatility=sd(Yield),.groups="keep")

VolAnum %>%
  ggplot(aes(x=Year,y=Volatility,colour=Country))+
    geom_line()+
    theme_minimal()+
    scale_y_log10()+
    ylab("Volatility (log10)")+
    xlab("")+
    theme(legend.position="none")+
    ggtitle("Yearly Volatility by Nation")

VolMonthly %>%
  ggplot(aes(x=Date,y=Volatility,colour=Country))+
    geom_line()+
    theme_minimal()+
    scale_y_log10()+
    ylab("Volatility (log10)")+
    xlab("")+
    theme(legend.position="none")+
    ggtitle("Monthly Volatility by Nation")

The inclusion of Greece alone makes this far less readable than it has any business being, though there are too many nations included for this to be particularly informative. I have deliberately omitted the legend to make the lines fractionally more readable, on the basis that the graph is so unreadable with so much colour overlap that even with a legend no sense could be made of it.

This being done, I don’t think lines or a candlestick graph will ever be massively informative here unless trying to look at one (or very few) nations, though with the annual breakdown one can start to see some fairly extreme correlations, both temporally and between nations.

I think a heatmap might display the ‘over time’ trends better:

VolAnum %>%
  mutate(Outlier=ifelse(Volatility>=3,TRUE,FALSE), Volatility=ifelse(Volatility>=2,2,Volatility)) %>%
  ggplot(aes(x=Year,y=reorder(Country,desc(Country)), fill=Volatility))+
    geom_tile()+
    theme_minimal()+
    ylab("")+
    theme(legend.position="none")+
    ggtitle("Yield Volatility Since 1980")

This obviously compresses the level of volatility in no small way - without some compression the more volatile periods completely overwhelm everything else, and handling them more gracefully would require a decent amount of time expended on one small visual. To answer the question, though: yield volatility evolves massively over time - you can more or less pick out major global and regional events even on a simplistic graphic like this one.

For instance, looking at Europe:

VolAnum %>%
  filter(Region=="Europe") %>%
  mutate(Outlier=ifelse(Volatility>=3,TRUE,FALSE), Volatility=ifelse(Volatility>=2,2,Volatility)) %>%
  ggplot(aes(x=Year,y=reorder(Country,desc(Country)), fill=Volatility))+
    geom_tile()+
    theme_minimal()+
    ylab("")+
    theme(legend.position="none")+
    ggtitle("European Yield Volatility Since 1980")

…one can easily pick out the 2008 financial crash, the Eurozone debt crisis, and (at least in the case of Italy and Greece) what probably amounts to the beginnings of COVID-19 related volatility. Without looking at summary statistics or mathematically derived correlation coefficients, one can intuit with a reasonable degree of accuracy how linked or unlinked nations are economically - the EU nations are near carbon copies of one another, while Scandinavia remains clearly separated, but not entirely unaffected - and fairly strongly correlated with each other.

Looking at the United States only:

VolMonthly %>%
  filter(Country=="United States of America (the)") %>%
  ggplot(aes(x=Date,y=Volatility))+
    geom_line()+    
    theme_minimal()+
    ylab("Volatility")+
    geom_vline(aes(xintercept=as.Date("1990-08-01"),colour="red"))+
    geom_vline(aes(xintercept=as.Date("1994-01-01"),colour="red"))+
    geom_vline(aes(xintercept=as.Date("2001-09-01"),colour="red"))+
    geom_vline(aes(xintercept=as.Date("2003-03-01"),colour="red"))+
    geom_vline(aes(xintercept=as.Date("2007-09-01"),colour="red"))+
    theme(legend.position="none")+
    ggtitle("USA Yield Volatility Since 1980")

..we can, even without the assistance of lines marking dates, pick out the Gulf War, start of NAFTA and localised shocks caused by the Mexican ‘Peso Crisis’, September 11 2001, followed closely by the second Iraq War, as well as the end of Q2 2007 leading into the sub-prime mortgage crisis. The chart ends, of course, with the beginnings of the COVID uncertainty.

Without the annotations, we can look at two quite close trade partners, France and Germany:

VolMonthly %>%
  filter(Country %in% c("France","Germany"), Date >= "1990-01-01") %>%
  ggplot(aes(x=Date,y=Volatility,colour=Country))+
    geom_line()+    
    theme_minimal()+
    ylab("")+
    ggtitle("German and French Yield Volatility Since 1990")

..we can see that they track incredibly closely, especailly between 2000 and 2010, when they are almost indistinguishable. Since 2010, the shapes remain consistent, despite the divergence.

Maturity and Volatility

I don’t really have much of a ‘gut feeling’ of what the answer here will be, or why, but it is easy enough to look at:

knitr::kable(TFData %>% 
  group_by(Maturity) %>%
  summarise(Volatility=sd(Yield), Sample=n(),.groups="keep") %>% 
  arrange(desc(Volatility)))
Maturity Volatility Sample
03 6.4294127 122351
20 5.0716333 74775
05 4.9837343 171897
25 3.8471081 12829
10 3.5408829 175350
02 3.4206638 138432
01 3.2596165 75990
03M 3.0163429 16149
15 2.8719753 68090
30 2.5092943 74187
06M 1.9384928 19422
07 1.6609227 28069
01M 1.4703415 4676
50 1.1811947 7698
02M 0.8622539 445
40 0.7026141 3201

So, the reaction looking at the entire dataset would be ‘it doesn’t’. However, we have tens or hundreds of thousands of examples of some maturities, and hundreds or low thousands of others. This may also change substantially by nation - if Greece is ‘responsible’ for the majority of the 3 or 20 year bonds, they would be a high outlier because of the association with Greece, not because of the Maturity.

Because of the data quality issues discussed before, I’m almost not certain how to approach this properly. This is a case where I would definitely be going back to a stakeholder and talking about the data issues and trying to work out what they were trying to figure out here to know what the most appropriate way to slice the data is.

I guess I can consider which nations use which maturities, but then there is the question of ‘when’. I think the best approach is to use the dataset containing all nations (2016 onward) to get the most consistent picture. I’m also going to exclude maturities with below 5000 samples, as they are likely to be issued by only one or two nations, and therefore not contributing at all to an overall picture of the effects of Maturity on Volatility.

knitr::kable(TFData %>%
  filter(Date >= "2016-04-26") %>% 
  group_by(Maturity) %>%
  summarise(Volatility=sd(Yield), Sample=n(),.groups="keep") %>% 
  filter(Sample >= 5000) %>%
  arrange(desc(Volatility)))
Maturity Volatility Sample
01 4.022928 18190
03 3.614053 22470
05 3.239926 32100
02 3.058203 27820
25 3.011663 7490
10 3.011514 32100
20 2.845421 19170
15 2.705239 17307
30 2.478833 19137
07 1.882220 18190

I would say that, broadly speaking, the volatility of yield is inversely correlated with Maturity - the longer the Maturity, the lower the volatility. However, this is at odds with what a bit of reading about bonds tells me: it says that longer-term bonds should be MORE volatile than lower term bonds. I am going to assume, therefore, that there is something weird with the data: the sample of nations and distribution of maturities means that something ‘funny’ is going on.

I’ll try to demonstrate this by breaking the above down by nation:

knitr::kable(TFData %>%
  filter(Date >= "2016-04-26") %>% 
  group_by(Country,Maturity) %>%
  summarise(Volatility=sd(Yield), Sample=n(),.groups="keep") %>% 
  arrange(Country))
Country Maturity Volatility Sample
Australia 01 0.5232951 1070
Australia 02 0.5740956 1070
Australia 03 0.6187630 1070
Australia 05 0.6845195 1070
Australia 07 0.7161627 1070
Australia 10 0.7023857 1070
Australia 15 0.7401314 1070
Australia 20 0.1154084 187
Australia 30 1.1069387 947
Belgium 01 0.0683349 1070
Belgium 02 0.0741974 1070
Belgium 03 0.1060574 1070
Belgium 05 0.2100796 1070
Belgium 07 0.2787462 1070
Belgium 10 0.3681651 1070
Belgium 15 0.4282925 1070
Belgium 20 0.4138677 1070
Belgium 30 0.4396111 1070
Canada 01 0.5814184 1070
Canada 02 0.5917615 1070
Canada 03 0.5871970 1070
Canada 03M 0.5302934 1070
Canada 05 0.5742670 1070
Canada 06M 0.5468774 1070
Canada 10 0.5126812 1070
Canada 30 0.4113583 1070
China 01 0.9756378 1070
China 02 0.5121565 1070
China 03 0.7504183 1070
China 05 0.4601611 1070
China 07 0.4035091 1070
China 10 0.3705094 1070
China 15 0.3397111 1070
China 20 0.3460935 1070
China 30 0.3249406 1070
Denmark 02 0.1223675 1070
Denmark 03 0.1488948 1070
Denmark 05 0.2253679 1070
Denmark 10 0.3707640 1070
Denmark 20 0.1412321 187
France 01 0.0603588 1070
France 02 0.0918562 1070
France 03 0.1491822 1070
France 05 0.2398681 1070
France 07 0.2875683 1070
France 10 0.3936286 1070
France 15 0.4359402 1070
France 20 0.4613718 1070
France 25 0.4502617 1070
France 30 0.4601811 1070
France 50 0.4795901 1070
Germany 01 0.0805831 1070
Germany 02 0.0872463 1070
Germany 03 0.1209972 1070
Germany 05 0.2086514 1070
Germany 07 0.2843686 1070
Germany 10 0.3711332 1070
Germany 15 0.4005665 1070
Germany 20 0.4845173 1070
Germany 25 0.4652682 1070
Germany 30 0.5348418 1070
Greece 05 3.3252171 1070
Greece 10 2.2246409 1070
Greece 15 2.3196618 1070
Greece 20 2.2389929 1070
Greece 25 2.1852659 1070
Hong Kong 01 0.5940577 1070
Hong Kong 02 0.5504825 1070
Hong Kong 03 0.5303794 1070
Hong Kong 05 0.5267569 1070
Hong Kong 07 0.5130899 1070
Hong Kong 10 0.4944564 1070
India 01 0.5957291 1070
India 02 0.8343062 1070
India 05 0.6576624 1070
India 07 0.5894984 1070
India 10 0.6195345 1070
India 20 0.4576384 1070
India 25 0.4500240 1070
Indonesia 01 0.8343626 1070
Indonesia 03 0.6450666 1070
Indonesia 05 0.6605422 1070
Indonesia 10 0.5589509 1070
Indonesia 15 0.4970779 1070
Indonesia 20 0.4578735 1070
Indonesia 25 0.4825076 1070
Indonesia 30 0.5026452 1070
Ireland 01 0.3905691 1070
Ireland 02 0.0783087 1070
Ireland 03 0.1024767 1070
Ireland 05 0.1749922 1070
Ireland 10 0.3783510 1070
Ireland 15 0.4190097 1070
Italy 01 0.3232781 1070
Italy 02 0.4644664 1070
Italy 03 0.5513359 1070
Italy 05 0.6174200 1070
Italy 07 0.6618042 1070
Italy 10 0.6441796 1070
Italy 15 0.9913044 1070
Italy 20 1.2828464 1070
Italy 25 0.9922118 1070
Italy 30 0.7849119 1070
Japan 01 0.0833918 1070
Japan 02 0.0565517 1070
Japan 03 0.0606268 1070
Japan 05 0.0720416 1070
Japan 07 0.0995204 1070
Japan 10 0.0945444 1070
Japan 15 0.1185438 1070
Japan 20 0.1632577 1070
Japan 30 0.2133646 1070
Japan 40 0.2601987 1070
Korea (the Republic of) 01 0.3242430 1070
Korea (the Republic of) 02 0.3572878 1070
Korea (the Republic of) 03 0.3772311 1070
Korea (the Republic of) 05 0.4164929 1070
Korea (the Republic of) 10 0.4380287 1070
Korea (the Republic of) 20 0.4175923 1070
Korea (the Republic of) 30 0.4112942 1070
Malaysia 03 0.6610088 1070
Malaysia 05 0.6618528 1070
Malaysia 07 0.3963499 1070
Malaysia 10 0.3923085 1070
Malaysia 15 0.4533075 1070
Malaysia 20 0.4502009 1070
Malaysia 30 0.3943682 1070
Netherlands (the) 01 0.0705049 1070
Netherlands (the) 02 0.0764310 1070
Netherlands (the) 03 0.1016692 1070
Netherlands (the) 05 0.1700871 1070
Netherlands (the) 07 0.2741386 1070
Netherlands (the) 10 0.3624083 1070
Netherlands (the) 15 1.4593897 1070
Netherlands (the) 20 0.3974531 1070
Netherlands (the) 30 0.4714278 1070
New Zealand 02 0.5654251 1070
New Zealand 05 0.6693993 1070
New Zealand 10 0.7591092 1070
New Zealand 15 1.3508619 187
New Zealand 20 1.1156280 187
Norway 02 0.3618590 1070
Norway 05 0.3345781 1070
Norway 07 0.3331710 1070
Norway 10 0.3558900 1070
Poland 02 0.5858118 1070
Poland 05 0.5512618 1070
Poland 10 0.6375984 1070
Portugal 02 0.3382581 1070
Portugal 03 0.4745268 1070
Portugal 05 0.7657995 1070
Portugal 07 1.1072724 1070
Portugal 10 1.1658532 1070
Portugal 20 0.2513097 187
Singapore 02 0.4721856 1070
Singapore 05 0.4319671 1070
Singapore 10 0.4181775 1070
Singapore 15 0.4416675 1070
Singapore 20 0.4220559 1070
Singapore 30 0.4289101 1070
South Africa 03 0.7521665 1070
South Africa 05 0.5390450 1070
South Africa 10 0.5152318 1070
South Africa 15 0.5424790 1070
South Africa 20 0.5799395 1070
South Africa 30 0.5535330 1070
Spain 02 0.1250253 1070
Spain 03 0.1657500 1070
Spain 05 0.2426607 1070
Spain 07 0.3539316 1070
Spain 10 0.4852516 1070
Spain 20 0.6372948 1070
Spain 30 0.6295438 1070
Sweden 02 0.1515283 1070
Sweden 05 0.2272169 1070
Sweden 07 0.2759229 1070
Sweden 10 0.3418934 1070
Sweden 20 0.1029414 187
Taiwan (Province of China) 02 0.0775297 1070
Taiwan (Province of China) 05 0.1280148 1070
Taiwan (Province of China) 10 0.2036157 1070
Taiwan (Province of China) 20 0.3571261 1070
Taiwan (Province of China) 30 0.3740729 1070
Thailand 02 0.3643583 1070
Thailand 05 0.4276580 1070
Thailand 10 0.5281419 1070
Thailand 15 0.6254620 1070
Turkey 01 5.8353690 1070
Turkey 02 5.3098306 1070
Turkey 03 4.8250379 1070
Turkey 05 3.9614126 1070
Turkey 10 3.0810486 1070
United Kingdom 01 0.2690984 1070
United Kingdom 02 0.2791825 1070
United Kingdom 03 0.2717020 1070
United Kingdom 05 0.3163380 1070
United Kingdom 07 0.3499090 1070
United Kingdom 10 0.3860596 1070
United Kingdom 15 0.4133474 1070
United Kingdom 20 0.4161874 1070
United Kingdom 25 0.4076955 1070
United Kingdom 30 0.3353617 1070
United Kingdom 50 0.3936867 1070
United States of America (the) 01 0.7812878 1070
United States of America (the) 01M 0.7965811 1070
United States of America (the) 02 0.7636595 1070
United States of America (the) 02M 0.8622539 445
United States of America (the) 03 0.7461904 1070
United States of America (the) 03M 0.7902415 1070
United States of America (the) 05 0.7133855 1070
United States of America (the) 06M 0.7859200 1070
United States of America (the) 07 0.6812258 1070
United States of America (the) 10 0.6597364 1070
United States of America (the) 20 0.0989086 45
United States of America (the) 30 0.5342911 1070
TFData %>% 
  filter(Date >= "2016-04-26", !Price==0) %>%
  mutate(Maturity=as.numeric(Maturity)) %>%
  na.omit() %>%
  group_by(Country,Maturity,Region) %>%
  summarise(Volatility=sd(Yield), Sample=n(),.groups="keep") %>% 
  filter(Sample >=1000) %>%
  arrange(Country) %>%
  ggplot(aes(x=Maturity,y=Volatility,colour=Country))+
    geom_point()+
    geom_line(stat="smooth",method = "lm", formula = y ~ x, se = FALSE, size=0.5, alpha=0.2)+
    theme_minimal()+
    ggtitle("Maturity vs Volatility")

So, it appears that this is lovely example of Simpson’s Paradox, one of my favourite statistical fallacies. It refers to a phenomenon where trends exist in groups of data, but disappear or reverse when the groups are combined. In this case, breaking this down into nations provides the ‘correct’ answer: that higher maturity bonds have higher yield volatility… but not by much, and not universally. In aggregate, the few outlier volatility nations with ‘incorrect’ best fit slopes (Greece in dark green and Turkey in pink) overwhelm the others. It may be fair to say that the less economic certainty there is in a nation, the higher the variance in Yield, and that makes a kind of intuitive sense to me, but this is where my lack of domain knowledge comes to the fore.

Edit 01/05/22: It is worth noting that there is a consistent drop in volatility with regards to the longest maturity instrument available in most jurisdictions. This is far more prominent in those where the longest maturity corresponds with a substantial drop in sample size. I do not know if this implies a lack of interest, or that the instrument is not consistently available as a primary market offering, but either of those could realistically cause a lack of apparent volatility among those products. It is unclear whether these should be omitted or included, but it is worth noting regardless.

National Yield Correlations

Note 02/05/22: I think this section (and the bit above on volatility) have been a bit clumsily written to imply that the relationship between national yields and/or yield volatilities is causative rather than correlative. While it is very likely that it is somewhat causative, this should be read as (e.g in the case of France and Germany having near 98% correlation) that an event which affects one of these economies is likely to have a near-identical effect on the other.

This has been covered a little bit above with regard to yield volatility, but not with strict yield. I would expect the result to be similar: nations with close economic ties are impacted by fluctuations in each other’s economies, as well as by national, regional, and international events at least somewhat similarly. In this way, I would expect Japan to have a stronger correlation with the USA than with China, and Australia to be more closely linked to its Commonwealth peers than its near neighbours.

As a starting point, I’ll look at France and Germany again, as their yield volatility, at least, is closely correlated:

TFData %>%
  filter(Country %in% c("France","Germany"), Date >= "1990-01-01") %>%
  group_by(Country, Date) %>%
  summarise(Yield=mean(Yield),.groups="keep") %>%
  ggplot(aes(x=Date,y=Yield,colour=Country))+
    geom_line()+    
    theme_minimal()+
    ylab("")+
    ggtitle("German and French Yields Since 1990")

I mean, I can do this mathematically as well, but those are nearly identically shaped even when they deviate. Correlation plots and matrices, especially with time series data, is not something I’ve spent a lot of time playing with, so we’ll see if I get to something useful.

What it seems like I need is to get the data from my current ‘tidy’ format into a slightly wider format, like:

Date DE_Yield FR_Yield

..which is easy enough, actually. I think from there I can use correlate() from the corrr package (which I haven’t actually used before..) to generate a correlation matrix. This should be replicable for the complete dataset easily enough.

FYield <- TFData %>%
  filter(Country %in% "France", Date >= "1990-01-01") %>%
  group_by(Date) %>%
  summarise(FR_Yield=mean(Yield),.groups="keep")

GYield <- TFData %>%
  filter(Country %in% "Germany", Date >= "1990-01-01") %>%
  group_by(Date) %>%
  summarise(DE_Yield=mean(Yield),.groups="keep")

GFYield <- full_join(FYield,GYield,by="Date")

knitr::kable(GFYield %>% 
  ungroup() %>%
  select(FR_Yield,DE_Yield) %>%
  correlate(quiet=TRUE))
term FR_Yield DE_Yield
FR_Yield NA 0.9771743
DE_Yield 0.9771743 NA

Okay, cool. So, the Pearson’s Correlation Coefficient between the German and French yields are 0.977 - an absurdly closely correlated relationship. Pearson’s Correlation Coefficient provides an estimate of the slope of the relationship between the two factors, so long as they use the same scale. In this case, a one point fluctuation in the yield of one of these nations should provide an 0.977 point fluctuation in the same direction for the other.

I think I can use a reshaping package to do this in one go for the full dataset instead of doing this for 30 nations manually, and then create a little visualisation without too much trouble. Correlation functions handle null values very badly, so I am going to use the 2016 date where all of the nations are present in the data as a starting point. Realistically, I probably could have used melt() and cast() to do the initial cleanup more gracefully rather than use a loop, but oh well.

WideYield <- TFData %>%
  filter(Date >= "2016-04-26") %>%
  group_by(Date,CountryCode) %>%
  summarise(Yield=mean(Yield),.groups="keep") %>%
  pivot_wider(id_cols=Date,names_from=CountryCode,values_from=Yield)

YieldCorrs <- WideYield %>%
  ungroup()  %>%
  select(-Date) %>%
  correlate(quiet=TRUE) %>%
  column_to_rownames(var="term")

knitr::kable(YieldCorrs)
AU BE CA CN DE DK ES FR GB GR HK ID IE IN IT JP KR MY NL NO NZ PL PT SE SG TH TR TW US ZA
AU NA 0.8870312 0.3911642 0.6417824 0.9429857 0.7549636 0.8811956 0.9326234 0.8304877 0.6461212 0.3717444 0.1351675 0.7671742 0.6845899 0.3763824 0.7142192 0.8523321 0.8762465 0.8259843 0.4476247 0.9482942 0.7841900 0.5818301 0.6669701 0.7600473 0.8816584 0.0335420 0.9044513 0.4935790 -0.4153050
BE 0.8870312 NA 0.4711414 0.6059199 0.9436634 0.8200378 0.8801707 0.9676223 0.8372466 0.4500394 0.4594577 0.2918290 0.8257587 0.7295274 0.6116291 0.7871695 0.8642044 0.8310146 0.7181063 0.4992508 0.7766363 0.6178124 0.4170695 0.8080461 0.7369825 0.8326243 0.2241310 0.7611191 0.5843807 -0.2780308
CA 0.3911642 0.4711414 NA 0.6154976 0.4864123 0.1233518 0.1328027 0.3552329 0.6614010 -0.2704249 0.9460893 0.1917836 0.0927301 0.6379730 0.4889613 0.4009744 0.6735412 0.5176527 0.0602138 0.8986183 0.2136858 0.3128744 -0.4080698 0.3463706 0.7831740 0.4961926 0.6872376 0.2257962 0.9520739 -0.2915587
CN 0.6417824 0.6059199 0.6154976 NA 0.6981484 0.4093182 0.4248279 0.6084594 0.6140995 0.0849612 0.5260864 -0.2095985 0.2500815 0.4743627 0.2503735 0.6490446 0.8113094 0.6242709 0.3045604 0.4932474 0.5462258 0.5482670 -0.0419342 0.5162376 0.6198703 0.5703326 0.2280510 0.5868611 0.6137073 -0.4331410
DE 0.9429857 0.9436634 0.4864123 0.6981484 NA 0.8406511 0.8547517 0.9467046 0.8789453 0.5216559 0.4359689 0.0951142 0.7784952 0.7327440 0.4141212 0.7597975 0.9036164 0.8443313 0.8111676 0.5083664 0.8627683 0.7128413 0.4601613 0.7716760 0.7656342 0.8350607 0.0919476 0.8258343 0.5642985 -0.4116513
DK 0.7549636 0.8200378 0.1233518 0.4093182 0.8406511 NA 0.7951999 0.8224070 0.6372000 0.4882066 0.1099711 0.0236656 0.8182273 0.5684379 0.2827398 0.6423529 0.6863332 0.5904540 0.8183082 0.1635314 0.7155954 0.4547212 0.5477329 0.8121138 0.4313770 0.5511314 -0.1665369 0.6116830 0.2033908 -0.2408408
ES 0.8811956 0.8801707 0.1328027 0.4248279 0.8547517 0.7951999 NA 0.9297748 0.7093239 0.7362352 0.1338097 0.2885410 0.9013996 0.6017404 0.4482302 0.6549443 0.6724485 0.7701143 0.8479878 0.1976253 0.8548744 0.6541161 0.7302855 0.6757845 0.5505193 0.7897732 -0.0568263 0.8249878 0.2564142 -0.1417553
FR 0.9326234 0.9676223 0.3552329 0.6084594 0.9467046 0.8224070 0.9297748 NA 0.8021916 0.5795105 0.3549778 0.2091545 0.8381453 0.6470834 0.4907413 0.7874904 0.8369840 0.8398088 0.7710074 0.4117826 0.8667711 0.7231295 0.5602838 0.7890622 0.6915880 0.8492330 0.0622053 0.8675629 0.4697444 -0.3072362
GB 0.8304877 0.8372466 0.6614010 0.6140995 0.8789453 0.6372000 0.7093239 0.8021916 NA 0.3939936 0.6401657 0.2812246 0.7055754 0.8355429 0.4439865 0.5098200 0.8372463 0.8470582 0.7157414 0.7272158 0.7449323 0.6843325 0.3000069 0.6103911 0.9102522 0.8335089 0.3197094 0.7058767 0.7358572 -0.3578252
GR 0.6461212 0.4500394 -0.2704249 0.0849612 0.5216559 0.4882066 0.7362352 0.5795105 0.3939936 NA -0.2706130 0.1031037 0.7025122 0.3194746 -0.0302860 0.1965682 0.2356404 0.5014677 0.8049060 -0.1032516 0.7588498 0.6021429 0.9278676 0.2033026 0.2730120 0.5752330 -0.3920566 0.7058846 -0.1441591 -0.2386630
HK 0.3717444 0.4594577 0.9460893 0.5260864 0.4359689 0.1099711 0.1338097 0.3549778 0.6401657 -0.2706130 NA 0.3402767 0.1179660 0.6442931 0.5314668 0.3794933 0.6382622 0.5319335 0.0433025 0.9199561 0.2044794 0.3247795 -0.3578746 0.3381692 0.8018852 0.5195183 0.7161885 0.2273320 0.9318048 -0.1802882
ID 0.1351675 0.2918290 0.1917836 -0.2095985 0.0951142 0.0236656 0.2885410 0.2091545 0.2812246 0.1031037 0.3402767 NA 0.3608218 0.3886371 0.6463048 0.0723546 0.0340262 0.3248384 0.1396119 0.3124090 0.0354513 0.0333344 0.1324365 0.0506745 0.3472447 0.4109520 0.5932707 0.0714957 0.3210029 0.3859559
IE 0.7671742 0.8257587 0.0927301 0.2500815 0.7784952 0.8182273 0.9013996 0.8381453 0.7055754 0.7025122 0.1179660 0.3608218 NA 0.6752144 0.3837843 0.4356998 0.5605316 0.6964962 0.8598236 0.1953743 0.7618993 0.5465415 0.7204500 0.6249735 0.5383200 0.7004328 -0.0438032 0.6858203 0.2280601 -0.1614131
IN 0.6845899 0.7295274 0.6379730 0.4743627 0.7327440 0.5684379 0.6017404 0.6470834 0.8355429 0.3194746 0.6442931 0.3886371 0.6752144 NA 0.5155226 0.3793774 0.7182927 0.7230827 0.6288284 0.6549026 0.5741039 0.4664353 0.2249105 0.4567032 0.8314620 0.7438017 0.4650625 0.4614727 0.7209303 -0.2719622
IT 0.3763824 0.6116291 0.4889613 0.2503735 0.4141212 0.2827398 0.4482302 0.4907413 0.4439865 -0.0302860 0.5314668 0.6463048 0.3837843 0.5155226 NA 0.5493995 0.4556152 0.4588377 0.1548424 0.4444365 0.1842722 0.1213302 -0.0554801 0.4196564 0.4792528 0.5238000 0.6670654 0.1988683 0.5829604 0.1477253
JP 0.7142192 0.7871695 0.4009744 0.6490446 0.7597975 0.6423529 0.6549443 0.7874904 0.5098200 0.1965682 0.3794933 0.0723546 0.4356998 0.3793774 0.5493995 NA 0.7629559 0.5951764 0.4212975 0.3337002 0.5809748 0.4204909 0.1890006 0.7853377 0.4632148 0.5947643 0.1577777 0.5953644 0.4363745 -0.1489002
KR 0.8523321 0.8642044 0.6735412 0.8113094 0.9036164 0.6863332 0.6724485 0.8369840 0.8372463 0.2356404 0.6382622 0.0340262 0.5605316 0.7182927 0.4556152 0.7629559 NA 0.8136326 0.5686665 0.6445228 0.7226666 0.6523179 0.1604115 0.7154380 0.8199423 0.7627421 0.2632737 0.7204986 0.7253244 -0.3995203
MY 0.8762465 0.8310146 0.5176527 0.6242709 0.8443313 0.5904540 0.7701143 0.8398088 0.8470582 0.5014677 0.5319335 0.3248384 0.6964962 0.7230827 0.4588377 0.5951764 0.8136326 NA 0.6856801 0.5895543 0.8032857 0.7432153 0.4173800 0.5256810 0.8492449 0.9087671 0.2628979 0.8153338 0.6378713 -0.3110147
NL 0.8259843 0.7181063 0.0602138 0.3045604 0.8111676 0.8183082 0.8479878 0.7710074 0.7157414 0.8049060 0.0433025 0.1396119 0.8598236 0.6288284 0.1548424 0.4212975 0.5686665 0.6856801 NA 0.1881831 0.8495014 0.6248745 0.7939812 0.5331859 0.5363621 0.7027485 -0.2101456 0.7373591 0.1646139 -0.3022398
NO 0.4476247 0.4992508 0.8986183 0.4932474 0.5083664 0.1635314 0.1976253 0.4117826 0.7272158 -0.1032516 0.9199561 0.3124090 0.1953743 0.6549026 0.4444365 0.3337002 0.6445228 0.5895543 0.1881831 NA 0.3248484 0.4708639 -0.1927064 0.3140852 0.8721695 0.6139877 0.6164304 0.3592907 0.9257208 -0.3246456
NZ 0.9482942 0.7766363 0.2136858 0.5462258 0.8627683 0.7155954 0.8548744 0.8667711 0.7449323 0.7588498 0.2044794 0.0354513 0.7618993 0.5741039 0.1842722 0.5809748 0.7226666 0.8032857 0.8495014 0.3248484 NA 0.8140374 0.7097068 0.5798993 0.6573695 0.8061906 -0.1727084 0.9114451 0.3069331 -0.3847664
PL 0.7841900 0.6178124 0.3128744 0.5482670 0.7128413 0.4547212 0.6541161 0.7231295 0.6843325 0.6021429 0.3247795 0.0333344 0.5465415 0.4664353 0.1213302 0.4204909 0.6523179 0.7432153 0.6248745 0.4708639 0.8140374 NA 0.5424482 0.3345324 0.6975497 0.7908699 -0.0390739 0.8760888 0.4110531 -0.4538014
PT 0.5818301 0.4170695 -0.4080698 -0.0419342 0.4601613 0.5477329 0.7302855 0.5602838 0.3000069 0.9278676 -0.3578746 0.1324365 0.7204500 0.2249105 -0.0554801 0.1890006 0.1604115 0.4173800 0.7939812 -0.1927064 0.7097068 0.5424482 NA 0.2636413 0.1649462 0.4864557 -0.4981141 0.6678145 -0.2776004 -0.1426094
SE 0.6669701 0.8080461 0.3463706 0.5162376 0.7716760 0.8121138 0.6757845 0.7890622 0.6103911 0.2033026 0.3381692 0.0506745 0.6249735 0.4567032 0.4196564 0.7853377 0.7154380 0.5256810 0.5331859 0.3140852 0.5798993 0.3345324 0.2636413 NA 0.4400672 0.4590945 -0.0154838 0.5278960 0.3511126 -0.1526391
SG 0.7600473 0.7369825 0.7831740 0.6198703 0.7656342 0.4313770 0.5505193 0.6915880 0.9102522 0.2730120 0.8018852 0.3472447 0.5383200 0.8314620 0.4792528 0.4632148 0.8199423 0.8492449 0.5363621 0.8721695 0.6573695 0.6975497 0.1649462 0.4400672 NA 0.8647628 0.4816576 0.6601702 0.8688741 -0.4000656
TH 0.8816584 0.8326243 0.4961926 0.5703326 0.8350607 0.5511314 0.7897732 0.8492330 0.8335089 0.5752330 0.5195183 0.4109520 0.7004328 0.7438017 0.5238000 0.5947643 0.7627421 0.9087671 0.7027485 0.6139877 0.8061906 0.7908699 0.4864557 0.4590945 0.8647628 NA 0.3263941 0.8333181 0.6561389 -0.3545035
TR 0.0335420 0.2241310 0.6872376 0.2280510 0.0919476 -0.1665369 -0.0568263 0.0622053 0.3197094 -0.3920566 0.7161885 0.5932707 -0.0438032 0.4650625 0.6670654 0.1577777 0.2632737 0.2628979 -0.2101456 0.6164304 -0.1727084 -0.0390739 -0.4981141 -0.0154838 0.4816576 0.3263941 NA -0.1320428 0.7274695 0.0973620
TW 0.9044513 0.7611191 0.2257962 0.5868611 0.8258343 0.6116830 0.8249878 0.8675629 0.7058767 0.7058846 0.2273320 0.0714957 0.6858203 0.4614727 0.1988683 0.5953644 0.7204986 0.8153338 0.7373591 0.3592907 0.9114451 0.8760888 0.6678145 0.5278960 0.6601702 0.8333181 -0.1320428 NA 0.3479677 -0.4432692
US 0.4935790 0.5843807 0.9520739 0.6137073 0.5642985 0.2033908 0.2564142 0.4697444 0.7358572 -0.1441591 0.9318048 0.3210029 0.2280601 0.7209303 0.5829604 0.4363745 0.7253244 0.6378713 0.1646139 0.9257208 0.3069331 0.4110531 -0.2776004 0.3511126 0.8688741 0.6561389 0.7274695 0.3479677 NA -0.3504164
ZA -0.4153050 -0.2780308 -0.2915587 -0.4331410 -0.4116513 -0.2408408 -0.1417553 -0.3072362 -0.3578252 -0.2386630 -0.1802882 0.3859559 -0.1614131 -0.2719622 0.1477253 -0.1489002 -0.3995203 -0.3110147 -0.3022398 -0.3246456 -0.3847664 -0.4538014 -0.1426094 -0.1526391 -0.4000656 -0.3545035 0.0973620 -0.4432692 -0.3504164 NA
corrplot(as.matrix(YieldCorrs),method="circle",order="hclust",title="Yield Correlations",cl.pos="n",mar=c(0,0,2,0))

Just about understandable, and considering the number of variables, I’m happy enough given the context of this activity.

This is pretty much as expected, though: The closer the trading partner, the more closely correlated the yields: ‘The West’ can be broadly construed as a cluster, with Australia more closely related to Europe than their near neighbours. South Africa is off on its own, with no other African nations in the sample. Surprisingly, to me, China is more heavily correlated with Korea and Japan than its larger trading partners, and even more closely correlated with Korea than Hong Kong. This is made up for, I guess, by Hong Kong having a closer correlation to the USA, Canada, and slightly strangely, Norway than to China, Korea, and Japan. I suppose this makes sense due to the international financial market centricity of Hong Kong, but it still surprises me.

Least correlated are, as expected, geographically distant nations with no real trading relationship. Greece and Portugal’s yields have no relationship at all with Norway, China, Taiwan, Hong Kong, Canada, or the American yields. South Africa has no real relationship with most of the other sampled nations, even having apparent negative correlations with most of them.

Bond Prices vs Yields

This is a quick answer: it is because Yield already accounts for the relationship between coupon rate and the trading price of the instrument. The price, in and of itself, tells us nothing about the yield or the coupon rate, only the trading price versus the par value of the bond.

Additional Useful Data

The list is near endless.

Coupon Rate might be interesting, and could add value in places. This could almost certainly be worked out as a solid estimate from Yield and Price (probably with a little bit of rounding to smooth it out), but having it in the data would be useful.

The dataset being complete would be, of course, useful, but seems to be a bit of a pipe dream. Incomplete data limits what I can do with this, and severely limit the value of this dataset - unless one were interested in Britain, France, Canada, German, and the USA, this dataset is far more limited than it appears on its face.

National economic information would provide huge amounts of context and potential avenues of exploration: inflation (and inflation estimates), as well as interest rates (and the dates when rate rises are announced) would both add value to this dataset and analysis.

Dates of major national, regional, and international events would be valuable as well: the aforementioned rate rise announcements, economic policy/budget announcements, conflicts such as demonstrated on the United States Yield Variance graph above, election dates (especially contentious ones), smaller scale conflicts, disasters, and anything else that can cause economic upheaval.

The last one there is a bit of a catch-all, but it really could go on for absolutely ages.

RPI <- fread("RPI.csv") %>% 
  mutate(Date=ym(Year)) %>% 
  select(-Year) %>% 
  group_by(Date=as.Date(paste(format(Date,'%Y-%m'),"-1",sep=""),"%Y-%m-%d")) %>%
  summarise(RPI=mean(RPI),.groups="keep") %>%
  ungroup()

Interest <- fread("BOEInterestRate.csv") %>%
  mutate(Date=dmy(`Date Changed`)) %>% 
  select(-`Date Changed`) %>%
  group_by(Date=as.Date(paste(format(Date,'%Y-%m'),"-1",sep=""),"%Y-%m-%d")) %>%
  summarise(Rate=mean(Rate),.groups="keep") %>%
  ungroup()

UKExpanded <- TFData %>% 
  filter(CountryCode=="GB") %>% 
  group_by(Country, Region, Date=as.Date(paste(format(Date,'%Y-%m'),"-1",sep=""),"%Y-%m-%d")) %>%
  summarise(Volatility=sd(Yield),.groups="keep",Yield=mean(Yield)) %>%
  left_join(Interest, by="Date") %>% 
  left_join(RPI, by="Date") %>% 
  arrange(Date)


for(i in 2:nrow(UKExpanded)){
  UKExpanded$Rate[i]=ifelse(is.na(UKExpanded$Rate[i]),UKExpanded$Rate[i-1],UKExpanded$Rate[i])
  UKExpanded$RPI[i]=ifelse(is.na(UKExpanded$RPI[i]),UKExpanded$RPI[i-1],UKExpanded$RPI[i])
}

Graphing ^

colours <- c("Yield"="plum","Rate"="cornflowerblue","RPI"="gray70")
UKExpanded %>% 
  filter(!Rate==0 & !RPI==0) %>%
  ggplot(aes(x=Date))+ 
    theme_minimal()+
    geom_ribbon(aes(y=Yield,ymin=Yield-Volatility,ymax=Yield+Volatility, fill="Yield"),alpha=.15)+
    geom_line(aes(y=Yield,colour="Yield"),size=.75)+
    geom_line(aes(y=RPI,colour="RPI"),size=.75)+
    geom_line(aes(y=Rate,colour="Rate"),size=.75)+
    scale_colour_manual(values=colours)+  
    ylab("")+
    xlab("")+
    scale_y_continuous(labels = scales::percent_format(scale = 1))+
    ggtitle("UK Bond Yield vs Interest Rates and Reported RPI")+
    guides(fill="none",colour=guide_legend(""))

TFData %>% 
  group_by(Country,Date=as.Date(paste(format(Date,'%Y'),"-1-1",sep=""),"%Y-%m-%d")) %>%
  summarise(Present=TRUE,.groups="keep") %>%
  ggplot(aes(x=Date,y=reorder(Country,desc(Country)), fill=Present,alpha=0.9))+
    scale_fill_manual(values=c("cornflowerblue"))+
    geom_tile()+
    theme_minimal()+
    ylab("")+
    xlab("")+
    ggtitle("Periods with Data Present")+
    theme(legend.position="none")

EuroYield <- TFData %>%
  filter(Date >= "2016-04-26",Region=="Europe") %>%
  group_by(Date,CountryCode) %>%
  summarise(Yield=mean(Yield),.groups="keep") %>%
  pivot_wider(id_cols=Date,names_from=CountryCode,values_from=Yield) %>%
  ungroup() %>%
  select(-Date) %>%
  correlate(quiet=TRUE) %>%
  column_to_rownames(var="term")

corrplot(as.matrix(EuroYield),method="circle",order="hclust",title="European Yield Correlations", cl.pos="n", mar=c(0,0,2,0))

AsiaYield <- TFData %>%
  filter(Date >= "2016-04-26",Region=="Asia") %>%
  group_by(Date,CountryCode) %>%
  summarise(Yield=mean(Yield),.groups="keep") %>%
  pivot_wider(id_cols=Date,names_from=CountryCode,values_from=Yield) %>%
  ungroup() %>%
  select(-Date) %>%
  correlate(quiet=TRUE) %>%
  column_to_rownames(var="term")

corrplot(as.matrix(AsiaYield),method="circle",order="hclust",title="Asian Yield Correlations", cl.pos="n", mar=c(0,0,2,0))

NATOYield <- TFData %>%
  filter(Date >= "2016-04-26",CountryCode %in% c("CA","BE","DK","FR","DE","GR","US","IT","NE","PL","PT","NO","GB","TK")) %>%
  group_by(Date,CountryCode) %>%
  summarise(Yield=mean(Yield),.groups="keep") %>%
  pivot_wider(id_cols=Date,names_from=CountryCode,values_from=Yield) %>%
  ungroup() %>%
  select(-Date) %>%
  correlate(quiet=TRUE) %>%
  column_to_rownames(var="term")

corrplot(as.matrix(NATOYield),method="circle",order="hclust",title="NATO Member Yield Correlations",cl.pos="n",mar=c(0,0,2,0))

EuropeYields <- TFData %>% 
  filter(Region=="Europe") %>% 
  group_by(Country, Region, Date=as.Date(paste(format(Date,'%Y-%m'),"-1",sep=""),"%Y-%m-%d")) %>%
  summarise(Volatility=sd(Yield),.groups="keep",Yield=mean(Yield),Price=mean(Price))

EuropeYields %>%
  filter(Date >= "1997-01-01", Date <= "2010-06-01", !Country %in% c("Norway","Poland","Portugal","Netherlands (the)","Ireland")) %>%
  ggplot(aes(x=Date,y=Yield,colour=Country))+
    geom_line(size=0.75,alpha=0.75)+
    geom_vline(xintercept=as.Date("1999-01-01"),alpha=0.5,size=.75,colour="grey70")+
    geom_vline(xintercept=as.Date("2008-09-01"),alpha=0.5,size=.75,colour="grey70")+
    theme_minimal()+
    ylab("")+
    xlab("")+
    scale_y_continuous(labels = scales::percent_format(scale = 1))+
    guides(fill="none",colour=guide_legend(""))+
    ggtitle("Select European Bond Yields")

EuropeYields %>%
  filter(Date >= "1997-01-01", Date <= "2010-06-01", !Country %in% c("Norway","Poland","Portugal","Netherlands (the)","Ireland")) %>%
  ggplot(aes(x=Date,y=Volatility,colour=Country))+
    geom_line(size=0.75,alpha=0.75)+
    geom_vline(xintercept=as.Date("1999-01-01"),alpha=0.5,size=.75,colour="grey70")+
    geom_vline(xintercept=as.Date("2008-09-01"),alpha=0.5,size=.75,colour="grey70")+
    theme_minimal()+
    ylab("")+
    xlab("")+
    guides(fill="none",colour=guide_legend(""))+
    ggtitle("Select European Bond Volatilities")

AsianYields <- TFData %>% 
  filter(Region %in% c("Asia")) %>% 
  group_by(Country, Region, Date=as.Date(paste(format(Date,'%Y-%m'),"-1",sep=""),"%Y-%m-%d")) %>%
  summarise(Volatility=sd(Yield),.groups="keep",Yield=mean(Yield))

AsianYields %>%
  filter(Date >= "1997-01-01", Date <= "2010-01-01", !Country %in% c("Turkey","Indonesia")) %>%
  ggplot(aes(x=Date,y=Yield,colour=Country))+
    geom_line(size=0.75,alpha=0.75)+
    theme_minimal()+
    ylab("")+
    xlab("")+
    scale_y_continuous(labels = scales::percent_format(scale = 1))+
    guides(fill="none",colour=guide_legend(""))+
    ggtitle("Select Southeast Asian Bond Yields")

TFData %>%
  filter(Date >= "2016-01-01", !Price==0) %>%
  mutate(Maturity=as.numeric(Maturity)) %>%
  na.omit() %>%
  group_by(Country,Maturity,Region) %>%
  summarise(Volatility=sd(Yield), Sample=n(),.groups="keep") %>% 
  filter(Sample >= 1000, !Country=="Turkey") %>%
  arrange(Country) %>%
  ggplot(aes(x=Maturity,y=Volatility))+
    geom_point(aes(colour=Country),alpha=.25)+
    geom_line(stat="smooth",method = "lm", formula = y ~ x, se = FALSE, size=0.5,alpha=.5,aes(colour=Country))+
    geom_line(stat="smooth",method = "lm", formula = y ~ x, se = FALSE, size=1, alpha=.5)+
    theme_minimal()+
    ggtitle("Maturity vs Volatility")+
    facet_wrap(vars(Region),scales="free")+
    theme(legend.position="none")