Portfolio optimization with Baysian in R

Overview

Portfolio opitimization is an important topic in finance; Baysian optimization become more and more popular because of its Baysian tag (not really, but close😉). Therefore it’s not that crazy to put them together.

In particular, I would like to try some packages that caught my attention but haven’t manage to do some experiments with them, by which I mean tidyquant and mlrMBO. tidyquant provides unified interface to some common packages in quantitative finance or time series analysis in general; it pretty much like what caret do plus some “tidy” support. mlrMBO is a nice package for Baysian Optimization, it is, at least look that way to me, a derivative package of machine learning meta package mlr. Let’s dig into it.

Actual work

Load some packages to use as always.

library(tidyverse)
library(mlrMBO)
library(tidyquant)

Before actually doing optimization, let’s first do some trials on tidyquant. tidyquant has some convinient functions to get data online, so let’s get daily prices for three stocks, and calculates the return for each of them.

Ra <- c("A", "AVGO", "LUV") %>%
    tq_get(get  = "stock.prices",
           from = "2010-01-01",
           to   = "2018-12-31") %>%
    group_by(symbol) %>%
    tq_transmute(select     = adjusted, 
                 mutate_fun = periodReturn, 
                 period     = "monthly", 
                 col_rename = "Ra")

The idea of this optimization is just to find the weights that maximize the shape ratio of the portfolio. Let’s first try some naive weights, to see the result. Clearly, the shape ratio doesn’t look so look with these random weights. This gives some motivations for optimization that ensue.

wts <- c(0.5, 0.2, 0.3)

Ra %>%
  tq_portfolio(assets_col = symbol,
               returns_col = Ra,
               weights = wts,
               col_rename = "Ra") %>% 
  tq_performance(Ra, performance_fun = SharpeRatio, Rf = 0.02) %>% pull(1)

## [1] -0.01318146

In order for the optimization to work, we need first set up a objective function, which is just a wrapper around the function that produce the shape ratio for given weights. The tricky part here is that mlrMBO doesn’t support constrain such as sum of weight equals one. This can be problematic. A simple solution to this is just using x as proportion of what have left from previous allocation: see the implementation below.

obj_func <- smoof::makeSingleObjectiveFunction(
  name = "portf", 
  fn = function(x) {
    wts <- c(x[1], (1-x[1])*x[2], 1-sum(x[1], (1-x[1])*x[2]))
    Ra %>%
      tq_portfolio(assets_col = symbol,
                   returns_col = Ra,
                   weights = wts,
                   col_rename = "Ra") %>% 
      tq_performance(Ra, performance_fun = SharpeRatio, Rf = 0.02) %>%
      pull(1)
  },
  par.set = makeParamSet(
    makeNumericVectorParam("x", len = 2L, lower = 0, upper = 1)
  ),
  minimize = FALSE
)

Then, we need some starting points, function generateDesign just does that for us. The code below basically copy from the vignette. After we have the starting points, we calculate the “objective” for those points.

des = generateDesign(n = 5, par.set = getParamSet(obj_func), fun = lhs::randomLHS)

des$y = apply(des, 1, obj_func)

For Baysian optimization to work, we need something called surrogate function, which help us evaluate the situation of the objective. In mlrMBO, the surrogate function is defined as a learner exactly from mlr packages.

surr_km <- makeLearner("regr.randomForest", predict.type = "se", se.method = "jackknife")

Before runing the optimization, we need to set some parameters, like number of iteration. Most important, it’s the infill function, which will propose the next point to try out.

makeMBOControl() %>% 
  setMBOControlTermination(iters = 30) %>% 
  setMBOControlInfill(crit = makeMBOInfillCritEI()) -> control

Now, the most exciting part. Collect what we have defined earlier, put them together, run the optimization.

Let’s make the result in nice tibble form. We can see the shape ratio is much better than the previous naive one.

weights_x <- run$x %>% pluck(1)
res_df <- tibble(
  w1 = weights_x[1],
  w2 = (1-weights_x[1])*weights_x[2],
  w3 = 1 - sum(weights_x[1], (1-weights_x[1])*weights_x[2]),
  shape_ratio = run$y %>% pluck(1)
)

res_df

## # A tibble: 1 x 4
##       w1    w2     w3 shape_ratio
##    <dbl> <dbl>  <dbl>       <dbl>
## 1 0.0288 0.961 0.0107      0.0740