1 Overview

The follow notes cover multiple discussion related to portfolio asset allocation and application to sector ETFs. The goal of which is provide the reader some of the science and the art behind tactical asset allocation. These notes also include code snippets written R along with visualizations that would assist the reader to better comprehend the theory/practice behind the discussed topics.

2 An Empirical Perspective on Mean-Variance Portfolios

2.1 Introduction

The conventional wisdom in finance implies that investors should make rational decisions in which risk is compensated by reward. Such that in order to achieve a greater reward, investors need to bear more risk. This is the ideal view in financial economic thought and the foundation of the Modern Portfolio Theory (MPT). In the following, I would like to address this view in the presence of uncertainty, which is the inevitable ingredient of day-to-day decision making.

Let us think about an investor who is interested in allocating his wealth among a set of assets. As a rational investor, he should choose an optimal allocation among the set that yields the best reward for the level of risk he is willing to take. By reward, I refer to how much he expects to earn on his portfolio decision; whereas risk denotes how volatile this prospect will be. Formally, the former is measured by the expected return of his portfolio, while the latter is proxied by the standard deviation of his portfolio return. Such paradigm has been known as the mean-variance (MV) model, pioneered by Harry Markowitz in the early 1950s.

One of the underlying assumptions of the MV model is that the investor possesses the full information about the underlying assets. Specifically, it assumes that he knows the model’s inputs without any uncertainty. Nonetheless, in reality as decision makers, we can either form our views about these assets using historical data or through some speculations about the future of the underlying assets (or the sectors/market). The result of which, inevitably, induces what is called estimation error into the asset allocation problem.

It has been well established in the recent MPT literature that estimation error impairs the performance of MV optimal portfolios. It appears that investors are better off by indifferently allocating their wealth among the underlying assets rather than trying to solve for an optimal allocation. Such practice has been known as the naive approach, since it does not incorporate information about the underlying assets. Nonetheless, a more recent literature debates whether this naive strategy outperforms MV portfolios, after accounting for estimation error in the portfolio optimization problem.

In this article, I will test the above implications using monthly asset returns for 9 sector ETFs dating back to Jan 1999. I exclude one sector, which is the XLRE ETF, due to limited data availability. I will demonstrate the impact of estimation error on constructing MV optimal portfolios and then refer the reader to possible remedies used in the literature. In doing so, I hope to address the importance of portfolio optimization in the presence of uncertainty and the significance of taking into account estimation error.

2.2 ETF Data

I use the quantmod to download on 9 sector ETFs and the lubridate package to deal with date format (highly recommended!):

library(lubridate)
library(quantmod)


v <- c("XLE","XLU","XLK","XLB","XLP","XLY","XLI","XLV","XLF")
t1 <- "1996-01-01" # a starting date
P.list <- lapply(v,function(x) get(getSymbols(x,from = t1)) )
P <- lapply(P.list,function(x) x[,grep("Adjusted",names(x))])
P <- Reduce(function(...) merge(...),P  )
names(P) <- v
P <- apply.monthly(P,function(x) x[nrow(x),] )
R <- na.omit(P/lag(P)-1)

2.3 In and Out of Sample

I split the data into two parts, in-sample and out-of-sample. The former resembles the window upon which decision is placed in terms of asset allocation, whereas the latter denotes the realization period.

in_index <- 1:floor(nrow(R)/2)
out_index <- (1:nrow(R))[!1:nrow(R) %in% in_index]
R_in <- R[in_index,]
R_out <- R[out_index,]

For the in-sample, we have

range(date(R_in))
## [1] "1999-01-29" "2009-02-27"

and for the out-of-sample

range(date(R_out))
## [1] "2009-03-31" "2019-04-02"

The parameters of the two periods would differ significantly, especially when the first period contains the dot-com bubble and part of the recent financial crises. Looking at the difference between the in-sample and out-of-sample parameters below, it is evident that estimation error in the mean returns is more severe than the case for the second moments, i.e. variances and covariances of asset returns:

M1 <- apply(R_in,2,mean)
M2 <- apply(R_out,2,mean)


S1 <- var(R_in);
S2 <- var(R_out)


plot(M1 ~ M2 , ylim = range(c(M1,S1)) , xlim = range(c(M2,S2)), pch = 20, xlab = "In-Sample",ylab = "Out-Sample")
points(diag(S1)~diag(S2), col = 2, pch =15)
points(S1[!row(S1) == col(S1)] ~  S2[!row(S2) == col(S2) ], col = 3, pch =18)
abline(a=0,b = 1)
legend("bottom" ,c("Mean","Variance","Covariance") , col = 1:3, pch = c(20,15,18))

This small evidence is one of the main motivations in the practice of portfolio theory that focuses on the global minimum variance (henceforth GMV) portfolio. The GMV portfolio requires only information about the volatilities and covariances of the asset returns, unlike the case of efficient MV portfolios that also require knowledge about the mean returns. I will get back to this issue later on.

2.4 MV Optimal Portfolios

There are a number of R packages available to perform portfolio optimization. Nevertheless, I will solve for optimal MV portfolios using a function that I designed myself. The function is coded using an R base constrained optimization function. In doing so, I hope to provide the reader with some exposure to the underlying science behind the practice of portfolio optimization.

2.4.1 Objective Function

The objective function is defined in terms of expected utility (EU). A decision maker chooses an optimal allocation that maximizes his EU of his terminal wealth. Hence, such a function takes into account two main components: the portfolio mean return and the volatility of the portfolio return. This represents the reward-risk trade-off, in which the EU increases with the former but decreases with the latter, an environment where risk in non-preferable such that investors are risk-averse. Put formally, the expected utility of a risk averse investor is given by

\[\begin{equation} U(X)=X^{\prime}\mu-\frac{\kappa}{2}X^{\prime}\Sigma X \end{equation}\]

In R, the function is written as

EU <- function(X,M,S,k) {
  port_mean <- c(t(X)%*%M) 
  port_var <- c(t(X)%*%S%*%X)
  result <- port_mean - 0.5*k*port_var
  return(result)
} 

which takes 4 arguments. The first input is a vector X that denotes the allocation among the assets. The second and third inputs are the mean vector, M, and the covariance matrix, S, of the asset return. These two resemble the knowledge of the decision maker about the underlying assets. Finally, the fourth input is the risk-aversion of the decision maker, denoted by k.

The k parameter determines the preference of the decision maker in terms of risk tolerance. Let us consider two extreme cases. If the investor is only concerned with maximizing reward, then k is close to zero such that his utility is mainly determined by the portfolio expected return, regardless of the associated risk. On the other hand, if k goes to infinity, then it implies that the utility is mostly affected by the portfolio risk, whereas the utility derived from the portfolio expected return is trivial. In the former case, the investor chooses the asset with the highest mean return; while in the latter he would choose the GMV portfolio since risk is the main component that affects his utility.

We will consider k as given and let it range between 2.5 and 100. Clearly, the larger the k is the more risk-averse the investor is. Additionally, in assessing the mean vector, M, and the covariance matrix, S, we will consider the sample estimates for now. Clearly, k, M, and S are treated as given, whereas the main control variable the investor needs to choose is the portfolio weights, i.e. X. Therefore, with a given level of risk-aversion and equipped with both the M and S, the investor will choose the allocation that maximizes his EU.

2.4.2 Optimization

I use numerical optimization to construct optimal portfolios. The base constrOptim R function allows users to find the minimum point given an initial guess of the control variable. In addition, a gradient can be added to make the optimization more efficient. Ideally, numerical optimization tools rely on random searching algorithms to figure out the minimum (optimal) point. Hence, if one is able to direct this search in a more indicative way, the result of which will make the search more efficient and reliable. This is where the gradient comes into the picture.

A unique portfolio solution is achieved if the objective function is convex. Due to the fact that we are using a quadratic utility function, the function is convex and, hence, a unique solution exits. To implement, I define the following function that takes four arguments: M, S, k, and BC. BC is a list that contains the budget constraints with respect to which the investor chooses his optimal allocation.

MV_portfolio <- function(M,S,BC,k) {
  eps <- 10^-3
  X0 <- rep(1/d,d)
  # define a utility as a function of portfolio weight
  f <- function(x) -EU(x,M,S,k)
  # define the gradient function
  g <- function(X) -M + k*S%*%X
  A <- BC[[1]]
  B <- BC[[2]]
  X1 <- constrOptim(X0,f,grad = g,ui = A,ci = B)$par
  return(X1)
}

The basic budget constraint is that the investor allocates all of his wealth in the portfolio, such that the allocated proportions sum to 1. Other constraints may include limits on positions in individual assets or exclude short-sales. The latter are common constraints used in the practice of portfolio management, whereas the former case is usually used in the theoretical literature to derive tractable analytical results. I define the following two BC items:

BC_f <- function(d) {
  # sum to one constraint
  A <- matrix(1,1,d)
  A <- rbind(A,-A)
  B <- c(0.999,-1.001)
  
  # short-sales constraints
  A2 <- diag(rep(1,d))
  B2 <- rep(0,d)
  A2 <- rbind(A,A2)
  B2 <- c(B,B2)
  
  # stack altogether in a list
  BC1 <- list(A,B)
  BC2 <- list(A2,B2)
  list(BC1,BC2)
}

For a given info about the mean vector and the covariance matrix of the assets, the desired level of risk taking, and some budget constraints, the MV_portfolio function returns the optimal portfolio weights. It does so by initializing the vector of weights to an equally weighted portfolio, which also satisfies the budget constraints.

To demonstrate the optimization problem, consider the case of two assets. The objective, is to find two weights that maximize the utility for a given mean vector, covariance matrix, and risk-aversion. Given the budget constraints that portfolio weights should sum to one, it can be shown that the optimal portfolio weights are given by

d <- 2
mu <- M1[1:d]
Sig <- S1[1:d,1:d] 
k <- 5
x_opt_2 <- MV_portfolio(mu,Sig,BC_f(2)[[1]],k)
x_opt_2
## [1] 0.5479614 0.4510387

In the case of two assets, XLE and XLU, the expected utility is in-sample when one invests 55% in XLE and 45% in the latter. Clearly, both weights sum to 1 and satisfy the budget constraint. In what follows, I would like to demonstrate in multiple step the rationale behind this optimization.

Let’s consider different weights in XLE and XLU that range between -1 and 1. For each possible combination, we compute the expected utility. Finally, we plot the expected utility as a function of each and try to pinpoint the optimal point. Put formally, we run the following commands

X1 <- seq(-1,1,length = 100)
X2 <- seq(-1,1,length = 100)
f_x <- function(x1,x2) EU(c(x1,x2),M1[1:2],S1[1:2,1:2],5)
eu_ds <- expand.grid(X1,X2)
eu_ds$eu <- sapply(1:nrow(eu_ds), function(i) f_x(eu_ds[i,1],eu_ds[i,2]  ) )
head(eu_ds)
##         Var1 Var2          eu
## 1 -1.0000000   -1 -0.03629802
## 2 -0.9797980   -1 -0.03552304
## 3 -0.9595960   -1 -0.03475687
## 4 -0.9393939   -1 -0.03399951
## 5 -0.9191919   -1 -0.03325095
## 6 -0.8989899   -1 -0.03251120

The above return the expected utility for each possible combination of the portfolio weights. The optimal point is the one that returns the maximum EU value in the data:

max_point <- eu_ds[which.max(eu_ds$eu),]

Now, let’s draw the EU as a function of each in a contour in the following manner

Z <- matrix(eu_ds$eu,length(X1))
contour(X1,X2,Z,xlab = expression(x[1]) , ylab = expression(x[2]) )
points(max_point[1,1], max_point[1,2], pch = 20, cex = 2)

We note that the maximum point in the above illustration is somewhere is given by

max_point
##           Var1        Var2         eu
## 4571 0.4141414 -0.09090909 0.00155178

However, this is an unconstrained optimization problem. Given the contour above, we need to choose the point that satisfies the budget constraints that both weights sum to 1. Put formally, the BC states that \(x_1 + x_2 = 1\) or \(x_2 = 1- x_1\). Hence, this should be reflected in a downward line on the above contour. Therefore, the optimal points with respect to the BC should be tangent to the .

contour(X1,X2,Z,xlab = expression(x[1]) , ylab = expression(x[2]) )
points(max_point[1,1], max_point[1,2], pch = 20, cex = 2)
abline(a=1,b=-1,col = 2,lty = 2)
points(x_opt_2[1],x_opt_2[2],pch = 20, col = 2,cex = 2)

Obviously, the constrained optimization results in a lower optimal point than the unconstrained one. Nonetheless, the red point above does satisfy the budget constraints and yield the maximum expected utility on the line. Also, note if we were to add the short-sales constraints, then this wouldn’t affect the result, since the red point already. We would expect a change in the results in the optimized portfolio did have short-sales in the position.

The above procedure would not be most efficient, since it considers all possible permutations and then picks that maximum point. For this reason, relying on efficient gradient descent methods can be significantly more efficient, especially, when dealing with high dimensions.

Before we move to the construction of the mean-variance efficient frontier (MVEF) using all sector ETFs, I would like to demonstrate one more thing. We can view the above optimal decision making in terms of indifference. If the investors faces the same utility for different choices of asset \(x_1\), then he is indifferent among all of which. Hence, his choice of \(x_1\) and, hence, \(x2\) are determined by points in which the indifference curve is tangent with the budget constraint.

To demonstrate this, consider the EU in the case of two assets. Let \(X^*\) denote the optimal solution that maximizes the utility with respect to the budget constraint and \(U(X^*) = \theta\) represent the corresponding expected utility. In our example, this \(\theta\) is given by

theta <- EU(x_opt_2,mu,Sig,k)
theta
## [1] -0.001100614

hence, the indifference point is the set of all \(X\) values that satisfy the condition \(U(X) = -0.001100614\). To find all possible sets over a number of possible values of \(x_1\) and \(x_2\), we refer to numerical solution to determine the unit root of a given function.

ind_curv <- function(x_1,x_2) EU(c(x_1,x_2),mu,Sig,k) - theta
eu_ds$ind_cont <- sapply(1:nrow(eu_ds), function(i) ind_curv(eu_ds[i,1],eu_ds[i,2]  ) )
Z2 <- matrix(eu_ds$ind_cont,length(X1))
contour(X1,X2,Z2, levels = 0,lwd = 2, 
        col = 2,xlim = c(0,1), ylim = c(0,1), 
        xlab = expression(x[1]) , ylab = expression(x[2]) )
abline(a = 1,b = -1,lty = 2)
points(x_opt_2[1],x_opt_2[2],pch = 20, col = 1,cex = 2)

The red line denotes the indifference curve, i.e. the set of all allocations that investors achieves the same utility, which is \(\theta\). On the other hand, given the budget constraint, there is a single point on the curve that overlaps with BC line.

The latter visualization is more consistent with economic interpretation, whereas the former one more with optimization. In either example, we can see the implications of the optimization function written in this vignette. In what follows, this will be our building block to construct portfolio on the \(d\)-dimensional level. However, our visualization is limited to a maximum of three dimension.

2.5 The MV Efficient Frontier

The ideal view is that investors should be compensated, in terms of portfolio expected return, the more risk they are willing to take. This results in the classical textbook parabola that captures the reward-risk trade-off, known as the MV efficient frontier. Such parabola is the corner stone of almost every Finance MBA class. Nonetheless, such trade-off in practice, i.e. when investors face estimation error, is not as clear.

2.5.1 Basic Budget Constraints

I start with the MVEF without any short-sale constraints. To do so, I need to find the optimal portfolio with respect to different levels of risk preference. In particular, for each given portfolio, I compute the corresponding mean return and volatility. Eventually, by plotting these means on the corresponding volatility should illustrate the MVEF. To do so, first, I need a function that computes the optimal portfolio weights for different ks:

MV_portfolios <- function(M,S,BC) {
    d <- ncol(S) # number of assets
    e <- as.matrix(rep(1,d)) # vector of ones
    MV_portfolio_k <- function(k) MV_portfolio(M,S,BC,k)
    k.seq <- c(seq(2.5,5,length = 10),seq(5,10,length = 10),seq(10,100,length = 100))
    X.list <- lapply(k.seq,MV_portfolio_k)
    return(X.list)
}

In addition, I write an additional function that finds the GMV portfolio, i.e. the weights that minimize the portfolio variance:

gmv_portfolio <- function(S,BC) {
  d <- ncol(S)
  portfolio_variance <- function(X) c(t(X)%*%S%*%X)
  X0 <- rep(1/d,d)
  A <- BC[[1]]
  B <- BC[[2]]
  # solving this should give the minimum variance portfolio (GMV)
  X_gmv <- constrOptim(X0,portfolio_variance,grad = NULL,ui = A,ci = B)
  return(X_gmv)
}

In the second step, we need a function construct the frontier. We do so in two different ways. The first one is the hypothetical one, which assumes that we know the out-of-sample mean vector and covariance matrix and construct the portfolio using these inputs. This should correspond to the theoretical parabola where we observe that there is an optimal risk for each reward. The more realistic case, however, finds the portfolio weights from the in-sample inputs and computes the mean returns and volatilities using the out-of-sample inputs.

MVE_function <- function(BC) {  
  # OUT-OF-SAMPLE CONSTRUCTED PORTFOLIOS
  X.list <- MV_portfolios(M2,S2,BC)
  # IN-SAMPLE CONSTRUCTED PORTFOLIOS
  X2.list <- MV_portfolios(M1,S1,BC)

  # OUT-OF-SAMPLE FRONTIER (HYPOTHETICAL CASE)
  M_p <- sapply(X.list,function(X) t(X)%*%M2 )
  
  
  V_p <- sqrt(sapply(X.list,function(X) t(X)%*%S2%*%X))
  MVE <- data.frame(M = M_p,V = V_p)

  # IN-SAMPLE FRONTIER (REALISTIC CASE)
  M2_p <- sapply(X2.list,function(X) t(X)%*%M2 )
  V2_p <- sqrt(sapply(X2.list,function(X) t(X)%*%S2%*%X))
  MVE2 <- data.frame(M = M2_p,V = V2_p)

  # COMPUTE THE GMV PORTFOLIO
  X_gmv1 <- gmv_portfolio(S2,BC)$par
  X_gmv2 <- gmv_portfolio(S1,BC)$par

  # HIGHLIGHT THE GMV POINT
  M_0 <- t(X_gmv1)%*%M2
  V_0 <- sqrt(t(X_gmv1)%*%S2%*%X_gmv1)
  MVE <- rbind(MVE,c(M_0,V_0))

  M_02 <- t(X_gmv2)%*%M2
  V_02 <- sqrt(t(X_gmv2)%*%S2%*%X_gmv2)
  MVE2 <- rbind(MVE2,c(M_02,V_02))

  # ORDER THE FRONTIER
  MVE <- MVE[order(MVE$M),]
  MVE2 <- MVE2[order(MVE2$M),]

  # ADD THE NAIVE PORTFOLIO
  X_N <- rep(1/ncol(R),ncol(R))
  M_N <- t(X_N)%*%M2
  V_N <- sqrt(t(X_N)%*%S2%*%X_N)
  
 list(MVE2 = MVE2,MVE = MVE,NAIVE = c(M_N,V_N), GMV = c(M_02,V_02) )
}

d <- ncol(R)
BC1 <- BC_f(d)[[1]]

MVE_BC1 <- MVE_function(BC1)

The object MVE_BC1 returns a number of items. First, it returns the means and volatilities for all hypothetical portfolios

summary(MVE_BC1$MVE)
##        M                 V          
##  Min.   :0.01175   Min.   :0.02713  
##  1st Qu.:0.01298   1st Qu.:0.02745  
##  Median :0.01376   Median :0.02795  
##  Mean   :0.01711   Mean   :0.03584  
##  3rd Qu.:0.01676   3rd Qu.:0.03167  
##  Max.   :0.04926   Max.   :0.12344

which is summarized in a data frame. The second object correspond to those from the more realistic case

summary(MVE_BC1$MVE2)
##        M                  V          
##  Min.   :0.001405   Min.   :0.02981  
##  1st Qu.:0.009809   1st Qu.:0.03031  
##  Median :0.010554   Median :0.03066  
##  Mean   :0.009720   Mean   :0.03374  
##  3rd Qu.:0.010750   3rd Qu.:0.03238  
##  Max.   :0.011074   Max.   :0.07110

We observe from the above summary that mean returns for the latter is smaller on average. In addition, the MVE_BC1 object returns the mean and volatility for the GMV and the naive portfolios.

Given all of which, I, hence, plot the MVEF below. The y-axis represents the portfolio mean return, while the x-axis denotes the portfolio risk return, proxied by the standard deviation of the portfolio return. The figure has two lines. The solid line is the classical MV efficient frontier, which is constructed using the out-of-sample data. This is the hypothetical case, which serves as our benchmark. On the other hand, the dashed line represents the frontier for the in-sample case, which is the more realistic one.

MVE <- MVE_BC1$MVE
MVE2 <- MVE_BC1$MVE2
M_N <- MVE_BC1$NAIVE[1]
V_N <- MVE_BC1$NAIVE[2]
M_02 <- MVE_BC1$GMV[1]
V_02 <- MVE_BC1$GMV[2]

y.range <- range(c(MVE$M,MVE2$M))
x.range <-  range(c(MVE$V,MVE2$V))

plot(M~V,data = MVE,  ylim = y.range, xlim = x.range, pch = 1, cex = 0.3, col = 1, ylab = expression(mu[p]), xlab = expression(sigma[b]),type = "l", lwd = 2)
lines(M~V,data = MVE2, lty = 2,lwd = 2)
points(M_N~V_N,pch = 4,lwd = 2)
points(M_02~V_02,pch = 1,lwd = 2)
legend("topright",c("MVE Out","MVE In", "Naive","GMV") , lty = c(1:2,0,0),lwd = 2, pch = c(NA,NA,4,1)) 

The reward-risk trade-off is very evident in the solid line. This implies, if we could assess the future reward-risk trade-off, then there is an additional reward for tolerating more risk. However, the dashed line tells us a different story. Specifically, it implies that investors get punished for taking more risk, something that contradicts the whole foundation of financial economic thought.

The reason for the above evidence is the presence of estimation error. On the top left of the dashed line, I highlight the GMV in-sample constructed portfolio. In this case, I use the covariance matrix from the in-sample window to construct the portfolio that yields me the lowest standard deviation. Nonetheless, as we move away from this point, my portfolio also relies on the assets mean return, which associated with greater estimation error. Clearly, this justifies the conventional wisdom that argues that if you deviate from the GMV portfolio, your portfolio will suffer due to greater estimation error.

Standing next to the GMV portfolio, is the naive one denoted by an x shaped dot. Clearly, the naive strategy dominates most of MV portfolios (top and left to most points on the dashed line). However, that does not hold true for the GMV portfolio. At the same time, the GMV portfolio does not dominate the naive one. In any case, we also observe, if estimation error is absent (black line), then the naive portfolio would have been considered MV sub-optimal.

2.5.2 Adding Short-Sales Constraints

It is common in the practice of portfolio management to use ad-hoc techniques, such as limiting the exposure to a certain sector or avoid short-sales at all. While such practice seems sub-optimal from a theoretical point of view, it has important implications on estimation error.

I repeat the same exercise as before but with the addition of short-sales constraints. In the same fashion of the previous figure, I demonstrate the case when short-sales are not allowed. I highlight the new results using the red color and compare with the previous one as follows

BC2 <- BC_f(d)[[2]]
MVE_BC2 <- MVE_function(BC2)

# FINALLY PRODUCE PLOT
MVE_ss <- MVE_BC2$MVE
MVE2_ss <- MVE_BC2$MVE2
M_N_ss <- MVE_BC2$NAIVE[1]
V_N_ss <- MVE_BC2$NAIVE[2]
M_02_ss <- MVE_BC2$GMV[1]
V_02_ss <- MVE_BC2$GMV[2]


plot(M~V,data = MVE,  ylim = y.range, xlim = x.range, pch = 1, cex = 0.3, col = 1, ylab = expression(mu[p]), xlab = expression(sigma[b]),type = "l", lwd = 2)
lines(M~V,data = MVE2, lty = 2,lwd = 2)
points(M_N~V_N,pch = 4,lwd = 2)
points(M_02~V_02,pch = 1,lwd = 2)
lines(M~V,data = MVE_ss, lty = 1,lwd = 2, col = 2)
lines(M~V,data = MVE2_ss, lty = 2,lwd = 2, col = 2)
points(M_N_ss~V_N_ss,pch = 4,lwd = 2, col = 2)
points(M_02_ss~V_02_ss,pch = 1,lwd = 2, col = 2)
legend("topright",c("MVE Out","MVE In", "Naive","GMV") , lty = c(1:2,0,0),lwd = 2, pch = c(NA,NA,4,1)) 

In the absence of estimation error (i.e. hypothetical case), it is clear that solving a constrained optimization problem results in a sub-optimal solution. In this case, the red solid is below the black solid line, such that portfolio optimization that omits short-sales yields sub-optimal MV portfolios compared with the ones that do not impose such. Nevertheless, we also observe that short-sales constraints mitigate the risk exposure of the investor. For the same level of risk-aversion, the investor ends up taking less risk. Alternatively, one can argue that no-short-sales investors are more risk-averse in nature.

Looking closer at the more realistic case, i.e. the presence of estimation error, it is clear from the red dashed line that short-sales constraints limit the exposure of the investor to excessive risk-taking. Nonetheless, we can still see that investors get punished for taking excessive risk for which he does not get rewarded accordingly. On the other hand, we still observe that the GMV portfolio is not dominated by the naive one and that there is a small change in the location of the GMV point.

In either case, we can argue that short-sales limits the exposure of the investor to excessive risk. What’s more interesting, nevertheless, is the following observation. While adding short-sales seem MV sub-optimal under full information perspective (i.e. the red solid line versus the black solid line), this is not the case when we take into account estimation error. Clearly, the red dashed line does not seem to be less MV sub-optimal as the black dashed. In fact, it appears that the former mitigates underperformance due to estimation error.

2.6 Summary

Most of the recent literature on portfolio optimization proposes different ways to mitigate estimation error. Those approaches include Bayesian and shrinkage methods. In fact, it has been established that some of the shrinkage approaches are consistent with adding short-sales constraints. Nevertheless, due to the greater estimation error associated with the assets mean return, the focus has been limited to the GMV portfolio alone. In the next chapter, I devote the discussion to the GMV portfolio and apply some of these techniques to yield estimation-error-robust portfolios.

3 Dynamic Asset Allocation for Sector ETFs

In the previous section, I addressed the issue of estimation error and its impact on the performance of mean-variance (MV) optimal portfolios. I demonstrated that MV optimization constructed in-sample does not necessarily imply good performance out-of-sample. I also illustrated that taking more risk in-sample does not necessarily translate into greater reward out-of-sample. These insights were derived from an empirical investigation using stock returns on 9 sector ETFs. In this section, we proceed with the same data sample and demonstrate how to construct optimal portfolios out-of-sample by taking into account estimation issues.

Recall that due to the greater estimation error inherited in the mean returns of the assets, the global minimum variance (GMV) portfolio performs the best out-of-sample in comparison with over portfolios on the MV efficient frontier (recall the MVEF plot). In this investigation, I will focus on the GMV portfolio and demonstrate its performance using back-testing. I will perform back-testing using a monthly rolling window, resembling an investor who updates his information/view about the assets and re-balances his portfolio on a monthly basis.

3.1 Rolling Window

One limitation of the first section demonstration is that the construction of the MV assumes two periods model. In one period, you estimate the parameters, using which you find the MV portfolio weights, and in the second period you realize the returns. One remedy is to think about multi-period model, in which you assess the parameters on recurring basis. This allows to incorporate recent information to certain degree and ignore older information to construct the optimal portfolios. On the bright side, the portfolio weight would consistent of the recent data. On the dark side, this also implies more turnover and, thus, transaction cost.

To demonstrate the idea behind rolling window, let’s refer to the same sample used in the previous section. Recall that I download the data using the quantmod package. By construction, the output is given in the format of zoo and xts objects. These objects are extremely friendly for time series manipulation. For instance, suppose we are interested in computing the volatility using a moving average of 25 days. This can be achieved in a single command thanks to the rollapply function from the zoo library. The function performs a rolling window analysis for a given interval and function. As a small demonstration, let’s consider the financial sector:

XLF_vol <- rollapply(R[,"XLF"],25,sd)
{
plot(R[,"XLF"], legend = "XLF")
lines(XLF_vol, col = 2)
}

Clearly, there is an upward trend in the volatility as we head toward the recent financial crisis, whereas less uncertainty prevails after the crisis. Additionally, we observe that there has been an increase in the volatility over the last year, especially beginning from late 2016.

The above plot serves as a small example to demonstrate why dynamic asset allocation matters as new information becomes available. Clearly, a static portfolio, as the case of the naive portfolio, does not incorporate such information. Nevertheless, paying too much attention to historical data could also imply over-fitting on past trends and amplifying the estimation concerns. I will return to this later on.

3.2 Dynamic Asset Allocation

To move on, I would like to use some notations. By definition let \(X_t(N)\) denote the portfolio weights constructed at the end of month \(t\), based on a sample of \(N\) months (including month t). \(X_t(N)\) resembles a rational output function that takes a data sample consisting of \(N\) historical monthly returns as its main input. In our case, the investor evaluates the asset volatilities as well as the correlation among asset returns to determine his optimal allocation. Specifically, it follows that \(X_t(N)\) is a function of \(S_t(N)\), where \(S_t(N)\) denotes the estimated covariance matrix of the asset returns given \(N\) months of historical data. As a result, one can conjecture that a better assessment of \(S_t(N)\) should imply a more robust portfolio \(X_t(N)\).

In fact, without imposing any short-sale constraints, the GMV portfolio weights are given by \[\begin{equation} X_{t}(N)=\frac{\left[S_{t}(N)\right]^{-1}\mathbf{1}}{\mathbf{1}^{\prime}\left[S_{t}(N)\right]^{-1}\mathbf{1}} \end{equation}\]

Nonetheless, for our empirical exposition, we will refer to the gmv_portfolio function that we constructed in the previous Section, which is given by

gmv_portfolio <- function(S,BC) {
  # check for any missing values due to rolling window
  if (any(is.na(S))) return(rep(NA,ncol(S)))
  d <- ncol(S) # number of assets

  # objective function
  f <- function(X) c(t(X)%*%S%*%X)

  # gradient
  g <- function(X) 2*S%*%X # and gradient
  X0 <- rep(1/d,d) # an initial guess
  
  # use budget constraints
  A <- BC[[1]]
  B <- BC[[2]]
  
  # use constrained optimization
  X_gmv <- constrOptim(X0,f,g,ui = A,ci = B)$par
  return(X_gmv)
}

The above function takes two main inputs. The first is the covariance matrix \(S_t(N)\), whereas the latter is the budget constraints, which may or may not include short sales. Clearly, if there is no restrictions on the weight allocated to each asset, then the solution of the gmv_portfolio is consistent with the closed form one.

For each month \(t\) in the data, I compute the sample covariance matrix based on a sample of \(N\) months, this returns a list of \(S_t(N)\) for all \(t>N-1\). I set \(N=100\) and use the rollapply function (with a little trick as suggested by Richard Herron:

library(plyr)
N <- 100
cov_roll <- rollapply(R,N, function(x) as.vector(var(x)),
by.column = F, align = "right")
cov_roll <- alply(cov_roll, 1, function(x) matrix(x, nrow = ncol(R) ) )
names(cov_roll) <- date(R)

The rollapply function is designed to run on a single column and return a single statistic by default. In our case, we need to apply it on 9 columns, whereas the output is a \(9 \times 9\) covariance matrix. To deal with this, the trick mentioned above returns a vector of \(9 \times 9=81\) elements for each month, such that the covariance matrix elements are stacked in a vector instead of a matrix. Thus, we need to stack these elements back into a matrix representation and combine all matrices in a list. This is where the alply function from the plyr package comes into the picture. Finally, I name each covariance matrix with respect to the date it was estimated.

round(cov_roll[[N-1]]*100,2)
##       [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
##  [1,]   NA   NA   NA   NA   NA   NA   NA   NA   NA
##  [2,]   NA   NA   NA   NA   NA   NA   NA   NA   NA
##  [3,]   NA   NA   NA   NA   NA   NA   NA   NA   NA
##  [4,]   NA   NA   NA   NA   NA   NA   NA   NA   NA
##  [5,]   NA   NA   NA   NA   NA   NA   NA   NA   NA
##  [6,]   NA   NA   NA   NA   NA   NA   NA   NA   NA
##  [7,]   NA   NA   NA   NA   NA   NA   NA   NA   NA
##  [8,]   NA   NA   NA   NA   NA   NA   NA   NA   NA
##  [9,]   NA   NA   NA   NA   NA   NA   NA   NA   NA

Note that we need N=100 months to estimate the first covariance matrix. The rollapply function assigns missing values to the first \(N-1\) periods, whereas we have

 round(cov_roll[[N]]*100,2)
##       [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
##  [1,] 0.36 0.13 0.12 0.19 0.05 0.09 0.15 0.08 0.11
##  [2,] 0.13 0.22 0.02 0.10 0.07 0.07 0.10 0.06 0.11
##  [3,] 0.12 0.02 0.74 0.21 0.01 0.28 0.25 0.21 0.18
##  [4,] 0.19 0.10 0.21 0.38 0.06 0.21 0.25 0.14 0.18
##  [5,] 0.05 0.07 0.01 0.06 0.13 0.05 0.06 0.04 0.08
##  [6,] 0.09 0.07 0.28 0.21 0.05 0.27 0.19 0.14 0.18
##  [7,] 0.15 0.10 0.25 0.25 0.06 0.19 0.24 0.13 0.16
##  [8,] 0.08 0.06 0.21 0.14 0.04 0.14 0.13 0.16 0.12
##  [9,] 0.11 0.11 0.18 0.18 0.08 0.18 0.16 0.12 0.24

The diagonal elements represent the variances of the ETF returns, and each row and column number corresponds respectively to

names(R)
## [1] "XLE" "XLU" "XLK" "XLB" "XLP" "XLY" "XLI" "XLV" "XLF"

This implies that the XLF monthly volatility (measured as standard deviation of the stock returns) in late April 2007 was almost 5%

sqrt(cov_roll[[N]][9,9])*100
## [1] 4.873489

On the other hand, the off-diagonal elements denote the covariances between the ETF returns. One observes that the correlation between the sectors, on average, is high, and, over time, the correlation coefficient does not go below zero. This should raise a flag regarding the diversification benefits achieved here. The following code plots the average as well as the minimum correlation coefficient among the ETF returns

plot.new()
# look at correlation over time
cor_roll <- rollapply(R,N, function(x) as.vector(cor(x)), by.column = F, align = "right")
cor_roll <- alply(cor_roll, 1, function(x) matrix(x, nrow = ncol(R) )   )
names(cor_roll) <- date(R)

cor_mean <- na.omit(as.xts(sapply(cor_roll, function(x) mean(x[upper.tri(x)]) )))
cor_min <- na.omit(as.xts(sapply(cor_roll, function(x) min(x[upper.tri(x)]) )))

plot(merge(cor_mean,cor_min), ylim = range(c(cor_mean,cor_min))   )

The black (red) line denotes the average (min) correlation among the ETFs. Clearly, an investor should take into account the benefits of diversification, in terms of correlation, as he allocates his wealth among different assets. While this section is not concerned with asset picking, clearly one should take into account the correlation among the underlying assets. I will leave this for now.

Given the covariance matrix for each month in the data, i.e. the list cov_roll, I can finally construct GMV portfolios on a dynamic basis. I use the following command to establish this

X_gmv_1 <- t(sapply(cov_roll, function(S)  gmv_portfolio(S,BC1)))
X_gmv_2 <- t(sapply(cov_roll, function(S)  gmv_portfolio(S,BC2)))
colnames(X_gmv_1) <- colnames(X_gmv_2)<- names(R)

X_gmv_1 <- as.xts(X_gmv_1)
X_gmv_2 <- as.xts(X_gmv_2)

The X_gmv_1 (respectively X_gmv_2) object stacks all GMV portfolios with (respectively without) short-sales in a time series. Looking at the last month in the data, one can see that the long-only portfolio is more protective with lower weight allocated to financials (XLF) while putting more load on utilities and consumer discretionary.

port_sum_t <- data.frame(GMV =  t(round(tail(X_gmv_1*100,1),2)), GMV_no_short =  t(round(tail(X_gmv_2*100,1),2)) )
names(port_sum_t) <- c("GMV","GMV_no_short")
port_sum_t
##        GMV GMV_no_short
## XLE   0.36         0.79
## XLU  40.32        41.27
## XLK  18.55        16.22
## XLB  10.82         4.50
## XLP  21.43        18.76
## XLY   0.24         2.86
## XLI -11.50         0.64
## XLV  12.46        10.02
## XLF   7.22         4.84

Taking a closer look at the financial sector over time, I plot the weight allocated to XLF over time

XLF_weight <- na.omit(X_gmv_1[,"XLF"])
XLF_weight$hline <- 0
plot(XLF_weight,lty = 1:2)

Prior to the crisis, one can see that greater weight was allocated to financials. This implies that the volatility of the XLF was perceived to be lower. However, as the crisis unraveled there was a turmoil in the sector and the perception of the sector had been deemed very risky relative to the other sectors. This manifested by the long horizon in which the XLF is shorted. Nevertheless, over the last two years, a more optimistic view about the sector is observed.

3.3 Bkactesting

Given the above dynamic asset allocations, how do the GMV portfolio performs over time? At the end of month \(t\), we have an allocation of \(X_t(N)\), such that the portfolio return for the next month is given by \(X_t(N)^\prime R_{t+1}\), where \(^\prime\) denotes the transpose operation of a vector and \(R_{t+1}\) is the vector of returns for month \(t+1\). This implies that the portfolio return for the next period is given by the dot product between two vectors, i.e. the portfolio weight at month \(t\) and the realized returns at month \(t+1\).

Since the sample size consists of 100 months, the rest of the data will be used for testing, which leaves us with 12 years for back testing. The first estimated portfolio dates back to April 2007. I will use May 2007 as the first testing period. Since this vignette was created early April, we drop the last month from the testing period.

R_test <- R[date(R)[101:(nrow(R)-1)],]
range(date(R_test))
## [1] "2007-05-31" "2019-03-29"

On the the hand, the portfolio dates should correspond to the previous month

X_gmv_1 <- X_gmv_1[date(R)[100:(nrow(R)-2)],]
X_gmv_2 <- X_gmv_2[date(R)[100:(nrow(R)-2)],]
range(date(X_gmv_1))
## [1] "2007-04-30" "2019-02-28"

Given R_test, X_gmv_1, and X_gmv_2 objects, the return of each portfolio can be computed in the following manner

ret_gmv_1 <- sapply(1:nrow(R_test), function(i) (X_gmv_1[i,]) %*% t(R_test[i,]) )
ret_gmv_2 <- sapply(1:nrow(R_test), function(i) (X_gmv_2[i,]) %*% t(R_test[i,]) )

which follows from matrix multiplications. For each month, we compute the dot product \(X_t(N)^\prime R_{t+1}\). As a benchmark, we consider the naive portfolio, which return is given by a simple mean return among asset for each period, i.e.

ret_naive <- apply(R_test,1,mean)

Note that the benchmark does not incorporate any prior information about the sectors and indifferently allocates among all of which regardless of market trends.

To gain a first perspective on each portfolio, we plot the performance of each using the cumulative return from the start until the end. This can be achieved easily using the PerformanceAnalytics library. Nonetheless, to do so, we need to stack the return of each portfolio in a merged xts object:

library(PerformanceAnalytics)

names(ret_gmv_1) <- names(ret_gmv_2) <- names(ret_naive) <- date(R_test)
ret_gmv_1 <- as.xts(ret_gmv_1)
ret_gmv_2 <- as.xts(ret_gmv_2)
ret_naive <- as.xts(ret_naive)
ds_port <- merge(ret_gmv_1,ret_gmv_2,ret_naive)
names(ds_port) <- c("GMV","GMV_no_short","Naive")
chart.CumReturns(ds_port,legend.loc = "topleft")

The black (red) line denotes the performance of GMV portfolio with (without) short-sales. The green line denotes the naive strategy. Interestingly, the no-short-sales GMV portfolio does almost as good as the one with short-sales. On the other hand, we observe that both GMV portfolios outperform the naive strategy. However, most of this out-performance dates back to the financial crisis, where we expect the benefits of the GMV portfolio to prevail the most.

3.3.1 Performace Metrics

While the above plot provides a perspective on the portfolio performance, a more detailed summary is needed. To do so, I will summarize the performance of each portfolio with respect to the following metrics:

  1. Portfolio Return computes the annual mean return (monthly mean scaled by 12). Metric is reported in percentages.
  2. Portfolio Volatility computes the annual standard deviation (monthly standard deviation scaled by the square root of 12) - recall that the objective of GMV is to minimize volatility. Metric is reported in percentages.
  3. Sharpe-ratio (SR) captures the portfolio risk-return trade-off - computed as the ratio between the portfolio return and volatility.
  4. Turnover (TO) measures how stable the portfolio weights are over time. Metric is reported in percentages on a monthly basis.
  5. Equivalent Transaction Costs (TC) is a hypothetical measure that tries to reconcile turnover with performance (Sharpe-ratio). Metric is reported in percentages on a monthly basis.

Computing the first three metrics is straightforward. I stack the returns of the three portfolios in a list, run a specific summary function on the list using the sapply function, and stack all statistics in a matrix called M:

portfolio_ret <- list(ret_gmv_1,ret_gmv_2,ret_naive)
summary_ret <- function(x) c(100*mean(x)*12, 100*sd(x)*sqrt(12), (mean(x)/sd(x))*sqrt(12))

# summarize returns
M <- sapply(portfolio_ret,summary_ret)
colnames(M) <- c("GMV","GMV_no_short","Naive")

For TO, we need to consider the change in allocations over time. Given the X_gmv_1 and X_gmv_2 objects, TO can be easily computed for both GMV strategies. For the Naive one, I consider the change of returns for each asset and then adjust the weights correspondingly. For instance, if the financial sector went up by 10% while the other 8 sectors stayed the same in month t+1, then I would need to sell 10% percent from the XLF ETF in order to keep an equally weighted portfolio. The following code returns the TO for all three strategies:

# GMV TO
portfolio_weights <- list(X_gmv_1,X_gmv_2)
TO_f <- function(X) apply(abs(as.matrix(X[-1,]) - as.matrix(X[-nrow(X),])),1,sum)
TO_list <- lapply(portfolio_weights,TO_f)

# NAIVE TO
TO_naive <- apply(R_test[-nrow(R_test),],1,function(x)  sum(abs((x+1)/sum(x+1) - 1/length(x)))  )     
TO_list[3] <- list(TO_naive)

TO <- sapply(TO_list,mean)*100
M <- rbind(M,TO)

The TO metric implies how much the strategy trades each month for a single dollar. Obviously, the larger the value is the less stable the strategy is, implying a larger number of trades needed to maintain it. For TC, I run the following

TC_f <- function(TC,i) {
  R_i <- portfolio_ret[[i]][-1] - TO_list[[i]]*TC
  return(mean(R_i)/sd(R_i))
  }


# solve for TC that makes it equal
TC1 <- uniroot(function(TC) TC_f(TC,1) - TC_f(TC,3) ,c(-1,1))$root
TC2 <- uniroot(function(TC) TC_f(TC,2) - TC_f(TC,3) ,c(-1,1))$root
TC <- c(TC1,TC2,NA)*100

The TC metric punishes the strategy performance (Sharpe-ratio in our case) for TO. This metric, hence, captures the trade-off between TO and performance with respect to a benchmark (the naive portfolio in our case). For instance, if one strategy yields a higher SR than the benchmark, how much of this out-performance is driven by turnover. The TC1 and TC2 statistics, hence, denote the maximum transaction cost for each $1 invested in strategy 1 and 2, respectively, without under-performing the benchmark. Clearly, if such statistic is negative, then it implies that the strategy under-performs the benchmark regardless of the transaction costs.

Finally, I stack all metrics in matrix M and summarize as follows

M <- rbind(M,TC)
rownames(M) <- c("Mean","Std","SR","TO","TC")
data.frame(M)
##             GMV GMV_no_short      Naive
## Mean  9.2902338    9.0313282  8.9442723
## Std  11.5203880   11.2884341 14.6874018
## SR    0.8064167    0.8000515  0.6089758
## TO   13.3117676    6.4004262  2.2814003
## TC    1.7180033    3.9580534         NA

3.3.2 Discussion of Results

Starting with the first three metrics, one discerns that both GMV portfolios outperform the Naive one. In particular, both GMV strategies yield a higher return, a lower volatility, and, thus, a higher Sharpe-ratio than the Naive one. This is consistent with the perspective concluded from the cumulative performance figure above. Nevertheless, such inference can be misleading without taking into account portfolio turnover (TO).

Looking at TO, we observe that the GMV with the short-sales has the highest TO among the three, followed by the short-sales constrained GMV portfolio, and the Naive one, respectively. This implies that for a capital of $100K, the investor would need, on average, to trade $13.31K on a monthly basis to maintain such strategy. The same figure for the other portfolios would require monthly trading of $6.4K and $2.28K. Clearly, the Naive one would be the least costly one in terms of transaction costs.

While both GMV portfolios yield a higher SR than the naive portfolio, the main question is whether the investor is better off by choosing either GMV portfolio over the naive one. In other words, is optimization optimal? Looking at the TC metric, we see that for any monthly transaction cost that is larger than 1.72% the GMV portfolio with short-sales does not yield a higher SR than the naive one. On the other hand, this metric looks better for the constrained GMV portfolio. Indeed, for any monthly transaction costs lower than 3.96%, the no-short-sales GMV portfolio outperforms the benchmark.

3.4 Robustness

3.4.1 Sensitivity to Sample Size

As a robustness check, I repeat the same analysis with respect to different sample sizes. A small sample size implies less confidence in achieving reliable estimates and, hence, more noisy allocations. On the other hand, choosing a larger sample could also imply over-fitting where one assigns too much importance to past less relevant data. To test these implications, I keep the same test sample but use different \(N=20,30,..,100\) to construct the GMV portfolios. I generalize the computation with respect to \(N\) using the following function:

backtesting_N <- function(N) {
  if(N > 100) stop("sample size should not be greater than 100")

  # estimate covariances
  cov_roll <- rollapply(R,N, function(x) as.vector(var(x)), by.column = F, align = "right")
  cov_roll <- alply(cov_roll, 1, function(x) matrix(x, nrow = ncol(R) )   )
  names(cov_roll) <- date(R)
  
  # GMV portfolios
  X_gmv_1 <- t(sapply(cov_roll, function(S)  gmv_portfolio(S,BC1)))
  X_gmv_2 <- t(sapply(cov_roll, function(S)  gmv_portfolio(S,BC2)))
  colnames(X_gmv_1) <- colnames(X_gmv_2)<- names(R)
  X_gmv_1 <- as.xts(X_gmv_1)
  X_gmv_2 <- as.xts(X_gmv_2)
  
  # backtest
  R_test <- R[date(R)[101:(nrow(R)-1)],]
  X_gmv_1 <- X_gmv_1[date(R)[100:(nrow(R)-2)],]
  X_gmv_2 <- X_gmv_2[date(R)[100:(nrow(R)-2)],]
  
  ret_gmv_1 <- sapply(1:nrow(R_test), function(i) (X_gmv_1[i,]) %*% t(R_test[i,]) )
  ret_gmv_2 <- sapply(1:nrow(R_test), function(i) (X_gmv_2[i,]) %*% t(R_test[i,]) )
  ret_naive <- apply(R_test,1,mean)
  
  portfolio_ret <- list(ret_gmv_1,ret_gmv_2,ret_naive)
  summary_ret <- function(x) c(100*mean(x)*12,100*sd(x)*sqrt(12),(mean(x)/sd(x))*sqrt(12))
  
  # summarize returns
  M <- round(sapply(portfolio_ret,summary_ret),2)
  colnames(M) <- c("GMV","GMV_no_short","Naive")
  
  # summarize TO for GMV
  portfolio_weights <- list(X_gmv_1,X_gmv_2)
  TO_f <- function(X) apply(abs(as.matrix(X[-1,]) - as.matrix(X[-nrow(X),])),1,sum)
  TO_list <- lapply(portfolio_weights,TO_f)
  
  # NAIVE TO
  TO_naive <- apply(R_test[-nrow(R_test),],1,function(x)  sum(abs((x+1)/sum(x+1) - 1/length(x)))  )     
  TO_list[3] <- list(TO_naive)
  
  TO <- sapply(TO_list,mean)*100
  M <- rbind(M,TO)
  
  # summarize TC
  TC_f <- function(TC,i) {
    R_i <- portfolio_ret[[i]][-1] - TO_list[[i]]*TC
    return(mean(R_i)/sd(R_i))
  }
  
  # solve for TC that makes it equal
  TC1 <- uniroot(function(TC) TC_f(TC,1)  - TC_f(TC,3) ,c(-1,1))$root
  TC2 <- uniroot(function(TC) TC_f(TC,2)  - TC_f(TC,3),c(-1,1))$root
  TC <- c(TC1,TC2,NA)*100
  
  # finally summarize results altogether
  M <- rbind(M,TC)
  rownames(M) <- c("Mean","Std","SR","TO","TC")
  
  return(M)
}

Given the function, I use the sapply function to run over a sequence of sample sizes and extract the corresponding SR and TC:

N_seq <- seq(20,100,by = 10)
backtest_results <- lapply(N_seq,backtesting_N)
SR2 <- sapply(backtest_results , function(x) x["SR","GMV_no_short"])
TC2 <- sapply(backtest_results , function(x) x["TC","GMV_no_short"])

To summarize, I plot the following

{
plot(SR2~N_seq, pch = 20, ylab = "SR", xlab = "N")
lines(SR2~N_seq)
par(new = T)
plot(TC2~N_seq, pch = 15, col = 2, ylab = NA,xlab = NA,axes = F)
lines(TC2~N_seq, col = 2)
axis(side = 4)
mtext(side = 4, line = 0.5, 'TC')
legend("topleft",c("SR","TC"), col = 1:2,pch = c(20,15))
}

To some degree, we observe that the SR is an inverse U-shaped function of the sample size, where maximum performance is achieved for \(N=90\) monthly observations. Similarly, it appears that the TC is an increasing function of \(N\) to some extent, which achieves the maximum as well for \(N = 90\). This is intuitive due to the following. As the sample size increases, re-balancing is gives less attention to the recent monthly observation and, as a result, implies less sensitive portfolio weights (or more smoothed allocation) over time.

In practice, one needs to pick the sample size ex-ante, unlike the above case. To do so, one can use an ad-hoc sample size (e.g., pick \(N=100\)) or perform some cross-validation to determine the “optimal” sample size. Nonetheless, the message from the above plot implies that the investor could be better off using a large sample size than a small one, whereas determining the optimal one remains an open question.

4 Concluding Remarks

Overall, this vignette provides an introduction to portfolio optimization with implication to tactical asset allocation using sector ETFs. At the same time we discussed the challenges we face when it comes to estimation and implementation. The latter part of the article investigated the implications of optimization using back-testing. Consistent with challenges we saw in Section 2, the results from the back-testing raise concerns about the optimality of MV portfolios.

The rationale behind the strategy used in this investigation is that estimation risk is less severe in the covariance matrix and, hence, out-of-sample the GMV portfolio is more robust. The above investigation relied on constructing portfolios using the sample covariance matrix as the main estimate for the covariance. Nonetheless, other methods in the practice of portfolio optimization have been proposed. Those include using implied option data or shrinkage approach using a given prior or factor models. I leave both for future investigation.