Efficient Frontier Plot in R

Importing packages required

We will use several packages that will help us pull stocks data and to visualize simulated risks and returns. The following code installs the packages required if they’re not yet available. Afterwards, they will be loaded in our current R session.

## (1) Define the packages that will be needed
packages <- c('quantmod', 'ggplot2', 'dplyr')

## (2) Install them if not yet installed
installed_packages <- packages %in% rownames(installed.packages())
if (any(installed_packages == FALSE)) {
  install.packages(packages[!installed_packages])
}

## (3) Load the packages into R session
invisible(lapply(packages, library, character.only = TRUE))



Getting adjusted stock prices data

Let’s now load the stocks data that we will need. First, we will define a variable called portfolio that has the stock symbols that we want. Then use lapply to load the weekly stock data of each symbol into R and store them as a list. Afterwards, we’ll only take the adjusted prices, then merge all of them into an xts object.

## Create a character vector that has the stock codes we need
portfolio <- c('AAPL', 'MSFT', 'GOOG', 'AMZN', 'JNJ')

## Load the stocks needed into R 
portfolio <- lapply(portfolio, function(x) {getSymbols(
        x, periodicity='weekly', auto.assign=FALSE)})

## Get adjusted prices of all stocks
portfolio_adjusted <- lapply(portfolio, Ad)

## Transform into xts
portfolio_adjusted <- do.call(merge, portfolio_adjusted)

## View the first few rows
head(portfolio_adjusted)
##            AAPL.Adjusted MSFT.Adjusted GOOG.Adjusted AMZN.Adjusted JNJ.Adjusted
## 2007-01-01      2.600933      21.62418      242.6853         38.37     42.64668
## 2007-01-08      2.893595      22.76959      251.5571         38.20     42.65948
## 2007-01-15      2.706438      22.69663      243.9606         37.02     43.37643
## 2007-01-22      2.611025      22.32455      246.9942         36.85     42.29460
## 2007-01-29      2.591759      22.02544      239.8510         37.39     42.62106
## 2007-02-05      2.546499      21.14266      230.0826         38.72     41.99372



Getting weekly log returns

We’ll now take the weekly log returns of each symbol.

## Make a list that contains log weekly returns of each stock
portfolio_adjusted <- lapply(portfolio_adjusted, weeklyReturn, type='log')

## Transform into an xts object
portfolio_adjusted <- do.call(merge, portfolio_adjusted)

## Adjust the column names 
colnames(portfolio_adjusted) <- c('AAPL', 'MSFT', 'GOOG', 'AMZN', 'JNJ')

## Remove first row since these do not have returns
portfolio_adjusted <- portfolio_adjusted[-1]

## View the first few rows of the log returns 
head(portfolio_adjusted)
##                    AAPL         MSFT        GOOG         AMZN           JNJ
## 2007-01-08  0.106629447  0.051613838  0.03590425 -0.004440336  0.0003002362
## 2007-01-15 -0.066866295 -0.003209463 -0.03066338 -0.031377235  0.0166667512
## 2007-01-22 -0.035890515 -0.016529488  0.01235824 -0.004602743 -0.0252569272
## 2007-01-29 -0.007406069 -0.013488686 -0.02934704  0.014547698  0.0076891967
## 2007-02-05 -0.017617321 -0.040905349 -0.04157947  0.034953026 -0.0148285558
## 2007-02-12  0.018560637 -0.008315604  0.01727824  0.040739370 -0.0013727184



Creating variance-covariance matrix

Our goal is to plot efficient frontier of the symbols we’ve defined above. We will simulate 10,000 combinations of random weights to be assigned to each of the symbols. Then, we’ll calculate the risk and returns that correspond to each combination of weights.

To do that, we need to make a variance-covariance matrix since it’ll be part of the calculations. We can get that using the var function in base R.

## Get variance-covariance matrix
var_covar <- var(portfolio_adjusted)

## Print results
var_covar
##              AAPL         MSFT         GOOG         AMZN          JNJ
## AAPL 0.0018814637 0.0006536266 0.0009219622 0.0009696068 0.0003096191
## MSFT 0.0006536266 0.0012047967 0.0007166929 0.0007124102 0.0003549947
## GOOG 0.0009219622 0.0007166929 0.0014842451 0.0009823727 0.0003496576
## AMZN 0.0009696068 0.0007124102 0.0009823727 0.0022446234 0.0002901278
## JNJ  0.0003096191 0.0003549947 0.0003496576 0.0002901278 0.0005356508

Creating random weights

We can now make our random weights.

## Set seed for reproducibility
set.seed(123)

## Get 50,000 random uniform numbers
random_numbers <- runif(50000)

## Transform random numbers into matrix to distribute across all symbols
all_weights <- matrix(random_numbers, nrow=10000, ncol=5)

## Add sixth column with just NAs
all_weights <- cbind(all_weights, rep(NA, 10000))

## Add names
colnames(all_weights) <- c('AAPL', 'MSFT', 'GOOG', 'AMZN', 'JNJ', 'total')

## Loop to convert into actual weights
for (i in 1:10000) {
        
        ## Get sum of random numbers in each row
        all_weights[i, 6] <- sum(all_weights[i, 1:5])
        
        ## Get the actual weights of the random numbers
        all_weights[i, 1:5] <- 
                all_weights[i, 1:5] / all_weights[i, 6]
}

## Delete total column
all_weights <- all_weights[, -6]



Calculating risks and returns

We can now compute the risks and returns of the portfolio for each 10,000 random weights. The following are the corresponding formula for each metric.

\[ P_{Rt} = \sum P_w * P_{Cov} \]

Where, \(P_Rt\) is our portfolio return, \(P_w\) are the asset weights, then \(P_{Cov}\) is the individual covariance per variable. We get \(P_Rt\) by getting the sum of the product of weights and covariance.

\[ P_{Rs} = \sqrt \sum (X*P_w) * X \]

Where, \(P_Rs\) is our portfolio risk, \(P_w\) are the asset weights, then \(X\) is the variance-covariance matrix. We get \(P_Rs\) by getting the matrix product of \(X\) and \(P_w\) first, then multiply by \(X\), and get the sum. Finally, take the sum’s square root.

## Create column placeholders
portfolio_risk <- rep(NA, 10000)
portfolio_returns <- rep(NA, 10000)

## loop to calculate risk and return per weights 
for (i in 1:10000) {
        weights <- all_weights[i, ]
        portfolio_risk[i] <- sqrt(sum((weights %*% var_covar) * weights))
        portfolio_returns[i] <- sum(weights * var_covar[1, ])
}

## Make a data frame to be used for ggplot2
portfolio_df <- data.frame(portfolio_risk, portfolio_returns)



Make the efficient frontier plot

For the final step of this demo, let’s plot the risks and returns.

portfolio_df %>% 
        ggplot(aes(x=portfolio_risk, y=portfolio_returns)) + 
        geom_point(alpha=0.2) + 
        theme_minimal() +
        labs(
                title='Efficient Frontier graph of 5 assets',
                subtitle='AAPL, MSFT, GOOG, AMZN, JNJ')