Pearl Fund: Portfolio Anlaysis & Optmization

Author

Navein Suresh & Suryaveer Singh

Published

March 21, 2024

Overview

In this project, our primary goal is to enhance the risk-adjusted returns of a carefully curated portfolio of stocks. The focus of our analysis is on a equity/fixed income portfolio, nicknamed — “Pearl Fund.” This portfolio comprises Stocks & ETFs from a diverse array of industries and asset classes, namely:

  1. Chevron Corporation (NYSE: CVX)
  2. Costco Wholesale Corporation (NASDAQ: COST)
  3. Ford Motor Company (NYSE: F)
  4. HP Inc. (NYSE: HPQ)
  5. Johnson & Johnson (NYSE: JNJ)
  6. JPMorgan Chase & Co. (NYSE: JPM)
  7. Microsoft Corporation (NASDAQ: MSFT)
  8. PepsiCo, Inc. (NASDAQ: PEP)
  9. Procter & Gamble Company (NYSE: PG)
  10. Raytheon Technologies Corporation (NYSE: RTX)
  11. iShares 20+ Year Treasury Bond ETF (NASDAQ: TLT)

Our approach involves the utilization of sophisticated financial models, which will be thoroughly explained later in this documentation. Furthermore, we leverage the power of various statistical and financial packages. We use th R programming language as our preferred tool for seamless portfolio optimization. Through a comprehensive analysis of historical data, we aim to evaluate and compare the performance of our portfolio against two benchmark index funds—S&P 500 and NASDAQ.

Tip

💡 Detailed Background Overview: Pearl Fund

Once you’re familiarized with the code, feel free to explore and add your own curated collections of assets to see how they perform! To this make sure to enter the exchange traded ticker symbols to the stock_tickers vector in the second cell below. You must ensure that the assets you choose were traded on an exchange from a minimum of 2004 onward.

Before proceeding, ensure that the following libraries are installed and imported for seamless execution of the project. Be sure to use install.packages() for any of the uninstalled packages below.
Code
suppressWarnings(suppressPackageStartupMessages({
  library(tidyverse)
  library(ggplot2)
  library(plotly)
  library(knitr)
  library(tidyquant)
  library(corrplot)
  library(quantmod)
  library(gganimate)
  library(PerformanceAnalytics)
  library(PortfolioAnalytics)
  library(DEoptim)
  library(ROI)
  library(tseries)
  library(TTR)
  library(kableExtra)
  library(fPortfolio)
  require(ROI.plugin.glpk)
  require(ROI.plugin.quadprog)
}))

Equity Analysis

Our analysis spans from January 1, 2004, until the latest market day close, providing a comprehensive view of various economic scenarios, including critical events such as the global financial crisis of 2008 and the SARS COVID-19 epidemic in 2020. We use the FRED economic data set to retrieve the most recent 10-year Treasury yields for our statistical testing we will run later.

Code
stock_tickers <- c("TLT", "MSFT", "HPQ", "COST", "F", "PEP", "JPM", "RTX", "PG", "CVX", "JNJ")

# Treasury Yields (FRED)
ten_year_treasury_yield_table = arrange(tq_get("DGS10", get = "economic.data"), desc(date))
latest_ten_year_treasury_yield = slice(ten_year_treasury_yield_table, 1)$price

cat("Latest 10-year Treasury Yield: ",latest_ten_year_treasury_yield,"%")
Latest 10-year Treasury Yield:  4.3 %

Monthly Returns (Table View)

We proceed to collect our monthly stock returns using the tidyquant functions. We split this data set into two sets, one set ranging from 2004 to 5 years prior to the current date and another one from the previous 5 years. After some data pre-processing, we display the 5 most recent monthly returns for each of our stocks for the entire timeline below.

Code
stock_returns_monthly_full_raw <- tq_get(stock_tickers, get  = "stock.prices", from = "2004-01-01", to = Sys.Date())

# Transformed Monthly Stock Returns (long format)
stock_returns_monthly_full_long = stock_returns_monthly_full_raw %>% group_by(symbol) %>% tq_transmute(select = adjusted, mutate_fun = periodReturn, period = "monthly", col_rename = "MonthlyReturnsFull")

# Transformed Monthly Stock Returns (wide format)
stock_returns_monthly_full_wide = pivot_wider(stock_returns_monthly_full_long, names_from=symbol, values_from=MonthlyReturnsFull)

# Displays the most recent 5 monthly stock returns
stock_returns_monthly_full_wide_decreasing =  stock_returns_monthly_full_wide[order(stock_returns_monthly_full_wide$date, decreasing = TRUE), ]
as.data.frame(head(stock_returns_monthly_full_wide_decreasing, 5)) %>%
  kbl(caption = "Returns for past 5 months") %>%
  kable_classic(full_width = F, html_font = "Cambria")
Returns for past 5 months
date TLT MSFT HPQ COST F PEP JPM RTX PG CVX JNJ
2024-03-20 -0.0105981 0.0280195 0.0571070 -0.0042614 0.0369775 0.0394339 0.0551973 0.0577674 0.0191897 0.0167095 -0.0348247
2024-02-29 -0.0225221 0.0423184 -0.0132358 0.0721039 0.0900739 -0.0114416 0.0671025 -0.0094345 0.0114548 0.0423163 0.0233186
2024-01-31 -0.0224513 0.0572812 -0.0458624 0.0527209 -0.0385561 -0.0077131 0.0313648 0.0829570 0.0791076 -0.0115983 0.0137807
2023-12-29 0.0868574 -0.0075744 0.0349681 0.1389319 0.1881091 0.0092102 0.0898258 0.0326460 -0.0454664 0.0387187 0.0134489
2023-11-30 0.0992387 0.1229454 0.1143182 0.0749183 0.0523077 0.0385427 0.1223932 0.0084852 0.0232620 -0.0042868 0.0509542

The data set below shows the last 5 monthly returns from 2004 to 5 years prior to today’s date.

Code
# Raw Stock Monthly Returns
stock_returns_monthly_raw <- tq_get(stock_tickers, get  = "stock.prices", from = "2004-01-01", to = Sys.Date() - 1815)

# Transformed Monthly Stock Returns (long format)
stock_returns_monthly_long = stock_returns_monthly_raw %>% group_by(symbol) %>% tq_transmute(select = adjusted, mutate_fun = periodReturn, period = "monthly", col_rename = "MonthlyReturns")

# Transformed Monthly Stock Returns (wide format)
stock_returns_monthly_wide = pivot_wider(stock_returns_monthly_long, names_from=symbol, values_from=MonthlyReturns)

# Displays the most recent 5 monthly stock returns
# stock_returns_monthly_wide_decreasing =  stock_returns_monthly_wide[order(stock_returns_monthly_wide$date, decreasing = TRUE), ]
# as.data.frame(head(stock_returns_monthly_wide_decreasing, 5)) %>% kbl(caption = "Returns for past 5 months") %>% kable_classic(full_width = F, html_font = "Cambria")

We repeat the above process for the 2nd batch of monthly stock returns which lasts for the last 5 years.

Code
stock_returns_monthly_test_raw <- tq_get(stock_tickers, get  = "stock.prices", from = Sys.Date() - 1815, to = Sys.Date())

stock_returns_monthly_test_long = stock_returns_monthly_test_raw %>% group_by(symbol) %>% tq_transmute(select = adjusted, mutate_fun = periodReturn, period = "monthly", col_rename = "MonthlyReturns2nd")

stock_returns_monthly_test_wide = pivot_wider(stock_returns_monthly_test_long, names_from=symbol, values_from=MonthlyReturns2nd)

# stock_returns_monthly_test_wide_decreasing =  stock_returns_monthly_test_wide[order(stock_returns_monthly_test_wide$date, decreasing = TRUE), ]
# as.data.frame(head(stock_returns_monthly_test_wide_decreasing, 5))

Stock Growth & Monthly Returns (Graph View)

Displayed below is a graphical representation of the total returns (%) for our chosen stocks covering the chosen time period. To facilitate accurate time series analysis, the data has been standardized.

Code
stock_returns_monthly_raw1 <- stock_returns_monthly_raw %>% 
  group_by(symbol) %>% 
  mutate(return = 100 * (close - first(close)) / first(close)) %>% 
  ggplot(aes(date, return, color = symbol)) + 
  geom_line() +
  labs(title = "Stock Performance Over Time Period",
       y = "Return (%)") +
  theme_minimal()
stock_returns_monthly_raw1

The individual graphs below show the total return (to date) for each asset on their own scales.

Code
stock_returns_monthly_raw %>% 
  group_by(symbol) %>% 
  mutate(return = 100 * (close - first(close)) / first(close)) %>% 
  ggplot(aes(date, return, color = symbol)) + 
  geom_line() +
  labs(title = "Stock Performance Over Time Period",
       y = "Return (%) on a log scale") +
  theme_minimal() +
  facet_wrap(~symbol, scales = "free_y")

Furthermore, within the same temporal framework, we present the monthly returns for our selected stocks. The graphical representation highlights the varying levels of volatility, with more erratic and spread-out patterns evident in stocks characterized by higher volatility.

Code
stock_returns_monthly_long %>% ggplot(aes(date, `MonthlyReturns`, color=symbol)) + 
  geom_line() +
  facet_wrap(~ symbol)

Statistical Metrics

We calculate the expected returns (based off historical returns), standard deviations, variance, and total count for each of our assets.

Code
Stock_Statistics = stock_returns_monthly_long %>% group_by(symbol)  %>% summarise(Historic_Expected_Returns = mean(`MonthlyReturns`), Standard_Deviation = sd(`MonthlyReturns`), Variance = VAR(`MonthlyReturns`), Count = n())
Stock_Statistics %>%
  kbl(caption = "Stock Statistics") %>%
  kable_classic(full_width = F, html_font = "Cambria")
Stock Statistics
symbol Historic_Expected_Returns Standard_Deviation Variance Count
COST 0.0134653 0.0525892 0.0027656 184
CVX 0.0103205 0.0563277 0.0031728 184
F 0.0077453 0.1424700 0.0202977 184
HPQ 0.0083149 0.0806987 0.0065123 184
JNJ 0.0085053 0.0391815 0.0015352 184
JPM 0.0108169 0.0762465 0.0058135 184
MSFT 0.0123813 0.0647541 0.0041931 184
PEP 0.0083090 0.0404144 0.0016333 184
PG 0.0072533 0.0420618 0.0017692 184
RTX 0.0088972 0.0533712 0.0028485 184
TLT 0.0057034 0.0368194 0.0013557 184
Code
animation1 = ggplot(Stock_Statistics, aes(x = Standard_Deviation, y = Historic_Expected_Returns, color = symbol)) +
geom_point(size = 5) +
theme_bw() + ggtitle(" Monthly Historic Risk-Return Tradeoff") +
xlab("Volatility") + ylab("Expected Returns")

ggplotly(animation1)

Calculation of Excess Returns

Moving forward, our analysis involves the computation of excess returns. This is achieved by subtracting our anticipated monthly return value from each historical monthly return value, resulting in an array of excess returns. This practice is widely adopted in risk-adjusted portfolio analysis to refine the assessment of portfolio performance.

Using our split data set, we calculate the historical expected mean from the first group and our excess returns using the monthly returns from the last 5 years with respect to the calculated expected returns to ensure the excess returns don’t contain future data priced in.

Code
excess_returns_table <- stock_returns_monthly_test_long %>% left_join(Stock_Statistics, by = "symbol") %>% mutate(Excess_Returns = `MonthlyReturns2nd` - Historic_Expected_Returns)

excess_returns_table <- excess_returns_table %>% select(`symbol`, `date`, `Excess_Returns`)

# Displays the most recent 5 monthly excess stock returns
excess_returns_table_decreasing =  excess_returns_table[order(excess_returns_table$date, decreasing = TRUE), ]
head(excess_returns_table_decreasing,length(stock_tickers))
# A tibble: 11 × 3
# Groups:   symbol [11]
   symbol date       Excess_Returns
   <chr>  <date>              <dbl>
 1 TLT    2024-03-20       -0.0163 
 2 MSFT   2024-03-20        0.0156 
 3 HPQ    2024-03-20        0.0488 
 4 COST   2024-03-20       -0.0177 
 5 F      2024-03-20        0.0292 
 6 PEP    2024-03-20        0.0311 
 7 JPM    2024-03-20        0.0444 
 8 RTX    2024-03-20        0.0489 
 9 PG     2024-03-20        0.0119 
10 CVX    2024-03-20        0.00639
11 JNJ    2024-03-20       -0.0433 

We compute two matrices essential for our optimization process: the variance-covariance matrix and the correlation matrix. Each value within these matrices is derived from the following respective equations:

Note

💡 Calculation of the correlation matrix necessitates the input of the variance-covariance matrix

Var-CoVar: \Sigma = \frac{X^T X}{n-1}

Corr: \frac{\Sigma}{\sigma^T \sigma}

Code
excess_returns_wider =  excess_returns_table |> pivot_wider(names_from = symbol, values_from = Excess_Returns)

M = as.matrix(excess_returns_wider[ ,2:12])

# Variance-Covariance Matrix
cat("Variance-Covariance Matrix")
Variance-Covariance Matrix
Code
cat("\n")
Code
round(var(M), 4)
         TLT   MSFT    HPQ   COST      F    PEP     JPM     RTX     PG     CVX
TLT   0.0022 0.0010 0.0003 0.0013 0.0006 0.0003 -0.0003 -0.0005 0.0001 -0.0008
MSFT  0.0010 0.0040 0.0027 0.0021 0.0032 0.0013  0.0018  0.0016 0.0012  0.0015
HPQ   0.0003 0.0027 0.0084 0.0016 0.0046 0.0014  0.0039  0.0031 0.0012  0.0035
COST  0.0013 0.0021 0.0016 0.0040 0.0037 0.0013  0.0014  0.0010 0.0012  0.0013
F     0.0006 0.0032 0.0046 0.0037 0.0170 0.0022  0.0064  0.0037 0.0011  0.0067
PEP   0.0003 0.0013 0.0014 0.0013 0.0022 0.0021  0.0014  0.0019 0.0016  0.0020
JPM  -0.0003 0.0018 0.0039 0.0014 0.0064 0.0014  0.0065  0.0042 0.0010  0.0050
RTX  -0.0005 0.0016 0.0031 0.0010 0.0037 0.0019  0.0042  0.0074 0.0017  0.0048
PG    0.0001 0.0012 0.0012 0.0012 0.0011 0.0016  0.0010  0.0017 0.0026  0.0010
CVX  -0.0008 0.0015 0.0035 0.0013 0.0067 0.0020  0.0050  0.0048 0.0010  0.0093
JNJ  -0.0001 0.0012 0.0012 0.0009 0.0015 0.0014  0.0016  0.0017 0.0013  0.0021
         JNJ
TLT  -0.0001
MSFT  0.0012
HPQ   0.0012
COST  0.0009
F     0.0015
PEP   0.0014
JPM   0.0016
RTX   0.0017
PG    0.0013
CVX   0.0021
JNJ   0.0024
Code
cat("\n")
Code
cor_mat = cor(M)

# Correlation Matrix with percentage 
cat("Correlation Matrix as percentages")
Correlation Matrix as percentages
Code
cat("\n")
Code
round(cor_mat * 100, 3)
         TLT    MSFT     HPQ    COST       F     PEP     JPM     RTX      PG
TLT  100.000  35.691   5.884  44.843  10.444  15.273  -7.472 -12.901   5.307
MSFT  35.691 100.000  46.591  51.866  39.353  44.602  35.509  30.196  37.342
HPQ    5.884  46.591 100.000  28.231  38.652  32.271  53.048  39.185  24.548
COST  44.843  51.866  28.231 100.000  44.992  44.736  27.024  17.946  35.447
F     10.444  39.353  38.652  44.992 100.000  36.022  61.151  33.232  16.523
PEP   15.273  44.602  32.271  44.736  36.022 100.000  39.099  47.534  67.470
JPM   -7.472  35.509  53.048  27.024  61.151  39.099 100.000  61.146  25.255
RTX  -12.901  30.196  39.185  17.946  33.232  47.534  61.146 100.000  37.405
PG     5.307  37.342  24.548  35.447  16.523  67.470  25.255  37.405 100.000
CVX  -17.856  24.318  39.559  21.939  53.208  44.191  63.968  57.650  19.771
JNJ   -5.198  38.035  25.759  27.932  22.869  63.425  41.258  38.864  49.622
         CVX     JNJ
TLT  -17.856  -5.198
MSFT  24.318  38.035
HPQ   39.559  25.759
COST  21.939  27.932
F     53.208  22.869
PEP   44.191  63.425
JPM   63.968  41.258
RTX   57.650  38.864
PG    19.771  49.622
CVX  100.000  44.750
JNJ   44.750 100.000

We run a Principal Component Analysis calculation to display the groupings and correlations among the various assets in the pearl fund. The graph below displays the groups of closely related or correlated associations with certain stocks. The horizontal axis (Comp.1) and the vertical axis (Comp.2) represent the first two principal components derived from the PCA. These are the two dimensions that capture the most variance in the data set. From an investment standpoint, this plot might help in understanding which stocks move together (correlated), and potentially, it might help in identifying which stocks contribute most to the variance in the portfolio. This could be used for risk management and diversification; for instance, once might choose to invest in stocks that are less correlated with each other according to the PCA results. For example, the biplot shows that TLT, which is an ETF that tracks long-term treasury bonds, has a very different loading compared to other assets, indicating that its price movements are likely less correlated with those of the stocks.

Code
pca = princomp(cor_mat, cor = TRUE)
pca_df = as.data.frame(cor(cor_mat, pca$scores))

suppressWarnings(suppressPackageStartupMessages({pca_df |>
  ggplot(aes(x = Comp.1, y = Comp.2)) +
  geom_point(alpha = 0.3) +
  geom_text(label = rownames(pca_df)) + 
  geom_vline(xintercept = 0) +
  geom_hline(yintercept = 0) +
  scale_x_continuous(limits = c(-1,1)) +
  scale_y_continuous(limits = c(-1,1)) +

  # Adding arrows from origin to points
  geom_segment(aes(x = 0, y = 0, 
                   xend = Comp.1, yend = Comp.2),
               arrow = arrow(length = unit(0.05, "inches")),
               color = "blue", 
               data = subset(pca_df, rownames(pca_df) %in% c("TLT", "MSFT", "HPQ", "COST", "F", "PEP", "JPM", "RTX", "PG", "CVX", "JNJ")),  # Add ticker symbols here
               size = 0.5, 
               lineend = "round")}))

The following visuals will help aid the user with further understanding the correlations between the certain assets and larger groupings. Below we include a correlogram & a heat-map. The correlogram displays the correlations of various stock pairs on a grid to display the direction (blue shows + and red shows -) and the intensity (increasing size/shade of dot shows a value closer to a magnitude of 1). A white box shows a correlation of 0, ruling out a linear association.

Code
corrplot(cor_mat, method = "circle")

The dendrogram below shows another display of correlation pairings as well as clustering of certain groups of assets to show similarities/relatedness.

Code
heatmap(cor_mat)

Benchmark Performance

Upon acquiring the historical monthly stock returns for our designated time frame, our focus shifts to the evaluation of the two ETFs of the comparison benchmark index funds—SPDR S&P 500 ETF Trust (NYSE: SPY) and NASDAQ COMPOSITE (NYSE: QQQ). We display the 5 most recent SPY & NASDAQ Monthly returns available.

S&P 500 (Monthly % Returns):

Code
# Raw S&P 500 Monthly Returns
spy_returns_monthly_raw <- tq_get("SPY", get = "stock.prices", from = "2004-01-01", to = "2024-01-01")

# Transformed S&P 500 Monthly Returns
spy_returns_monthly = tq_transmute(spy_returns_monthly_raw, select = adjusted, mutate_fun = periodReturn, period = "monthly", col_rename = "SPYMonthlyReturns")

# Transforms data into percentages 
spy_returns_monthly_percentages = data.frame("date" = spy_returns_monthly$date, "SPYPercentReturns" = spy_returns_monthly$SPYMonthlyReturns * 100)

# Displays the most recent 5 monthly SPY returns (%)
spy_returns_monthly_percentages_decreasing =  spy_returns_monthly_percentages[order(spy_returns_monthly_percentages$date, decreasing = TRUE), ]

head(spy_returns_monthly_percentages_decreasing[, c("date", "SPYPercentReturns")], 5)%>%
  kbl(caption = "SPY returns for past 5 months") %>%
  kable_classic(full_width = F, html_font = "Cambria")
SPY returns for past 5 months
date SPYPercentReturns
240 2023-12-29 4.565537
239 2023-11-30 9.134381
238 2023-10-31 -2.170861
237 2023-09-29 -4.743455
236 2023-08-31 -1.625196

NASDAQ (Monthly % Returns):

Code
# Raw NASDAQ Monthly Returns
nasdaq_returns_monthly_raw <- tq_get("QQQ", get = "stock.prices", from = "2004-01-01", to = "2024-01-01")

# Transformed NASDAQ Monthly Returns
nasdaq_returns_monthly = tq_transmute(nasdaq_returns_monthly_raw, select = adjusted, mutate_fun = periodReturn, period = "monthly", col_rename = "NASDAQMonthlyReturns")

# Transforms data into percentages 
nasdaq_returns_monthly_percentages = data.frame("date" = nasdaq_returns_monthly$date, "NASDAQPercentReturns" = nasdaq_returns_monthly$NASDAQMonthlyReturns * 100)

# Displays the most recent 5 monthly NASDAQ returns (%)
nasdaq_returns_monthly_percentages_decreasing =  nasdaq_returns_monthly_percentages[order(nasdaq_returns_monthly_percentages$date, decreasing = TRUE), ]

head(nasdaq_returns_monthly_percentages_decreasing[, c("date", "NASDAQPercentReturns")], 5)%>%
  kbl(caption = " NASDAQ returns for past 5 months") %>%
  kable_classic(full_width = F, html_font = "Cambria")
NASDAQ returns for past 5 months
date NASDAQPercentReturns
240 2023-12-29 5.586963
239 2023-11-30 10.818816
238 2023-10-31 -2.065479
237 2023-09-29 -5.079851
236 2023-08-31 -1.483020

The line graph below illustrates the price trends over our time period for the benchmark returns — SPY and QQQ. As before, we standardize the index prices to facilitate a more effective comparison.

Code
# Combining benchmark raw dataframes
benchmark_returns_raw = rbind(spy_returns_monthly_raw, nasdaq_returns_monthly_raw)

benchmark_returns_raw %>% group_by(symbol) %>% mutate(close = 100*(close - first(close))/first(close)) %>% ggplot(aes(date, close, color=symbol)) + geom_line()

Now, we proceed to consolidate the returns of the two baseline index funds into a unified table. Additionally, a graph is presented to visually depict the combined returns of the two benchmarks within a single frame.

Code
benchmark_combined_table_of_returns = left_join(spy_returns_monthly, nasdaq_returns_monthly, by="date")

benchmark_long = benchmark_combined_table_of_returns |> 
  pivot_longer(cols = 2:3, values_to = "return", names_to = "symbol") |>
  arrange(symbol)

benchmark_long |>
  ggplot(aes(x = date, y = return, color = symbol)) +
  geom_line() +
  facet_grid(symbol ~ .)


Portfolio Optimization

The groundwork laid by our analysis and statistical metrics sets the stage for the upcoming portfolio optimization process. Drawing inspiration from the tenets of Modern Portfolio Theory (MPT), our objective is to optimize the portfolio of assets. Employing a mean-variance framework analysis, we seek to maximize expected returns for a specified level of risk tolerance, represented by standard deviation. We will also be utilizing other constraints and objectives to customize what we aim with our financial returns. Our approach involves leveraging concepts such as the efficient set mathematics, efficient frontier model, and other relevant methodologies.

Equally Weighted Portfolio Initialization

Code
init_weights = rep((1/length(stock_tickers)), length(stock_tickers))
assets = setNames(init_weights, stock_tickers)
assets 
       TLT       MSFT        HPQ       COST          F        PEP        JPM 
0.09090909 0.09090909 0.09090909 0.09090909 0.09090909 0.09090909 0.09090909 
       RTX         PG        CVX        JNJ 
0.09090909 0.09090909 0.09090909 0.09090909 

Next, using our optimization libraries we create a portfolio object where we will input our asset, initialized weights, portfolio constraints, and our objectives. Our desired outcome will be a vector of weights which optimizes for our objectives under our desired constraints.

Comments on Constraints

  1. Weight Sum -> sets a range of values for the total weight of out portfolio assets
  2. Leverage -> sets a range of values to depict additional borrowing
  3. Box -> sets a range of values for which the assets can be weighted at
  4. Return -> sets a target return rate for our portfolio to achieve

Comments on Objectives

  1. Return -> aims to create a portfolio to maximize the return
  2. Risk Aversion Parameter -> sets the level of risk aversion for the optimization. A higher risk aversion value indicates a greater emphasis on minimizing risk in the portfolio (ranges from 1-10)
  3. Risk -> aims to create a portfolio which minimizes the risk (StdDev, var, etc..)

For the Pearl Fund, we will optimize for maximizing our expected return while minimizing our variance given a risk aversion (delta) value of 5. Feel free to use your own set of constraints and objectives for your own portfolio.

Code
pearlfund = portfolio.spec(assets = assets)
print(pearlfund)
**************************************************
PortfolioAnalytics Portfolio Specification 
**************************************************

Call:
portfolio.spec(assets = assets)

Number of assets: 11 
Asset Names
 [1] "TLT"  "MSFT" "HPQ"  "COST" "F"    "PEP"  "JPM"  "RTX"  "PG"   "CVX" 
More than 10 assets, only printing the first 10
Code
# CONSTRAINTS

pearlfund = add.constraint(portfolio=pearlfund, type="weight_sum", min_sum=1, max_sum=1)
# pearlfund <- add.constraint(portfolio=pearlfund, type="leverage", min_sum=0.99, max_sum=1.01)
pearlfund <- add.constraint(portfolio=pearlfund, type="box", min=0.00, max=1)
# pearlfund = add.constraint(portfolio=pearlfund, type="return", return_target=0.0150)

# OBJECTIVES

pearlfund = add.objective(portfolio=pearlfund, type='return', name='mean')
pearlfund = add.objective(portfolio = pearlfund, type='risk', name="var", risk_aversion=5)
# pearlfund = add.objective(pearlfund, type = "risk", name = "StdDev")

# OPTIMIZE.PORTFOLIO

opt_maxret <- optimize.portfolio(R=stock_returns_monthly_full_wide, portfolio=pearlfund, optimize_method="ROI", trace=TRUE)

# OPTIMAL WEIGHTS

print(opt_maxret)
***********************************
PortfolioAnalytics Optimization
***********************************

Call:
optimize.portfolio(R = stock_returns_monthly_full_wide, portfolio = pearlfund, 
    optimize_method = "ROI", trace = TRUE)

Optimal Weights:
   TLT   MSFT    HPQ   COST      F    PEP    JPM    RTX     PG    CVX    JNJ 
0.2224 0.1736 0.0000 0.3000 0.0000 0.0157 0.0547 0.0000 0.1195 0.0766 0.0375 

Objective Measure:
   mean 
0.01111 


 StdDev 
0.03244 
Code
# VISUALIZATIONS

plot(opt_maxret, risk.col="StdDev", return.col="mean", main="Maximum Return Optimization", chart.assets=TRUE, xlim=c(0, 0.18), ylim=c(0,0.04))

Our visual above displays a line graph of the associated weights of our assets and a scatter plot comparing the mean vs. StdDev of the individual portfolio compared to the mean vs. StdDev of the portfolio as an aggregate whole.

The following code extracts the weights from our portfolio object and stores it as a vector of weights.

Code
optimized_list_of_weights = opt_maxret[1]$weights
optimized_vector_of_weights = unlist(optimized_list_of_weights)
100 * round(optimized_vector_of_weights, 5)
   TLT   MSFT    HPQ   COST      F    PEP    JPM    RTX     PG    CVX    JNJ 
22.245 17.360  0.000 29.998  0.000  1.572  5.466  0.000 11.951  7.662  3.746 

Optimized Portfolio Visualizations

Fund Total Growth Return

The visual below displays how much you would have right now if you had invested $10,000 at the start of the time period till today.

*Please note that this does not account for tax consequences, which could reduce your final capital depending on tax policy and brackets

Code
# Data Retstructuring

# Pearl Fund

aggregate_portfolio_monthly_returns_growth <- stock_returns_monthly_full_long %>%
    tq_portfolio(assets_col = symbol, 
                 returns_col= `MonthlyReturnsFull`, 
                 weights = optimized_vector_of_weights,
                 col_rename   = "investment.growth",
                 wealth.index = TRUE) %>% 
  mutate(investment.growth = investment.growth * 10000)

# Pearl Fund - Equally Weighted 

equally_weighted_portfolio_monthly_returns_growth <- stock_returns_monthly_full_long %>%
    tq_portfolio(assets_col = symbol, 
                 returns_col= `MonthlyReturnsFull`, 
                 weights = init_weights,
                 col_rename   = "investment.growth",
                 wealth.index = TRUE) %>% 
  mutate(investment.growth = investment.growth * 10000)

# SPY Portfolio

vals1 <- rep("SPY", 240)
spy_returns_monthly_ticker = spy_returns_monthly
spy_returns_monthly_ticker$symbol <- vals1

spy_returns_monthly_ticker_growth <- spy_returns_monthly_ticker %>%
    tq_portfolio(assets_col = symbol, 
                 returns_col= `SPYMonthlyReturns`, 
                 weights = c(1),
                 col_rename   = "investment.growth",
                 wealth.index = TRUE) %>% 
  mutate(investment.growth = investment.growth * 10000)

# NASDAQ Portfolio

vals2 = rep("NASDAQ", 240)
nasdaq_returns_monthly_ticker = nasdaq_returns_monthly
nasdaq_returns_monthly_ticker$symbol <- vals2

nasdaq_returns_monthly_ticker_growth <- nasdaq_returns_monthly_ticker %>%
    tq_portfolio(assets_col = symbol, 
                 returns_col= `NASDAQMonthlyReturns`, 
                 weights = c(1),
                 col_rename   = "investment.growth",
                 wealth.index = TRUE) %>% 
  mutate(investment.growth = investment.growth * 10000)

# Pearl Fund Portfolio Growth
ggplot(aggregate_portfolio_monthly_returns_growth, aes(x = date, y = investment.growth)) +
  geom_line(linewidth = 2, color = "red") +
  geom_smooth(method = "loess") +
  labs(title = "Pearl Fund Portfolio Growth",
       subtitle = "Optimized Portfolio Growth",
       caption = "Performance growth over our 20-year horizon",
       x = "Year", y = "Portfolio Value") +
  scale_y_continuous(labels = scales::dollar)

Code
# SPY Portfolio Growth
ggplot(spy_returns_monthly_ticker_growth, aes(x = date, y = investment.growth)) +
  geom_line(linewidth = 2, color = "blue") +
  geom_smooth(method = "loess") +
  labs(title = "SPY Portfolio Growth",
       subtitle = "Portfolio Growth",
       caption = "Performance growth over our 20-year horizon",
       x = "Year", y = "Portfolio Value") +
  scale_y_continuous(labels = scales::dollar)

Code
# NASDAQ Portfolio Growth
ggplot(nasdaq_returns_monthly_ticker_growth, aes(x = date, y = investment.growth)) +
  geom_line(linewidth = 2, color = "green") +
  geom_smooth(method = "loess") +
  labs(title = "NASDAQ Portfolio Growth",
       subtitle = "Portfolio Growth",
       caption = "Performance growth over our 20-year horizon",
       x = "Year", y = "Portfolio Value") +
  scale_y_continuous(labels = scales::dollar)

Fund Monthly Returns

The following visual displays a scatter plot of the monthly returns throughout the time period of interest. A trend line (using “loess”) is fitted onto the graph to display the variation and direction at different regions of the scatter plot.

Code
aggregate_portfolio_monthly_returns = stock_returns_monthly_full_long %>% tq_portfolio(assets_col = symbol, returns_col = `MonthlyReturnsFull`, weights = optimized_vector_of_weights, col_rename = "AggregateMonthlyReturns")

aggregate_portfolio_monthly_returns %>% ggplot(aes(x = date, y = `AggregateMonthlyReturns`)) + geom_point() + labs(title = "Pearl Fund Portfolio Returns", subtitle = "Optimized Portfolio", caption = "Shows an above-zero trend indicating long-term positive returns", x = "", y = "Monthly Returns") + 
  geom_smooth(method = "loess") + 
  theme_tq() + scale_color_tq() + scale_y_continuous(labels = scales::percent)

Annualized Returns

To get a better appreciation for the fund’s return, we proceed to annualize the the monthly returns.

Code
portfolio_monthly_mean = unname(unlist(opt_maxret[2]$objective_measures$mean))

# Convert to annualized % return
annualized_return <- paste0(100*((1 + portfolio_monthly_mean)^12 - 1), "%")
names(annualized_return) = "Annualized Return"

cat("Annualized Return:", annualized_return)
Annualized Return: 14.1734654669547%

Comparative Statistical Analysis

To objectively test our portfolio, we will be calculating several financial metrics in relation to both our benchmarks — SPY & QQQ.

Valuable tq_performance tests…

table.Stats, table.CAPM, table.AnnualizedReturns, table.Correlation, table.DownsideRisk, table.DownsideRiskRatio, table.HigherMoments, table.InformationRatio, table.Variability, VaR, SharpeRatio

To begin, it is important to combine the benchmark index monthly returns with our portfolio monthly returns.

Code
# Portfolio + S&P500 Table
spy_combined_table_of_returns = left_join(aggregate_portfolio_monthly_returns, spy_returns_monthly, by="date")

# Portfolio + NASDAQ Table
nasdaq_combined_table_of_returns = left_join(aggregate_portfolio_monthly_returns, nasdaq_returns_monthly, by="date")

# Displays the most recent 5 monthly returns for the tables above
spy_combined_table_of_returns_decreasing =  spy_combined_table_of_returns[order(spy_combined_table_of_returns$date, decreasing = TRUE), ]
head(spy_combined_table_of_returns_decreasing, 5)
# A tibble: 5 × 3
  date       AggregateMonthlyReturns SPYMonthlyReturns
  <date>                       <dbl>             <dbl>
1 2024-03-20                 0.00754           NA     
2 2024-02-29                 0.0559            NA     
3 2024-01-31                 0.0483            NA     
4 2023-12-29                 0.0775             0.0457
5 2023-11-30                 0.0837             0.0913
Code
nasdaq_combined_table_of_returns_decreasing =  nasdaq_combined_table_of_returns[order(nasdaq_combined_table_of_returns$date, decreasing = TRUE), ]
head(nasdaq_combined_table_of_returns_decreasing, 5)
# A tibble: 5 × 3
  date       AggregateMonthlyReturns NASDAQMonthlyReturns
  <date>                       <dbl>                <dbl>
1 2024-03-20                 0.00754              NA     
2 2024-02-29                 0.0559               NA     
3 2024-01-31                 0.0483               NA     
4 2023-12-29                 0.0775                0.0559
5 2023-11-30                 0.0837                0.108 

The following code combines all the funds of interest that intend to be comparing into a single table (wide format) once again with the 5 most recent returns displayed.

Code
# Combined Table (3 Funds)
three_fund_combined_table_of_returns = left_join(spy_combined_table_of_returns, nasdaq_returns_monthly, by="date")

three_fund_combined_table_of_returns_decreasing <- arrange(three_fund_combined_table_of_returns, desc(date))

head(three_fund_combined_table_of_returns_decreasing, 5)%>%
  kbl(caption = "Combined returns for past 5 months") %>%
  kable_classic(full_width = F, html_font = "Cambria")
Combined returns for past 5 months
date AggregateMonthlyReturns SPYMonthlyReturns NASDAQMonthlyReturns
2024-03-20 0.0075420 NA NA
2024-02-29 0.0559100 NA NA
2024-01-31 0.0483276 NA NA
2023-12-29 0.0774682 0.0456554 0.0558696
2023-11-30 0.0837300 0.0913438 0.1081882

This is a great display of a side by side view of the monthly returns for our three funds of comparison for our time period of interest. The following visuals are a great measure of the overall volatility of our portfolio compared to the two benchmarks. Graphs that display more “ups” and “downs” with monthly returns indicate larger deviations from the mean. Given similar annualized returns and expected long term value, a risk-averse investor should prefer the portfolios/funds which minimize the volatility/risk for the given time period.

Code
three_fund_long = three_fund_combined_table_of_returns |> 
  pivot_longer(cols = 2:4, values_to = "return", names_to = "symbol") |>
  arrange(symbol)

three_fund_long |>
  ggplot(aes(x = date, y = return, color = symbol)) +
  geom_line() +
  facet_grid(symbol ~ .)

The tq_performance function in tidyquant allows us to produce statistical metrics for financial analysis for our Peal Fund against each of the benchmark index funds.

Code
# Pearl Fund vs. SPY
statisticsA = spy_combined_table_of_returns %>% tq_performance(Ra = `AggregateMonthlyReturns`, Rb = `SPYMonthlyReturns`, performance_fun = table.CAPM)

# Pearl Fund vs. NASDAQ
statisticsB = nasdaq_combined_table_of_returns %>% tq_performance(Ra = `AggregateMonthlyReturns`, Rb = `NASDAQMonthlyReturns`, performance_fun = table.CAPM)

as.data.frame(statisticsA) %>%
  kbl(caption = "Pearl Fund Against SPY") %>%
  kable_classic(full_width = F, html_font = "Cambria")
Pearl Fund Against SPY
ActivePremium Alpha AnnualizedAlpha Beta Beta- Beta+ Correlation Correlationp-value InformationRatio R-squared TrackingError TreynorRatio
0.0461 0.0062 0.0775 0.6477 0.6066 0.6869 0.7582 0 0.4707 0.5748 0.0979 0.2261
Code
as.data.frame(statisticsB) %>%
  kbl(caption = "Pearl Fund Against NASDAQ") %>%
  kable_classic(full_width = F, html_font = "Cambria")
Pearl Fund Against NASDAQ
ActivePremium Alpha AnnualizedAlpha Beta Beta- Beta+ Correlation Correlationp-value InformationRatio R-squared TrackingError TreynorRatio
0.0045 0.0056 0.0699 0.5047 0.4657 0.4766 0.7305 0 0.0361 0.5336 0.1255 0.2902

Mean-Variance Analysis (Efficient Frontier)

The efficient frontier is a concept in financial portfolio theory that represents the set of optimal portfolios that offer the highest expected return for a given level of risk, or the lowest risk for a given level of expected return. It is useful for investors and portfolio managers as it helps identify the optimal balance between risk and return, enabling them to construct portfolios that maximize potential gains while minimizing exposure to volatility.

To apply this framework to our portfolio, we first need to modify our monthly stock returns table so that the date column is removed with once again the 5 most recent returns displayed.

Code
stock_returns_monthly_wide_dateless = stock_returns_monthly_wide[, -1]

stock_returns_monthly_wide_decreasing <- arrange(stock_returns_monthly_wide, desc(date))

head(stock_returns_monthly_wide_decreasing[, -1], 5) %>%
  kbl() %>%
  kable_classic(full_width = F, html_font = "Cambria")
TLT MSFT HPQ COST F PEP JPM RTX PG CVX JNJ
-0.0142039 0.0091571 0.0319091 0.0084663 0.0227790 -0.0044880 0.0336855 0.0333616 -0.0039403 0.0142877 -0.0057942
0.0557169 0.0527540 -0.0068591 0.1069762 0.0011403 0.0597543 -0.0299923 0.0256229 0.0558092 0.0301057 0.0230530
-0.0137643 0.0773574 -0.1044036 0.0219021 -0.0034091 0.0346830 0.0083093 0.0706941 0.0215611 0.0534849 0.0335606
0.0037858 0.0281579 0.0767355 0.0536055 0.1703673 0.0198225 0.0688436 0.1088464 0.0578003 0.0538653 0.0312281
0.0585351 -0.0840474 -0.1040780 -0.1192059 -0.1870351 -0.0867865 -0.1220432 -0.1260669 -0.0274045 -0.0853373 -0.1215110

The next step is to simulate, using a loop, a series of random weights which we then apply certain matrices to. The result is a data frame which stores the list of returns and risk values for portfolios of different weights.

Code
np1 = 1000
ret2 = stock_returns_monthly_wide_dateless  #excluding dates
mu1 = colMeans(ret2)  #mean returns
na1 = ncol(ret2)  #number of assets
varc1 = cov(ret2)

riskp1 = NULL  #vector to store risk
retp1 = NULL  #vector to store returns
# using loops here (not aiming for efficiency but demonstration)

for (i in 1:np1) {
    w = diff(c(0, sort(runif(na1 - 1)), 1))  # random weights
    r1 = t(w) %*% mu1  #matrix multiplication
    sd1 = t(w) %*% varc1 %*% w
    retp1 = rbind(retp1, r1)
    riskp1 = rbind(riskp1, sd1)
}

# create a data frame of risk and return
d_p1 = data.frame(Ret = retp1, Risk = riskp1)

We use ggplot to and to visualize the data accordingly and as expected we achieve the curve as derived from the Modern Portfolio Framework.

Code
p1 = ggplot(d_p1, aes(Risk, Ret, colour = Ret))
# scatter plot
p1 = p1 + geom_point()
# scatter plot with density and identified port risk return (highest
# lowest returns and min risk)
p1 + geom_point() + geom_hline(yintercept = c(max(d_p1$Ret), median(d_p1$Ret),
    min(d_p1$Ret)), colour = c("darkgreen", "darkgray", "darkred"), size = 1) +
    geom_vline(xintercept = d_p1[(d_p1$Risk == min(d_p1$Risk)), ][, 2]) +
    labs(colour = "Portfolio Return", x = "Portfolio Risk", y = "Portfolio Return",
        title = "Random Feasible Portfolios") + theme_bw()

Here is another display of our plot using a different R frontier calculation package. To choose what you’d like to plot exactly in plot(), enter values from below as a vector:

Syntax for choosing vector in plot function below:

  1. Efficient Frontier
  2. Global Minimum Variance Portfolio
  3. Tangent (optimal portfolio)
  4. Risk/Return of each asset
  5. Equal Weights Portfolio
  6. Two Asset Frontier
  7. Monte Carlo
  8. Sharpe Ratio
Code
return.matrix = as.timeSeries(stock_returns_monthly_wide[, -1])

efficient.frontier = portfolioFrontier(return.matrix, `setRiskFreeRate<-`(portfolioSpec(), (latest_ten_year_treasury_yield/100)/12), constraints = "LongOnly" )

plot(efficient.frontier, c(1, 3, 7))

Code
plot(efficient.frontier, c(1,3,5,7))

Optimal portfolios for a given target risk are displayed by the dark colored dots on the border or frontier of the Monte-Carlo simulated data points. The capital allocation line (blue line) which intersects the efficient frontier at a point of tangency indicating the optimal portfolio for an investor at the current risk-free rate (10 year yield on government bond).

The following visual is another very useful metric to understand portfolio weightings based on chosen return/risk levels.

Interpretation

Pick a vertical line/column at a specified target return and target risk and you can find out the allocations of stocks in your portfolio (through the lengths of the appropriate colored bars) to achieve those chosen metrics.

Code
weightsPlot(efficient.frontier)

Here our goal is to produce a different type of frontier visual. Instead of plotting the traditional measures of risk (standard deviation or variance) against returns, we turn to a different measure of risk — expected tail loss (ETL). ETL is a another measure of risk which could be a better use in portfolio optimization in certain situations.

ETL, also known as conditional value at risk (CVaR), is a risk measure that quantifies the expected loss in the tail of a distribution of possible outcomes beyond a certain confidence level. ETL focuses specifically on the extreme or tail events, giving an indication of the potential loss in those extreme scenarios. The preference for ETL over variance in some cases arises because ETL puts more emphasis on extreme events, which can be crucial in risk assessment. Variance might not adequately capture the potential impact of rare but severe events, whereas ETL provides a clearer picture of the expected loss in the tail of the distribution.

The following code generates a series of ETL values for each of our portfolio weights at a chosen confidence level (95%).

Code
confidence_level <- 0.95
np1 <- length(retp1)
cvar_values <- numeric(np1)
returns <- rnorm(1000, mean(retp1), mean(riskp1))
etl_values <- numeric(np1)

for (i in 1:np1) {
  random_sample <- sample(returns, length(returns), replace = TRUE)
  sorted_returns <- sort(random_sample, decreasing = TRUE)
  tail_index <- floor((1 - confidence_level) * length(returns))
  tail_returns <- sorted_returns[1:tail_index]
  etl_values[i] <- mean(tail_returns)
}

We add these ETL values to our data frame of Returns and Risk for our various portfolios.

Code
# Add ETL to this table
d_p1 = mutate(d_p1, ETL = etl_values)
head(d_p1) %>%
  kbl(caption = "Monte Carlo Simulation") %>%
  kable_classic(full_width = F, html_font = "Cambria")
Monte Carlo Simulation
Ret Risk ETL
0.0091751 0.0016406 0.0128155
0.0085436 0.0014544 0.0125995
0.0083278 0.0011513 0.0129044
0.0102011 0.0016985 0.0130339
0.0097651 0.0010151 0.0127538
0.0101553 0.0012116 0.0126825

Finally, we plot these ETL values against our return values for our portfolios and are able to visualize the results. By plotting ETL vs. Expected Return, you can assess whether the expected returns of the portfolio adequately compensates for the potential magnitude of extreme losses.

Code
p1 <- ggplot(d_p1, aes(ETL, Ret, colour = Ret)) +
  geom_point() +
  geom_hline(yintercept = c(max(d_p1$Ret), median(d_p1$Ret), min(d_p1$Ret)),
             colour = c("darkgreen", "darkgray", "darkred"), size = 1) +
  geom_vline(xintercept = d_p1[(d_p1$Risk == min(d_p1$Risk)), ][, 2]) +
  labs(colour = "Portfolio Return",
       x = "Expected Tail Loss",
       y = "Portfolio Return",
       title = "Random Feasible Portfolios") +
  theme_bw()

p1 + xlim(min(d_p1$ETL), max(d_p1$ETL)) + ylim(min(d_p1$Ret), max(d_p1$Ret))


Conclusion

To conclude, we plot the Optimized Pearl Fund Portfolio (in red) versus the Equally Weighted Pearl Fund Portfolio (in blue). As evident in the graph below, the Optimized Portfolio generates higher returns than the Equally Weighted Portfolio, while rising more steadily.

Code
ggplot() +
  geom_line(data = aggregate_portfolio_monthly_returns_growth, aes(x = date, y = investment.growth), 
            linewidth = 2, color = "red") +
  geom_smooth(data = aggregate_portfolio_monthly_returns_growth, aes(x = date, y = investment.growth), 
              method = "loess", color = "red", se = FALSE) +  
  annotate(geom = "text", x = as.Date("2017-03-01"), y = 100000, 
           label = "Optimized\nPortfolio", color = "red") +
  geom_line(data = equally_weighted_portfolio_monthly_returns_growth, aes(x = date, y = investment.growth), 
            linewidth = 2, color = "blue") +
  geom_smooth(data = equally_weighted_portfolio_monthly_returns_growth, aes(x = date, y = investment.growth), 
              method = "loess", color = "blue", se = FALSE) +
  annotate(geom = "text", x = as.Date("2023-03-01"), y = 35000, 
           label = "Equally\nWeighted", color = "blue") +
  labs(title = "Pearl Fund Optimized Portfolio Growth vs. Equally Weighted Portfolio",
       subtitle = ,
       caption = "Performance growth over our 20-year horizon",
       x = "Year", y = "Portfolio Value") +
  scale_y_continuous(labels = scales::dollar) +
 theme_minimal()