Pearl Fund: Portfolio Anlaysis & Optmization

Author

Navein Suresh & Suryaveer Singh

Published

March 21, 2024

Overview

In this project, our primary goal is to enhance the risk-adjusted returns of a carefully curated portfolio of stocks. The focus of our analysis is on a equity/fixed income portfolio, nicknamed — “Pearl Fund.” This portfolio comprises Stocks & ETFs from a diverse array of industries and asset classes, namely:

Chevron Corporation (NYSE: CVX)
Costco Wholesale Corporation (NASDAQ: COST)
Ford Motor Company (NYSE: F)
HP Inc. (NYSE: HPQ)
Johnson & Johnson (NYSE: JNJ)
JPMorgan Chase & Co. (NYSE: JPM)
Microsoft Corporation (NASDAQ: MSFT)
PepsiCo, Inc. (NASDAQ: PEP)
Procter & Gamble Company (NYSE: PG)
Raytheon Technologies Corporation (NYSE: RTX)
iShares 20+ Year Treasury Bond ETF (NASDAQ: TLT)

Our approach involves the utilization of sophisticated financial models, which will be thoroughly explained later in this documentation. Furthermore, we leverage the power of various statistical and financial packages. We use th R programming language as our preferred tool for seamless portfolio optimization. Through a comprehensive analysis of historical data, we aim to evaluate and compare the performance of our portfolio against two benchmark index funds—S&P 500 and NASDAQ.

Tip

💡 Detailed Background Overview: Pearl Fund

Once you’re familiarized with the code, feel free to explore and add your own curated collections of assets to see how they perform! To this make sure to enter the exchange traded ticker symbols to the stock_tickers vector in the second cell below. You must ensure that the assets you choose were traded on an exchange from a minimum of 2004 onward.

Before proceeding, ensure that the following libraries are installed and imported for seamless execution of the project. Be sure to use install.packages() for any of the uninstalled packages below.

Code

suppressWarnings(suppressPackageStartupMessages({
  library(tidyverse)
  library(ggplot2)
  library(plotly)
  library(knitr)
  library(tidyquant)
  library(corrplot)
  library(quantmod)
  library(gganimate)
  library(PerformanceAnalytics)
  library(PortfolioAnalytics)
  library(DEoptim)
  library(ROI)
  library(tseries)
  library(TTR)
  library(kableExtra)
  library(fPortfolio)
  require(ROI.plugin.glpk)
  require(ROI.plugin.quadprog)
}))

Equity Analysis

Our analysis spans from January 1, 2004, until the latest market day close, providing a comprehensive view of various economic scenarios, including critical events such as the global financial crisis of 2008 and the SARS COVID-19 epidemic in 2020. We use the FRED economic data set to retrieve the most recent 10-year Treasury yields for our statistical testing we will run later.

Code

stock_tickers <- c("TLT", "MSFT", "HPQ", "COST", "F", "PEP", "JPM", "RTX", "PG", "CVX", "JNJ")

# Treasury Yields (FRED)
ten_year_treasury_yield_table = arrange(tq_get("DGS10", get = "economic.data"), desc(date))
latest_ten_year_treasury_yield = slice(ten_year_treasury_yield_table, 1)$price

cat("Latest 10-year Treasury Yield: ",latest_ten_year_treasury_yield,"%")

Latest 10-year Treasury Yield:  4.3 %

Monthly Returns (Table View)

We proceed to collect our monthly stock returns using the tidyquant functions. We split this data set into two sets, one set ranging from 2004 to 5 years prior to the current date and another one from the previous 5 years. After some data pre-processing, we display the 5 most recent monthly returns for each of our stocks for the entire timeline below.

Code

stock_returns_monthly_full_raw <- tq_get(stock_tickers, get  = "stock.prices", from = "2004-01-01", to = Sys.Date())

# Transformed Monthly Stock Returns (long format)
stock_returns_monthly_full_long = stock_returns_monthly_full_raw %>% group_by(symbol) %>% tq_transmute(select = adjusted, mutate_fun = periodReturn, period = "monthly", col_rename = "MonthlyReturnsFull")

# Transformed Monthly Stock Returns (wide format)
stock_returns_monthly_full_wide = pivot_wider(stock_returns_monthly_full_long, names_from=symbol, values_from=MonthlyReturnsFull)

# Displays the most recent 5 monthly stock returns
stock_returns_monthly_full_wide_decreasing =  stock_returns_monthly_full_wide[order(stock_returns_monthly_full_wide$date, decreasing = TRUE), ]
as.data.frame(head(stock_returns_monthly_full_wide_decreasing, 5)) %>%
  kbl(caption = "Returns for past 5 months") %>%
  kable_classic(full_width = F, html_font = "Cambria")

Returns for past 5 months
date	TLT	MSFT	HPQ	COST	F	PEP	JPM	RTX	PG	CVX	JNJ
2024-03-20	-0.0105981	0.0280195	0.0571070	-0.0042614	0.0369775	0.0394339	0.0551973	0.0577674	0.0191897	0.0167095	-0.0348247
2024-02-29	-0.0225221	0.0423184	-0.0132358	0.0721039	0.0900739	-0.0114416	0.0671025	-0.0094345	0.0114548	0.0423163	0.0233186
2024-01-31	-0.0224513	0.0572812	-0.0458624	0.0527209	-0.0385561	-0.0077131	0.0313648	0.0829570	0.0791076	-0.0115983	0.0137807
2023-12-29	0.0868574	-0.0075744	0.0349681	0.1389319	0.1881091	0.0092102	0.0898258	0.0326460	-0.0454664	0.0387187	0.0134489
2023-11-30	0.0992387	0.1229454	0.1143182	0.0749183	0.0523077	0.0385427	0.1223932	0.0084852	0.0232620	-0.0042868	0.0509542

The data set below shows the last 5 monthly returns from 2004 to 5 years prior to today’s date.

Code

# Raw Stock Monthly Returns
stock_returns_monthly_raw <- tq_get(stock_tickers, get  = "stock.prices", from = "2004-01-01", to = Sys.Date() - 1815)

# Transformed Monthly Stock Returns (long format)
stock_returns_monthly_long = stock_returns_monthly_raw %>% group_by(symbol) %>% tq_transmute(select = adjusted, mutate_fun = periodReturn, period = "monthly", col_rename = "MonthlyReturns")

# Transformed Monthly Stock Returns (wide format)
stock_returns_monthly_wide = pivot_wider(stock_returns_monthly_long, names_from=symbol, values_from=MonthlyReturns)

# Displays the most recent 5 monthly stock returns
# stock_returns_monthly_wide_decreasing =  stock_returns_monthly_wide[order(stock_returns_monthly_wide$date, decreasing = TRUE), ]
# as.data.frame(head(stock_returns_monthly_wide_decreasing, 5)) %>% kbl(caption = "Returns for past 5 months") %>% kable_classic(full_width = F, html_font = "Cambria")

We repeat the above process for the 2nd batch of monthly stock returns which lasts for the last 5 years.

Code

stock_returns_monthly_test_raw <- tq_get(stock_tickers, get  = "stock.prices", from = Sys.Date() - 1815, to = Sys.Date())

stock_returns_monthly_test_long = stock_returns_monthly_test_raw %>% group_by(symbol) %>% tq_transmute(select = adjusted, mutate_fun = periodReturn, period = "monthly", col_rename = "MonthlyReturns2nd")

stock_returns_monthly_test_wide = pivot_wider(stock_returns_monthly_test_long, names_from=symbol, values_from=MonthlyReturns2nd)

# stock_returns_monthly_test_wide_decreasing =  stock_returns_monthly_test_wide[order(stock_returns_monthly_test_wide$date, decreasing = TRUE), ]
# as.data.frame(head(stock_returns_monthly_test_wide_decreasing, 5))

Stock Growth & Monthly Returns (Graph View)

Displayed below is a graphical representation of the total returns (%) for our chosen stocks covering the chosen time period. To facilitate accurate time series analysis, the data has been standardized.

Code

stock_returns_monthly_raw1 <- stock_returns_monthly_raw %>% 
  group_by(symbol) %>% 
  mutate(return = 100 * (close - first(close)) / first(close)) %>% 
  ggplot(aes(date, return, color = symbol)) + 
  geom_line() +
  labs(title = "Stock Performance Over Time Period",
       y = "Return (%)") +
  theme_minimal()
stock_returns_monthly_raw1

The individual graphs below show the total return (to date) for each asset on their own scales.

Code

stock_returns_monthly_raw %>% 
  group_by(symbol) %>% 
  mutate(return = 100 * (close - first(close)) / first(close)) %>% 
  ggplot(aes(date, return, color = symbol)) + 
  geom_line() +
  labs(title = "Stock Performance Over Time Period",
       y = "Return (%) on a log scale") +
  theme_minimal() +
  facet_wrap(~symbol, scales = "free_y")

Furthermore, within the same temporal framework, we present the monthly returns for our selected stocks. The graphical representation highlights the varying levels of volatility, with more erratic and spread-out patterns evident in stocks characterized by higher volatility.

Code

stock_returns_monthly_long %>% ggplot(aes(date, `MonthlyReturns`, color=symbol)) + 
  geom_line() +
  facet_wrap(~ symbol)

Statistical Metrics

We calculate the expected returns (based off historical returns), standard deviations, variance, and total count for each of our assets.

Code

Stock_Statistics = stock_returns_monthly_long %>% group_by(symbol)  %>% summarise(Historic_Expected_Returns = mean(`MonthlyReturns`), Standard_Deviation = sd(`MonthlyReturns`), Variance = VAR(`MonthlyReturns`), Count = n())
Stock_Statistics %>%
  kbl(caption = "Stock Statistics") %>%
  kable_classic(full_width = F, html_font = "Cambria")

Stock Statistics
symbol	Historic_Expected_Returns	Standard_Deviation	Variance	Count
COST	0.0134653	0.0525892	0.0027656	184
CVX	0.0103205	0.0563277	0.0031728	184
F	0.0077453	0.1424700	0.0202977	184
HPQ	0.0083149	0.0806987	0.0065123	184
JNJ	0.0085053	0.0391815	0.0015352	184
JPM	0.0108169	0.0762465	0.0058135	184
MSFT	0.0123813	0.0647541	0.0041931	184
PEP	0.0083090	0.0404144	0.0016333	184
PG	0.0072533	0.0420618	0.0017692	184
RTX	0.0088972	0.0533712	0.0028485	184
TLT	0.0057034	0.0368194	0.0013557	184

Code

animation1 = ggplot(Stock_Statistics, aes(x = Standard_Deviation, y = Historic_Expected_Returns, color = symbol)) +
geom_point(size = 5) +
theme_bw() + ggtitle(" Monthly Historic Risk-Return Tradeoff") +
xlab("Volatility") + ylab("Expected Returns")

ggplotly(animation1)

Calculation of Excess Returns

Moving forward, our analysis involves the computation of excess returns. This is achieved by subtracting our anticipated monthly return value from each historical monthly return value, resulting in an array of excess returns. This practice is widely adopted in risk-adjusted portfolio analysis to refine the assessment of portfolio performance.

Using our split data set, we calculate the historical expected mean from the first group and our excess returns using the monthly returns from the last 5 years with respect to the calculated expected returns to ensure the excess returns don’t contain future data priced in.

Code

excess_returns_table <- stock_returns_monthly_test_long %>% left_join(Stock_Statistics, by = "symbol") %>% mutate(Excess_Returns = `MonthlyReturns2nd` - Historic_Expected_Returns)

excess_returns_table <- excess_returns_table %>% select(`symbol`, `date`, `Excess_Returns`)

# Displays the most recent 5 monthly excess stock returns
excess_returns_table_decreasing =  excess_returns_table[order(excess_returns_table$date, decreasing = TRUE), ]
head(excess_returns_table_decreasing,length(stock_tickers))

# A tibble: 11 × 3
# Groups:   symbol [11]
   symbol date       Excess_Returns
   <chr>  <date>              <dbl>
 1 TLT    2024-03-20       -0.0163 
 2 MSFT   2024-03-20        0.0156 
 3 HPQ    2024-03-20        0.0488 
 4 COST   2024-03-20       -0.0177 
 5 F      2024-03-20        0.0292 
 6 PEP    2024-03-20        0.0311 
 7 JPM    2024-03-20        0.0444 
 8 RTX    2024-03-20        0.0489 
 9 PG     2024-03-20        0.0119 
10 CVX    2024-03-20        0.00639
11 JNJ    2024-03-20       -0.0433

We compute two matrices essential for our optimization process: the variance-covariance matrix and the correlation matrix. Each value within these matrices is derived from the following respective equations:

Note

💡 Calculation of the correlation matrix necessitates the input of the variance-covariance matrix

Var-CoVar: \Sigma = \frac{X^T X}{n-1}

Corr: \frac{\Sigma}{\sigma^T \sigma}

Code

excess_returns_wider =  excess_returns_table |> pivot_wider(names_from = symbol, values_from = Excess_Returns)

M = as.matrix(excess_returns_wider[ ,2:12])

# Variance-Covariance Matrix
cat("Variance-Covariance Matrix")

Variance-Covariance Matrix

Code

cat("\n")

Code

round(var(M), 4)

         TLT   MSFT    HPQ   COST      F    PEP     JPM     RTX     PG     CVX
TLT   0.0022 0.0010 0.0003 0.0013 0.0006 0.0003 -0.0003 -0.0005 0.0001 -0.0008
MSFT  0.0010 0.0040 0.0027 0.0021 0.0032 0.0013  0.0018  0.0016 0.0012  0.0015
HPQ   0.0003 0.0027 0.0084 0.0016 0.0046 0.0014  0.0039  0.0031 0.0012  0.0035
COST  0.0013 0.0021 0.0016 0.0040 0.0037 0.0013  0.0014  0.0010 0.0012  0.0013
F     0.0006 0.0032 0.0046 0.0037 0.0170 0.0022  0.0064  0.0037 0.0011  0.0067
PEP   0.0003 0.0013 0.0014 0.0013 0.0022 0.0021  0.0014  0.0019 0.0016  0.0020
JPM  -0.0003 0.0018 0.0039 0.0014 0.0064 0.0014  0.0065  0.0042 0.0010  0.0050
RTX  -0.0005 0.0016 0.0031 0.0010 0.0037 0.0019  0.0042  0.0074 0.0017  0.0048
PG    0.0001 0.0012 0.0012 0.0012 0.0011 0.0016  0.0010  0.0017 0.0026  0.0010
CVX  -0.0008 0.0015 0.0035 0.0013 0.0067 0.0020  0.0050  0.0048 0.0010  0.0093
JNJ  -0.0001 0.0012 0.0012 0.0009 0.0015 0.0014  0.0016  0.0017 0.0013  0.0021
         JNJ
TLT  -0.0001
MSFT  0.0012
HPQ   0.0012
COST  0.0009
F     0.0015
PEP   0.0014
JPM   0.0016
RTX   0.0017
PG    0.0013
CVX   0.0021
JNJ   0.0024

Code

cat("\n")

Code

cor_mat = cor(M)

# Correlation Matrix with percentage 
cat("Correlation Matrix as percentages")

Correlation Matrix as percentages

Code

cat("\n")

Code

round(cor_mat * 100, 3)

         TLT    MSFT     HPQ    COST       F     PEP     JPM     RTX      PG
TLT  100.000  35.691   5.884  44.843  10.444  15.273  -7.472 -12.901   5.307
MSFT  35.691 100.000  46.591  51.866  39.353  44.602  35.509  30.196  37.342
HPQ    5.884  46.591 100.000  28.231  38.652  32.271  53.048  39.185  24.548
COST  44.843  51.866  28.231 100.000  44.992  44.736  27.024  17.946  35.447
F     10.444  39.353  38.652  44.992 100.000  36.022  61.151  33.232  16.523
PEP   15.273  44.602  32.271  44.736  36.022 100.000  39.099  47.534  67.470
JPM   -7.472  35.509  53.048  27.024  61.151  39.099 100.000  61.146  25.255
RTX  -12.901  30.196  39.185  17.946  33.232  47.534  61.146 100.000  37.405
PG     5.307  37.342  24.548  35.447  16.523  67.470  25.255  37.405 100.000
CVX  -17.856  24.318  39.559  21.939  53.208  44.191  63.968  57.650  19.771
JNJ   -5.198  38.035  25.759  27.932  22.869  63.425  41.258  38.864  49.622
         CVX     JNJ
TLT  -17.856  -5.198
MSFT  24.318  38.035
HPQ   39.559  25.759
COST  21.939  27.932
F     53.208  22.869
PEP   44.191  63.425
JPM   63.968  41.258
RTX   57.650  38.864
PG    19.771  49.622
CVX  100.000  44.750
JNJ   44.750 100.000

We run a Principal Component Analysis calculation to display the groupings and correlations among the various assets in the pearl fund. The graph below displays the groups of closely related or correlated associations with certain stocks. The horizontal axis (Comp.1) and the vertical axis (Comp.2) represent the first two principal components derived from the PCA. These are the two dimensions that capture the most variance in the data set. From an investment standpoint, this plot might help in understanding which stocks move together (correlated), and potentially, it might help in identifying which stocks contribute most to the variance in the portfolio. This could be used for risk management and diversification; for instance, once might choose to invest in stocks that are less correlated with each other according to the PCA results. For example, the biplot shows that TLT, which is an ETF that tracks long-term treasury bonds, has a very different loading compared to other assets, indicating that its price movements are likely less correlated with those of the stocks.

Code

pca = princomp(cor_mat, cor = TRUE)
pca_df = as.data.frame(cor(cor_mat, pca$scores))

suppressWarnings(suppressPackageStartupMessages({pca_df |>
  ggplot(aes(x = Comp.1, y = Comp.2)) +
  geom_point(alpha = 0.3) +
  geom_text(label = rownames(pca_df)) + 
  geom_vline(xintercept = 0) +
  geom_hline(yintercept = 0) +
  scale_x_continuous(limits = c(-1,1)) +
  scale_y_continuous(limits = c(-1,1)) +

  # Adding arrows from origin to points
  geom_segment(aes(x = 0, y = 0, 
                   xend = Comp.1, yend = Comp.2),
               arrow = arrow(length = unit(0.05, "inches")),
               color = "blue", 
               data = subset(pca_df, rownames(pca_df) %in% c("TLT", "MSFT", "HPQ", "COST", "F", "PEP", "JPM", "RTX", "PG", "CVX", "JNJ")),  # Add ticker symbols here
               size = 0.5, 
               lineend = "round")}))

The following visuals will help aid the user with further understanding the correlations between the certain assets and larger groupings. Below we include a correlogram & a heat-map. The correlogram displays the correlations of various stock pairs on a grid to display the direction (blue shows + and red shows -) and the intensity (increasing size/shade of dot shows a value closer to a magnitude of 1). A white box shows a correlation of 0, ruling out a linear association.

Code

corrplot(cor_mat, method = "circle")

The dendrogram below shows another display of correlation pairings as well as clustering of certain groups of assets to show similarities/relatedness.

Code

heatmap(cor_mat)

Benchmark Performance

Upon acquiring the historical monthly stock returns for our designated time frame, our focus shifts to the evaluation of the two ETFs of the comparison benchmark index funds—SPDR S&P 500 ETF Trust (NYSE: SPY) and NASDAQ COMPOSITE (NYSE: QQQ). We display the 5 most recent SPY & NASDAQ Monthly returns available.

S&P 500 (Monthly % Returns):

Code

# Raw S&P 500 Monthly Returns
spy_returns_monthly_raw <- tq_get("SPY", get = "stock.prices", from = "2004-01-01", to = "2024-01-01")

# Transformed S&P 500 Monthly Returns
spy_returns_monthly = tq_transmute(spy_returns_monthly_raw, select = adjusted, mutate_fun = periodReturn, period = "monthly", col_rename = "SPYMonthlyReturns")

# Transforms data into percentages 
spy_returns_monthly_percentages = data.frame("date" = spy_returns_monthly$date, "SPYPercentReturns" = spy_returns_monthly$SPYMonthlyReturns * 100)

# Displays the most recent 5 monthly SPY returns (%)
spy_returns_monthly_percentages_decreasing =  spy_returns_monthly_percentages[order(spy_returns_monthly_percentages$date, decreasing = TRUE), ]

head(spy_returns_monthly_percentages_decreasing[, c("date", "SPYPercentReturns")], 5)%>%
  kbl(caption = "SPY returns for past 5 months") %>%
  kable_classic(full_width = F, html_font = "Cambria")

SPY returns for past 5 months
	date	SPYPercentReturns
240	2023-12-29	4.565537
239	2023-11-30	9.134381
238	2023-10-31	-2.170861
237	2023-09-29	-4.743455
236	2023-08-31	-1.625196

NASDAQ (Monthly % Returns):

Code

# Raw NASDAQ Monthly Returns
nasdaq_returns_monthly_raw <- tq_get("QQQ", get = "stock.prices", from = "2004-01-01", to = "2024-01-01")

# Transformed NASDAQ Monthly Returns
nasdaq_returns_monthly = tq_transmute(nasdaq_returns_monthly_raw, select = adjusted, mutate_fun = periodReturn, period = "monthly", col_rename = "NASDAQMonthlyReturns")

# Transforms data into percentages 
nasdaq_returns_monthly_percentages = data.frame("date" = nasdaq_returns_monthly$date, "NASDAQPercentReturns" = nasdaq_returns_monthly$NASDAQMonthlyReturns * 100)

# Displays the most recent 5 monthly NASDAQ returns (%)
nasdaq_returns_monthly_percentages_decreasing =  nasdaq_returns_monthly_percentages[order(nasdaq_returns_monthly_percentages$date, decreasing = TRUE), ]

head(nasdaq_returns_monthly_percentages_decreasing[, c("date", "NASDAQPercentReturns")], 5)%>%
  kbl(caption = " NASDAQ returns for past 5 months") %>%
  kable_classic(full_width = F, html_font = "Cambria")

NASDAQ returns for past 5 months
	date	NASDAQPercentReturns
240	2023-12-29	5.586963
239	2023-11-30	10.818816
238	2023-10-31	-2.065479
237	2023-09-29	-5.079851
236	2023-08-31	-1.483020

The line graph below illustrates the price trends over our time period for the benchmark returns — SPY and QQQ. As before, we standardize the index prices to facilitate a more effective comparison.

Code

# Combining benchmark raw dataframes
benchmark_returns_raw = rbind(spy_returns_monthly_raw, nasdaq_returns_monthly_raw)

benchmark_returns_raw %>% group_by(symbol) %>% mutate(close = 100*(close - first(close))/first(close)) %>% ggplot(aes(date, close, color=symbol)) + geom_line()

Now, we proceed to consolidate the returns of the two baseline index funds into a unified table. Additionally, a graph is presented to visually depict the combined returns of the two benchmarks within a single frame.

Code

benchmark_combined_table_of_returns = left_join(spy_returns_monthly, nasdaq_returns_monthly, by="date")

benchmark_long = benchmark_combined_table_of_returns |> 
  pivot_longer(cols = 2:3, values_to = "return", names_to = "symbol") |>
  arrange(symbol)

benchmark_long |>
  ggplot(aes(x = date, y = return, color = symbol)) +
  geom_line() +
  facet_grid(symbol ~ .)

Portfolio Optimization

The groundwork laid by our analysis and statistical metrics sets the stage for the upcoming portfolio optimization process. Drawing inspiration from the tenets of Modern Portfolio Theory (MPT), our objective is to optimize the portfolio of assets. Employing a mean-variance framework analysis, we seek to maximize expected returns for a specified level of risk tolerance, represented by standard deviation. We will also be utilizing other constraints and objectives to customize what we aim with our financial returns. Our approach involves leveraging concepts such as the efficient set mathematics, efficient frontier model, and other relevant methodologies.

Equally Weighted Portfolio Initialization

Code

init_weights = rep((1/length(stock_tickers)), length(stock_tickers))
assets = setNames(init_weights, stock_tickers)
assets

       TLT       MSFT        HPQ       COST          F        PEP        JPM 
0.09090909 0.09090909 0.09090909 0.09090909 0.09090909 0.09090909 0.09090909 
       RTX         PG        CVX        JNJ 
0.09090909 0.09090909 0.09090909 0.09090909

Next, using our optimization libraries we create a portfolio object where we will input our asset, initialized weights, portfolio constraints, and our objectives. Our desired outcome will be a vector of weights which optimizes for our objectives under our desired constraints.

Comments on Constraints

Weight Sum -> sets a range of values for the total weight of out portfolio assets
Leverage -> sets a range of values to depict additional borrowing
Box -> sets a range of values for which the assets can be weighted at
Return -> sets a target return rate for our portfolio to achieve

Comments on Objectives

Return -> aims to create a portfolio to maximize the return
Risk Aversion Parameter -> sets the level of risk aversion for the optimization. A higher risk aversion value indicates a greater emphasis on minimizing risk in the portfolio (ranges from 1-10)
Risk -> aims to create a portfolio which minimizes the risk (StdDev, var, etc..)

For the Pearl Fund, we will optimize for maximizing our expected return while minimizing our variance given a risk aversion (delta) value of 5. Feel free to use your own set of constraints and objectives for your own portfolio.

Code

pearlfund = portfolio.spec(assets = assets)
print(pearlfund)

**************************************************
PortfolioAnalytics Portfolio Specification 
**************************************************

Call:
portfolio.spec(assets = assets)

Number of assets: 11 
Asset Names
 [1] "TLT"  "MSFT" "HPQ"  "COST" "F"    "PEP"  "JPM"  "RTX"  "PG"   "CVX" 
More than 10 assets, only printing the first 10

Code

# CONSTRAINTS

pearlfund = add.constraint(portfolio=pearlfund, type="weight_sum", min_sum=1, max_sum=1)
# pearlfund <- add.constraint(portfolio=pearlfund, type="leverage", min_sum=0.99, max_sum=1.01)
pearlfund <- add.constraint(portfolio=pearlfund, type="box", min=0.00, max=1)
# pearlfund = add.constraint(portfolio=pearlfund, type="return", return_target=0.0150)

# OBJECTIVES

pearlfund = add.objective(portfolio=pearlfund, type='return', name='mean')
pearlfund = add.objective(portfolio = pearlfund, type='risk', name="var", risk_aversion=5)
# pearlfund = add.objective(pearlfund, type = "risk", name = "StdDev")

# OPTIMIZE.PORTFOLIO

opt_maxret <- optimize.portfolio(R=stock_returns_monthly_full_wide, portfolio=pearlfund, optimize_method="ROI", trace=TRUE)

# OPTIMAL WEIGHTS

print(opt_maxret)

***********************************
PortfolioAnalytics Optimization
***********************************

Call:
optimize.portfolio(R = stock_returns_monthly_full_wide, portfolio = pearlfund, 
    optimize_method = "ROI", trace = TRUE)

Optimal Weights:
   TLT   MSFT    HPQ   COST      F    PEP    JPM    RTX     PG    CVX    JNJ 
0.2224 0.1736 0.0000 0.3000 0.0000 0.0157 0.0547 0.0000 0.1195 0.0766 0.0375 

Objective Measure:
   mean 
0.01111 


 StdDev 
0.03244

Code

# VISUALIZATIONS

plot(opt_maxret, risk.col="StdDev", return.col="mean", main="Maximum Return Optimization", chart.assets=TRUE, xlim=c(0, 0.18), ylim=c(0,0.04))

Our visual above displays a line graph of the associated weights of our assets and a scatter plot comparing the mean vs. StdDev of the individual portfolio compared to the mean vs. StdDev of the portfolio as an aggregate whole.

The following code extracts the weights from our portfolio object and stores it as a vector of weights.

Code

optimized_list_of_weights = opt_maxret[1]$weights
optimized_vector_of_weights = unlist(optimized_list_of_weights)
100 * round(optimized_vector_of_weights, 5)

   TLT   MSFT    HPQ   COST      F    PEP    JPM    RTX     PG    CVX    JNJ 
22.245 17.360  0.000 29.998  0.000  1.572  5.466  0.000 11.951  7.662  3.746

Optimized Portfolio Visualizations

Fund Total Growth Return

The visual below displays how much you would have right now if you had invested $10,000 at the start of the time period till today.

*Please note that this does not account for tax consequences, which could reduce your final capital depending on tax policy and brackets

Code

# Data Retstructuring

# Pearl Fund

aggregate_portfolio_monthly_returns_growth <- stock_returns_monthly_full_long %>%
    tq_portfolio(assets_col = symbol, 
                 returns_col= `MonthlyReturnsFull`, 
                 weights = optimized_vector_of_weights,
                 col_rename   = "investment.growth",
                 wealth.index = TRUE) %>% 
  mutate(investment.growth = investment.growth * 10000)

# Pearl Fund - Equally Weighted 

equally_weighted_portfolio_monthly_returns_growth <- stock_returns_monthly_full_long %>%
    tq_portfolio(assets_col = symbol, 
                 returns_col= `MonthlyReturnsFull`, 
                 weights = init_weights,
                 col_rename   = "investment.growth",
                 wealth.index = TRUE) %>% 
  mutate(investment.growth = investment.growth * 10000)

# SPY Portfolio

vals1 <- rep("SPY", 240)
spy_returns_monthly_ticker = spy_returns_monthly
spy_returns_monthly_ticker$symbol <- vals1

spy_returns_monthly_ticker_growth <- spy_returns_monthly_ticker %>%
    tq_portfolio(assets_col = symbol, 
                 returns_col= `SPYMonthlyReturns`, 
                 weights = c(1),
                 col_rename   = "investment.growth",
                 wealth.index = TRUE) %>% 
  mutate(investment.growth = investment.growth * 10000)

# NASDAQ Portfolio

vals2 = rep("NASDAQ", 240)
nasdaq_returns_monthly_ticker = nasdaq_returns_monthly
nasdaq_returns_monthly_ticker$symbol <- vals2

nasdaq_returns_monthly_ticker_growth <- nasdaq_returns_monthly_ticker %>%
    tq_portfolio(assets_col = symbol, 
                 returns_col= `NASDAQMonthlyReturns`, 
                 weights = c(1),
                 col_rename   = "investment.growth",
                 wealth.index = TRUE) %>% 
  mutate(investment.growth = investment.growth * 10000)

# Pearl Fund Portfolio Growth
ggplot(aggregate_portfolio_monthly_returns_growth, aes(x = date, y = investment.growth)) +
  geom_line(linewidth = 2, color = "red") +
  geom_smooth(method = "loess") +
  labs(title = "Pearl Fund Portfolio Growth",
       subtitle = "Optimized Portfolio Growth",
       caption = "Performance growth over our 20-year horizon",
       x = "Year", y = "Portfolio Value") +
  scale_y_continuous(labels = scales::dollar)

Code

# SPY Portfolio Growth
ggplot(spy_returns_monthly_ticker_growth, aes(x = date, y = investment.growth)) +
  geom_line(linewidth = 2, color = "blue") +
  geom_smooth(method = "loess") +
  labs(title = "SPY Portfolio Growth",
       subtitle = "Portfolio Growth",
       caption = "Performance growth over our 20-year horizon",
       x = "Year", y = "Portfolio Value") +
  scale_y_continuous(labels = scales::dollar)

Code

# NASDAQ Portfolio Growth
ggplot(nasdaq_returns_monthly_ticker_growth, aes(x = date, y = investment.growth)) +
  geom_line(linewidth = 2, color = "green") +
  geom_smooth(method = "loess") +
  labs(title = "NASDAQ Portfolio Growth",
       subtitle = "Portfolio Growth",
       caption = "Performance growth over our 20-year horizon",
       x = "Year", y = "Portfolio Value") +
  scale_y_continuous(labels = scales::dollar)

Fund Monthly Returns

The following visual displays a scatter plot of the monthly returns throughout the time period of interest. A trend line (using “loess”) is fitted onto the graph to display the variation and direction at different regions of the scatter plot.

Code

aggregate_portfolio_monthly_returns = stock_returns_monthly_full_long %>% tq_portfolio(assets_col = symbol, returns_col = `MonthlyReturnsFull`, weights = optimized_vector_of_weights, col_rename = "AggregateMonthlyReturns")

aggregate_portfolio_monthly_returns %>% ggplot(aes(x = date, y = `AggregateMonthlyReturns`)) + geom_point() + labs(title = "Pearl Fund Portfolio Returns", subtitle = "Optimized Portfolio", caption = "Shows an above-zero trend indicating long-term positive returns", x = "", y = "Monthly Returns") + 
  geom_smooth(method = "loess") + 
  theme_tq() + scale_color_tq() + scale_y_continuous(labels = scales::percent)

Annualized Returns

To get a better appreciation for the fund’s return, we proceed to annualize the the monthly returns.

Code

portfolio_monthly_mean = unname(unlist(opt_maxret[2]$objective_measures$mean))

# Convert to annualized % return
annualized_return <- paste0(100*((1 + portfolio_monthly_mean)^12 - 1), "%")
names(annualized_return) = "Annualized Return"

cat("Annualized Return:", annualized_return)

Annualized Return: 14.1734654669547%

Comparative Statistical Analysis

To objectively test our portfolio, we will be calculating several financial metrics in relation to both our benchmarks — SPY & QQQ.

Valuable tq_performance tests…: table.Stats, table.CAPM, table.AnnualizedReturns, table.Correlation, table.DownsideRisk, table.DownsideRiskRatio, table.HigherMoments, table.InformationRatio, table.Variability, VaR, SharpeRatio

To begin, it is important to combine the benchmark index monthly returns with our portfolio monthly returns.

Code

# Portfolio + S&P500 Table
spy_combined_table_of_returns = left_join(aggregate_portfolio_monthly_returns, spy_returns_monthly, by="date")

# Portfolio + NASDAQ Table
nasdaq_combined_table_of_returns = left_join(aggregate_portfolio_monthly_returns, nasdaq_returns_monthly, by="date")

# Displays the most recent 5 monthly returns for the tables above
spy_combined_table_of_returns_decreasing =  spy_combined_table_of_returns[order(spy_combined_table_of_returns$date, decreasing = TRUE), ]
head(spy_combined_table_of_returns_decreasing, 5)

# A tibble: 5 × 3
  date       AggregateMonthlyReturns SPYMonthlyReturns
  <date>                       <dbl>             <dbl>
1 2024-03-20                 0.00754           NA     
2 2024-02-29                 0.0559            NA     
3 2024-01-31                 0.0483            NA     
4 2023-12-29                 0.0775             0.0457
5 2023-11-30                 0.0837             0.0913

Code

nasdaq_combined_table_of_returns_decreasing =  nasdaq_combined_table_of_returns[order(nasdaq_combined_table_of_returns$date, decreasing = TRUE), ]
head(nasdaq_combined_table_of_returns_decreasing, 5)

# A tibble: 5 × 3
  date       AggregateMonthlyReturns NASDAQMonthlyReturns
  <date>                       <dbl>                <dbl>
1 2024-03-20                 0.00754              NA     
2 2024-02-29                 0.0559               NA     
3 2024-01-31                 0.0483               NA     
4 2023-12-29                 0.0775                0.0559
5 2023-11-30                 0.0837                0.108

The following code combines all the funds of interest that intend to be comparing into a single table (wide format) once again with the 5 most recent returns displayed.

Code

# Combined Table (3 Funds)
three_fund_combined_table_of_returns = left_join(spy_combined_table_of_returns, nasdaq_returns_monthly, by="date")

three_fund_combined_table_of_returns_decreasing <- arrange(three_fund_combined_table_of_returns, desc(date))

head(three_fund_combined_table_of_returns_decreasing, 5)%>%
  kbl(caption = "Combined returns for past 5 months") %>%
  kable_classic(full_width = F, html_font = "Cambria")

Combined returns for past 5 months
date	AggregateMonthlyReturns	SPYMonthlyReturns	NASDAQMonthlyReturns
2024-03-20	0.0075420	NA	NA
2024-02-29	0.0559100	NA	NA
2024-01-31	0.0483276	NA	NA
2023-12-29	0.0774682	0.0456554	0.0558696
2023-11-30	0.0837300	0.0913438	0.1081882

This is a great display of a side by side view of the monthly returns for our three funds of comparison for our time period of interest. The following visuals are a great measure of the overall volatility of our portfolio compared to the two benchmarks. Graphs that display more “ups” and “downs” with monthly returns indicate larger deviations from the mean. Given similar annualized returns and expected long term value, a risk-averse investor should prefer the portfolios/funds which minimize the volatility/risk for the given time period.

Code

three_fund_long = three_fund_combined_table_of_returns |> 
  pivot_longer(cols = 2:4, values_to = "return", names_to = "symbol") |>
  arrange(symbol)

three_fund_long |>
  ggplot(aes(x = date, y = return, color = symbol)) +
  geom_line() +
  facet_grid(symbol ~ .)

The tq_performance function in tidyquant allows us to produce statistical metrics for financial analysis for our Peal Fund against each of the benchmark index funds.

Code

# Pearl Fund vs. SPY
statisticsA = spy_combined_table_of_returns %>% tq_performance(Ra = `AggregateMonthlyReturns`, Rb = `SPYMonthlyReturns`, performance_fun = table.CAPM)

# Pearl Fund vs. NASDAQ
statisticsB = nasdaq_combined_table_of_returns %>% tq_performance(Ra = `AggregateMonthlyReturns`, Rb = `NASDAQMonthlyReturns`, performance_fun = table.CAPM)

as.data.frame(statisticsA) %>%
  kbl(caption = "Pearl Fund Against SPY") %>%
  kable_classic(full_width = F, html_font = "Cambria")

Pearl Fund Against SPY
ActivePremium	Alpha	AnnualizedAlpha	Beta	Beta-	Beta+	Correlation	Correlationp-value	InformationRatio	R-squared	TrackingError	TreynorRatio
0.0461	0.0062	0.0775	0.6477	0.6066	0.6869	0.7582	0	0.4707	0.5748	0.0979	0.2261

Code

as.data.frame(statisticsB) %>%
  kbl(caption = "Pearl Fund Against NASDAQ") %>%
  kable_classic(full_width = F, html_font = "Cambria")

Pearl Fund Against NASDAQ
ActivePremium	Alpha	AnnualizedAlpha	Beta	Beta-	Beta+	Correlation	Correlationp-value	InformationRatio	R-squared	TrackingError	TreynorRatio
0.0045	0.0056	0.0699	0.5047	0.4657	0.4766	0.7305	0	0.0361	0.5336	0.1255	0.2902

Mean-Variance Analysis (Efficient Frontier)

The efficient frontier is a concept in financial portfolio theory that represents the set of optimal portfolios that offer the highest expected return for a given level of risk, or the lowest risk for a given level of expected return. It is useful for investors and portfolio managers as it helps identify the optimal balance between risk and return, enabling them to construct portfolios that maximize potential gains while minimizing exposure to volatility.

To apply this framework to our portfolio, we first need to modify our monthly stock returns table so that the date column is removed with once again the 5 most recent returns displayed.

Code

stock_returns_monthly_wide_dateless = stock_returns_monthly_wide[, -1]

stock_returns_monthly_wide_decreasing <- arrange(stock_returns_monthly_wide, desc(date))

head(stock_returns_monthly_wide_decreasing[, -1], 5) %>%
  kbl() %>%
  kable_classic(full_width = F, html_font = "Cambria")

TLT	MSFT	HPQ	COST	F	PEP	JPM	RTX	PG	CVX	JNJ
-0.0142039	0.0091571	0.0319091	0.0084663	0.0227790	-0.0044880	0.0336855	0.0333616	-0.0039403	0.0142877	-0.0057942
0.0557169	0.0527540	-0.0068591	0.1069762	0.0011403	0.0597543	-0.0299923	0.0256229	0.0558092	0.0301057	0.0230530
-0.0137643	0.0773574	-0.1044036	0.0219021	-0.0034091	0.0346830	0.0083093	0.0706941	0.0215611	0.0534849	0.0335606
0.0037858	0.0281579	0.0767355	0.0536055	0.1703673	0.0198225	0.0688436	0.1088464	0.0578003	0.0538653	0.0312281
0.0585351	-0.0840474	-0.1040780	-0.1192059	-0.1870351	-0.0867865	-0.1220432	-0.1260669	-0.0274045	-0.0853373	-0.1215110

The next step is to simulate, using a loop, a series of random weights which we then apply certain matrices to. The result is a data frame which stores the list of returns and risk values for portfolios of different weights.

Code

np1 = 1000
ret2 = stock_returns_monthly_wide_dateless  #excluding dates
mu1 = colMeans(ret2)  #mean returns
na1 = ncol(ret2)  #number of assets
varc1 = cov(ret2)

riskp1 = NULL  #vector to store risk
retp1 = NULL  #vector to store returns
# using loops here (not aiming for efficiency but demonstration)

for (i in 1:np1) {
    w = diff(c(0, sort(runif(na1 - 1)), 1))  # random weights
    r1 = t(w) %*% mu1  #matrix multiplication
    sd1 = t(w) %*% varc1 %*% w
    retp1 = rbind(retp1, r1)
    riskp1 = rbind(riskp1, sd1)
}

# create a data frame of risk and return
d_p1 = data.frame(Ret = retp1, Risk = riskp1)

We use ggplot to and to visualize the data accordingly and as expected we achieve the curve as derived from the Modern Portfolio Framework.

Code

p1 = ggplot(d_p1, aes(Risk, Ret, colour = Ret))
# scatter plot
p1 = p1 + geom_point()
# scatter plot with density and identified port risk return (highest
# lowest returns and min risk)
p1 + geom_point() + geom_hline(yintercept = c(max(d_p1$Ret), median(d_p1$Ret),
    min(d_p1$Ret)), colour = c("darkgreen", "darkgray", "darkred"), size = 1) +
    geom_vline(xintercept = d_p1[(d_p1$Risk == min(d_p1$Risk)), ][, 2]) +
    labs(colour = "Portfolio Return", x = "Portfolio Risk", y = "Portfolio Return",
        title = "Random Feasible Portfolios") + theme_bw()

Here is another display of our plot using a different R frontier calculation package. To choose what you’d like to plot exactly in plot(), enter values from below as a vector:

Syntax for choosing vector in plot function below:

Efficient Frontier

Global Minimum Variance Portfolio

Tangent (optimal portfolio)

Risk/Return of each asset

Equal Weights Portfolio

Two Asset Frontier

Monte Carlo

Sharpe Ratio

Code

return.matrix = as.timeSeries(stock_returns_monthly_wide[, -1])

efficient.frontier = portfolioFrontier(return.matrix, `setRiskFreeRate<-`(portfolioSpec(), (latest_ten_year_treasury_yield/100)/12), constraints = "LongOnly" )

plot(efficient.frontier, c(1, 3, 7))

Code

plot(efficient.frontier, c(1,3,5,7))

Optimal portfolios for a given target risk are displayed by the dark colored dots on the border or frontier of the Monte-Carlo simulated data points. The capital allocation line (blue line) which intersects the efficient frontier at a point of tangency indicating the optimal portfolio for an investor at the current risk-free rate (10 year yield on government bond).

The following visual is another very useful metric to understand portfolio weightings based on chosen return/risk levels.

Interpretation

Pick a vertical line/column at a specified target return and target risk and you can find out the allocations of stocks in your portfolio (through the lengths of the appropriate colored bars) to achieve those chosen metrics.

Code

weightsPlot(efficient.frontier)

Here our goal is to produce a different type of frontier visual. Instead of plotting the traditional measures of risk (standard deviation or variance) against returns, we turn to a different measure of risk — expected tail loss (ETL). ETL is a another measure of risk which could be a better use in portfolio optimization in certain situations.

ETL, also known as conditional value at risk (CVaR), is a risk measure that quantifies the expected loss in the tail of a distribution of possible outcomes beyond a certain confidence level. ETL focuses specifically on the extreme or tail events, giving an indication of the potential loss in those extreme scenarios. The preference for ETL over variance in some cases arises because ETL puts more emphasis on extreme events, which can be crucial in risk assessment. Variance might not adequately capture the potential impact of rare but severe events, whereas ETL provides a clearer picture of the expected loss in the tail of the distribution.

The following code generates a series of ETL values for each of our portfolio weights at a chosen confidence level (95%).

Code

confidence_level <- 0.95
np1 <- length(retp1)
cvar_values <- numeric(np1)
returns <- rnorm(1000, mean(retp1), mean(riskp1))
etl_values <- numeric(np1)

for (i in 1:np1) {
  random_sample <- sample(returns, length(returns), replace = TRUE)
  sorted_returns <- sort(random_sample, decreasing = TRUE)
  tail_index <- floor((1 - confidence_level) * length(returns))
  tail_returns <- sorted_returns[1:tail_index]
  etl_values[i] <- mean(tail_returns)
}

We add these ETL values to our data frame of Returns and Risk for our various portfolios.

Code

# Add ETL to this table
d_p1 = mutate(d_p1, ETL = etl_values)
head(d_p1) %>%
  kbl(caption = "Monte Carlo Simulation") %>%
  kable_classic(full_width = F, html_font = "Cambria")

Monte Carlo Simulation
Ret	Risk	ETL
0.0091751	0.0016406	0.0128155
0.0085436	0.0014544	0.0125995
0.0083278	0.0011513	0.0129044
0.0102011	0.0016985	0.0130339
0.0097651	0.0010151	0.0127538
0.0101553	0.0012116	0.0126825

Finally, we plot these ETL values against our return values for our portfolios and are able to visualize the results. By plotting ETL vs. Expected Return, you can assess whether the expected returns of the portfolio adequately compensates for the potential magnitude of extreme losses.

Code

p1 <- ggplot(d_p1, aes(ETL, Ret, colour = Ret)) +
  geom_point() +
  geom_hline(yintercept = c(max(d_p1$Ret), median(d_p1$Ret), min(d_p1$Ret)),
             colour = c("darkgreen", "darkgray", "darkred"), size = 1) +
  geom_vline(xintercept = d_p1[(d_p1$Risk == min(d_p1$Risk)), ][, 2]) +
  labs(colour = "Portfolio Return",
       x = "Expected Tail Loss",
       y = "Portfolio Return",
       title = "Random Feasible Portfolios") +
  theme_bw()

p1 + xlim(min(d_p1$ETL), max(d_p1$ETL)) + ylim(min(d_p1$Ret), max(d_p1$Ret))

Conclusion

To conclude, we plot the Optimized Pearl Fund Portfolio (in red) versus the Equally Weighted Pearl Fund Portfolio (in blue). As evident in the graph below, the Optimized Portfolio generates higher returns than the Equally Weighted Portfolio, while rising more steadily.

Code

ggplot() +
  geom_line(data = aggregate_portfolio_monthly_returns_growth, aes(x = date, y = investment.growth), 
            linewidth = 2, color = "red") +
  geom_smooth(data = aggregate_portfolio_monthly_returns_growth, aes(x = date, y = investment.growth), 
              method = "loess", color = "red", se = FALSE) +  
  annotate(geom = "text", x = as.Date("2017-03-01"), y = 100000, 
           label = "Optimized\nPortfolio", color = "red") +
  geom_line(data = equally_weighted_portfolio_monthly_returns_growth, aes(x = date, y = investment.growth), 
            linewidth = 2, color = "blue") +
  geom_smooth(data = equally_weighted_portfolio_monthly_returns_growth, aes(x = date, y = investment.growth), 
              method = "loess", color = "blue", se = FALSE) +
  annotate(geom = "text", x = as.Date("2023-03-01"), y = 35000, 
           label = "Equally\nWeighted", color = "blue") +
  labs(title = "Pearl Fund Optimized Portfolio Growth vs. Equally Weighted Portfolio",
       subtitle = ,
       caption = "Performance growth over our 20-year horizon",
       x = "Year", y = "Portfolio Value") +
  scale_y_continuous(labels = scales::dollar) +
 theme_minimal()