1 Introduction

Portfolio optimization addresses the weight allocation problem: given a selected set of securities, how much capital should be assigned to each? This assignment uses the mean-variance framework to find weight combinations that are not dominated by alternatives. That is to say, no other portfolio offers higher return for the same risk or lower risk for the same return.

Rather than analytical quadratic optimization, we used a simulation approach: we generated 100N random weight vectors, computed each portfolio’s expected return and risk, and identified the Optimal Portfolio (highest Sharpe ratio) and the Minimum Variance Portfolio (MVP, lowest standard deviation).

Data retrieval was performed using the yfR package, which provides adjusted closing prices by default and supports batch downloading of multiple tickers in a single call, offering a more robust and up-to-date alternative to quantmod.

2 Required Libraries

library(yfR)      # Stock price data (adjusted prices by default)
library(tidyr)    # pivot_wider for reshaping yfR long format output
library(ggplot2)  # Plotting
library(dplyr)    # Data manipulation
library(knitr)    # kable for formatted table output

3 The myMeanVarPort Function

The function accepts four inputs: tickers, begin_date, end_date, and rf (annual risk-free rate) and returns a named list containing the vector of stock means, the covariance matrix, and a data frame of simulated portfolio results (weights, mean, sigma, Sharpe ratio).

yfR::yf_get() retrieves all tickers in a single call and returns monthly arithmetic returns on adjusted prices directly, removing the need for per-ticker loops or manual adjusted price extraction. The long format output is reshaped to wide format via pivot_wider() before any matrix calculations are performed.

myMeanVarPort <- function(tickers, begin_date, end_date, rf) {

  # Download monthly adjusted returns via yfR
  df_long <- yf_get(tickers    = tickers,
                    first_date  = begin_date,
                    last_date   = end_date,
                    freq_data   = "monthly",
                    type_return = "arit",
                    be_quiet    = TRUE)

  # Reshape from long to wide format, preserve ticker order, drop NAs
  retout <- df_long %>%
    select(ref_date, ticker, ret_adjusted_prices) %>%
    pivot_wider(names_from  = ticker,
                values_from = ret_adjusted_prices) %>%
    arrange(ref_date) %>%
    select(all_of(tickers)) %>%
    na.omit()

  # Aggregate means and covariance matrix over the full period
  meanret <- as.matrix(colMeans(retout))
  covar   <- var(retout)

  # Simulate 100*N random weight vectors (seed = 12)
  n     <- length(tickers)
  niter <- 100 * n
  set.seed(12)
  randomnums <- data.frame(replicate(n, runif(niter, 0, 1)))
  wt_sim     <- randomnums / rowSums(randomnums)

  # Convert annual rf to monthly before computing Sharpe ratios
  rf_monthly <- rf / 12

  # Compute portfolio mean, sigma, and Sharpe ratio for each simulation
  weight  <- matrix(NA, nrow = n, ncol = 1)
  Results <- matrix(NA, nrow = niter, ncol = n + 3)

  for (i in 1:niter) {
    for (k in 1:n) {
      Results[i, k] <- weight[k, 1] <- wt_sim[i, k]
    }
    port_mean         <- t(weight) %*% meanret
    port_sigma        <- sqrt(t(weight) %*% covar %*% weight)
    sharpe            <- (port_mean - rf_monthly) / port_sigma
    Results[i, n + 1] <- port_mean
    Results[i, n + 2] <- port_sigma
    Results[i, n + 3] <- sharpe
  }

  colnames(Results) <- c(paste0(tickers, "_wt"), "PortMean", "PortSigma", "Sharpe")

  list(
    stock_means    = round(meanret, 6),
    cov_matrix     = round(covar, 8),
    sim_portfolios = as.data.frame(Results)
  )
}

4 Results

4.1 Function Inputs

Per the assignment, the function was run with the required inputs below. With 7 securities, 700 portfolios were simulated (100 × 7).

tickers    <- c("GE", "XOM", "GBX", "SBUX", "PFE", "HMC", "NVDA")
begin_date <- "2021-01-01"
end_date   <- "2024-12-31"
rf         <- 0.02

port_output <- myMeanVarPort(tickers, begin_date, end_date, rf)
Results     <- port_output$sim_portfolios

4.2 Stock Means and Covariance Matrix

The mean returns and covariance matrix are single aggregate values computed over the full sample period. The diagonal elements of the covariance matrix are individual stock variances; off-diagonal elements measure pairwise co-movement and drive diversification benefits.

cat("Mean Monthly Returns (2021-2024):\n")
## Mean Monthly Returns (2021-2024):
print(port_output$stock_means)
##           [,1]
## GE    0.029508
## XOM   0.025301
## GBX   0.023086
## SBUX  0.003455
## PFE  -0.000358
## HMC   0.006483
## NVDA  0.063030
cat("Variance-Covariance Matrix:\n")
## Variance-Covariance Matrix:
print(port_output$cov_matrix)
##               GE         XOM        GBX       SBUX         PFE         HMC
## GE    0.00909667  0.00394022 0.00641849 0.00314594 -0.00075865  0.00270523
## XOM   0.00394022  0.00711813 0.00387242 0.00071499 -0.00066581  0.00141213
## GBX   0.00641849  0.00387242 0.02247839 0.00226764  0.00175634  0.00220912
## SBUX  0.00314594  0.00071499 0.00226764 0.00623296  0.00093618  0.00024185
## PFE  -0.00075865 -0.00066581 0.00175634 0.00093618  0.00566406 -0.00046056
## HMC   0.00270523  0.00141213 0.00220912 0.00024185 -0.00046056  0.00396152
## NVDA  0.00739327  0.00005447 0.00755342 0.00277184  0.00288029  0.00331455
##            NVDA
## GE   0.00739327
## XOM  0.00005447
## GBX  0.00755342
## SBUX 0.00277184
## PFE  0.00288029
## HMC  0.00331455
## NVDA 0.02393420

4.3 Simulated Portfolios (First 10 Rows)

cat("Total portfolios simulated:", nrow(Results), "\n\n")
## Total portfolios simulated: 700
kable(round(head(Results, 10), 5),
      caption = "First 10 Simulated Portfolios",
      align   = "r")
First 10 Simulated Portfolios
GE_wt XOM_wt GBX_wt SBUX_wt PFE_wt HMC_wt NVDA_wt PortMean PortSigma Sharpe
0.02200 0.12060 0.10444 0.14058 0.26511 0.21988 0.12739 0.01596 0.05211 0.27424
0.17490 0.09083 0.16895 0.18331 0.12098 0.20000 0.06103 0.01709 0.05882 0.26227
0.24676 0.01779 0.26135 0.04117 0.14936 0.11392 0.16966 0.02529 0.07712 0.30625
0.08228 0.22659 0.06260 0.01112 0.27532 0.07600 0.26609 0.02681 0.06470 0.38862
0.05806 0.20295 0.28035 0.08887 0.16370 0.06580 0.14027 0.02284 0.07026 0.30132
0.00897 0.11912 0.24735 0.18626 0.10236 0.10331 0.23262 0.02493 0.07284 0.31936
0.04826 0.19277 0.22443 0.16279 0.26723 0.10418 0.00034 0.01265 0.05669 0.19366
0.20001 0.15077 0.26378 0.01574 0.18823 0.17998 0.00148 0.01705 0.06508 0.23643
0.00667 0.10845 0.27122 0.09188 0.25678 0.14664 0.11837 0.01784 0.06533 0.24757
0.00282 0.21481 0.24811 0.17142 0.14310 0.11865 0.10109 0.01893 0.06292 0.27435

4.4 Optimal and Minimum Variance Portfolios

opt_idx  <- which.max(Results$Sharpe)
opt_port <- Results[opt_idx, ]

mvp_idx  <- which.min(Results$PortSigma)
mvp_port <- Results[mvp_idx, ]

cat("OPTIMAL PORTFOLIO (Highest Sharpe)\n")
## OPTIMAL PORTFOLIO (Highest Sharpe)
cat(sprintf("  Portfolio Mean:  %.5f\n", opt_port$PortMean))
##   Portfolio Mean:  0.03802
cat(sprintf("  Portfolio Sigma: %.5f\n", opt_port$PortSigma))
##   Portfolio Sigma: 0.08575
cat(sprintf("  Sharpe Ratio:    %.5f\n", opt_port$Sharpe))
##   Sharpe Ratio:    0.42395
cat("  Weights:\n")
##   Weights:
for (t in tickers) cat(sprintf("    %-4s : %.2f%%\n", t,
                                opt_port[[paste0(t, "_wt")]] * 100))
##     GE   : 14.04%
##     XOM  : 18.07%
##     GBX  : 2.32%
##     SBUX : 8.09%
##     PFE  : 1.31%
##     HMC  : 12.22%
##     NVDA : 43.95%
cat("\nMINIMUM VARIANCE PORTFOLIO\n")
## 
## MINIMUM VARIANCE PORTFOLIO
cat(sprintf("  Portfolio Mean:  %.5f\n", mvp_port$PortMean))
##   Portfolio Mean:  0.01091
cat(sprintf("  Portfolio Sigma: %.5f\n", mvp_port$PortSigma))
##   Portfolio Sigma: 0.04371
cat(sprintf("  Sharpe Ratio:    %.5f\n", mvp_port$Sharpe))
##   Sharpe Ratio:    0.21155
cat("  Weights:\n")
##   Weights:
for (t in tickers) cat(sprintf("    %-4s : %.2f%%\n", t,
                                mvp_port[[paste0(t, "_wt")]] * 100))
##     GE   : 7.49%
##     XOM  : 19.45%
##     GBX  : 2.44%
##     SBUX : 23.79%
##     PFE  : 20.96%
##     HMC  : 24.46%
##     NVDA : 1.40%

5 Plots

5.1 Feasible Set

The simulated portfolios form the characteristic “bullet” shape of the feasible set. Every plotted point is an achievable portfolio; the upper-left boundary is the efficient frontier.

ggplot(Results, aes(x = PortSigma, y = PortMean)) +
  geom_point(colour = "steelblue", alpha = 0.3, size = 1.2) +
  labs(
    title    = "Feasible Set of Simulated Portfolios",
    subtitle = "GE, XOM, GBX, SBUX, PFE, HMC, NVDA | 2021-2024 | 700 Simulations",
    x        = "Portfolio Sigma (Monthly Std Dev)",
    y        = "Portfolio Mean (Monthly Return)"
  ) +
  theme_minimal(base_size = 12) +
  theme(plot.title = element_text(face = "bold"))

5.2 Optimal & Minimum Variance Portfolios

The efficient frontier is traced by selecting the minimum-sigma portfolio within each return bin. The segment above the MVP (green triangle) constitutes the efficient frontier, the portion below it is dominated and therefore inefficient. The Optimal Portfolio (gold diamond) is the tangency point that maximises the Sharpe ratio.

# Bin by sigma (x-axis) and take the MAX return per bin -> upper envelope
seqsig <- seq(min(Results$PortSigma),
              max(Results$PortSigma),
              length.out = 150)

optim_frontier <- Results %>%
  mutate(ints = cut(PortSigma, breaks = seqsig)) %>%
  filter(!is.na(ints)) %>%
  group_by(ints) %>%
  summarise(sig_optim  = mean(PortSigma),
            retn_optim = max(PortMean),
            .groups    = "drop") %>%
  arrange(sig_optim) %>%
  # Keep only the efficient portion: above the MVP return
  filter(retn_optim >= mvp_port$PortMean)

# Smooth with a high span for a clean arc (no dips/peaks)
smooth_fit <- loess(retn_optim ~ sig_optim, data = optim_frontier, span = 0.6)

smooth_curve <- data.frame(
  sig_optim = seq(min(optim_frontier$sig_optim),
                  max(optim_frontier$sig_optim),
                  length.out = 300)
)

smooth_curve$retn_optim <- predict(smooth_fit, smooth_curve$sig_optim)

# Plot
ggplot() +
  geom_point(data = Results,
             aes(x = PortSigma, y = PortMean),
             colour = "steelblue", alpha = 0.2, size = 1.2) +
  # Smooth frontier curve
  geom_line(data = smooth_curve,
            aes(x = sig_optim, y = retn_optim),
            colour = "darkred", linewidth = 1.3) +
  geom_point(data = opt_port,
             aes(x = PortSigma, y = PortMean),
             colour = "gold", size = 5, shape = 18) +
  geom_point(data = mvp_port,
             aes(x = PortSigma, y = PortMean),
             colour = "green4", size = 5, shape = 17) +
  annotate("text",
           x     = opt_port$PortSigma + 0.001,
           y     = opt_port$PortMean,
           label = sprintf("Optimal Portfolio\n(Sharpe = %.3f)", opt_port$Sharpe),
           hjust = 0, size = 3.2, colour = "darkgoldenrod") +
  annotate("text",
           x     = mvp_port$PortSigma + 0.001,
           y     = mvp_port$PortMean,
           label = sprintf("Min Variance Portfolio\n(\u03c3 = %.4f)", mvp_port$PortSigma),
           hjust = 0, size = 3.2, colour = "green4") +
  labs(
    title    = "Mean-Variance Portfolio Optimization via Simulation",
    subtitle = "GE, XOM, GBX, SBUX, PFE, HMC, NVDA | 2021-2024 | rf = 2% p.a.",
    x        = "Portfolio Sigma (Monthly Std Dev)",
    y        = "Portfolio Mean (Monthly Return)",
    caption  = "Gold diamond = Optimal Portfolio (Max Sharpe) | Green triangle = Min Variance Portfolio"
  ) +
  theme_minimal(base_size = 12) +
  theme(plot.title   = element_text(face = "bold", size = 13),
        plot.caption = element_text(size = 9, hjust = 0))

6 Analysis and Discussion

6.1 Interpreting the Outputs

Stock means reflect aggregate performance over 2021–2024. NVDA leads all stocks with a monthly mean of ~6.3%, driven by the AI semiconductor boom. GE and XOM also post strong positive means, reflecting industrial recovery and energy cycle tailwinds respectively. GBX and HMC offer moderate returns with cross-sector and international diversification. SBUX and PFE post near-zero or negative means over this period, reflecting consumer headwinds and post-pandemic pharmaceutical softness.

The feasible set encompasses all 700 achievable portfolios. Its upper-left boundary is the efficient frontier — no portfolio on this boundary can be improved simultaneously in both return and risk. The Optimal Portfolio is the tangency point where a line from \(r_f\) is tangent to the frontier, maximising the Sharpe ratio \(\frac{\bar{r}_p - r_f}{\sigma_p}\). Note that only the efficient frontier segment above the MVP is truly non-dominated; portfolios below the MVP offer lower return for the same risk and are therefore inefficient.

The MVP sits at the leftmost tip of the feasible set and is suited to investors who prioritise capital preservation above return.

6.2 Pros and Cons of the Simulation Method

Simulation Quadratic Optimization
Precision Approximate Exact
Interpretability High — full feasible set visible Low — solution only
Solver required No Yes (quadprog, PortfolioAnalytics)
Scalability Poor for large N Better
Constraints Simple (filter results) Requires explicit specification

The key advantage of simulation is transparency; every portfolio is visible and comparable, and no black-box optimizer is needed. The key limitation is approximation: the portfolio identified as optimal depends on the random seed and simulation count, and random sampling may miss the true optimum in high-dimensional weight spaces. Critically, both methods share the same vulnerability to estimation risk. Small errors in mean returns or the covariance matrix can shift results substantially, since both rely on historical data that may not be predictive of future performance.

7 Conclusion

This report demonstrates simulation-based mean-variance optimization using seven stocks (GE, XOM, GBX, SBUX, PFE, HMC, NVDA) over January 2021 to December 2024. Data retrieval was performed using the yfR package, which provides adjusted closing prices by default and enables batch downloading in a single function call. Generating 700 random portfolios with all return, sigma, and Sharpe values treated as aggregate figures across the full period which allowed us to visualize the feasible set, trace the efficient frontier, and identify both the Optimal Portfolio and the Minimum Variance Portfolio. The simulation approach is intuitive and requires no specialized solvers, making it a practical alternative to quadratic optimization at the cost of an approximate rather than exact solution.