Portfolio optimization addresses the weight allocation problem: given a selected set of securities, how much capital should be assigned to each? This assignment uses the mean-variance framework to find weight combinations that are not dominated by alternatives. That is to say, no other portfolio offers higher return for the same risk or lower risk for the same return.
Rather than analytical quadratic optimization, we used a simulation approach: we generated 100N random weight vectors, computed each portfolio’s expected return and risk, and identified the Optimal Portfolio (highest Sharpe ratio) and the Minimum Variance Portfolio (MVP, lowest standard deviation).
Data retrieval was performed using the yfR package,
which provides adjusted closing prices by default and supports batch
downloading of multiple tickers in a single call, offering a more robust
and up-to-date alternative to quantmod.
The function accepts four inputs: tickers,
begin_date, end_date, and rf
(annual risk-free rate) and returns a named list containing the vector
of stock means, the covariance matrix, and a data frame of simulated
portfolio results (weights, mean, sigma, Sharpe ratio).
yfR::yf_get() retrieves all tickers in a single call and
returns monthly arithmetic returns on adjusted prices directly, removing
the need for per-ticker loops or manual adjusted price extraction. The
long format output is reshaped to wide format via
pivot_wider() before any matrix calculations are
performed.
myMeanVarPort <- function(tickers, begin_date, end_date, rf) {
# Download monthly adjusted returns via yfR
df_long <- yf_get(tickers = tickers,
first_date = begin_date,
last_date = end_date,
freq_data = "monthly",
type_return = "arit",
be_quiet = TRUE)
# Reshape from long to wide format, preserve ticker order, drop NAs
retout <- df_long %>%
select(ref_date, ticker, ret_adjusted_prices) %>%
pivot_wider(names_from = ticker,
values_from = ret_adjusted_prices) %>%
arrange(ref_date) %>%
select(all_of(tickers)) %>%
na.omit()
# Aggregate means and covariance matrix over the full period
meanret <- as.matrix(colMeans(retout))
covar <- var(retout)
# Simulate 100*N random weight vectors (seed = 12)
n <- length(tickers)
niter <- 100 * n
set.seed(12)
randomnums <- data.frame(replicate(n, runif(niter, 0, 1)))
wt_sim <- randomnums / rowSums(randomnums)
# Convert annual rf to monthly before computing Sharpe ratios
rf_monthly <- rf / 12
# Compute portfolio mean, sigma, and Sharpe ratio for each simulation
weight <- matrix(NA, nrow = n, ncol = 1)
Results <- matrix(NA, nrow = niter, ncol = n + 3)
for (i in 1:niter) {
for (k in 1:n) {
Results[i, k] <- weight[k, 1] <- wt_sim[i, k]
}
port_mean <- t(weight) %*% meanret
port_sigma <- sqrt(t(weight) %*% covar %*% weight)
sharpe <- (port_mean - rf_monthly) / port_sigma
Results[i, n + 1] <- port_mean
Results[i, n + 2] <- port_sigma
Results[i, n + 3] <- sharpe
}
colnames(Results) <- c(paste0(tickers, "_wt"), "PortMean", "PortSigma", "Sharpe")
list(
stock_means = round(meanret, 6),
cov_matrix = round(covar, 8),
sim_portfolios = as.data.frame(Results)
)
}Per the assignment, the function was run with the required inputs below. With 7 securities, 700 portfolios were simulated (100 × 7).
The mean returns and covariance matrix are single aggregate values computed over the full sample period. The diagonal elements of the covariance matrix are individual stock variances; off-diagonal elements measure pairwise co-movement and drive diversification benefits.
## Mean Monthly Returns (2021-2024):
## [,1]
## GE 0.029508
## XOM 0.025301
## GBX 0.023086
## SBUX 0.003455
## PFE -0.000358
## HMC 0.006483
## NVDA 0.063030
## Variance-Covariance Matrix:
## GE XOM GBX SBUX PFE HMC
## GE 0.00909667 0.00394022 0.00641849 0.00314594 -0.00075865 0.00270523
## XOM 0.00394022 0.00711813 0.00387242 0.00071499 -0.00066581 0.00141213
## GBX 0.00641849 0.00387242 0.02247839 0.00226764 0.00175634 0.00220912
## SBUX 0.00314594 0.00071499 0.00226764 0.00623296 0.00093618 0.00024185
## PFE -0.00075865 -0.00066581 0.00175634 0.00093618 0.00566406 -0.00046056
## HMC 0.00270523 0.00141213 0.00220912 0.00024185 -0.00046056 0.00396152
## NVDA 0.00739327 0.00005447 0.00755342 0.00277184 0.00288029 0.00331455
## NVDA
## GE 0.00739327
## XOM 0.00005447
## GBX 0.00755342
## SBUX 0.00277184
## PFE 0.00288029
## HMC 0.00331455
## NVDA 0.02393420
## Total portfolios simulated: 700
| GE_wt | XOM_wt | GBX_wt | SBUX_wt | PFE_wt | HMC_wt | NVDA_wt | PortMean | PortSigma | Sharpe |
|---|---|---|---|---|---|---|---|---|---|
| 0.02200 | 0.12060 | 0.10444 | 0.14058 | 0.26511 | 0.21988 | 0.12739 | 0.01596 | 0.05211 | 0.27424 |
| 0.17490 | 0.09083 | 0.16895 | 0.18331 | 0.12098 | 0.20000 | 0.06103 | 0.01709 | 0.05882 | 0.26227 |
| 0.24676 | 0.01779 | 0.26135 | 0.04117 | 0.14936 | 0.11392 | 0.16966 | 0.02529 | 0.07712 | 0.30625 |
| 0.08228 | 0.22659 | 0.06260 | 0.01112 | 0.27532 | 0.07600 | 0.26609 | 0.02681 | 0.06470 | 0.38862 |
| 0.05806 | 0.20295 | 0.28035 | 0.08887 | 0.16370 | 0.06580 | 0.14027 | 0.02284 | 0.07026 | 0.30132 |
| 0.00897 | 0.11912 | 0.24735 | 0.18626 | 0.10236 | 0.10331 | 0.23262 | 0.02493 | 0.07284 | 0.31936 |
| 0.04826 | 0.19277 | 0.22443 | 0.16279 | 0.26723 | 0.10418 | 0.00034 | 0.01265 | 0.05669 | 0.19366 |
| 0.20001 | 0.15077 | 0.26378 | 0.01574 | 0.18823 | 0.17998 | 0.00148 | 0.01705 | 0.06508 | 0.23643 |
| 0.00667 | 0.10845 | 0.27122 | 0.09188 | 0.25678 | 0.14664 | 0.11837 | 0.01784 | 0.06533 | 0.24757 |
| 0.00282 | 0.21481 | 0.24811 | 0.17142 | 0.14310 | 0.11865 | 0.10109 | 0.01893 | 0.06292 | 0.27435 |
opt_idx <- which.max(Results$Sharpe)
opt_port <- Results[opt_idx, ]
mvp_idx <- which.min(Results$PortSigma)
mvp_port <- Results[mvp_idx, ]
cat("OPTIMAL PORTFOLIO (Highest Sharpe)\n")## OPTIMAL PORTFOLIO (Highest Sharpe)
## Portfolio Mean: 0.03802
## Portfolio Sigma: 0.08575
## Sharpe Ratio: 0.42395
## Weights:
## GE : 14.04%
## XOM : 18.07%
## GBX : 2.32%
## SBUX : 8.09%
## PFE : 1.31%
## HMC : 12.22%
## NVDA : 43.95%
##
## MINIMUM VARIANCE PORTFOLIO
## Portfolio Mean: 0.01091
## Portfolio Sigma: 0.04371
## Sharpe Ratio: 0.21155
## Weights:
## GE : 7.49%
## XOM : 19.45%
## GBX : 2.44%
## SBUX : 23.79%
## PFE : 20.96%
## HMC : 24.46%
## NVDA : 1.40%
The simulated portfolios form the characteristic “bullet” shape of the feasible set. Every plotted point is an achievable portfolio; the upper-left boundary is the efficient frontier.
ggplot(Results, aes(x = PortSigma, y = PortMean)) +
geom_point(colour = "steelblue", alpha = 0.3, size = 1.2) +
labs(
title = "Feasible Set of Simulated Portfolios",
subtitle = "GE, XOM, GBX, SBUX, PFE, HMC, NVDA | 2021-2024 | 700 Simulations",
x = "Portfolio Sigma (Monthly Std Dev)",
y = "Portfolio Mean (Monthly Return)"
) +
theme_minimal(base_size = 12) +
theme(plot.title = element_text(face = "bold"))The efficient frontier is traced by selecting the minimum-sigma portfolio within each return bin. The segment above the MVP (green triangle) constitutes the efficient frontier, the portion below it is dominated and therefore inefficient. The Optimal Portfolio (gold diamond) is the tangency point that maximises the Sharpe ratio.
# Bin by sigma (x-axis) and take the MAX return per bin -> upper envelope
seqsig <- seq(min(Results$PortSigma),
max(Results$PortSigma),
length.out = 150)
optim_frontier <- Results %>%
mutate(ints = cut(PortSigma, breaks = seqsig)) %>%
filter(!is.na(ints)) %>%
group_by(ints) %>%
summarise(sig_optim = mean(PortSigma),
retn_optim = max(PortMean),
.groups = "drop") %>%
arrange(sig_optim) %>%
# Keep only the efficient portion: above the MVP return
filter(retn_optim >= mvp_port$PortMean)
# Smooth with a high span for a clean arc (no dips/peaks)
smooth_fit <- loess(retn_optim ~ sig_optim, data = optim_frontier, span = 0.6)
smooth_curve <- data.frame(
sig_optim = seq(min(optim_frontier$sig_optim),
max(optim_frontier$sig_optim),
length.out = 300)
)
smooth_curve$retn_optim <- predict(smooth_fit, smooth_curve$sig_optim)
# Plot
ggplot() +
geom_point(data = Results,
aes(x = PortSigma, y = PortMean),
colour = "steelblue", alpha = 0.2, size = 1.2) +
# Smooth frontier curve
geom_line(data = smooth_curve,
aes(x = sig_optim, y = retn_optim),
colour = "darkred", linewidth = 1.3) +
geom_point(data = opt_port,
aes(x = PortSigma, y = PortMean),
colour = "gold", size = 5, shape = 18) +
geom_point(data = mvp_port,
aes(x = PortSigma, y = PortMean),
colour = "green4", size = 5, shape = 17) +
annotate("text",
x = opt_port$PortSigma + 0.001,
y = opt_port$PortMean,
label = sprintf("Optimal Portfolio\n(Sharpe = %.3f)", opt_port$Sharpe),
hjust = 0, size = 3.2, colour = "darkgoldenrod") +
annotate("text",
x = mvp_port$PortSigma + 0.001,
y = mvp_port$PortMean,
label = sprintf("Min Variance Portfolio\n(\u03c3 = %.4f)", mvp_port$PortSigma),
hjust = 0, size = 3.2, colour = "green4") +
labs(
title = "Mean-Variance Portfolio Optimization via Simulation",
subtitle = "GE, XOM, GBX, SBUX, PFE, HMC, NVDA | 2021-2024 | rf = 2% p.a.",
x = "Portfolio Sigma (Monthly Std Dev)",
y = "Portfolio Mean (Monthly Return)",
caption = "Gold diamond = Optimal Portfolio (Max Sharpe) | Green triangle = Min Variance Portfolio"
) +
theme_minimal(base_size = 12) +
theme(plot.title = element_text(face = "bold", size = 13),
plot.caption = element_text(size = 9, hjust = 0))Stock means reflect aggregate performance over 2021–2024. NVDA leads all stocks with a monthly mean of ~6.3%, driven by the AI semiconductor boom. GE and XOM also post strong positive means, reflecting industrial recovery and energy cycle tailwinds respectively. GBX and HMC offer moderate returns with cross-sector and international diversification. SBUX and PFE post near-zero or negative means over this period, reflecting consumer headwinds and post-pandemic pharmaceutical softness.
The feasible set encompasses all 700 achievable portfolios. Its upper-left boundary is the efficient frontier — no portfolio on this boundary can be improved simultaneously in both return and risk. The Optimal Portfolio is the tangency point where a line from \(r_f\) is tangent to the frontier, maximising the Sharpe ratio \(\frac{\bar{r}_p - r_f}{\sigma_p}\). Note that only the efficient frontier segment above the MVP is truly non-dominated; portfolios below the MVP offer lower return for the same risk and are therefore inefficient.
The MVP sits at the leftmost tip of the feasible set and is suited to investors who prioritise capital preservation above return.
| Simulation | Quadratic Optimization | |
|---|---|---|
| Precision | Approximate | Exact |
| Interpretability | High — full feasible set visible | Low — solution only |
| Solver required | No | Yes (quadprog,
PortfolioAnalytics) |
| Scalability | Poor for large N | Better |
| Constraints | Simple (filter results) | Requires explicit specification |
The key advantage of simulation is transparency; every portfolio is visible and comparable, and no black-box optimizer is needed. The key limitation is approximation: the portfolio identified as optimal depends on the random seed and simulation count, and random sampling may miss the true optimum in high-dimensional weight spaces. Critically, both methods share the same vulnerability to estimation risk. Small errors in mean returns or the covariance matrix can shift results substantially, since both rely on historical data that may not be predictive of future performance.
This report demonstrates simulation-based mean-variance optimization
using seven stocks (GE, XOM, GBX, SBUX, PFE, HMC, NVDA) over January
2021 to December 2024. Data retrieval was performed using the
yfR package, which provides adjusted closing prices by
default and enables batch downloading in a single function call.
Generating 700 random portfolios with all return, sigma, and Sharpe
values treated as aggregate figures across the full period which allowed
us to visualize the feasible set, trace the efficient frontier, and
identify both the Optimal Portfolio and the Minimum Variance Portfolio.
The simulation approach is intuitive and requires no specialized
solvers, making it a practical alternative to quadratic optimization at
the cost of an approximate rather than exact solution.