Statistical Resampling Methods: Part 1


Monte Carlo Simulations

Monte Carlo Simulations is a method of estimating the value of an unknown quantity using the principles of inferential statistics. The intuition is that a random sample tends to exhibit the same properties as the population from which it is drawn.

In applied statistics, Monte Carlo methods are generally used for two purposes:

  1. To compare competing statistics for small samples under realistic data conditions. Although Type I error and power properties of statistics can be calculated for data drawn from classical theoretical distributions (e.g., normal curve, Cauchy distribution) for asymptotic conditions (i.e., infinite sample size and infinitesimally small treatment effect), real data often do not have such distributions.

  2. To provide implementations of hypothesis tests that are more efficient than exact tests such as permutation tests (which are often impossible to compute) while being more accurate than critical values for asymptotic distributions.


Monte Carlo simulations sample from a probability distribution for each variable to produce hundreds or thousands of possible outcomes. The results are analyzed to get probabilities of different outcomes occurring.

Example: Characterizing Variability in Returns for Financial Portfolios

What happens with riskless asset over a 40 year trading period?

library(foreach)

#creating a riskless asset
#Horizon = number of years
Horizon = 40
ReturnAvg = 0.05

#initial investment value
Wealth = 10000

#for loop updates the value of wealth over 40 yrs
for(year in 1:Horizon){
  ThisReturn = ReturnAvg
  Wealth = Wealth * (1 + ThisReturn)
}

Wealth
## [1] 70399.89
#Compare result with the usual compound interest formula 
10000 * (1 + ReturnAvg)^40
## [1] 70399.89

So, an initial investment of $10,000 with an average annual return of 5% over 40 years generates a portfolio value of $70,399.89

This value does not change when the projection is repeated because there is no uncertainty

The graph below shows change in wealth over time, which is an exponential growth curve with no uncertainty:

#resetting initial wealth
Wealth = 10000
WealthOverTime = rep(0, Horizon)

#for loop updates the value of wealth over 40 years 
for(year in 1:Horizon) {
  ThisReturn = ReturnAvg
  Wealth = Wealth * (1 + ThisReturn)
  WealthOverTime[year] = Wealth

}

plot(WealthOverTime, type = "l", xlab = "Years", ylab = "Total Portfolio Value", main = "Portfolio Value of a Riskless Asset", col = "dark green", lwd = 10, cex.lab = 1.5, cex.main = 1.5)

plot of chunk unnamed-chunk-2

Injecting some Noise/Variability:

#Simulating a risky asset with a positive expected return
ReturnAvg = 0.05 
ReturnSD = 0.05
Horizon = 40

#resetting initial wealth
Wealth = 10000
#Allocating space to variable
WealthOverTime = rep(0, Horizon)

for(year in 1:Horizon){
  ThisReturn = rnorm(1, ReturnAvg, ReturnSD)
  Wealth = Wealth * (1 + ThisReturn)
  WealthOverTime[year] = Wealth
}

Wealth
## [1] 51246.92
plot(WealthOverTime, type = "l", xlab = "Years", ylab = "Total Portfolio Value", main = "Portfolio Value of a Risky Asset", col = "pink", lwd = "10", cex.lab = 1.5, cex.main = 1.5)

plot of chunk unnamed-chunk-3
After adding in a standard deviation of 5% to the average annual return rate, we see a different terminal wealth value with high variability each time you run this simulation.

To quantify this variabilty, we can use a monte carlo simulation:

ReturnAvg = 0.05
ReturnSD = 0.05
Horizon = 40
sim1 = foreach(i=1:1000, .combine='c') %do% {
 Wealth = 10000 # Reset initial wealth
 # Sweep through each year and update the value of wealth
 for(year in 1:Horizon) {
 # Generate a random return
 ThisReturn = rnorm(1, ReturnAvg, ReturnSD)

 # Update wealth
 Wealth = Wealth * (1 + ThisReturn)
 }
 # Output the value of wealth for each simulated scenario
 Wealth
}

hist(sim1, breaks = 20, main = "Sampling Distribution of Terminal Wealth Values", xlab = "Value ($)", cex.lab = 1.5, cex.main = 1.5)
abline(v = mean(sim1), col = "red", lwd =10)

plot of chunk unnamed-chunk-4
Using rnorm, we obtain a yearly return that is normally distributed. The monte carlo simulation repeats 1,000 times and returns the sampling distribution of the terminal wealth over 40 years.

The expected value of the monte carlo simulation terminal wealth mean is similar to the terminal wealth of a riskless asset. However, by adding variability into the average return rate, we see a large spread of possibilities for the terminal wealth:

mean(sim1)
## [1] 69268.62
sd(sim1)
## [1] 20466.69
ReturnAvg = 0.05
ReturnSD = 0.05
Horizon = 40
sim1 = foreach(i=1:500, .combine='rbind') %do% {
 Wealth = 10000 # Reset initial wealth
 WealthOverTime = rep(0, Horizon) # Allocate some space
 # Sweep through each year and update the value of wealth
 for(year in 1:Horizon) {
    ThisReturn = rnorm(1, ReturnAvg, ReturnSD)
    Wealth = Wealth * (1 + ThisReturn)
    WealthOverTime[year] = Wealth
 }

}
 plot(x = WealthOverTime, type = "l", col = "blue", ylab = "Total Portfolio Value", xlab = "Years", main = "Monte Carlo Simulation of Portfolio Returns", lwd = 10, cex.lab = 1.5, cex.main = 1.5)

plot of chunk unnamed-chunk-5

Professor Scott's graph:


Summary: