This week, one of the topics of interest is variance reduction. Discuss a method of variance reduction in simulation (see Rizzo’s SCR chapter 5). Demonstrate the method. Discuss when it might be used.

Increasing Replications:

See: Rizzo pg 126, Chapter: 5.3.

\[Var(\hat{\theta})=Var\overline Y=\frac{Var_{f}g(X)}{m}\]

In general, if standard error should be at most \(e\) and \(Var_{f}g(X)=\sigma^2\), then:
\[ m \geq \lceil \sigma^2/e^2 \rceil \]

Given the Exponential Distribution Function:

\(f_{R}(x) = \begin{cases} \lambda e^{-\lambda x}, & x \geq 0\\ 0, & x < 0 \end{cases}\)

And it’s CDF:

\(F_{R}(x) = \begin{cases} 1-\lambda e^{-\lambda x}, & x \geq 0\\ 0, & x < 0 \end{cases}\)

Setting a CDF function for a given distribution \(F(x)\) equal to an RV \(R\) where \(R=U[0,1]\), I can generated random values from the exponential function:

\(F(X)=1-\lambda e^{-\lambda x}=R\). After solving for \(X\), we get: \(X=-\frac{1}{\lambda}\ln(1-R)\) - my random variate generator for which we’ll look at different variances:

First, my function to return a value for \(R\), the RV for the uniform distribution on \([0,1]\) through the above mentioned function:

# Produces Exponentially Distributed RV:  
my_rvg_expo <- function(lam){
  x <- runif(1)
  return((-1/lam)*log(1-x))
}

Demonstration of the poor returns in Variance Reduction from increased replication:

First I create the function that will run my Monte Carlo simulations for this exponential variable with \(\lambda = 1\).

my_replicate <- function(n){
  # replicates the generation of n
  # exponentially distributed RVs
  # and returns a DF with SD, Var, and n
  q <- NULL
  for(i in 2:n){
    g <- my_rvg_expo(1)
    q <- append(q, g)
  }
  
  Sd_val <- sd(q)
  Var_val <- Sd_val^2
  Qty_val <- n
  
  result_val <- data.frame(n = c(n),
                           SD = c(Sd_val),
                           Var = c(Var_val))
  
  return(result_val)

}

As shown below, I have plotted runs from 100 to 50,000 in jumps of 500 to illustrate how little variance is reduced as n increases:

n <- seq(100,50000, 500)
final_results <- NULL

for(i in n){
  interm <-  my_replicate(i)
  final_results <- rbind(final_results,interm)
}

ggplot(final_results, aes(x=n, y=Var)) + 
geom_point() + 
geom_smooth(method = 'loess')

Note only that. This took 5minutes to run.