Monte Carlo simulation

Idea

CLT says that with infinite sample, your estimators will have normal distribution. Yet, you always have a finite sample. The question is which estimation method should you use to get the best estimated parameters? In here we use Monte Carlo method to find out whether using Gradient, Hessian matrix or Expected value of Hessian Matrix will give the best estimated values of the variance of parameters.

Background

Link: https://www.mediafire.com/?r9p02kpu06lz46r

Write needed Function

1. J1

J_1 returns estimates of the (inverse) of the asymptotic variance matrix

J_1 <- function(y,x,par) {
    Phi = pnorm(x %*% par)
    phi = dnorm(x %*% par)

    n = length(y) 
    k = length(par)

    g = matrix(rep(y*phi/Phi - (1-y)*phi/(1-Phi),k),nrow=n)*x

    f = t(g)%*%g

    return(f)
}

2. Hessian of Log-likelihood function (J2)

Probit_LL_h.R returns the hessian of the (negative) log-likelihood function of the probit model

Probit_LL_h <- function (y,x,par) {
    Phi = pnorm(x %*% par)
    phi = dnorm(x %*% par)

    n = length(y) 
    k = length(par)

    # Computing the Hessian (second partial derivative)
    h = t(x)%*% (matrix(rep((1-y)*(phi/(1-Phi)*(x%*%par)-(phi^2)/(1-Phi)^2),k),nrow=n)*x) + 
        t(x)%*% (matrix(rep((y)*(phi/Phi*(-x%*%par)-(phi^2)/Phi^2),k),nrow=n)*x)
    h = -h

    return(h)
}

3. J3

J_3 returns estimates of the (inverse) of the asymptotic variance matrix

J_3 <- function(y,x,par) {
    Phi = pnorm(x %*% par)
    phi = dnorm(x %*% par)

    n = length(y) 
    k = length(par)

    f = t(x)%*%(matrix(rep(phi/Phi*phi/(1-Phi),k),nrow=n)*x)

    return(f)
}

4. Log-Likelihood function

Probit_LL returns the (negative) log-likelihood function of the probit model

Probit_LL <- function(y,x,par) {
    Phi = pnorm(x %*% par)
    phi = dnorm(x %*% par)

    n = length(y) 
    k = length(par)

    # Computing the log-likelihood
    f = sum(y*log(Phi)) + sum((1-y)*log(1-Phi))
    f = -f

    return(f)
}

5. Gradient of Log-likelihood

Probit_LL_g returns the gradient of the (negative) log-likelihood function of the probit model

Probit_LL_g <- function (y,x,par) {
    Phi = pnorm(x %*% par) # Phi is Cumulative probability
    phi = dnorm(x %*% par) # phi is Proability Density

    n = length(y)           # sample size
    k = length(par)         # number of coefficients

    # Computing the gradient 
    # (What is its formular??? Where is it from?) Is it from derivative wrt theta0 of
    #  the function of the density of yi given xi - equation 13.6 Wooldridge
    # Yes!
    
    
    g = t(matrix(rep(phi/Phi,k),nrow=n)*x) %*% y - 
        t(matrix(rep(phi/(1-Phi),k),nrow=n)*x) %*% (1-y)
    g = -g

    return(g)
}

We build a loop around the data generating process and estimation part and computes the average estimate of the three different variances matrices based on J1, J2, J3, which are coded in separate functions, Probit_LL_h, J_1, and J_3.

library(numDeriv)   # loads the functions grad and hessian which numerically evaluate the gradient and hessian

## Warning: package 'numDeriv' was built under R version 3.3.2

rm(list=ls())       # Clear workspace
# Load the log-likelihood, its derivative, and the hessian
source("Probit_LL.R")
source("Probit_LL_g.R")
source("Probit_LL_h.R")
source("J_1.R")
source("J_3.R")

set.seed(2)         # Set seed for random number generator

n = 300         # Set the sample size

theta0 = c(1,1,1)   # True parameter value

k = length(theta0)      # Number of parameters

num = 1000      # Number of Monte Carlo iterations

theta_hat_vec = matrix(0,num,k) # Matrix of 0 with dimension 1000x3

J_1_inv = matrix(0,k,k) #Matrix of 0 with dimension 3x3
J_2_inv = matrix(0,k,k) #Matrix of 0 with dimension 3x3
J_3_inv = matrix(0,k,k) #Matrix of 0 with dimension 3x3

Monte Carlo loop:

#Create a loop of 1000 iterations
for (it in 1:num) { 
    # Data generating process
    x = cbind(matrix(1,n,1),matrix(rnorm(2*n,0,1),ncol=2))  # regressors

    u = rnorm(n,0,1)                                        # error terms

    y_star = x %*% theta0 + u               # latent "utility" (unobsered y)

    y = ceiling(y_star/max(abs(y_star)))    # observed outcome (y=1 if y_star>0, otherwise y=0)
    
    dat = data.frame(x,y)                   # Create data frame

    # Compute parameters that optimize the log-likelihood function with given data
    result <- optim(par = theta0, Probit_LL, y = y, x = x, gr = Probit_LL_g, method = c("BFGS"), control = list(reltol=1e-9))

    theta_hat_vec[it,1:k] = result$par #Store computed parameter of one loop into theta_hat_vec
        # When the loop finishes, we will get a vector of 1000 sets of theta_hat
    
    
    J_1_inv = J_1_inv + solve(J_1(y,x,result$par)) 
    
    J_2_inv = J_2_inv + solve(Probit_LL_h(y,x,result$par)) 

    J_3_inv = J_1_inv + solve(J_1(y,x,result$par)) 
}

Variance of theta_hat over Monte Carlo iterations

var(theta_hat_vec)

##            [,1]       [,2]       [,3]
## [1,] 0.01758578 0.01102783 0.01101053
## [2,] 0.01102783 0.02116617 0.01212997
## [3,] 0.01101053 0.01212997 0.02279791

Average of variance estimate based on J_1

J_1_inv/num

##             [,1]        [,2]        [,3]
## [1,] 0.016575690 0.009627735 0.009638856
## [2,] 0.009627735 0.020458742 0.009568992
## [3,] 0.009638856 0.009568992 0.020447097

Average of variance estimate based on J_2

J_2_inv/num

##             [,1]        [,2]        [,3]
## [1,] 0.015767336 0.008933802 0.008978805
## [2,] 0.008933802 0.019363556 0.009129810
## [3,] 0.008978805 0.009129810 0.019434550

Average of variance estimate based on J_3

J_3_inv/num

##             [,1]        [,2]        [,3]
## [1,] 0.016587160 0.009633201 0.009643042
## [2,] 0.009633201 0.020470625 0.009572925
## [3,] 0.009643042 0.009572925 0.020459840

Conclusion

In the case of sample size of 300, we can see the estimator J3 returns the best estimated variance (i.e. closest to variance got from Monte Carlo iteraton).

We can do the exact thing but instead of sample size 300, we change to sample size of 1000. In this case we’ll see the J1 now is the best estimator.

Monte Carlo simulation - On choosing estimator - Probit model

Nhung Nguyen

February 15, 2017