Comments on Wald Statistics

To produce the Wald statistics, the covariance matrix is required. As we know, the Wald test is appropraite in the case of large sample. Hence, I argue that it will be better to run the Wald test based on MLE estimator, since the MLE is pretty good at achieving efficiency. Below is an example, illustraing the approach to run Wald test in R.

data("iris")
colnames(iris) <- c('x1','x2','x3','y','m')  # dataset

library(bbmle)

## Warning: package 'bbmle' was built under R version 3.3.1

## Loading required package: stats4

LL <- function(beta0, beta1, beta2, beta3, mu, sigma) {
  R = y - beta0 - beta1 * x1 - beta2 * x2 - beta3 * x3
  score = suppressWarnings(dnorm(R, mu, sigma, log = TRUE))
  -sum(score)
}

fit <- mle2(LL, start = list(beta0 = 0, beta1 = 0, beta2 = 0, beta3 = 0, mu = 0, sigma = 1), fixed = list(mu = 0), data = iris) # MLE estimation
summary(fit)

## Maximum likelihood estimation
## 
## Call:
## mle2(minuslogl = LL, start = list(beta0 = 0, beta1 = 0, beta2 = 0, 
##     beta3 = 0, mu = 0, sigma = 1), fixed = list(mu = 0), data = iris)
## 
## Coefficients:
##        Estimate Std. Error z value     Pr(z)    
## beta0 -0.240307   0.175980 -1.3655    0.1721    
## beta1 -0.207266   0.046870 -4.4222 9.771e-06 ***
## beta2  0.222828   0.048282  4.6151 3.929e-06 ***
## beta3  0.524083   0.024163 21.6894 < 2.2e-16 ***
## sigma  0.189395   0.010935 17.3199 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## -2 log L: -73.50218

fit_coef <- as.matrix(fit@coef[-5], ncol = 1) # get rid of estimate of sigma
par_cov <- fit@vcov[-5, -5] # get rid of estimate of sigma

R <- matrix(c(0, 1, 0, 2, 
              0, 2, 1, 0), nrow = 2) # two constraints

res_cov <-R %*% par_cov %*% t(R) # restricted covariance matrix

wald_stat <- t(R %*% fit_coef - matrix(c(1,1), ncol = 1)) %*% solve(res_cov) %*% (R %*% fit_coef - matrix(c(1,1), ncol = 1))
pchisq(wald_stat, df = 2, lower.tail = F)

##              [,1]
## [1,] 4.202846e-90

Note that the constraints are \(0*c + 1*x_{1} + 0*x_{2} + 2*x_{3} = 1\) and \(0*c + 2*x_{1} + 1*x_{2} + 0*x_{3} = 1\). I just made them up. Hence, I obtain a very low p-value.
Last, an advantage of Wald test is that the restricted model does not need to be estimated. By using the first order derivative matrix, one can test on non-linear constraints as well, simply by changing the R matrix (the delta method).

Comments on Wald Statistics

Bowen Chen

September 7, 2016