Econometrics 871: Tutorial 2 Restrictions and Structural Models in R

We continue with the fish market model used in previous tutorials. Load the fulton data, and re-create the lags and holiday dummy variable if necessary (see tutorial 1). Additionally, we’ll be working with three packages in this tutorial, two of which you’re already familiar with ( and ) and the package (which stands for ‘’structural equation model’’).

rm(list=ls())
fulton <- read.delim2("./fulton.txt")

lagpad <- function(x, k) {
  if (!is.vector(x)) 
    stop('x must be a vector')
  if (!is.numeric(x)) 
    stop('x must be numeric')
  if (!is.numeric(k))
    stop('k must be numeric')
  if (1 != length(k))
    stop('k must be a single number')
  c(rep(NA, k), x)[1 : length(x)] 
}

# Creating lags
fulton$LogPriceL1 <- lagpad(fulton$LogPrice,1)
fulton$LogQuantityL1 <- lagpad(fulton$LogQuantity,1)
fulton$Holiday <- 0
fulton$Holiday[c(18,34,95)] <- c(1,1,1)

Additionally, run the following to load the packages we’ll use later:

library('pacman')
p_load(lmtest, systemfit, bbmle, mvtnorm, sem, tidyverse, huxtable, broom)
library(lmtest)
library(systemfit)
library(bbmle)
library(mvtnorm)
library(sem)

The vars package is great, until you want to add restrictions. Then the package is your friend. Remember, you can always find out more about \textbf{ by typing into your command line, which will return information pertaining to all of ’s functions. Also, if you haven’t already noticed this, text in boldface refers to packages, whereas text in this font refers to commands/function and R code more generally. This is useful when two things have the same name: for instance, the package encompasses the function as well as other accompanying functions. This is the notation commonly used on R help forums. (the command) allows us to estimate generalized systems of equations with a high degree of flexibility, but it requires a little more work than the package. Specifically, we must first create a set of formulae (objects of class formula contain the symbol ˜) that we want R to evaluate, embed them in a list, and pass them to . This is done as follows:

pEqn1 <- formula(LogPrice ~ 1 + LogPriceL1 + LogQuantityL1 + Stormy + Mixed + Holiday)
qEqn1 <- formula(LogQuantity ~  1 + LogPriceL1 + LogQuantityL1 + Stormy + Mixed + Holiday)
sysEqn1 <- list(pEqn1,qEqn1)
var1 <- systemfit(sysEqn1, data=fulton)
var1 %>%
  tidy() %>%
  as_huxtable() %>%
  set_bold(1, everywhere, TRUE) %>%
  set_caption("System of Equations Estimation Results")

System of Equations Estimation Results
term	estimate	std.error	statistic	p.value	conf.low	conf.high
eq1_(Intercept)	0.0962	0.276	0.349	0.728	-0.451	0.643
eq1_LogPriceL1	0.653	0.0662	9.87	0	0.521	0.784
eq1_LogQuantityL1	-0.0321	0.0325	-0.988	0.325	-0.0965	0.0323
eq1_Stormy	0.252	0.0584	4.32	3.58e-05	0.137	0.368
eq1_Mixed	0.139	0.0546	2.55	0.0123	0.0308	0.247
eq1_Holiday	0.094	0.142	0.662	0.509	-0.188	0.376
eq2_(Intercept)	7.64	0.723	10.6	0	6.2	9.07
eq2_LogPriceL1	0.45	0.173	2.6	0.0108	0.106	0.793
eq2_LogQuantityL1	0.148	0.085	1.75	0.0837	-0.0201	0.317
eq2_Stormy	-0.567	0.153	-3.71	0.000335	-0.871	-0.264
eq2_Mixed	-0.25	0.143	-1.75	0.0834	-0.534	0.0336
eq2_Holiday	-2.02	0.372	-5.44	3.62e-07	-2.76	-1.28

# This is the same as the output we got using the vars package also in 14.1 in HN

Note that adding unity to these formulae is actually redundant – R includes a constant (represented by unity) by default, but you can exclude a constant by adding zero to your formulae. Now that we have our VAR in an object of class we easily impose linear restrictions on our model. One way to add (exclusion) restrictions is by simply omitting variables from the equations:

# One way to add restrictions is by simply omit variables from the equations,
# This is given in table 14.5 of HN
pEqn2 <- formula(LogPrice ~ 1 + LogPriceL1 + Stormy + Mixed + Holiday)
qEqn2 <- formula(LogQuantity ~  1 + LogPriceL1 + Stormy + Mixed + Holiday)
sysEqn2 <- list(pEqn2,qEqn2)
var2 <- systemfit(sysEqn2, data=fulton)
var2 %>%
  tidy() %>%
  as_huxtable() %>%
  set_bold(1, everywhere, TRUE) %>%
  set_caption("System of Equations Estimation Results with Cross Eq. R")

System of Equations Estimation Results with Cross Eq. R
term	estimate	std.error	statistic	p.value	conf.low	conf.high
eq1_(Intercept)	-0.174	0.0412	-4.21	5.44e-05	-0.255	-0.0918
eq1_LogPriceL1	0.67	0.0637	10.5	0	0.544	0.797
eq1_Stormy	0.251	0.0584	4.3	3.87e-05	0.135	0.367
eq1_Mixed	0.137	0.0546	2.51	0.0137	0.0286	0.245
eq1_Holiday	0.124	0.139	0.895	0.373	-0.151	0.399
eq2_(Intercept)	8.88	0.109	81.5	0	8.67	9.1
eq2_LogPriceL1	0.368	0.168	2.18	0.0312	0.0339	0.702
eq2_Stormy	-0.561	0.154	-3.63	0.000435	-0.867	-0.255
eq2_Mixed	-0.239	0.144	-1.66	0.1	-0.525	0.0467
eq2_Holiday	-2.16	0.367	-5.89	4.68e-08	-2.89	-1.43

Alternatively, allows you to impose restrictions manually via matrix algebra:

# Alternatively, you can impose restrictions manually via matrix algebra:

resMat <- matrix(0,2,var1$rank)
resMat[1,3] <- 1
resMat[2,9] <- 1
resEql <- c(0,0)
var3 <- systemfit(sysEqn1, data=fulton, restrict.matrix=resMat, restrict.rhs = resEql)
linearHypothesis(var1, c("eq1_LogQuantityL1 = 0", "eq2_LogQuantityL1 = 0") , test="Chisq")

Res.Df	Df	Chisq	Pr(>Chisq)
210
208	2	4.03	0.134

lrtest(var1,var3)

#Df	LogLik	Df	Chisq	Pr(>Chisq)
13	-83.1
11	-84.7	-2	3.24	0.198

-2*(logLik(var3) - logLik(var1))

## 'log Lik.' 3.24298 (df=11)

Those of you familiar with the matrix algebra behind the imposition of restrictions will see that R’s syntax derives from this – our matrix of restrictions has a number of rows equal to the number of coefficients in the model, and the number of columns reflects the number of restrictions (2 in this case).

Once we’ve estimated our restricted and unrestricted VARs (corresponding var3 or var2 and var1 respectively) we can test these restrictions via lmtest’s function lrtest, which conducts the usual likelihood ratio test. My understanding is that it’s good practice to order your inputs into lrtest with restricted entering first and unrestricted entering second (otherwise your degrees of freedom will be negative) but the ordering does not affect the result of the test – for instance, if you enter or , you should find that both tests yield a p-value of 0.197604 (indicating that we cannot reject the restriction).

In addition to the above, an alternative syntax using the names of variables (as stored in our object var1 – you can obtain these by simply typing var1 into the command line) is also available via car’s function linearHypothesis. The syntax works as follows:

# Reduced form VAR
linearHypothesis(var1, c("eq1_LogQuantityL1 = 0", "eq2_LogQuantityL1 = 0") , test="Chisq")

Res.Df	Df	Chisq	Pr(>Chisq)
210
208	2	4.03	0.134

which again shows that we cannot reject the exclusion restriction. This approach is advantageous in that it does not require you to estimate the restricted model.

The functions used above are also useful when working with single equation models, such as the conditional model for LogQuantity. For instance, we can estimate the conditional model and impose restrictions via linearHypothesis (much like the function plot, linearHypothesis has methods for many different sorts of objects) as we did above. Alternatively, we can estimate a restricted and unrestricted conditional model and construct a likelihood ratio test. Both methods are shown here:

# Modelling the conditional equation with and without restrictions
condEqnUR <- lm(LogQuantity ~ 1 + LogPrice + LogPriceL1 
                            + LogQuantityL1 + Stormy + Mixed + Holiday, fulton)
condEqnR <- lm(LogQuantity ~ 1 + LogPrice + LogPriceL1 
                            + Stormy + Mixed + Holiday, fulton)
linearHypothesis(condEqnUR, "LogQuantityL1 = 0", test="F")

Res.Df	RSS	Df	Sum of Sq	F	Pr(>F)
104	32.9
103	32.3	1	0.657	2.09	0.151

-2*(logLik(condEqnR) - logLik(condEqnUR))

## 'log Lik.' 2.214794 (df=7)

lrtest(condEqnUR, condEqnR)

#Df	LogLik	Df	Chisq	Pr(>Chisq)
8	-88.7
7	-89.8	-1	2.21	0.137

Both of these methods are just as applicable to marginal equations:

# Modelling the marginal equation with and without restrictions
margEqnUR <- lm(LogPrice ~ 1 + LogPriceL1 + LogQuantityL1 + Stormy + Mixed + Holiday, fulton)
margEqnR <-  lm(LogPrice ~ 1 + LogPriceL1 + Stormy + Mixed + Holiday, fulton)
-2*(logLik(margEqnR) - logLik(margEqnUR))

## 'log Lik.' 1.028186 (df=6)

linearHypothesis(margEqnUR, "LogQuantityL1 = 0", test="F")

Res.Df	RSS	Df	Sum of Sq	F	Pr(>F)
105	5.88
104	5.82	1	0.0547	0.977	0.325

lrtest(margEqnUR, margEqnR)

#Df	LogLik	Df	Chisq	Pr(>Chisq)
7	5.54
6	5.03	-1	1.03	0.311

As noted in the lecture, cross-equation restrictions violate weak exogeneity. Under these conditions we can no longer estimate the system of equations via OLS (because the parameters of the conditional and marginal equations are no longer independent and hence cannot be estimated separately). Fortunately, maximum likelihood is not contingent on these assumptions – MLE is perfectly capable of estimating systems of equations where parameters as shared across models. As demonstrated in the R crash course, this is achieved by constructing a likelihood function and telling R to find it’s maximum via a numerical optimization routine. The intuition behind this routine is simple: starting with a set of user-specified initial values, R is going to plug values for our coefficients into the likelihood function, assess the likelihood associated with these values, try another set, compare, and adjust until it cannot find a set of values that returns a higher likelihood. The final set of values will be our ML coefficient estimates.

For the estimators presented in this instance, attaching our data will make things a lot easier. You can do this via the command attach(fulton[2:dim(fulton)[1], ]). Note that this command actually attaches a subset of our data which excludes the first row of observations – this is done to omit the NAs produced when we create lag variables. If we did not exclude these observations, feeding them into the likelihood function would produce something that cannot be evaluated by R’s numerical optimizers and MLE will fail (it will give you an error saying that the likelihood function has produced non-finite values).

To obtain a working example, let’s see that we can replicate what we got when we estimated the conditional model for LogPrice via OLS. Step 1 of MLE estimation process is to construct the appropriate likelihood function. This is done as follows

# In this instance, attaching our data makes things a lot easier.
attach(fulton[2:dim(fulton)[1], ])

# As a first exercise, let's see that we can replicate what we got via OLS
LL <-function(b10=0, b11=0, b12=0, b13=0,b14=0,b15=0, Sigma1=0.5) { 
  R <- LogPrice - b10 - LogPriceL1 * b11 - LogQuantityL1 * b12  
  - Stormy * b13  - Mixed * b14  - Holiday * b15 
  R = suppressWarnings(dnorm(R, 0, Sigma1))
  -sum(log(R))
}

This code does the following: it creates a function LL that takes the size parameters in our model as arguments (note that we’ve included the standard deviation of the error term as a parameter to be estimated). Setting these parameters equal to values in the first line of code establishes default values for the parameters – these are the values R will try first when it attempts to maximize this likelihood function. A word of caution: most of the time it will not matter what values you select here, but they can be important. On the one hand, if you select values that are too far from the actual parameters of the model, R might not find the true parameters (especially given that R’s optimizers are by default set to a finite number of iterations). We’re going to use mle2 from the package bbmle specifically because it’s more robust to the set of starting values we specify than other ML estimators in R (surprised to here there are others?). In a multi-equation context, you should also be careful to ensure that your initial values for the variance-covariance matrix produce a non-singular matrix (i.e. a matrix that can be inverted) – setting the variances to one and the covariances to zero will ensure that your matrix is not singular. The second line in this code chunk specifies the functional form of the model – R can store this line of code as an object of class formula. Remember that for MLE we assume that once we subtract the RHS of our model from the dependent variable, the stuff that’s left should be a normally distributed structural error. This is what’s captured in the third line of code: remember that the function dnorm is the normal distribution with the specified mean and standard deviation (arguments 2 and 3) – the first argument in dnorm is on the x-axis with respects to the normal distribution; what comes out when we feed dnorm these x-axis values is the likelihood. This line of code tell R to calculate the likelihood of observing each of the residual of our model (contained in the vector R), assuming that they are normally distributed with mean zero and standard deviation . Finally, the last line of code sums the likelihoods associated with each residual and converts it to a positive value – this is done because numerical optimization routines are typically set to minimize rather than maximize.

With the likelihood function above, we can easily obtain the MLE for out model via the command:

z1 <- mle2(LL)
summary(z1)

## Maximum likelihood estimation
## 
## Call:
## mle2(minuslogl = LL)
## 
## Coefficients:
##           Estimate  Std. Error z value  Pr(z)    
## b10     2.3951e-01  2.8404e-01  0.8432 0.3991    
## b11     7.4461e-01  6.6429e-02 11.2092 <2e-16 ***
## b12    -3.3044e-02  3.3601e-02 -0.9834 0.3254    
## b13     0.0000e+00  5.2641e-13  0.0000 1.0000    
## b14     0.0000e+00  5.2633e-13  0.0000 1.0000    
## b15     0.0000e+00  5.2633e-13  0.0000 1.0000    
## Sigma1  2.5092e-01  1.6917e-02 14.8322 <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## -2 log L: 7.9875

If you compare and the first equation from , you’ll see that these estimates are identical (barring rounding errors in the coefficients – which can be eliminated by decreasing the step size in the numerical optimizer – and the smaller standard errors the result from modelling the variance of the error term directly). From the code above, you can easily obtain the MLE estimate for marginal model of by setting as the dependent variable in the likelihood function (don’t forget to run the function again).

To extend MLE to a multi-equation context, all we need to do is add additional formulae for each equation and re-specify the distribution of the error term. On this last point, a two equation VAR will have two error terms (one for each equation) which we will assume are bivariate normal distributed. The command is not helpful to us here – this is used for a single normally distributed variable – but the package provides us with everything we need to specify joint-normal distributions over any number of dimensions (and hence equations). We loaded this package at the beginning of the tutorial, so we can use it without issues – it allows us to use the command dmvnorm, which takes as its arguments a matrix of variables (corresponding to the errors in each of our equations), a vector of means (which we’ve assumed to be zero), and a variance-covariance matrix. The dimensions of these objects are determined by the number of equations we have in our model, where a three-equation model will require vector of length three for the first two arguments and a three-by-three variance-covariance matrix for the third argument.

# We can also model multiple equations simultaniously - just like we did with our VAR using OLS:
LL <-function(b10=0, b11=0, b12=0, b13=0, b14=0, b15=0, b20=0, b21=0, 
              b22=0, b23=0, b24=0, b25=0, Sigma1=1, Sigma2=1,cov=0) {   
  R1 <- LogPrice     - b10 - LogPriceL1 * b11 - LogQuantityL1 * b12  
  - Stormy * b13  - Mixed * b14  - Holiday * b15 
  R2 <- LogQuantity  - b20 - LogPriceL1 * b21 - LogQuantityL1 * b22  
  - Stormy * b23  + Mixed * b24  - Holiday * b25 
  sigma <- matrix(c(Sigma1,cov,cov,Sigma2),2,2)
  R3 = suppressWarnings(dmvnorm(cbind(R1,R2), c(0,0), sigma))  
  -sum(log(R3))
} 
z2 <- mle2(LL, fixed=list(b12=0,b22=0), control=list(maxit=1000))
z2

## 
## Call:
## mle2(minuslogl = LL, fixed = list(b12 = 0, b22 = 0), control = list(maxit = 1000))
## 
## Coefficients:
##         b10         b11         b12         b13         b14         b15 
## -0.03854013  0.76283582  0.00000000  0.00000000  0.00000000  0.00000000 
##         b20         b21         b22         b23         b24         b25 
##  8.54470235  0.12741208  0.00000000  0.00000000  0.00000000  0.00000000 
##      Sigma1      Sigma2         cov 
##  0.06352548  0.54577308 -0.09156235 
## 
## Log-likelihood: -112.03

In addition to estimating a multi-equation model, the code chunk above demonstrates two additional features of . Firstly, we’ve used fixed to add exclusion restrictions on the parameters associated with in both equations. We can test these restrictions using any of the tests used earlier for systemfit objects (the package adds methods for objects to these and other functions). Additionally, we’ve reset the maximum number of iterations is allowed to go through before it gives up on finding the maximum of the likelihood – this may be useful when your starting values are far from the true values, as is the case for .

This seems like a lot of effort to get slightly inaccurate estimates of the OLS parameters of our model – but now we can also impose cross-equation restrictions by literally forcing the equations in our model to share parameters. This is shown in the code chunk below. Look closely, and notice that and in the second equation have been replaced by and (forcing the equations to share these parameters), and we have a new variable, , which scales both of these coefficients:

# But now we can also add cross-equation restrictions 
# - notice that b23 and b24 are no longer included, and we have a new variable, ba:
LL <-function(b10=0, b11=0, b12=0, b13=0, b14=0, b15=0, b20=0, b21=0, b22=0, 
              b25=0, Sigma1=1, Sigma2=1,cov=0,ba=1) {   
  R1 <- LogPrice     - b10 - LogPriceL1 * b11 - LogQuantityL1 * b12  
  - Stormy * b13  - Mixed * b14  - Holiday * b15 
  R2 <- LogQuantity  - b20 - LogPriceL1 * b21 - LogQuantityL1 * b22  
  - ( Stormy * b13  + Mixed * b14 ) * ba  - Holiday * b25 
  sigma <- matrix(c(Sigma1,cov,cov,Sigma2),2,2)
  R3 = suppressWarnings(dmvnorm(cbind(R1,R2), c(0,0), sigma))  
  -sum(log(R3))
} 
z3 <- mle2(LL, fixed=list(b12=0,b22=0))
z3

## 
## Call:
## mle2(minuslogl = LL, fixed = list(b12 = 0, b22 = 0))
## 
## Coefficients:
##         b10         b11         b12         b13         b14         b15 
## -0.03854013  0.76283582  0.00000000  0.00000000  0.00000000  0.00000000 
##         b20         b21         b22         b25      Sigma1      Sigma2 
##  8.54470235  0.12741208  0.00000000  0.00000000  0.06352548  0.54577308 
##         cov          ba 
## -0.09156235  1.00000000 
## 
## Log-likelihood: -112.03

If you compare this with the result in your textbook, you’ll see that we’ve obtained identical values (barring rounding errors). A word of caution: be careful when imposing restrictions to ensure that the signs on your coefficients are correct – many hours have been lost to this tut due to a sign error! Again, we can assess this model using the same commands as before. You might imagine that setting up an MLE estimator will become very tedious as the number of equations an parameters grows. For that reason, I’ve included a working example of these same estimators specified in matrix notation. For a two equation model this might seem redundant, but it will save you a lot of time as the number of equations and parameters in your model grows.

# No Restrictions - matrix notation
LL <-function(b10=0,b11=0,b12=0,b13=0,b14=0,b15=0,Sigma1=1,
              b20=0,b21=0,b22=0,b23=0,b24=0,b25=0,Sigma2=1,cov=0) {    
  B <- cbind(c(b10,b11,b12,b13,b14,b15), c(b20,b21,b22,b23,b24,b25))
  X <- matrix(c(rep(1,110), LogPriceL1, LogQuantityL1, Stormy, Mixed, Holiday),110)
  Y <- cbind(LogPrice, LogQuantity) 
  R <- Y - X %*% B 
  sigma <- matrix(c(Sigma1,cov,cov,Sigma2),2,2)
  R = suppressWarnings(dmvnorm(R, c(0,0), sigma))   
  -sum(log(R))
}  
UR <- mle2(LL, fixed = list(b12=0,b22=0)) 

# With restrictions - matrix notation
LL <-function(b10=0,b11=0,b12=0,b13=0,b14=0,b15=0,Sigma1=1,
              b20=0,b21=0,b22=0,ba=1,b25=0,Sigma2=1,cov=0) {    
  B <- cbind(c(b10,b11,b12,b13,b14,b15), 
             c(b20,b21,b22,b13*ba ,b14*ba ,b25))
  X <- matrix(c(rep(1,110), LogPriceL1, LogQuantityL1, Stormy, Mixed, Holiday),110)
  Y <- cbind(LogPrice, LogQuantity) 
  R <- Y - X %*% B 
  sigma <- matrix(c(Sigma1,cov,cov,Sigma2),2,2)
  R = suppressWarnings(dmvnorm(R, c(0,0), sigma))   
  -sum(log(R))
}  
R <- mle2(LL, fixed = list(b12=0,b22=0)) 
# Assess the relative fit of the models
anova(UR, R)

## Likelihood Ratio Tests
## Model 1: UR, [LL]: b10+b11+b13+b14+b15+Sigma1+b20+b21+b23+b24+b25+
##           Sigma2+cov
## Model 2: R, [LL]: b10+b11+b13+b14+b15+Sigma1+b20+b21+ba+b25+Sigma2+
##           cov
##   Tot Df Deviance  Chisq Df Pr(>Chisq)
## 1     13   169.47                     
## 2     12   169.75 0.2848  1     0.5936

Finally, to fully replicate the output in your textbook we need to calculate the standard deviations and correlation for our model errors. These may be obtained from the variance-covariance matrix we obtain from as follows:

# calculate rho (correlation rather than covariance)
sqrt(R@fullcoef[["Sigma1"]])

## [1] 0.2312324

sqrt(R@fullcoef[["Sigma2"]])

## [1] 0.6115173

R@fullcoef[["cov"]]/(sqrt(R@fullcoef[["Sigma1"]])*sqrt(R@fullcoef[["Sigma2"]]))

## [1] -0.4444474

Each of these packages will be explained in turn over the course of the tutorial.

Recall that we have already modelled and tested the reduced-form model, i.e. the two-equation VAR. Reestimate the VAR model, as shown in Table 15.1: (At the end of this tutorial I add additional notes on System Fit and Cross-Equation Restrictions)

pEqn1 <- formula(LogPrice ~ LogPriceL1 +  Stormy + Mixed + Holiday)
qEqn1 <- formula(LogQuantity ~ LogPriceL1 + Stormy + Mixed + Holiday)
sysEqn1 <- list(pEqn1,qEqn1)
var1 <- systemfit(sysEqn1, "OLS", data=fulton, maxit=100)
summary(var1)

## 
## systemfit results 
## method: OLS 
## 
##          N  DF     SSR  detRCov   OLS-R2 McElroy-R2
## system 220 210 46.9678 0.017561 0.384224   0.566266
## 
##       N  DF      SSR      MSE     RMSE       R2   Adj R2
## eq1 110 105  5.87769 0.055978 0.236597 0.632401 0.618397
## eq2 110 105 41.09007 0.391334 0.625567 0.318399 0.292434
## 
## The covariance matrix of the residuals
##            eq1        eq2
## eq1  0.0559780 -0.0659171
## eq2 -0.0659171  0.3913340
## 
## The correlations of the residuals
##           eq1       eq2
## eq1  1.000000 -0.445364
## eq2 -0.445364  1.000000
## 
## 
## OLS estimates for 'eq1' (equation 1)
## Model Formula: LogPrice ~ LogPriceL1 + Stormy + Mixed + Holiday
## 
##               Estimate Std. Error  t value   Pr(>|t|)    
## (Intercept) -0.1735018  0.0412272 -4.20843 5.4358e-05 ***
## LogPriceL1   0.6703401  0.0636805 10.52662 < 2.22e-16 ***
## Stormy       0.2509784  0.0583986  4.29768 3.8697e-05 ***
## Mixed        0.1368354  0.0545688  2.50757   0.013688 *  
## Holiday      0.1240545  0.1386492  0.89474   0.372975    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.236597 on 105 degrees of freedom
## Number of observations: 110 Degrees of Freedom: 105 
## SSR: 5.877692 MSE: 0.055978 Root MSE: 0.236597 
## Multiple R-Squared: 0.632401 Adjusted R-Squared: 0.618397 
## 
## 
## OLS estimates for 'eq2' (equation 2)
## Model Formula: LogQuantity ~ LogPriceL1 + Stormy + Mixed + Holiday
## 
##              Estimate Std. Error  t value   Pr(>|t|)    
## (Intercept)  8.883876   0.109006 81.49928 < 2.22e-16 ***
## LogPriceL1   0.367745   0.168373  2.18411  0.0311790 *  
## Stormy      -0.560951   0.154407 -3.63293  0.0004354 ***
## Mixed       -0.239404   0.144281 -1.65929  0.1000412    
## Holiday     -2.159510   0.366592 -5.89078 4.6841e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.625567 on 105 degrees of freedom
## Number of observations: 110 Degrees of Freedom: 105 
## SSR: 41.090073 MSE: 0.391334 Root MSE: 0.625567 
## Multiple R-Squared: 0.318399 Adjusted R-Squared: 0.292434

Recall that this model is congruent with the data (what does congruent mean? again).

This tutorial deals with the identification problem, i.e. the problem of inferring structural parameters from the reduced-form model. In particular, suppose we seek to identify the price elasticity of demand. It is possible to formulate a structural model, where an instrumental variable is added to the supply equation, which would enable exact identification of the price elasticity of demand.

Note, however, that the reduced-form model associated with this structural model has the same structure as the reduced-form model associated with an under-identified model. So there is no empirical test one can perform to compare the two structural models. A choice between these two structural models has to be based on an economic-theory based belief about which model is right (see page 225).

Suppose we believe that the exactly-identified demand equation has support in theory, we can estimate the structural parameters in . As it happens, there is more than one way to estimate an exactly identified structural model in (surprised?). One option is to continue using , repecify the marginal equation for to a conditional equation with an exclusion restriction on and change the estimation method to as follows:

pEqn2 <- formula(LogPrice ~ LogPriceL1 + Stormy + Mixed + Holiday)
qEqn2 <- formula(LogQuantity ~ LogPrice + LogPriceL1 + Mixed + Holiday)
sysEqn2 <- list(pEqn2,qEqn2)
var2 <- systemfit(sysEqn2, "2SLS", data=fulton, inst = ~ Stormy + Mixed + Holiday + LogPriceL1)
summary(var2)

## 
## systemfit results 
## method: 2SLS 
## 
##          N  DF     SSR  detRCov OLS-R2 McElroy-R2
## system 220 210 45.3907 0.017561 0.4049   0.645601
## 
##       N  DF      SSR      MSE     RMSE       R2   Adj R2
## eq1 110 105  5.87769 0.055978 0.236597 0.632401 0.618397
## eq2 110 105 39.51299 0.376314 0.613445 0.344560 0.319591
## 
## The covariance matrix of the residuals
##          eq1      eq2
## eq1 0.055978 0.059197
## eq2 0.059197 0.376314
## 
## The correlations of the residuals
##          eq1      eq2
## eq1 1.000000 0.407864
## eq2 0.407864 1.000000
## 
## 
## 2SLS estimates for 'eq1' (equation 1)
## Model Formula: LogPrice ~ LogPriceL1 + Stormy + Mixed + Holiday
## Instruments: ~Stormy + Mixed + Holiday + LogPriceL1
## 
##               Estimate Std. Error  t value   Pr(>|t|)    
## (Intercept) -0.1735018  0.0412272 -4.20843 5.4358e-05 ***
## LogPriceL1   0.6703401  0.0636805 10.52662 < 2.22e-16 ***
## Stormy       0.2509784  0.0583986  4.29768 3.8697e-05 ***
## Mixed        0.1368354  0.0545688  2.50757   0.013688 *  
## Holiday      0.1240545  0.1386492  0.89474   0.372975    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.236597 on 105 degrees of freedom
## Number of observations: 110 Degrees of Freedom: 105 
## SSR: 5.877692 MSE: 0.055978 Root MSE: 0.236597 
## Multiple R-Squared: 0.632401 Adjusted R-Squared: 0.618397 
## 
## 
## 2SLS estimates for 'eq2' (equation 2)
## Model Formula: LogQuantity ~ LogPrice + LogPriceL1 + Mixed + Holiday
## Instruments: ~Stormy + Mixed + Holiday + LogPriceL1
## 
##               Estimate Std. Error   t value   Pr(>|t|)    
## (Intercept)  8.4960897  0.0845071 100.53702 < 2.22e-16 ***
## LogPrice    -2.2350569  0.6032993  -3.70472 0.00033955 ***
## LogPriceL1   1.8659930  0.4838677   3.85641 0.00019870 ***
## Mixed        0.0664304  0.1281158   0.51852 0.60518756    
## Holiday     -1.8822409  0.3680398  -5.11423 1.4259e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.613445 on 105 degrees of freedom
## Number of observations: 110 Degrees of Freedom: 105 
## SSR: 39.512993 MSE: 0.376314 Root MSE: 0.613445 
## Multiple R-Squared: 0.34456 Adjusted R-Squared: 0.319591

If you compare this code chunk with the earlier chuck wherein we modelled the reduced form VAR, you’ll see that we’ve added to the RHS of the equation for (from marginal to conditional equation), we’ve changed the estimator from to , and we’ve specified a formula containing our instruments. Among the instruments, you’ll see we’ve listed , and – these variables enter both equations in our model and are assumed to be exogenous, hence we list them as instruments so that they can serve as instruments for themselves. The variable is also listed as an instrument, notice however that enters the conditional equation for but does not enter the conditional equation for . In the marginal equation for , serves as it’s own instrument, but in the conditional equation for , serves as the instrument for (remember, instrumenting for a variable effectively means we use the predicted value from the endogenous variable regressed on the exogenous instrument as ``the variable’’ in our regression, with an adjustment to our standard errors to reflect that the predicted values of the endogenous variable is a generated regressor).

Though a full discussion of the relationship between instrumental variable analysis and structural equations modelling is beyond the scope of this tutorial, it is worth mentioning that structural equations modelling inevitably collapses into a form of instrumental variable analysis. Given the necessary restrictions, it is always possible to model each equation in a system of structural equations separately, but modelling them together can be beneficial. Specifically, you can model systems of equations simultaneously to achieve gains in efficiency over standard , provided that the system of equations is identified (i.e. you have a sufficient number of instruments). In the instance above, we’ve specified one instrument () for one of these equations – hence we should not expect to see any efficiency gains for modelling the system of equations compared with modelling them separately. This can be seen by modelling the conditional equation for via two-stage least squares (using the function ) as follows:

m1 <- tsls(LogQuantity ~ LogPrice + LogPriceL1 + Mixed + Holiday, data=fulton, 
           instruments= ~ LogPriceL1 + Stormy + Mixed + Holiday)
summary(m1)

## 
##  2SLS Estimates
## 
## Model Formula: LogQuantity ~ LogPrice + LogPriceL1 + Mixed + Holiday
## 
## Instruments: ~LogPriceL1 + Stormy + Mixed + Holiday
## 
## Residuals:
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -1.90505 -0.26684  0.03909  0.00000  0.41329  1.23120 
## 
##                Estimate  Std. Error   t value   Pr(>|t|)    
## (Intercept)  8.49608968  0.08450707 100.53702 < 2.22e-16 ***
## LogPrice    -2.23505692  0.60329931  -3.70472 0.00033955 ***
## LogPriceL1   1.86599297  0.48386774   3.85641 0.00019870 ***
## Mixed        0.06643043  0.12811583   0.51852 0.60518756    
## Holiday     -1.88224090  0.36803981  -5.11423 1.4259e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.6134446 on 105 degrees of freedom

If you compare the coefficient estimates and standard errors of this model with those of the conditional equation for obtained in the system of equations stored under , you’ll see that these are identical (i.e. it is no more efficient to model the system of equations than it is to model the single identified structural equation). You may also compare the output for the last equation to that of Table 15.2, page 225. In particular, we find that the elasticity is -2.2, which is exactly what we would find if we had followed the slope calculation approach explained in the slides.

While we are in familiar territory, let us introduce the package which offers another approach to modelling systems of (reduced form or structural) equations. Of course, you’ve just been shown that we already have two estimators which give us identical results under exact identification – you may thus (rightly) ask why we would bother introducing a third option? Well, the first of these is that estimates systems of equations via maximum likelihood (using the same numerical optimizers we used in tutorial 3). This can be highly advantageous in terms of efficiency when estimating systems of equations wherein all equations are identified. Additionally (though less importantly), is worth seeing as its setup departs from the broader literature where structural equation modelling (SEM) is employed, which spans economics, sociology and psychology (to name a few disciplines wherein SEMs have been popular). In each of these disciplines (including economics), systems of structural equations are often set up in ``path diagrams’’ which show the causal connections between variables. In these diagrams, arrows point from causal (though not necessarily exogenous) variables to affected (dependent) variables, illustrating the set of causal channels characterizing a system of equations. This idea is instantiated in the syntax for the function as shown below:

fimlRaM <- matrix(c(
 "LogPrice <- LogPriceL1",   "b11", NA,
 "LogPrice <- Stormy",       "b12", NA,
 "LogPrice <- Mixed",        "b13", NA,
 "LogPrice <- Holiday",      "b14", NA,
 "LogQuantity <- LogPriceL1","b22", NA,
 "LogQuantity <- Stormy",    "b23", NA,
 "LogQuantity <- Mixed",     "b24", NA,
 "LogQuantity <- Holiday",   "b25", NA,
 "LogPrice <-> LogPrice",       "sig1", 1,
 "LogQuantity <-> LogQuantity", "sig2", 1),
ncol=3, byrow=T)
class(fimlRaM) <- "mod"
exogVar <- c("LogPriceL1", "Stormy", "Mixed", "Holiday")
endogVar <- c("LogPrice", "LogQuantity")
allVar <- c(exogVar, endogVar)
fimlResult <- sem(model=fimlRaM, S = cov(fulton[-1, allVar]),
                  N = (nrow(fulton) -1), fixed.x = exogVar)
summary(fimlResult)

## 
##  Model Chisquare =  24.09798   Df =  1 Pr(>Chisq) = 9.155634e-07
##  AIC =  44.09798
##  BIC =  19.3975
## 
##  Normalized Residuals
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## -2.3242  0.0000  0.0000 -0.1291  0.0000  0.0000 
## 
##  R-square for Endogenous Variables
##    LogPrice LogQuantity 
##      0.6324      0.3184 
## 
##  Parameter Estimates
##      Estimate    Std Error   z value    Pr(>|z|)    
## b11   0.67034011 0.062501111 10.7252511 7.747638e-27
## b12   0.25097841 0.057317077  4.3787719 1.193500e-05
## b13   0.13683538 0.053558189  2.5548919 1.062208e-02
## b14   0.12405447 0.136081418  0.9116195 3.619691e-01
## b22   0.36774468 0.165254303  2.2253259 2.605937e-02
## b23  -0.56095102 0.151547605 -3.7014839 2.143422e-04
## b24  -0.23940444 0.141609021 -1.6906016 9.091292e-02
## b25  -2.15950971 0.359802242 -6.0019351 1.949796e-09
## sig1  0.05392378 0.007304359  7.3824115 1.554478e-13
## sig2  0.37697315 0.051063686  7.3824115 1.554478e-13
##                                  
## b11  LogPrice <--- LogPriceL1    
## b12  LogPrice <--- Stormy        
## b13  LogPrice <--- Mixed         
## b14  LogPrice <--- Holiday       
## b22  LogQuantity <--- LogPriceL1 
## b23  LogQuantity <--- Stormy     
## b24  LogQuantity <--- Mixed      
## b25  LogQuantity <--- Holiday    
## sig1 LogPrice <--> LogPrice      
## sig2 LogQuantity <--> LogQuantity
## 
##  Iterations =  1

In the code chunk above, the first 12 lines of code create a matrix that fully describes our structural model. The first column contains a causal relationship, where, for example, the first entry states that has a causal impact on . The second column contains the names of the parameter that relates these variables, which you specify (eg, for the first relation). Finally, the third column takes an optional set of starting values as arguments – stating allows to use a set of default starting values. Also take note of the last two lines of this matrix – double-headed arrows are used to denote variances, which we must also specify. Over the course of the next few lines of code, we coerce the matrix to take on the class which is one class compatible with (there are functions that create objects of this class automatically, but for our purposes it’s easier to specify this matrix directly). Thereafter, we create two character vectors which respectively contain a set of exogenous and endogenous variables, as well as a third character vector containing all of the variable names. Finally, these objects (and other relevant information – see the help file for ) are fed into in the appropriate arguments. If you compare the summary results from this code chunk with those of the reduced form VAR – – you’ll see that the coefficient estimates are identical (remember, MLE collapses to OLS in some situations).

We can also obtain a identical set of results to those of by respecifying out path diagram to exclude from the equation for – not the equation for – and adding to the equation for . This is obtained as follows:

fimlRaM2 <- matrix(c(
  "LogPrice <- LogPriceL1",   "b11", NA,
  "LogPrice <- Stormy",       "b12", NA,
  "LogPrice <- Mixed",        "b13", NA,
  "LogPrice <- Holiday",      "b14", NA,
  "LogQuantity <- LogPrice",  "b21", NA,
  "LogQuantity <- LogPriceL1","b22", NA,
  "LogQuantity <- Mixed",     "b24", NA,
  "LogQuantity <- Holiday",   "b25", NA,
  "LogPrice <-> LogQuantity", "cov1", NA,
  "LogPrice <-> LogPrice",       "sig1", 1,
  "LogQuantity <-> LogQuantity", "sig2", 1),
  ncol=3, byrow=T)
class(fimlRaM2) <- "mod"
exogVar <- c("LogPriceL1", "Stormy", "Mixed", "Holiday")
endogVar <- c("LogPrice", "LogQuantity")
allVar <- c(exogVar, endogVar)
fimlResult2 <- sem(model=fimlRaM2, S = cov(fulton[-1, allVar]), 
                   N = (nrow(fulton) -1), fixed.x = exogVar)
summary(fimlResult2)

## 
##  Model Chisquare =  2.96243e-11   Df =  0 Pr(>Chisq) = NA
##  AIC =  22
##  BIC =  2.96243e-11
## 
##  Normalized Residuals
##       Min.    1st Qu.     Median       Mean    3rd Qu.       Max. 
## -1.521e-06  0.000e+00  0.000e+00  5.215e-08  4.511e-07  1.775e-06 
## 
##  R-square for Endogenous Variables
##    LogPrice LogQuantity 
##      0.6324      0.3446 
## 
##  Parameter Estimates
##      Estimate    Std Error   z value    Pr(>|z|)    
## b11   0.67034006 0.062501099 10.7252524 7.747527e-27
## b12   0.25097837 0.057317066  4.3787721 1.193498e-05
## b13   0.13683533 0.053558179  2.5548914 1.062209e-02
## b14   0.12405458 0.136081392  0.9116205 3.619685e-01
## b21  -2.23505714 0.592126201 -3.7746297 1.602455e-04
## b22   1.86599291 0.474906466  3.9291798 8.523608e-05
## b24   0.06643058 0.125743104  0.5283040 5.972884e-01
## b25  -1.88224038 0.361223676 -5.2107337 1.880953e-07
## cov1  0.05702456 0.035052429  1.6268363 1.037719e-01
## sig1  0.05392376 0.007304356  7.3824115 1.554478e-13
## sig2  0.36250451 0.083496611  4.3415476 1.414826e-05
##                                  
## b11  LogPrice <--- LogPriceL1    
## b12  LogPrice <--- Stormy        
## b13  LogPrice <--- Mixed         
## b14  LogPrice <--- Holiday       
## b21  LogQuantity <--- LogPrice   
## b22  LogQuantity <--- LogPriceL1 
## b24  LogQuantity <--- Mixed      
## b25  LogQuantity <--- Holiday    
## cov1 LogQuantity <--> LogPrice   
## sig1 LogPrice <--> LogPrice      
## sig2 LogQuantity <--> LogQuantity
## 
##  Iterations =  26

We follow similar steps to estimate the over-identified demand equation, i.e. where we have two instrumental variables, but also a restriction to ensure uniqueness (the so-called ‘over-identifying restriction’). This is where becomes useful – allows us to estimate this over-identified system of equations via full information maximum likelihood. Intuitively, all we need do to implement our over-identifying restriction is omit our additional instrument from the conditional equation for – in this instance, we’ve chosen to omit .

fimlRaM3 <- matrix(c(
  "LogPrice <- LogPriceL1",   "b11", NA,
  "LogPrice <- Stormy",       "b12", NA,
  "LogPrice <- Mixed",        "b13", NA,
  "LogPrice <- Holiday",      "b14", NA,
  "LogQuantity <- LogPrice",  "b21", NA,
  "LogQuantity <- LogPriceL1","b22", NA,
  "LogQuantity <- Holiday",   "b25", NA,
  "LogPrice <-> LogQuantity", "cov1", NA,
  "LogPrice <-> LogPrice",       "sig1", 1,
  "LogQuantity <-> LogQuantity", "sig2", 1),
  ncol=3, byrow=T)
class(fimlRaM3) <- "mod"
exogVar <- c("LogPriceL1", "Stormy", "Mixed", "Holiday")
endogVar <- c("LogPrice", "LogQuantity")
allVar <- c(exogVar, endogVar)

fimlResult3 <- sem(model=fimlRaM3, S = cov(fulton[-1, allVar]), 
                   N = (nrow(fulton) -1), fixed.x = exogVar)
summary(fimlResult3)

## 
##  Model Chisquare =  0.2822383   Df =  1 Pr(>Chisq) = 0.5952379
##  AIC =  20.28224
##  BIC =  -4.418242
## 
##  Normalized Residuals
##       Min.    1st Qu.     Median       Mean    3rd Qu.       Max. 
## -1.424e-01 -3.100e-07  0.000e+00  1.139e-02  0.000e+00  2.811e-01 
## 
##  R-square for Endogenous Variables
##    LogPrice LogQuantity 
##      0.6323      0.3495 
## 
##  Parameter Estimates
##      Estimate    Std Error   z value    Pr(>|z|)    
## b11   0.67026923 0.062513748 10.7219490 8.029325e-27
## b12   0.25224324 0.057279278  4.4037433 1.063988e-05
## b13   0.12729468 0.050465053  2.5224322 1.165464e-02
## b14   0.12423250 0.136108830  0.9127439 3.613773e-01
## b21  -2.20233261 0.583709961 -3.7729913 1.613019e-04
## b22   1.84359618 0.469316746  3.9282557 8.556421e-05
## b25  -1.88514306 0.359756422 -5.2400540 1.605297e-07
## cov1  0.05540325 0.034608546  1.6008547 1.094091e-01
## sig1  0.05394584 0.007307346  7.3824115 1.554478e-13
## sig2  0.35976031 0.080982536  4.4424431 8.894314e-06
##                                  
## b11  LogPrice <--- LogPriceL1    
## b12  LogPrice <--- Stormy        
## b13  LogPrice <--- Mixed         
## b14  LogPrice <--- Holiday       
## b21  LogQuantity <--- LogPrice   
## b22  LogQuantity <--- LogPriceL1 
## b25  LogQuantity <--- Holiday    
## cov1 LogQuantity <--> LogPrice   
## sig1 LogPrice <--> LogPrice      
## sig2 LogQuantity <--> LogQuantity
## 
##  Iterations =  25

The second equation again provides structural estimates – compare with Table 15.5 page 231. In the model summary, take note of the model Chi-squared statistic – this is a test for over-identification (wherein the null hypothesis is that the model is exactly identified). According to this statistic, we cannot reject the hypothesis of exact identification.

The FIML approach involves, firstly, estimating a congruent reduced-form system and, secondly, formulating the structural model. As you might have guessed, one can avoid having to estimate the reduced-form system and try to estimate the structural equation ‘directly’ using a two-stage least-squares method. This is done as follows:

# Via 2SLS
m1 <- tsls(LogQuantity ~ LogPrice + LogPriceL1 + Holiday, data=fulton, 
           instruments = ~ LogPriceL1 + Stormy + Mixed + Holiday)
summary(m1)

## 
##  2SLS Estimates
## 
## Model Formula: LogQuantity ~ LogPrice + LogPriceL1 + Holiday
## 
## Instruments: ~LogPriceL1 + Stormy + Mixed + Holiday
## 
## Residuals:
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -1.90560 -0.28511  0.02252  0.00000  0.39484  1.18308 
## 
##                Estimate  Std. Error   t value   Pr(>|t|)    
## (Intercept)  8.51910711  0.07113679 119.75670 < 2.22e-16 ***
## LogPrice    -2.18757674  0.58990595  -3.70835 0.00033390 ***
## LogPriceL1   1.83237703  0.47435558   3.86288 0.00019323 ***
## Holiday     -1.88708663  0.36397008  -5.18473 1.0429e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.6068569 on 106 degrees of freedom

These results are identical to those reported identical in table 15.9. Also notice that the estimates from this approach are slightly different from the estimates obtained using the FIML approach – for overidentified models, 2SLS and FIML are not necessarily identical. Again, a full discussion of the difference between these estimations approaches is beyond the scope of this tutorial, but note that this single equation approach is more robust but less efficient that FIML for not requiring the full set of MLE distributional assumptions for consistency.

Econometrics 871: Tutorial 2 Restrictions and Structural Models in R

Willem Boshoff | Original Tutorials and R code compiled by Lewis McLean using Knitr. Updated by C. Friedrich Kreuser, 2025.