Interaction effect in Econometric Model

As explained in an earlier video on interpretting coefficient of a coefficient in multiple regression model, it was expalin that a coefficient of a multiple regression model implies the effect of a regressr by eliminating/partialling out/netting out the effect of other variables from this regressor as well as from the dependent variable video#18. In this post I am explaining why do we need interation term in the model. 

\(y_i=\beta_0+\beta_1x_{1i}+\beta_2x_{2i}+\beta_3x_{1i}x_{2i}+\epsilon_i\) 

Using the data set Growth described in an earlier video, excluding the data for Malta, run the following five regressions: Growth on
(1) TradeShare and YearsSchool;

In the following code-chunk all libraries and data are uploaded. This has already been discussed in a video

#install.packages(c('rmarkdown', 'tinytex'))
#tinytex::install_tinytex()
library(readxl)
library(dplyr)
library(moderndive)
library(gridExtra)
library(ggplot2)
library(stargazer)

Growth <- read_excel("C:/Users/hp/Dropbox/Applied Economics/Growth.xlsx")

growth_malta<-Growth %>% filter(country_name!="Malta")

Using the data set Growth described (description is provided in class notes), excluding the data for Malta, run the following regressions: Growth on
(1) TradeShare and YearsSchool; (2) TradeShare and ln(YearsSchool);
(3) TradeShare, ln(YearsSchool), Rev_Coups, Assassinations and ln(RGDP60);
(4) TradeShare, ln(YearsSchool), Rev_Coups, Assassinations, ln(RGDP60), and TradeShare*ln(Yearsschool); and /

Model 4 is special here in the sense that it has interaction terms. As in model 2 and 3 measures the effect of TradeShare eliminating/partialling out/netting out (holding the effect of all other variables constant) the effect of other variables from TradeShare and Growth. Now in case it is the case that TradeShare and yearsschooling are important and change in one variable leads to another vairable.

To elaborate my point,I am going to take a simple example
\(\wage=\beta_0+\beta_1Gender+\beta_2Graduate+\epsilon\)
where Gender is 1 for male and 0 for female
while Graduate is 1 for grduate degree and 0 if not graduate Now in this case if \(\beta_1\) measures effect of being male/female by eliminating the effect of Graduate both from Gender and wage. Same is the case with _1 which measures the effect of being Graduate/Non-Graduate by eliminating the effect of Gender from both Graduate and wage. Now if one is interested to measure the effect of an individual is male and graduate or female and graduate, then this model does not work. One has to introduce interaction term which will measure this impact. \(\wage=\beta_0+\beta_1Gender+\beta_2Graduate+\beta_3(Gender:Graduate)+\epsilon\) Interpretting interaction term when both variables are binary is simple, bit tricky when one has to deal with one binary and one conitinuous variable interaction and is point by point when one has both variables are continuous. Models having interaction term, or square/log of a variable(s) are called non-linear models in terms of variables. Non-linear models dont have any standard interpretation like linear models as same change in the indepdent variable at various points does not lead to same change in the response variable as is the case with linear models.

If interaction term is signinficant then retain both main variables in the model even if these are individually non-significant.

In model 4 there is an interaction term but it is non-significant, therefore, one should drop it from the model. Similarly assasination is not significantand should be dropped from the model. If I drop non-significant variables from the model (using joint significant F-test), I have another model here which is model 5.
Its interesting that TradeShare which was significant has now become insignificant. So this is the reason why one should not draw very strong conclusions from the models unless issue is explored in all possible dimensions. Anyhow my purpose here is not settle role of TradeShare on economic growth issue.

Note:Often one switches from main variable of interest (TradeShare) to another variables while interpretting the results. Like forecasting, economics students formulate a set of hypothesis and justify their findings on the basis if some of these hypothesis are rejected. I shall re-iterate here that in economics and social sciences main variable of interest is usually one. So if someone is interested to design a policy whether economic growth can be increased if tradeshare get increased, then main variable of interest is TradeShare.

But often it happens that if TradeShare is non-sginficant, many a times one leaves this actual debate and start highlighting other variables’ importance like Schooling, coups … . This approach of shifting from main hypothesis to other predictors while interpretting the results is not main concept of econometric model. Other variables are added in the model to correctly measure the effect of TradeShare. To understand please listen 4 objectives of econometrics

mod.lm1<-lm(growth~tradeshare+yearsschool,data = growth_malta)## Part_a
growth_malta<-growth_malta %>% mutate(lnYearSch=log(yearsschool),lnrgdp60=log(rgdp60))
mod.lm2<-lm(growth~tradeshare+lnYearSch,growth_malta)   ##Part_b
mod.lm3<-lm(growth~tradeshare+lnYearSch+rev_coups+assasinations+lnrgdp60,data = growth_malta)
mod.lm4<-lm(growth~tradeshare+lnYearSch+rev_coups+assasinations+lnrgdp60+tradeshare:yearsschool,data = growth_malta)
mod.lm5<-lm(growth~tradeshare+lnYearSch+rev_coups+lnrgdp60,data = growth_malta)
#stargazer(mod.lm1,mod.lm2,mod.lm3,mod.lm4,mod.lm5,type = "text")

library(gt)
## Warning: package 'gt' was built under R version 3.6.3
library(huxtable)
## Warning: package 'huxtable' was built under R version 3.6.3
## 
## Attaching package: 'huxtable'
## The following object is masked from 'package:ggplot2':
## 
##     theme_grey
## The following object is masked from 'package:dplyr':
## 
##     add_rownames
huxreg(mod.lm1,mod.lm2,mod.lm3,mod.lm4,mod.lm5)
(1) (2) (3) (4) (5)
(Intercept) -0.122    -0.186     11.746 *** 10.484 **  11.615 ***
(0.663)   (0.564)    (2.920)    (3.039)    (2.891)   
tradeshare 1.898 *  1.749 *   1.104     2.230     0.990    
(0.936)   (0.860)    (0.833)    (1.163)    (0.800)   
yearsschool 0.243 **                                    
(0.084)                                      
lnYearSch         1.016 *** 2.161 *** 2.487 *** 2.151 ***
        (0.223)    (0.363)    (0.431)    (0.360)   
rev_coups                  -2.300 *   -2.315 *   -2.048 *  
                 (1.004)    (0.997)    (0.878)   
assasinations                  0.228     0.117             
                 (0.434)    (0.438)            
lnrgdp60                  -1.621 *** -1.488 *** -1.592 ***
                 (0.399)    (0.407)    (0.392)   
tradeshare:yearsschool                           -0.330             
                          (0.240)            
N 64        64         64         64         64        
R2 0.161    0.287     0.453     0.471     0.451    
logLik -122.897    -117.667     -109.184     -108.136     -109.335    
AIC 253.794    243.334     232.367     232.272     230.671    
*** p < 0.001; ** p < 0.01; * p < 0.05.