ECNM Discussion_7

Author

Bryan Calderon

Loading Packages

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.4
✔ forcats   1.0.0     ✔ stringr   1.5.0
✔ ggplot2   3.4.3     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.0
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

Please cite as: 


 Hlavac, Marek (2022). stargazer: Well-Formatted Regression and Summary Statistics Tables.

 R package version 5.2.3. https://CRAN.R-project.org/package=stargazer 



Attaching package: 'kableExtra'


The following object is masked from 'package:dplyr':

    group_rows



Attaching package: 'scales'


The following object is masked from 'package:purrr':

    discard


The following object is masked from 'package:readr':

    col_factor



Attaching package: 'psych'


The following objects are masked from 'package:scales':

    alpha, rescale


The following objects are masked from 'package:ggplot2':

    %+%, alpha



Attaching package: 'reshape2'


The following object is masked from 'package:tidyr':

    smiths
Warning: package 'corrplot' was built under R version 4.3.3
corrplot 0.94 loaded

Attaching package: 'gridExtra'

The following object is masked from 'package:dplyr':

    combine
Warning: package 'plm' was built under R version 4.3.2

Attaching package: 'plm'

The following objects are masked from 'package:dplyr':

    between, lag, lead

Loading required package: carData

Attaching package: 'car'

The following object is masked from 'package:psych':

    logit

The following object is masked from 'package:dplyr':

    recode

The following object is masked from 'package:purrr':

    some

Bringing in Data

setwd("~/Desktop/Grad School/Econometrics/Discussion_7")

# Guns Data
guns  <-  read_csv("Guns Data.csv") 
Rows: 1173 Columns: 13
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (2): state, law
dbl (11): year, violent, murder, robbery, prisoners, afam, cauc, male, popul...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Part 1

Overview

I chose the Guns data set which shows crime and related statistics across U.S States from 1977 - 1999 with indications if the respected state enforces a shall carry law.

The time component in the data would be from years 1977 - 1999 while the entity component is the 50 U.S States + District of columbia.

# Summary table
kable(head(guns), caption = "Guns Dataset sample")
Guns Dataset sample
year violent murder robbery prisoners afam cauc male population income density state law
1977 414.4 14.2 96.8 83 8.384873 55.12291 18.17441 3.780403 9563.148 0.0745524 Alabama no
1978 419.1 13.3 99.1 94 8.352101 55.14367 17.99408 3.831838 9932.000 0.0755667 Alabama no
1979 413.3 13.2 109.5 144 8.329575 55.13586 17.83934 3.866248 9877.028 0.0762453 Alabama no
1980 448.5 13.2 132.1 141 8.408386 54.91259 17.73420 3.900368 9541.428 0.0768288 Alabama no
1981 470.5 11.9 126.5 149 8.483435 54.92513 17.67372 3.918531 9548.351 0.0771866 Alabama no
1982 447.7 10.6 112.0 183 8.514000 54.89621 17.51052 3.925229 9478.919 0.0773185 Alabama no

Checking for balance

We can below the see data is indeed balanced

# Converting into a panel data frame
guns_panel <- pdata.frame(guns, 
                          index = c("state", "year"))

# Checking if the dataset is balanced --> using PLM package
is_balanced <- is.pbalanced(guns_panel)

# Print the result
print(is_balanced)
[1] TRUE

Part 2

OLS

\(\text{Violent}_{it} = \beta_0 + \beta_1 \text{law}_{it} + \beta_2 \text{prisoners}_{it} + \beta_3 \text{income}_{it} + \beta_4 \text{population}_{it} + \beta_5 \text{male}_{it} + \beta_6 \text{cauc}_{it} + \beta_7 \text{afam}_{it} + u_{it}\)

Description of Variables in the Guns Dataset
Variable Description
state Factor indicating state.
year Factor indicating year.
violent Violent crime rate (incidents per 100,000 members of the population).
murder Murder rate (incidents per 100,000).
robbery Robbery rate (incidents per 100,000).
prisoners Incarceration rate in the state in the previous year (sentenced prisoners per 100,000 residents; value for the previous year).
afam Percent of state population that is African-American, ages 10 to 64.
cauc Percent of state population that is Caucasian, ages 10 to 64.
male Percent of state population that is male, ages 10 to 29.
population State population, in millions of people.
income Real per capita personal income in the state (US dollars).
density Population per square mile of land area, divided by 1,000.
law Factor. Does the state have a shall carry law in effect in that year?
# OLS model
ols_model <- lm(violent ~ law + prisoners + income + population + male + afam + cauc, 
                data = guns)

# summary
stargazer(ols_model, type = "text")

===============================================
                        Dependent variable:    
                    ---------------------------
                              violent          
-----------------------------------------------
lawyes                      -87.029***         
                             (15.067)          
                                               
prisoners                    1.102***          
                              (0.046)          
                                               
income                       0.021***          
                              (0.003)          
                                               
population                   13.782***         
                              (1.144)          
                                               
male                         32.176***         
                              (4.764)          
                                               
afam                         -17.653**         
                              (7.477)          
                                               
cauc                        -15.237***         
                              (3.701)          
                                               
Constant                     456.713*          
                             (248.122)         
                                               
-----------------------------------------------
Observations                   1,173           
R2                             0.652           
Adjusted R2                    0.649           
Residual Std. Error     197.926 (df = 1165)    
F Statistic          311.140*** (df = 7; 1165) 
===============================================
Note:               *p<0.1; **p<0.05; ***p<0.01

Interpreting the results

  • Direction: The variables seem to be in the expected direction with the exception of income. I would have expected to see violent crime rate drop with every unit increase in income. While so, I could understand that with an increase in wealth there could be more of an incentive for a form of violence by robbers for example.

  • Magnitude: The coefficients seem reasonable given the scale of the variables. The most interesting impacts being from the Males and Lawyes. There is a strong increase (~32 units) in violent crime rates per every additional % of the state population being male. Similarly, if the states have a shall carry law in effect, it decreases violent crime rates by ~87 units on average.

  • Statistical Significance: Based on the P values, all the variables are statistically signficant.

  • Omitted Variable Bias: There could be a omitted variable bias associated to unobserved state level characteristics such as cultural factors or state specific policies that affect crime rates and are correlated with the right to carry laws. Including fixed effects can help us control for these unobserved factors.

Part 3: Fixed Effects Models

Dummy Variables

\[ \begin{align*}\text{Violent}_{it} &= \beta_0 + \beta_1 \text{law}_{it} + \beta_2 \text{prisoners}_{it} + \beta_3 \text{income}_{it} + \beta_4 \text{population}_{it} \\&+ \beta_5 \text{male}_{it} + \beta_6 \text{cauc}_{it} + \beta_7 \text{afam}_{it} + \delta_2 \text{State} \ 2_i + \delta_3 \text{State} \ 3_i + \dots + \delta_n \text{State} \ n_i + u_{it}\end{align*} \]

# OLS model
ols_model_dummy <- lm(violent ~ law + prisoners + income + population + male + afam + cauc + state, 
                data = guns)

Demeaned Regression Method

\[ \begin{align*}\text{Violent}_{it} - \overline{\text{Violent}}_{i} &= \beta_1 (\text{law}_{it} - \overline{\text{law}}_{i}) \\&\quad + \beta_2 (\text{prisoners}_{it} - \overline{\text{prisoners}}_{i}) \\&\quad + \beta_3 (\text{income}_{it} - \overline{\text{income}}_{i}) \\&\quad + \beta_4 (\text{population}_{it} - \overline{\text{population}}_{i}) \\&\quad + \beta_5 (\text{male}_{it} - \overline{\text{male}}_{i}) \\&\quad + \beta_6 (\text{afam}_{it} - \overline{\text{afam}}_{i}) \\&\quad + \beta_7 (\text{cauc}_{it} - \overline{\text{cauc}}_{i}) \\&\quad + (\varepsilon_{it} - \overline{\varepsilon}_{i})\end{align*} \]

# Convert to numeric
guns <- guns %>%
  mutate(law_numeric = ifelse(law == "yes", 1, 0))

# Demean the data --> subtract out the mean
guns_demeaned <- with(data = guns,
                    expr = data.frame(
                    violent   = violent   - ave(x = violent, state),   
                    lawyes       = law_numeric - ave(x = law_numeric, state),
                    prisoners = prisoners - ave(x = prisoners, state), 
                    income    = income - ave(x = income, state),       
                    population = population - ave(x = population, state),
                    male      = male - ave(x = male, state),           
                    afam      = afam - ave(x = afam, state),           
                    cauc      = cauc - ave(x = cauc, state)))

# OLS regression without intercept
ols_model_demeaned <- lm(violent ~ lawyes + prisoners + income + population + male + afam + cauc - 1,
                         data = guns_demeaned)

Within Estimator Method

\[ \begin{align*} \text{Violent}_{it} - \overline{\text{Violent}}_{i} &= \beta_1 (\text{law}_{it} -\overline{\text{law}}_{i}) \\ &\quad + \beta_2 (\text{prisoners}_{it} - \overline{\text{prisoners}}_{i}) \\ &\quad +\beta_3 (\text{income}_{it} - \overline{\text{income}}_{i}) \\ &\quad + \beta_4 (\text{population}_{it} -\overline{\text{population}}_{i}) \\ &\quad + \beta_5 (\text{male}_{it} - \overline{\text{male}}_{i}) \\ &\quad +\beta_6 (\text{afam}_{it} - \overline{\text{afam}}_{i}) \\ &\quad + \beta_7 (\text{cauc}_{it} -\overline{\text{cauc}}_{i}) \\ &\quad + (\varepsilon_{it} - \overline{\varepsilon}_{i}) \end{align*} \]

# OLS model
ols_model_within <- plm(formula = violent ~ law + prisoners + income + population + male + afam + cauc + state, 
                data = guns,
                index   = c("state", "year"), 
                model   = "within")

Comparing the three methods

# Generate a summary table with custom column labels
stargazer(ols_model_dummy,
          ols_model_demeaned,
          ols_model_within,
          type = "text",
          column.labels = c("Dummy Fixed Effects", "Demeaned Fixed Effects", 
                            "Within Fixed Effects (PLM)"),
          title = "Regression Results: Comparison of Fixed Effects Models",
          dep.var.labels = "Violent Crime Rate",
          covariate.labels = c("Shall Carry Law", "Prisoners", "Income", "Population", 
                               "Male (10-29)", "African American", "Caucasian"),
          no.space = TRUE, omit = c("state"))

Regression Results: Comparison of Fixed Effects Models
==================================================================================================
                                                 Dependent variable:                              
                    ------------------------------------------------------------------------------
                                                  Violent Crime Rate                              
                                            OLS                                   panel           
                                                                                  linear          
                       Dummy Fixed Effects      Demeaned Fixed Effects  Within Fixed Effects (PLM)
                               (1)                       (2)                       (3)            
--------------------------------------------------------------------------------------------------
Shall Carry Law              -21.991*                  -21.991*                  -21.991*         
                             (11.548)                  (11.292)                  (11.548)         
Prisoners                    0.212***                  0.212***                  0.212***         
                             (0.043)                   (0.042)                   (0.043)          
Income                        -0.004                    -0.004                    -0.004          
                             (0.004)                   (0.004)                   (0.004)          
Population                   10.143*                   10.143*                   10.143*          
                             (5.319)                   (5.202)                   (5.319)          
Male (10-29)                -21.564***                -21.564***                -21.564***        
                             (3.862)                   (3.777)                   (3.862)          
African American              -3.949                    -3.949                    -3.949          
                             (9.364)                   (9.157)                   (9.364)          
Caucasian                    9.644***                  9.644***                  9.644***         
                             (3.113)                   (3.044)                   (3.113)          
Constant                     342.495                                                              
                            (237.027)                                                             
--------------------------------------------------------------------------------------------------
Observations                  1,173                     1,173                     1,173           
R2                            0.917                     0.197                     0.197           
Adjusted R2                   0.913                     0.193                     0.156           
Residual Std. Error     98.856 (df = 1115)        96.670 (df = 1166)                              
F Statistic         215.544*** (df = 57; 1115) 40.970*** (df = 7; 1166)  39.178*** (df = 7; 1115) 
==================================================================================================
Note:                                                                  *p<0.1; **p<0.05; ***p<0.01

Results

Do the coefficients change?

  • The Coefficients match across the three methods as they each are designed to account for time invariant characteristics of the entities (in this case states),

    • They remove the fixed effects in different ways but yield the same estimates for the explantory variables.
  • This being said, they do change from the original OLS as now you are controlling for the time invariant characteristics.

What are the fixed effects controlling for?

The fixed effects control for time invariant characteristics of each state. These may include any factors that are constant over time for each state. Examples below

  • Cultural differences

  • Geography

  • Historical crime rates

  • Long-term institutional differences between states

This allows the model to focus on the relationship between the explanatory variables (law, prisoners, income) and violent crime rates within each state over time.

Do you get the same coefficient if you specify the Fixed Effect in an alternative way?

Yes, you should as all three methods are mathematically equivalent for estimating fixed effects in linear panel data models. This is seen in our comparison above, in each of the three methods we got the same result.

Below is the two way FE (state and time)

In the models above, we mainly accounted for Entity fixed effects but including both time and entity fixed effects can be beneficial:

  • Time fixed effects: Control for factors that change over time but affect all entities (macroeconomic impacts or national policy changes).

  • Entity fixed effects: Control for characteristics that are constant over time within each entity (state).

As we introduce “year”, the coefficients do change from the one way as now we are also controling for factors that change over time. The overall magnitude of the law decreases signficantly from -21.991 to -1.707 but the significance also decreases.

ols_model_SY <- lm(data    = guns,
                       formula = violent ~ law + prisoners + 
                     income + population + male + afam + cauc + 
                     as.factor(state) + as.factor(year))

# estimate the fixed effects regression with plm()
ols_model_withinY <- plm(formula = violent ~ law + prisoners + 
                           income + population + male + afam + 
                           cauc + as.factor(year), 
                data = guns,
                index   = c("state", "year"), 
                model   = "within")

# summary

stargazer(ols_model_within,     # State - One Way OLS (FE)
          ols_model_SY,         # State - Time (Two Way) OLS
          ols_model_withinY,    # State - Time (Two Way) FE
                  type  = "text",
          column.labels = c("FE - One way (state)",
                            "FE - Two way (state-time) OLS",
                            "FE - Two way (state-time) PLM"),
          title = "2 way Regression Results: Comparison of Fixed Effects Models",
          omit = c("state","year"))

2 way Regression Results: Comparison of Fixed Effects Models
========================================================================================================
                                                    Dependent variable:                                 
                    ------------------------------------------------------------------------------------
                                                          violent                                       
                             panel                        OLS                          panel            
                             linear                                                   linear            
                      FE - One way (state)   FE - Two way (state-time) OLS FE - Two way (state-time) PLM
                              (1)                         (2)                           (3)             
--------------------------------------------------------------------------------------------------------
lawyes                      -21.991*                    -1.707                        -1.707            
                            (11.548)                   (10.410)                      (10.410)           
                                                                                                        
prisoners                   0.212***                   0.267***                      0.267***           
                            (0.043)                     (0.041)                       (0.041)           
                                                                                                        
income                       -0.004                     0.008**                       0.008**           
                            (0.004)                     (0.004)                       (0.004)           
                                                                                                        
population                  10.143*                      1.205                         1.205            
                            (5.319)                     (4.759)                       (4.759)           
                                                                                                        
male                       -21.564***                  45.848***                     45.848***          
                            (3.862)                     (9.459)                       (9.459)           
                                                                                                        
afam                         -3.949                   -42.638***                    -42.638***          
                            (9.364)                    (13.406)                      (13.406)           
                                                                                                        
cauc                        9.644***                   -10.033**                     -10.033**          
                            (3.113)                     (4.753)                       (4.753)           
                                                                                                        
Constant                                                379.268                                         
                                                       (286.651)                                        
                                                                                                        
--------------------------------------------------------------------------------------------------------
Observations                 1,173                       1,173                         1,173            
R2                           0.197                       0.940                         0.418            
Adjusted R2                  0.156                       0.935                         0.376            
Residual Std. Error                               85.026 (df = 1093)                                    
F Statistic         39.178*** (df = 7; 1115)  215.469*** (df = 79; 1093)     27.067*** (df = 29; 1093)  
========================================================================================================
Note:                                                                        *p<0.1; **p<0.05; ***p<0.01