This report objective is to replicate a controversial finding from the article “The Impact of Legalized Abortion on Crime” wtiten by S.Levitt and J.Donohue in 2001. The article offer evidence that legalized abortion has contributed significantly to recent crimer reductions.
To begin the data analysis the packages must be imported into R. After that, the dataset also needs to be imported to start the data cleaning.
# Set working directory
# Set working directory
setwd("C:/Users/olive/Desktop/R Assignment")
# Load packages
library(dplyr)
library(plm)
library(stargazer)
# Read dataset
df <- read.table("levitt_ex.dat",
header = TRUE,
sep = '\t')
# Check dataset basic info
str(df)
## 'data.frame': 1734 obs. of 17 variables:
## $ statenum: int 1 1 1 1 1 1 1 1 1 1 ...
## $ year : int 66 67 68 69 70 71 72 73 74 75 ...
## $ popul : num NA NA NA NA 3452 ...
## $ lpc_viol: num NA NA NA NA NA ...
## $ lpc_prop: num NA NA NA NA NA ...
## $ lpc_murd: num NA NA NA NA NA ...
## $ efamurd : num NA NA NA NA NA NA NA NA NA NA ...
## $ efaviol : num NA NA NA NA NA NA NA NA NA NA ...
## $ efaprop : num NA NA NA NA NA NA NA NA NA NA ...
## $ xxprison: num NA NA NA NA NA ...
## $ xxpolice: num NA NA NA NA NA ...
## $ xxunemp : num NA NA NA NA NA ...
## $ xxincome: num NA NA NA NA NA ...
## $ xxpover : num NA NA NA NA NA NA NA NA NA NA ...
## $ xxafdc15: num NA NA NA NA NA NA NA NA NA NA ...
## $ xxgunlaw: int 1 1 1 1 1 1 1 1 1 1 ...
## $ xxbeer : num NA NA NA NA NA NA NA NA NA NA ...
head(df)
## statenum year popul lpc_viol lpc_prop lpc_murd efamurd efaviol efaprop
## 1 1 66 NA NA NA NA NA NA NA
## 2 1 67 NA NA NA NA NA NA NA
## 3 1 68 NA NA NA NA NA NA NA
## 4 1 69 NA NA NA NA NA NA NA
## 5 1 70 3452 NA NA NA NA NA NA
## 6 1 71 3487 NA NA NA NA NA NA
## xxprison xxpolice xxunemp xxincome xxpover xxafdc15 xxgunlaw xxbeer
## 1 NA NA NA NA NA NA 1 NA
## 2 NA NA NA NA NA NA 1 NA
## 3 NA NA NA NA NA NA 1 NA
## 4 NA NA NA NA NA NA 1 NA
## 5 NA NA NA NA NA NA 1 NA
## 6 NA NA NA NA NA NA 1 NA
# Check for missing values
summary(is.na(df))
## statenum year popul lpc_viol
## Mode :logical Mode :logical Mode :logical Mode :logical
## FALSE:1734 FALSE:1734 FALSE:1530 FALSE:1320
## TRUE :204 TRUE :414
## lpc_prop lpc_murd efamurd efaviol
## Mode :logical Mode :logical Mode :logical Mode :logical
## FALSE:1320 FALSE:1320 FALSE:663 FALSE:663
## TRUE :414 TRUE :414 TRUE :1071 TRUE :1071
## efaprop xxprison xxpolice xxunemp
## Mode :logical Mode :logical Mode :logical Mode :logical
## FALSE:663 FALSE:1275 FALSE:1275 FALSE:1326
## TRUE :1071 TRUE :459 TRUE :459 TRUE :408
## xxincome xxpover xxafdc15 xxgunlaw
## Mode :logical Mode :logical Mode :logical Mode :logical
## FALSE:1326 FALSE:918 FALSE:714 FALSE:1683
## TRUE :408 TRUE :816 TRUE :1020 TRUE :51
## xxbeer
## Mode :logical
## FALSE:714
## TRUE :1020
# Data cleaning, filter dataframe for the years 1985 to 1997
df <- filter(df, df$year >= 85 & df$year <= 97)
# Check for missing values
summary(is.na(df))
## statenum year popul lpc_viol
## Mode :logical Mode :logical Mode :logical Mode :logical
## FALSE:663 FALSE:663 FALSE:663 FALSE:663
## lpc_prop lpc_murd efamurd efaviol
## Mode :logical Mode :logical Mode :logical Mode :logical
## FALSE:663 FALSE:663 FALSE:663 FALSE:663
## efaprop xxprison xxpolice xxunemp
## Mode :logical Mode :logical Mode :logical Mode :logical
## FALSE:663 FALSE:663 FALSE:663 FALSE:663
## xxincome xxpover xxafdc15 xxgunlaw
## Mode :logical Mode :logical Mode :logical Mode :logical
## FALSE:663 FALSE:663 FALSE:663 FALSE:663
## xxbeer
## Mode :logical
## FALSE:663
The dataset contains 17 variables, all of them numerical. The independet variables represents prisoners and police per capita, economic conditions, lagged welfare generosity, concealed handgun laws, beer consumptiona and uneployment rate.
In this part of the project the regressions models created by the article will be replicated as is showed in Table 4 of the article. Six models will be replicated, for each crime category, with and without the controls. The crime categories replicated are violent crime per capita, property crime per capita and finally the murder per capita.
# Violent crime per capita regression models
dfViolent <-
subset(df, select = -c(lpc_murd, lpc_prop, efamurd, efaprop))
# Apply regression
modelViolent_1 <-
plm(
lpc_viol ~ efaviol,
data = dfViolent,
model = "within",
effect = "twoways",
index = c('year','statenum')
)
modelViolent_2 <-
plm(
lpc_viol ~ efaviol + statenum + year + xxprison + xxpolice + xxunemp + xxincome + xxpover + xxafdc15 + xxgunlaw + xxbeer,
data = dfViolent,
model = "within",
effect = "twoways",
index = c('year','statenum')
)
# Property crime per capita regression models
dfProperty <-
subset(df, select = -c(lpc_murd, lpc_viol, efamurd, efaviol))
# Apply regression models
modelProperty_1 <-
plm(
lpc_prop ~ efaprop,
data = dfProperty,
model = "within",
effect = "twoways",
index = c('year','statenum')
)
modelProperty_2 <-
plm(
lpc_prop ~ efaprop + statenum + year + xxprison + xxpolice + xxunemp + xxincome + xxpover + xxafdc15 + xxgunlaw + xxbeer,
data = dfProperty,
model = "within",
effect = "twoways",
index = c('year','statenum')
)
# Murder per capita regression models
dfMurder <-
subset(df, select = -c(lpc_viol, lpc_prop, efaviol, efaprop))
# Apply regression
modelMurder_1 <-
plm(
lpc_murd ~ efamurd,
data = dfMurder,
model = "within",
effect = "twoways",
index = c('year','statenum')
)
modelMurder_2 <-
plm(
lpc_murd ~ efamurd + statenum + year + xxprison + xxpolice + xxunemp + xxincome + xxpover + xxafdc15 + xxgunlaw + xxbeer,
data = dfMurder,
model = "within",
effect = "twoways",
index = c('year','statenum')
)
# Print results
stargazer(modelViolent_1,
modelProperty_1,
modelMurder_1,
type = "text")
##
## =======================================================
## Dependent variable:
## -----------------------------
## lpc_viol lpc_prop lpc_murd
## (1) (2) (3)
## -------------------------------------------------------
## efaviol -0.039***
## (0.012)
##
## efaprop -0.016**
## (0.006)
##
## efamurd 0.022
## (0.025)
##
## -------------------------------------------------------
## Observations 663 663 663
## R2 0.018 0.010 0.001
## Adjusted R2 -0.086 -0.094 -0.104
## F Statistic (df = 1; 599) 10.719*** 6.306** 0.780
## =======================================================
## Note: *p<0.1; **p<0.05; ***p<0.01
stargazer(modelViolent_2,
modelProperty_2,
modelMurder_2,
type = "text")
##
## =======================================================
## Dependent variable:
## -----------------------------
## lpc_viol lpc_prop lpc_murd
## (1) (2) (3)
## -------------------------------------------------------
## efaviol -0.018
## (0.014)
##
## efaprop -0.006
## (0.007)
##
## efamurd 0.039
## (0.030)
##
## xxprison -0.050 -0.158*** -0.153*
## (0.043) (0.027) (0.083)
##
## xxpolice 0.040 0.005 0.216*
## (0.062) (0.039) (0.121)
##
## xxunemp -0.104 1.662*** -0.764
## (0.530) (0.333) (1.027)
##
## xxincome 0.620*** 0.622*** -0.066
## (0.193) (0.121) (0.375)
##
## xxpover -0.002 -0.001 0.002
## (0.003) (0.002) (0.005)
##
## xxafdc15 0.00001** 0.00001** -0.00001
## (0.00001) (0.00000) (0.00001)
##
## xxgunlaw 0.005 0.049*** 0.030
## (0.018) (0.011) (0.034)
##
## xxbeer 0.014*** 0.014*** 0.006
## (0.004) (0.002) (0.007)
##
## -------------------------------------------------------
## Observations 663 663 663
## R2 0.114 0.254 0.021
## Adjusted R2 0.007 0.165 -0.097
## F Statistic (df = 9; 591) 8.439*** 22.416*** 1.408
## =======================================================
## Note: *p<0.1; **p<0.05; ***p<0.01
LASSO is a regression analysis method that performs both variable selection and regularization in order to enhance the prediction accuracy and interpretability of the statistical model.