HW5 Answers

PAF 573

Elaine MacPherson

Load libraries

library(tidyverse)
library(jtools)
library(gridExtra)  # containas grid.arrange function

Read in data

# read in data 
URL <- "https://raw.githubusercontent.com/spiromar/files/main/paf573/data-crime-levitt.csv"
crime <- read.csv( URL )

Problem 1

Import the crime data Levitt’s paper, entitled “Using Electoral Cycles in Police Hiring to Estimate the Effect of Police on Crime.” In this problem we are going to again examine the relationship between the size of the police force and murder rate, only this time we will use regression that log tranform some of the key variables.

1.1

Estimate and plot the following regression using only the 1992 data:

workingData <- crime %>% filter(year==92)
mLevelLevel <- lm(murder ~ sworn , data=workingData)
summ(mLevelLevel, robust=T, digits = 4)
## MODEL INFO:
## Observations: 57 (2 missing obs. deleted)
## Dependent Variable: murder
## Type: OLS linear regression 
## 
## MODEL FIT:
## F(1,55) = 60.3490, p = 0.0000
## R² = 0.5232
## Adj. R² = 0.5145 
## 
## Standard errors: Robust, type = HC3
## -------------------------------------------------------
##                        Est.     S.E.    t val.        p
## ----------------- --------- -------- --------- --------
## (Intercept)         -3.7655   2.7476   -1.3705   0.1761
## sworn                0.1039   0.0118    8.8149   0.0000
## -------------------------------------------------------
effect_plot(mLevelLevel, pred=sworn, plot.points = T) # plot regression

##My answer copied from my script is below. By the way, it took me AN HOUR to figure out why my shit wasn’t working. I had to check the R version, open and close it, update it, copy and paste and re-run ALL of the damn packages and shit, and after ALL OF THAT: the problem was that the package (sandwich) just needed quotes around it inside the parentheses. GARBAGE.

MODEL INFO: Observations: 57 (2 missing obs. deleted) Dependent Variable: murder Type: OLS linear regression

MODEL FIT: F(1,55) = 60.3490, p = 0.0000 R² = 0.5232 Adj. R² = 0.5145

Standard errors: Robust, type = HC3

Est. S.E. t val. p
(Intercept) -3.7655 2.7476 -1.3705 0.1761
sworn 0.1039 0.0118 8.8149 0.0000

Interpret the coefficient on sworn.

ANSWER: Based on the coefficient listed here, murder increases at a rate of .10 with an increase in the number of sworn police officers in 1992 alone.

1.2

Create two new variables in the crime dataset: one new variable for the natural log of murder (call it ln_murder) and another new variable for the natural log of sworn (call it ln_sworn). Note: You will need to recreate your workingData in order for it to include the new variables you create in the crime dataset.

crime <- crime %>% mutate(ln_murder = log(murder),ln_sworn = log(sworn))
workingData <- crime %>% filter(year==92)

ln_murder <- log(murder) Use ggplot to create a histogram of each of the 2 new variables, as well as 2 other histograms for the untransformed versions of the variables. Use only the 1992 data. grid.arrange to show all 4 histograms together in one figure.

ANSWER:can you see the image on this knit? I ran the code in my R script so I’m confused how you would see it here….I will attach both so you can look.

1.3

Estimate and plot the same regresssion as in 1.1 using only the 1992 data, only this time use the natural log of the sworn:

mLevelLog <- lm(murder ~ ln_sworn , data=workingData)
summ(mLevelLog, robust=T, digits = 4)
## MODEL INFO:
## Observations: 57 (2 missing obs. deleted)
## Dependent Variable: murder
## Type: OLS linear regression 
## 
## MODEL FIT:
## F(1,55) = 49.3250, p = 0.0000
## R² = 0.4728
## Adj. R² = 0.4632 
## 
## Standard errors: Robust, type = HC3
## ----------------------------------------------------------
##                          Est.      S.E.    t val.        p
## ----------------- ----------- --------- --------- --------
## (Intercept)         -133.0899   31.3279   -4.2483   0.0001
## ln_sworn              28.4859    5.8527    4.8672   0.0000
## ----------------------------------------------------------
effect_plot(mLevelLog, pred=ln_sworn, plot.points = T)

MODEL INFO: Observations: 1332 (25 missing obs. deleted) Dependent Variable: murder Type: OLS linear regression

MODEL FIT: F(1,1330) = 677.8101, p = 0.0000 R² = 0.3376 Adj. R² = 0.3371

Standard errors: Robust, type = HC3

Est. S.E. t val. p
(Intercept) -83.3621 4.5746 -18.2227 0.0000
ln_sworn 18.9031 0.8705 21.7154 0.0000

Interpret the coefficient on sworn.

ANSWER:I am so confused. Isn’t whole point of doing the log transformation to have the coefficient be a value between 0 and 1? In this instance, it’s 18.9 which is not between 0 and 1. So what the hell. If I interpret it anyway, it means that murder increased almost 19 times with each additional sworn police officer.

1.4

Note that the regression plot in 1.3 uses ln_sworn on the x-axis. To get a better understanding of what the log tranformation of the sworn predictor did to the regression, it is useful to recreate the plot using the same regression model, only this time keep the x-axis as the same scale as sworn.(Do you mean keep the x axis of the NON-log sworn the same as the log-sworn? Confusing.) effect_plot will do this for you if you give a regression where you used the I() function to transform the variable rather than the ln_sworn variable. The steps are as follows:

mLevelLog2 <- lm(murder ~ I(log(sworn)) , data=workingData)
summ(mLevelLog2, robust=T, digits = 4)
## MODEL INFO:
## Observations: 57 (2 missing obs. deleted)
## Dependent Variable: murder
## Type: OLS linear regression 
## 
## MODEL FIT:
## F(1,55) = 49.3250, p = 0.0000
## R² = 0.4728
## Adj. R² = 0.4632 
## 
## Standard errors: Robust, type = HC3
## ------------------------------------------------------------
##                            Est.      S.E.    t val.        p
## ------------------- ----------- --------- --------- --------
## (Intercept)           -133.0899   31.3279   -4.2483   0.0001
## I(log(sworn))           28.4859    5.8527    4.8672   0.0000
## ------------------------------------------------------------
# notice that you have to give it the data argument 
effect_plot(mLevelLog2, pred=sworn, plot.points = T, data=workingData)

What did taking the natural log of sworn do to the modeling relationship between sworn and murder?

ANSWER:It made it have a curve! I am trying to understand why. Because…the slope changes at each point, and that makes it an s-curve instead of a linear regression model (because now it is a “generalized linear model”?

1.5

Estimate the same regresssion as in 1.1 using the 1992 data, only this time use the natural log of murder. Interpret the coefficient on sworn. #output copied below: MODEL INFO: Observations: 57 (2 missing obs. deleted) Dependent Variable: ln_murder Type: OLS linear regression

MODEL FIT: F(1,55) = 34.8776, p = 0.0000 R² = 0.3881 Adj. R² = 0.3769

Standard errors: Robust, type = HC3

Est. S.E. t val. p
(Intercept) 1.8663 0.2577 7.2425 0.0000
sworn 0.0041 0.0010 3.9826 0.0002

ANSWER: the coefficient decreases significantly, from .10 to .0041. This means that for every additional sworn office, murder increases by only .0041.

1.6

Estimate and plot the same regresssion as in 1.1 usingthe 1992 data, only this time also both the natural log of murder and the natural log of sworn. Interpret the coefficient on sworn.

#output copied below: MODEL INFO: Observations: 57 (2 missing obs. deleted) Dependent Variable: ln_murder Type: OLS linear regression

MODEL FIT: F(1,55) = 39.3092, p = 0.0000 R² = 0.4168 Adj. R² = 0.4062

Standard errors: Robust, type = HC3

Est. S.E. t val. p
(Intercept) -3.7422 1.1484 -3.2585 0.0019
ln_sworn 1.2153 0.2073 5.8622 0.0000

ANSWER:Now the coefficient demonstrates that with each additional sworn officer, there is a 1.21 increase in murder.

Problem 2

Please redo the in-class activity on interaction models on your own, including recreating your own data from the simulation.You do not need to turn this in.