PAF 573
Elaine MacPherson
Import the crime data Levitt’s paper, entitled “Using Electoral Cycles in Police Hiring to Estimate the Effect of Police on Crime.” In this problem we are going to again examine the relationship between the size of the police force and murder rate, only this time we will use regression that log tranform some of the key variables.
Estimate and plot the following regression using only the 1992 data:
workingData <- crime %>% filter(year==92)
mLevelLevel <- lm(murder ~ sworn , data=workingData)
summ(mLevelLevel, robust=T, digits = 4)## MODEL INFO:
## Observations: 57 (2 missing obs. deleted)
## Dependent Variable: murder
## Type: OLS linear regression
##
## MODEL FIT:
## F(1,55) = 60.3490, p = 0.0000
## R² = 0.5232
## Adj. R² = 0.5145
##
## Standard errors: Robust, type = HC3
## -------------------------------------------------------
## Est. S.E. t val. p
## ----------------- --------- -------- --------- --------
## (Intercept) -3.7655 2.7476 -1.3705 0.1761
## sworn 0.1039 0.0118 8.8149 0.0000
## -------------------------------------------------------
##My answer copied from my script is below. By the way, it took me AN
HOUR to figure out why my shit wasn’t working. I had to check the R
version, open and close it, update it, copy and paste and re-run ALL of
the damn packages and shit, and after ALL OF THAT: the problem was that
the package (sandwich) just needed quotes around it inside the
parentheses. GARBAGE.
MODEL INFO: Observations: 57 (2 missing obs. deleted) Dependent Variable: murder Type: OLS linear regression
MODEL FIT: F(1,55) = 60.3490, p = 0.0000 R² = 0.5232 Adj. R² = 0.5145
| Est. | S.E. | t val. | p | |
|---|---|---|---|---|
| (Intercept) | -3.7655 | 2.7476 | -1.3705 | 0.1761 |
| sworn | 0.1039 | 0.0118 | 8.8149 | 0.0000 |
Interpret the coefficient on sworn.
ANSWER: Based on the coefficient listed here, murder increases at a rate of .10 with an increase in the number of sworn police officers in 1992 alone.
Create two new variables in the crime dataset: one new variable for the natural log of murder (call it ln_murder) and another new variable for the natural log of sworn (call it ln_sworn). Note: You will need to recreate your workingData in order for it to include the new variables you create in the crime dataset.
crime <- crime %>% mutate(ln_murder = log(murder),ln_sworn = log(sworn))
workingData <- crime %>% filter(year==92)ln_murder <- log(murder) Use ggplot to create a
histogram of each of the 2 new variables, as well as 2 other histograms
for the untransformed versions of the variables. Use only the 1992 data.
grid.arrange to show all 4 histograms together in one
figure.
ANSWER:can you see the image on this knit? I ran the code in my R script so I’m confused how you would see it here….I will attach both so you can look.
Estimate and plot the same regresssion as in 1.1 using only the 1992 data, only this time use the natural log of the sworn:
## MODEL INFO:
## Observations: 57 (2 missing obs. deleted)
## Dependent Variable: murder
## Type: OLS linear regression
##
## MODEL FIT:
## F(1,55) = 49.3250, p = 0.0000
## R² = 0.4728
## Adj. R² = 0.4632
##
## Standard errors: Robust, type = HC3
## ----------------------------------------------------------
## Est. S.E. t val. p
## ----------------- ----------- --------- --------- --------
## (Intercept) -133.0899 31.3279 -4.2483 0.0001
## ln_sworn 28.4859 5.8527 4.8672 0.0000
## ----------------------------------------------------------
MODEL INFO: Observations: 1332 (25 missing obs. deleted) Dependent Variable: murder Type: OLS linear regression
MODEL FIT: F(1,1330) = 677.8101, p = 0.0000 R² = 0.3376 Adj. R² = 0.3371
| Est. | S.E. | t val. | p | |
|---|---|---|---|---|
| (Intercept) | -83.3621 | 4.5746 | -18.2227 | 0.0000 |
| ln_sworn | 18.9031 | 0.8705 | 21.7154 | 0.0000 |
Interpret the coefficient on sworn.
ANSWER:I am so confused. Isn’t whole point of doing the log transformation to have the coefficient be a value between 0 and 1? In this instance, it’s 18.9 which is not between 0 and 1. So what the hell. If I interpret it anyway, it means that murder increased almost 19 times with each additional sworn police officer.
Note that the regression plot in 1.3 uses ln_sworn on the x-axis. To
get a better understanding of what the log tranformation of the sworn
predictor did to the regression, it is useful to recreate the plot using
the same regression model, only this time keep the x-axis as the same
scale as sworn.(Do you mean keep the x axis of the NON-log
sworn the same as the log-sworn? Confusing.) effect_plot
will do this for you if you give a regression where you used the
I() function to transform the variable rather than the
ln_sworn variable. The steps are as follows:
## MODEL INFO:
## Observations: 57 (2 missing obs. deleted)
## Dependent Variable: murder
## Type: OLS linear regression
##
## MODEL FIT:
## F(1,55) = 49.3250, p = 0.0000
## R² = 0.4728
## Adj. R² = 0.4632
##
## Standard errors: Robust, type = HC3
## ------------------------------------------------------------
## Est. S.E. t val. p
## ------------------- ----------- --------- --------- --------
## (Intercept) -133.0899 31.3279 -4.2483 0.0001
## I(log(sworn)) 28.4859 5.8527 4.8672 0.0000
## ------------------------------------------------------------
# notice that you have to give it the data argument
effect_plot(mLevelLog2, pred=sworn, plot.points = T, data=workingData)What did taking the natural log of sworn do to the modeling
relationship between sworn and murder?
ANSWER:It made it have a curve! I am trying to understand why. Because…the slope changes at each point, and that makes it an s-curve instead of a linear regression model (because now it is a “generalized linear model”?
Estimate the same regresssion as in 1.1 using the 1992 data, only
this time use the natural log of murder. Interpret the coefficient on
sworn. #output copied below: MODEL INFO: Observations: 57
(2 missing obs. deleted) Dependent Variable: ln_murder Type: OLS linear
regression
MODEL FIT: F(1,55) = 34.8776, p = 0.0000 R² = 0.3881 Adj. R² = 0.3769
| Est. | S.E. | t val. | p | |
|---|---|---|---|---|
| (Intercept) | 1.8663 | 0.2577 | 7.2425 | 0.0000 |
| sworn | 0.0041 | 0.0010 | 3.9826 | 0.0002 |
ANSWER: the coefficient decreases significantly, from .10 to .0041. This means that for every additional sworn office, murder increases by only .0041.
Estimate and plot the same regresssion as in 1.1 usingthe 1992 data,
only this time also both the natural log of murder and the natural log
of sworn. Interpret the coefficient on sworn.
#output copied below: MODEL INFO: Observations: 57 (2 missing obs. deleted) Dependent Variable: ln_murder Type: OLS linear regression
MODEL FIT: F(1,55) = 39.3092, p = 0.0000 R² = 0.4168 Adj. R² = 0.4062
| Est. | S.E. | t val. | p | |
|---|---|---|---|---|
| (Intercept) | -3.7422 | 1.1484 | -3.2585 | 0.0019 |
| ln_sworn | 1.2153 | 0.2073 | 5.8622 | 0.0000 |
ANSWER:Now the coefficient demonstrates that with each additional sworn officer, there is a 1.21 increase in murder.
Please redo the in-class activity on interaction models on your own, including recreating your own data from the simulation.You do not need to turn this in.