These are the summary statistics from the Burkhardt paper:
include_graphics("sum_burk.png")
These are my means:
mean(crime$mean)
## [1] 21.13227
sd(crime$mean)
## [1] 8.084782
min(crime$mean)
## [1] 1.91396
max(crime$mean)
## [1] 74.125
mean(crime$mean_oz, na.rm = TRUE)
## [1] 12.55691
sd(crime$mean_oz, na.rm = TRUE)
## [1] 5.575342
min(crime$mean_oz, na.rm = TRUE)
## [1] 0.2885415
max(crime$mean_oz, na.rm = TRUE)
## [1] 85.61695
I divided ozone by 1000, to put it in the same scale as the means in the Burkhardt paper. I do not know why my ozone means are so high (maybe not in parts per million, but parts per thousand? Another possibility is that Medellin has dramatically higher ozone levels than the US, or as a city vs the widespread counties they use in the paper). In any case, for this exploration that will do, with the understanding that I need to find out what the problem is here.
I will use mean PM25 since that is the one they use in the paper.
This is their specification, in a Poisson Quasi-Maximum Likelihood:
\[ crime_{ct}^t = \gamma^j_{PM}PM25_{ct} + \gamma^j_{o}ozone_{ct} + X_{ct}\beta^j + \phi_{ct} + \epsilon_{ct}^j \] These are the results of their regressions:
include_graphics("reg_burk.png")
This is my approximation, using a Poisson Quasi-Maximum Likelihood estimation:
NOTE: I omit the coefficients for barrio, month and weekday, for readibility.
Model 1: crime = pm25 + oz
Model 2: crime = pm25 + oz with barrio Fixed Effects
Model 3: crime = pm25 + oz with barrio and year Fixed Effects
Model 4: crime = pm25 + oz with barrio, year and month Fixed Effects
Model 5: crime = pm25 + oz with barrio, year, month and day of the week Fixed Effects
| Dependent variable: | |||||
| crime | |||||
| (1) | (2) | (3) | (4) | (5) | |
| pm25 | 0.024*** | 0.009* | 0.014*** | 0.023*** | 0.015** |
| (0.005) | (0.005) | (0.005) | (0.006) | (0.007) | |
| mean_oz | -0.030*** | -0.012 | -0.018* | -0.033*** | -0.023** |
| (0.011) | (0.009) | (0.010) | (0.011) | (0.011) | |
| factor(year)2018 | 0.324*** | 0.429*** | 0.349*** | ||
| (0.115) | (0.121) | (0.129) | |||
| factor(year)2019 | 0.071 | 0.727** | 0.587 | ||
| (0.289) | (0.348) | (0.373) | |||
| Constant | -3.104*** | -1.706*** | -1.922*** | -2.635*** | -2.538*** |
| (0.199) | (0.190) | (0.208) | (0.318) | (0.364) | |
| Observations | 15,798 | 15,798 | 15,798 | 15,798 | 15,798 |
| Note: | p<0.1; p<0.05; p<0.01 | ||||
Now, given that I don’t really trust my Ozone estimations, I’ll run the same models but without Ozone. Note that these have MUCH HIGHER, sample sizes, since I lose a lot of observations given how many NAs there are in the Ozone observations.
Model 1: crime = pm25
Model 2: crime = pm25 with barrio Fixed Effects
Model 3: crime = pm25 with barrio and year Fixed Effects
Model 4: crime = pm25 with barrio, year and month Fixed Effects
Model 5: crime = pm25 with barrio, year, month and day of the week Fixed Effects
| Dependent variable: | |||||
| crime | |||||
| (1) | (2) | (3) | (4) | (5) | |
| pm25 | 0.009*** | 0.004** | 0.005*** | 0.016*** | 0.006** |
| (0.002) | (0.002) | (0.002) | (0.002) | (0.002) | |
| factor(year)2018 | 0.259*** | 0.321*** | 0.287*** | ||
| (0.037) | (0.041) | (0.042) | |||
| factor(year)2019 | -0.259*** | -0.021 | -0.083 | ||
| (0.081) | (0.099) | (0.103) | |||
| Constant | -2.421*** | -2.121*** | -2.332*** | -2.762*** | -2.536*** |
| (0.049) | (0.127) | (0.131) | (0.148) | (0.160) | |
| Observations | 81,913 | 81,913 | 81,913 | 81,913 | 81,913 |
| Note: | p<0.1; p<0.05; p<0.01 | ||||
I couldn’t replicate his model because the amount of FEs crashes the computer.
Model 1: crime = pm25 with barrio-year-month-dow Fixed Effects Didn’t work
Model 1: crime = pm25 + oxone with barrio-year-month-dow Fixed Effects Didn’t work
Full FE Model without ozone different independent variables:
Model 1: crime = pm25 with barrio, year, month and day of the week Fixed Effects
Model 2: property_posession = pm25 with barrio, year, month and day of the week Fixed Effects
Model 3: life_integrity = pm25 with barrio, year, month and day of the week Fixed Effects
Model 4: economic_activity = pm25 with barrio, year, month and day of the week Fixed Effects
Model 5: public_spaceint = pm25 with barrio, year, month and day of the week Fixed Effects
Model 6: urban_prop = pm25 with barrio, year, month and day of the week Fixed Effects
stargazer(model1c, model2c,model3c, model4c, model5c, model6c, type = 'html', omit = c("barrio", "month", "weekday"))
| Dependent variable: | ||||||
| crime | property_posession | life_integrity | economic_activity | public_spaceint | urban_prop | |
| (1) | (2) | (3) | (4) | (5) | (6) | |
| pm25 | 0.006** | 0.022*** | 0.005 | 0.012* | -0.001 | 0.001 |
| (0.002) | (0.005) | (0.004) | (0.006) | (0.008) | (0.004) | |
| factor(year)2018 | 0.287*** | 0.679*** | 0.211*** | 0.294*** | 0.144 | 0.179** |
| (0.042) | (0.090) | (0.064) | (0.103) | (0.130) | (0.073) | |
| factor(year)2019 | -0.083 | 0.345 | -0.162 | 0.210 | -0.045 | -0.325* |
| (0.103) | (0.258) | (0.150) | (0.276) | (0.332) | (0.167) | |
| Constant | -2.536*** | -4.174*** | -4.794*** | -23.218 | -22.519 | -20.293 |
| (0.160) | (0.264) | (0.419) | (1,142.936) | (1,316.274) | (507.553) | |
| Observations | 81,913 | 81,913 | 81,913 | 81,913 | 81,913 | 81,913 |
| Note: | p<0.1; p<0.05; p<0.01 | |||||
model1d <- glm(crime ~ pm25 + mean_oz + factor(barrio) + factor(year) + factor(month) + factor(weekday), data = crime,family = stats::quasipoisson(link = "log"))
model2d <- glm(property_posession ~ pm25 + mean_oz + factor(barrio) + factor(year) + factor(month) + factor(weekday), data = crime,family = stats::quasipoisson(link = "log"))
model3d <- glm(life_integrity ~ pm25 + mean_oz + factor(barrio) + factor(year) + factor(month) + factor(weekday), data = crime,family = stats::quasipoisson(link = "log"))
model4d <- glm(economic_activity ~ pm25 + mean_oz + factor(barrio) + factor(year) + factor(month) + factor(weekday), data = crime,family = stats::quasipoisson(link = "log"))
model5d <- glm(public_spaceint ~ pm25 + mean_oz + factor(barrio) + factor(year) + factor(month) + factor(weekday), data = crime,family = stats::quasipoisson(link = "log"))
model6d <- glm(urban_prop ~ pm25 + mean_oz + factor(barrio) + factor(year) + factor(month) + factor(weekday), data = crime,family = stats::quasipoisson(link = "log"))
Same models as above but with ozone:
| Dependent variable: | ||||||
| crime | property_posession | life_integrity | economic_activity | public_spaceint | urban_prop | |
| (1) | (2) | (3) | (4) | (5) | (6) | |
| pm25 | 0.015** | -0.014 | 0.018 | 0.041*** | 0.014 | 0.007 |
| (0.007) | (0.012) | (0.012) | (0.015) | (0.017) | (0.010) | |
| mean_oz | -0.023** | -0.009 | -0.014 | -0.055** | -0.038 | -0.023 |
| (0.011) | (0.019) | (0.018) | (0.028) | (0.028) | (0.015) | |
| factor(year)2018 | 0.349*** | -0.014 | 0.126 | 0.330 | 0.276 | 0.695*** |
| (0.129) | (0.252) | (0.218) | (0.277) | (0.272) | (0.201) | |
| factor(year)2019 | 0.587 | -15.870 | -0.040 | 0.656 | 0.599 | 1.583*** |
| (0.373) | (1,575.941) | (0.805) | (0.791) | (0.828) | (0.425) | |
| Constant | -2.538*** | -5.663*** | -4.348*** | -5.217*** | -4.283*** | -5.344*** |
| (0.364) | (0.693) | (0.653) | (0.750) | (0.859) | (0.557) | |
| Observations | 15,798 | 15,798 | 15,798 | 15,798 | 15,798 | 15,798 |
| Note: | p<0.1; p<0.05; p<0.01 | |||||