Murders and poverty, Part I. The following regression output is for predicting annual murders per million from percentage living in poverty in a random sample of 20 metropolitan areas.
murders <- read.csv("C:/Users/eptrs/Desktop/CUNY/Data606/chapter7/HW/murders.csv")
m_murders_poverty <- lm(murders$annual_murders_per_mil ~ murders$perc_pov)
summary(m_murders_poverty)##
## Call:
## lm(formula = murders$annual_murders_per_mil ~ murders$perc_pov)
##
## Residuals:
## Min 1Q Median 3Q Max
## -9.1663 -2.5613 -0.9552 2.8887 12.3475
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -29.901 7.789 -3.839 0.0012 **
## murders$perc_pov 2.559 0.390 6.562 3.64e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.512 on 18 degrees of freedom
## Multiple R-squared: 0.7052, Adjusted R-squared: 0.6889
## F-statistic: 43.06 on 1 and 18 DF, p-value: 3.638e-06
print(xtable(summary(m_murders_poverty), digits = 3), type="html")| Estimate | Std. Error | t value | Pr(>|t|) | |
|---|---|---|---|---|
| (Intercept) | -29.901 | 7.789 | -3.839 | 0.001 |
| murders$perc_pov | 2.559 | 0.390 | 6.562 | 0.000 |
(a) Write out the linear model.
Since we are given a regression output, the value for \(\beta_0\) and \(\beta_1\) are provided by the first column titled “Estimate” respectively.
\(\bar{y} = -29.901 + 2.559 \cdot x\)
m_murders_poverty <- lm(murders$annual_murders_per_mil ~ murders$perc_pov)
m_murders_poverty##
## Call:
## lm(formula = murders$annual_murders_per_mil ~ murders$perc_pov)
##
## Coefficients:
## (Intercept) murders$perc_pov
## -29.901 2.559
(b) Interpret the intercept.
m_murders_poverty$coefficient## (Intercept) murders$perc_pov
## -29.90116 2.55939
The intercept will be at -29.901. This value tells us that this model will predict negative crime when there is no poverty. This value doesn’t tell us anything.
It’s used to adjust the height of the regression line.
(c) Interpret the slope.
m_murders_poverty$coefficients## (Intercept) murders$perc_pov
## -29.90116 2.55939
For each percentage increase in poverty, murders increse by 2.559 per million.
(d) Interpret R 2 .
summary(m_murders_poverty)$r.squared## [1] 0.7052275
According to this linear model, the poverty level explains 70.52% of the variability in murder rates in metropolitan areas.
(e) Calculate the correlation coefficient.
Since we know \(R^2\).
The correlation coefficient is simply the square root of R2
R2 <- 0.7052
R <- sqrt(R2)
sqrt(summary(m_murders_poverty)$r.squared)## [1] 0.8397782
The correlation coefficient is 0.8397619.