Murders and poverty, Part I. The following regression output is for predicting annual murders per million from percentage living in poverty in a random sample of 20 metropolitan areas.

murders <- read.csv("C:/Users/eptrs/Desktop/CUNY/Data606/chapter7/HW/murders.csv")

m_murders_poverty <- lm(murders$annual_murders_per_mil ~ murders$perc_pov)

summary(m_murders_poverty)
## 
## Call:
## lm(formula = murders$annual_murders_per_mil ~ murders$perc_pov)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -9.1663 -2.5613 -0.9552  2.8887 12.3475 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       -29.901      7.789  -3.839   0.0012 ** 
## murders$perc_pov    2.559      0.390   6.562 3.64e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.512 on 18 degrees of freedom
## Multiple R-squared:  0.7052, Adjusted R-squared:  0.6889 
## F-statistic: 43.06 on 1 and 18 DF,  p-value: 3.638e-06
print(xtable(summary(m_murders_poverty), digits = 3), type="html")
Estimate Std. Error t value Pr(>|t|)
(Intercept) -29.901 7.789 -3.839 0.001
murders$perc_pov 2.559 0.390 6.562 0.000

Answer:

(a) Write out the linear model.

Since we are given a regression output, the value for \(\beta_0\) and \(\beta_1\) are provided by the first column titled “Estimate” respectively.

\(\bar{y} = -29.901 + 2.559 \cdot x\)

m_murders_poverty <- lm(murders$annual_murders_per_mil ~ murders$perc_pov)
m_murders_poverty
## 
## Call:
## lm(formula = murders$annual_murders_per_mil ~ murders$perc_pov)
## 
## Coefficients:
##      (Intercept)  murders$perc_pov  
##          -29.901             2.559

(b) Interpret the intercept.

m_murders_poverty$coefficient
##      (Intercept) murders$perc_pov 
##        -29.90116          2.55939

The intercept will be at -29.901. This value tells us that this model will predict negative crime when there is no poverty. This value doesn’t tell us anything.

It’s used to adjust the height of the regression line.

(c) Interpret the slope.

m_murders_poverty$coefficients
##      (Intercept) murders$perc_pov 
##        -29.90116          2.55939

For each percentage increase in poverty, murders increse by 2.559 per million.

(d) Interpret R 2 .

summary(m_murders_poverty)$r.squared
## [1] 0.7052275

According to this linear model, the poverty level explains 70.52% of the variability in murder rates in metropolitan areas.

(e) Calculate the correlation coefficient.

Since we know \(R^2\).

The correlation coefficient is simply the square root of R2

R2 <- 0.7052

R <- sqrt(R2)

sqrt(summary(m_murders_poverty)$r.squared)
## [1] 0.8397782

The correlation coefficient is 0.8397619.