Use RMarkdown to complete each of the following problems. Submit an HTML or PDF document.

Unless otherwise stated, use a significance level of \(\alpha = 0.05\).

Problem 1

Murders and Poverty*: The data \(\texttt{Murders.csv}\) contains the following information on 20 randomly selected Metropolitan areas:

\(\texttt{population}\): Population

\(\texttt{perc_pov}\): Percent in poverty

\(\texttt{perc_unemp}\): Percent unemployed

\(\texttt{annual_murders_per_mil}\): Number of murders per year per million people.

The goal is to use these data to predict annual murders per million for a metropolitan city from percentage living in poverty.

  1. Create a scatter plot of murders per million (y) vs poverty percentage (x). Include the line of best fit.
# Insert R code for problem 1a here
library(ggplot2)
library(car)
## Loading required package: carData
ggplot(Murders, aes(x = perc_pov, y=annual_murders_per_mil)) + geom_point() + geom_smooth(method=lm, se=FALSE) + labs(title = 'Percentage Poverty vs murders per million', x = 'percentage population', y = 'murders per million')
## `geom_smooth()` using formula = 'y ~ x'

  1. Write the estimated linear model. y^ = -29.901 + 2.559 * Poverty Percentage

  2. Interpret the intercept. When the poverty rate is at 0% the predicted amount of murders per million is -29.901

  3. Interpret the slope. with a 1% increase in the poverty percentage we predict a 2.559 increase in murder per millions

  4. Interpret \(R^2\). with a value for R^2 of .7052 we have a fairly strong correlation between the relatoinship of murders per year per million people and poverty percentage

  5. Calculate the correlation coefficient.

# Insert R code for problem 1f here
cor(Murders$perc_pov, Murders$annual_murders_per_mil)
## [1] 0.8397782
  1. Predict the average number of annual murders per million residents for a city that has a poverty percentage of 25%.
# Insert R code for problem 1g here
-29.901+2.559*25
## [1] 34.074
34.074
## [1] 34.074
  1. Do the assumptions appear to be met? Include visual checks.
# Insert R code for problem 1h here
slr = lm(annual_murders_per_mil ~ perc_pov, data = Murders)
hist(slr$residuals)

plot(slr$residuals~Murders$perc_pov, pch = 20)
abline(h=0, lty=3)

Insert problem 1h answer here They do not. Looking at the histogram with the residuals for the data set we can see that it doesn’t pass the normality assumption.