R-Report 3

Problem: to model a linear regression analysis of two variables, one of which is the percentage of individuals in the county with at least a high-school diploma (column dip), and the other is the crime rate per 100,000 residents for the counties (column rate).

Scatter plot:

The estimated regression line:

## 
## Call:
## lm(formula = crime$rate ~ crime$dip)
## 
## Coefficients:
## (Intercept)    crime$dip  
##     20517.6       -170.6

Since the slope is negative, we can conclude that increase by 1% in high-school graduation will lead to decrease in crime rate by 170.6 points in population of 100000 residents.
Outliers: (78,14016), (77,2105)
QQ plot of the residuals:

As can be seen from the QQ-Plot, residuals are not normally distributed.

Plot of the errors vs. the fitted values:

According to the graph, variance of the errors is not constant.

95% confidence interval for the slope:

##                  2.5 %      97.5 %
## (Intercept) 13997.3245 27037.87538
## crime$dip    -253.2798   -87.87061

I am 95% confident that increasing high-school graduation by 1% will lead to decrease of crime rate by a number in the interval between -253 and -87. However, the interval does not suggest that there is a strong linear relationship since the diffrence between the bounds is pretty big.

Appendix Code

crime=read.table("~/Desktop/crime.csv", header=TRUE, sep=",")
#1.a
plot(crime$dip,crime$rate,xlab = "diploma %", ylab = "crime rate", main = "Diploma vs Crime rate",pch = 20, col = "darkblue")
the.model = lm(crime$rate ~ crime$dip)
abline(the.model,lwd = 2)
#1.b
the.model
#1.d

#1.e
qqnorm(the.model$residuals, col="darkblue")
qqline(the.model$residuals)
#1.f
plot(the.model$fitted.values,the.model$residuals,xlab = "Fitted Values",ylab = "errors",main = "Constancy of variance of the errors", col="darkblue")

#1.g
confint(the.model,level = 0.95)

R-Report 3

Eduard Kachan

3/16/2019

Problem: to model a linear regression analysis of two variables, one of which is the percentage of individuals in the county with at least a high-school diploma (column dip), and the other is the crime rate per 100,000 residents for the counties (column rate).

Appendix Code