R Markdown

Assignment Overview

Using the “cars” dataset in R, build a linear model for stopping distance as a function of speed and replicate the analysis of your textbook chapter 3 (visualization, quality evaluation of the model, and residual analysis.

data(cars)

LINEAR REGRESSION

Stopping distance: dependent variable (y) or response variable

Speed: independent variable (x) or predictor variable

Visualization

plot(cars$speed,cars$dist, main="Cars",
xlab="Speed", ylab="Dist")

### Linear Model ### Based on the linear model function output, our function for the line would be: ### f(x) = 3.932x - 17.579

#y(dependent variable ) ~ x(independent variable)
attach(cars)
cars.lm <- lm(dist~speed)
cars.lm
## 
## Call:
## lm(formula = dist ~ speed)
## 
## Coefficients:
## (Intercept)        speed  
##     -17.579        3.932
plot(speed,dist,main="Cars Linear Regression Model")
abline(cars.lm)

### Evaluating the model

summary(cars.lm)
## 
## Call:
## lm(formula = dist ~ speed)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -29.069  -9.525  -2.272   9.215  43.201 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -17.5791     6.7584  -2.601   0.0123 *  
## speed         3.9324     0.4155   9.464 1.49e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 15.38 on 48 degrees of freedom
## Multiple R-squared:  0.6511, Adjusted R-squared:  0.6438 
## F-statistic: 89.57 on 1 and 48 DF,  p-value: 1.49e-12

Residual Analysis

plot(fitted(cars.lm),resid(cars.lm), main="Residuals")
abline(0, 0)

qqnorm(resid(cars.lm))
qqline(resid(cars.lm))

plot(cars.lm)

From the residual vs fitted value plot we can see the there is no pattern in the data hence the data randomness of the residuals and heteroscidatcity is satisfied.

From the normal q-q plot we can see that the residuals are Approximately normally distributed.

From the overall analysis we can say that the model is a well fitted model since the assumptions of the linear regression model are satisfied here.