Week 11 Assignment

Exploration: Stopping distance as a function of speed.

As speed increases, so does stopping distance. But by how much?

First we’ll do some data investigation.

attach(cars)

plot(speed, dist,  col='blue')
abline(lm(dist~speed, data=cars), col='red', lwd=3)

Distributions

Speed looks normal; dist looks gamma.

par(mfrow=c(1,2))
hist(speed, freq=F, col='lightblue', breaks=10)
lines(density(speed), col='red', lwd=3)
hist(dist, freq=F, col='lightblue', breaks=10)
lines(density(dist), col='red', lwd=3)

What is observation 49? Appears to be an outlier.

par(mfrow=c(1,2))
Boxplot(speed, data=cars, col='lightblue')
Boxplot(dist, data=cars, col='lightblue')

## [1] 49

Oh yeah. 120 feet to stop from 24 mph? It’s on the long side.

cars[which(cars$speed >20 & cars$speed<30), ]

##    speed dist
## 44    22   66
## 45    23   54
## 46    24   70
## 47    24   92
## 48    24   93
## 49    24  120
## 50    25   85

describe(cars)[2,-c(1,6,7,10)]

##       n  mean    sd median min max skew kurtosis   se
## dist 50 42.98 25.77     36   2 120 0.76     0.12 3.64

Lose observation 49 and see the difference.

describe(cars[-49,])[2,-c(1,6,7,10)]

##       n  mean    sd median min max skew kurtosis   se
## dist 49 41.41 23.49     36   2  93  0.5    -0.64 3.36

Meh. Not much. Leave it in.

Now for that regression model. Speed matters!

For every 1 mph in speed, stopping distance increases 3.9 feet. (That’s 3/4’s of a football field to stop from 60 mph, or 17 car lengths.) Although the model (measured by adjusted R²) only explains about 65 percent of the variation in the data.

Then there is that standard error. It is only 9.46 times smaller than our coefficient for speed, so that’s not great.

But the residual distribution looks good – 1st and 3rd quartiles are roughly the same magnitude, though smaller than the 1.5 times standard error that is recommended.

fit <- lm(dist~speed, data=cars)
par(mfrow=c(2,2))
summary(fit)

## 
## Call:
## lm(formula = dist ~ speed, data = cars)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -29.069  -9.525  -2.272   9.215  43.201 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -17.5791     6.7584  -2.601   0.0123 *  
## speed         3.9324     0.4155   9.464 1.49e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 15.38 on 48 degrees of freedom
## Multiple R-squared:  0.6511, Adjusted R-squared:  0.6438 
## F-statistic: 89.57 on 1 and 48 DF,  p-value: 1.49e-12

Here is a plot of the model (same as above, really.)

plot(cars$speed, cars$dist, main='Stopping distance',
     col='blue', xlab='Speed (mph)', ylab='Distance (ft)')
abline(fit, col='red', lwd=3)

Here’s a look at the residuals. They appear to be evenly distributed.

plot(fit)

Week 11 Assignment

Tom Detzel

4/22/2018

Exploration: Stopping distance as a function of speed.

Distributions

What is observation 49? Appears to be an outlier.

Oh yeah. 120 feet to stop from 24 mph? It’s on the long side.

Lose observation 49 and see the difference.

Meh. Not much. Leave it in.

Now for that regression model. Speed matters!

For every 1 mph in speed, stopping distance increases 3.9 feet. (That’s 3/4’s of a football field to stop from 60 mph, or 17 car lengths.) Although the model (measured by adjusted R²) only explains about 65 percent of the variation in the data.

Then there is that standard error. It is only 9.46 times smaller than our coefficient for speed, so that’s not great.

But the residual distribution looks good – 1st and 3rd quartiles are roughly the same magnitude, though smaller than the 1.5 times standard error that is recommended.

Here is a plot of the model (same as above, really.)

Here’s a look at the residuals. They appear to be evenly distributed.

Week 11 Assignment

Tom Detzel

4/22/2018

Exploration: Stopping distance as a function of speed.

Distributions

What is observation 49? Appears to be an outlier.

Oh yeah. 120 feet to stop from 24 mph? It’s on the long side.

Lose observation 49 and see the difference.

Meh. Not much. Leave it in.

Now for that regression model. Speed matters!

For every 1 mph in speed, stopping distance increases 3.9 feet. (That’s 3/4’s of a football field to stop from 60 mph, or 17 car lengths.) Although the model (measured by adjusted R2) only explains about 65 percent of the variation in the data.

Then there is that standard error. It is only 9.46 times smaller than our coefficient for speed, so that’s not great.

But the residual distribution looks good – 1st and 3rd quartiles are roughly the same magnitude, though smaller than the 1.5 times standard error that is recommended.

Here is a plot of the model (same as above, really.)

Here’s a look at the residuals. They appear to be evenly distributed.

For every 1 mph in speed, stopping distance increases 3.9 feet. (That’s 3/4’s of a football field to stop from 60 mph, or 17 car lengths.) Although the model (measured by adjusted R²) only explains about 65 percent of the variation in the data.