Stopping distance prediction

Data science track, level 9 - Developing data products. Course project.

Fedor Duzhin
Coursera student

Main info about the app

The app is available here:

https://theodor.shinyapps.io/theodor-stopping-distance/

It predicts the stopping distance of an old car depending on the speed at the moment the brake is applied. The speed can be entered in miles per hour, kilometers per hour or meters per second. The output can be displayed either in feet or in meters.

The story

We began with the dataset cars from the R packages datasets. It contains 50 observations on the stopping distance of cars depending on the speed. The observations were made in 1920s with cars equipped with all drum brakes and thus our model cannot be applied to a modern car.

Physics

The kinetic energy of a moving object is given by \[E_k=\frac{1}{2}mv^2,\] where $E_k$ is the kinetic energy, $m$ is the mass, and $v$ is the speed. Since braking is essentially the work of a friction force on transforming all the kinetic energy into thermal energy and the power of the brake is constant, it means that the stopping distance is proportional to the kinetic energy, i.e., the square of the speed.

Mathematics

Given 50 observations, i.e., points $(V_i,D_i)$, where $V_i$ is the observed speed and $D_i$ is the observed stopping distance in the $i$th experiment, we predict the stopping distance to be $aV_i^2$ and find the value of $a$ to minimize the sum of squared errors \[ \sum_{i=1}^{50}\left(aV_i^2-D_i\right)^2. \]

Regression model

If $V$ is the speed of a car and $D$ is the stopping distance, we fit a linear model with $V^2$ as regressor and $D$ as the dependent variable. We assume the zero intercept since the stopping distance of a immobile car is $0$. Thus, \[ D_i=aV_i^2+\varepsilon_i, \] where $a$ is the unknown coefficient of the regression and $\varepsilon_i$ is the random noise coming from things like weight of the car, road conditions, brake quality etc.

R-code

Here is the model construction in the core of the app:

library(datasets); data(cars)
cars$sq <- cars$speed^2
fit <- lm(data=cars, dist ~ sq -1)
coef(fit)

##     sq 
## 0.1534

It means that the model is $D=$0.15$V^2+\varepsilon$, where $D$ is the stopping distance, $V$ is the speed, and $\varepsilon$ is random noise.

The rest is bells and whistles, like plotting the data and converting between different units.

P.S. I honestly tried to tinker with the styles and wasted an hour or so, but the default style is by far the best.