Anat Kedem
April 25, 2015
Prepared as an assignment in the Coursera course Developing Data Products.
The Shiny page include static and interactive elements:
The lm function create linear regression model that can be used to predict new values, and contain information as the Residual Standard Error and the prediction equation.
RSE equation: \( S_{y}=\sqrt{\frac{\sum_{i=i}^{n} (y_{i}-\hat{y}_{i})^2}{(n-2)}}=5.91 \)
prediction: \( \hat{y}=10.73\cdot x^* + 33.47 \)
were \( y_{i} \) is each waiting value in faithful, \( \hat{y}_{i} \) is each equivalent predicted waiting time, (n-2) degree of freedom, \( x^* \) new eruption value to predict with. See code below:
library(datasets) ; data(faithful)
modFit <- lm(waiting~eruptions, data=faithful)
summary(modFit)$coefficients[1:2]
[1] 33.47440 10.72964
summary(modFit)$sigma #model RSE
[1] 5.914009
The predict function use the model coefficients, mean, RSE, etc. to predict waiting time and interval, for a new eruption.
The Shiny application show interactively one predicted value at a time and therefore the prediction interval (tolerance) is presented.
Tolerance equation:
\[ \hat{y}\pm t_{n-2}^*\cdot S_{y}\cdot\sqrt{1+\frac{1}{n}+\frac{(x^*-\bar{x})^2}{(n-1)S_x^2}}=55.88 to 75.45 \]
for \( x^*=3 \) (new value) and 0.9 tolerance, were \( t_{n-2}^* \) is t value for a given tolerance and degree of freedom (270), and \( \bar{x} \) is eruptions mean in faithful data. See code below:
predict(modFit, data.frame(eruptions=3), interval="prediction", level=0.9)
fit lwr upr
1 65.66332 55.88094 75.4457
National Park Service site, you can also click on the links below:
click inside the iframe