library(tidyverse)
library(scales)
data(SaratogaHouses, package="mosaicData")
houses_lm <- lm(price ~ lotSize + age + landValue +
                  livingArea + bedrooms + bathrooms +
                  waterfront, 
                data = SaratogaHouses)

# View summary of model 1
summary(houses_lm)
## 
## Call:
## lm(formula = price ~ lotSize + age + landValue + livingArea + 
##     bedrooms + bathrooms + waterfront, data = SaratogaHouses)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -220208  -35416   -5443   27570  464320 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   1.399e+05  1.647e+04   8.491  < 2e-16 ***
## lotSize       7.501e+03  2.075e+03   3.615 0.000309 ***
## age          -1.360e+02  5.416e+01  -2.512 0.012099 *  
## landValue     9.093e-01  4.583e-02  19.841  < 2e-16 ***
## livingArea    7.518e+01  4.158e+00  18.080  < 2e-16 ***
## bedrooms     -5.767e+03  2.388e+03  -2.414 0.015863 *  
## bathrooms     2.455e+04  3.332e+03   7.366 2.71e-13 ***
## waterfrontNo -1.207e+05  1.560e+04  -7.738 1.70e-14 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 59370 on 1720 degrees of freedom
## Multiple R-squared:  0.6378, Adjusted R-squared:  0.6363 
## F-statistic: 432.6 on 7 and 1720 DF,  p-value: < 2.2e-16

Interpretation

Q1 Build a regression model to predict the volume of trail users using hightemp, and precip.

Hint: The variables are available in the RailTrail data set from the mosaicData package.

data(RailTrail, package="mosaicData")
railtrail_lm <- lm(volume ~ hightemp + precip, 
                 
                data = RailTrail)

# View summary of model 1
summary(railtrail_lm)
## 
## Call:
## lm(formula = volume ~ hightemp + precip, data = RailTrail)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -271.311  -56.545    5.915   48.962  296.453 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -31.5197    55.2383  -0.571  0.56973    
## hightemp       6.1177     0.7941   7.704 1.97e-11 ***
## precip      -153.2608    39.3071  -3.899  0.00019 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 96.68 on 87 degrees of freedom
## Multiple R-squared:  0.4377, Adjusted R-squared:  0.4247 
## F-statistic: 33.85 on 2 and 87 DF,  p-value: 1.334e-11

Q2 Is the coefficient of hightemp statistically significant at 5%?

Yes it is at 5 % becasue its p value is smaller than 5%.

Q3 Interpret the coefficient of hightemp?

the number of trail users increases by more than 6 People per degree increase in fahrenheit.

Q4 Is the intercept statistically significant at 5%?

No, becasue its larger than 5% and the estimate for people is -31 people which is not possible.

Q5 Interpret the intercept?

The reported R^2 of the model is 0.9668. It means that 97% of the variability in home price can be explained by the model.

Q6 Interpret the reported residual standard error.

The typical difference between the actual railtrail price and the hightemp price predicted by the model is 296. In other words, the model estimated home price misses the actual home price by 296 on average. ## Q7 Interpret the reported adjusted R squared.

The reported R^2 of the model is 0.42 It means that 42% of the variability in home price can be explained by the model.

Q8 Hide the messages, but display the code and its results on the webpage.

Hint: Use message, echo and results in the chunk options. Refer to the RMarkdown Reference Guide.

Q9 Display the title and your name correctly at the top of the webpage.

Q10 Use the correct slug.