library(tidyverse)
library(scales)
options(scipen=999)
data(SaratogaHouses, package="mosaicData")
houses_lm <- lm(price ~ lotSize + age + landValue +
                  livingArea + bedrooms + bathrooms +
                  waterfront, 
                data = SaratogaHouses)

# View summary of model 1
summary(houses_lm)
## 
## Call:
## lm(formula = price ~ lotSize + age + landValue + livingArea + 
##     bedrooms + bathrooms + waterfront, data = SaratogaHouses)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -220208  -35416   -5443   27570  464320 
## 
## Coefficients:
##                   Estimate    Std. Error t value             Pr(>|t|)    
## (Intercept)   139878.80484   16472.92736   8.491 < 0.0000000000000002 ***
## lotSize         7500.79232    2075.13554   3.615             0.000309 ***
## age             -136.04011      54.15794  -2.512             0.012099 *  
## landValue          0.90931       0.04583  19.841 < 0.0000000000000002 ***
## livingArea        75.17866       4.15811  18.080 < 0.0000000000000002 ***
## bedrooms       -5766.75988    2388.43256  -2.414             0.015863 *  
## bathrooms      24547.10644    3332.26775   7.366    0.000000000000271 ***
## waterfrontNo -120726.62066   15600.82783  -7.738    0.000000000000017 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 59370 on 1720 degrees of freedom
## Multiple R-squared:  0.6378, Adjusted R-squared:  0.6363 
## F-statistic: 432.6 on 7 and 1720 DF,  p-value: < 0.00000000000000022

Interpretation

Q1 Build a regression model to predict the volume of trail users using hightemp, and precip.

Hint: The variables are available in the RailTrail data set from the mosaicData package.

data(RailTrail, package="mosaicData")
railtrail_lm <- lm(volume ~ hightemp + precip, 
                data = RailTrail)

# View summary of model 1
summary(railtrail_lm)
## 
## Call:
## lm(formula = volume ~ hightemp + precip, data = RailTrail)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -271.311  -56.545    5.915   48.962  296.453 
## 
## Coefficients:
##              Estimate Std. Error t value        Pr(>|t|)    
## (Intercept)  -31.5197    55.2383  -0.571         0.56973    
## hightemp       6.1177     0.7941   7.704 0.0000000000197 ***
## precip      -153.2608    39.3071  -3.899         0.00019 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 96.68 on 87 degrees of freedom
## Multiple R-squared:  0.4377, Adjusted R-squared:  0.4247 
## F-statistic: 33.85 on 2 and 87 DF,  p-value: 0.00000000001334

Q2 Is the coefficient of hightemp statistically significant at 5%?

Yes because the p value is smaller than five percent.

Q3 Interpret the coefficient of hightemp?

Number of trail users increases by more than six people per a degree increase in farenheit.

Q4 Is the intercept statistically significant at 5%?

No it is not because the intercept is larger than .05.

Q5 Interpret the intercept?

The intercept dosent matter because its -31 people which dosent make sense. ## Q6 Interpret the reported residual standard error.

On average the model dosent account for the actual number of trail users by 97 people.

Q7 Interpret the reported adjusted R squared.

42 percent of the variation is explained by the model

Q8 Hide the messages, but display the code and its results on the webpage.

Hint: Use message, echo and results in the chunk options. Refer to the RMarkdown Reference Guide.

Q9 Display the title and your name correctly at the top of the webpage.

Q10 Use the correct slug.