hightemp, and precip.hightemp statistically significant at 5%?hightemp?library(tidyverse)
library(scales)
data(SaratogaHouses, package="mosaicData")
houses_lm <- lm(price ~ lotSize + age + landValue +
livingArea + bedrooms + bathrooms +
waterfront,
data = SaratogaHouses)
# View summary of model 1
summary(houses_lm)
##
## Call:
## lm(formula = price ~ lotSize + age + landValue + livingArea +
## bedrooms + bathrooms + waterfront, data = SaratogaHouses)
##
## Residuals:
## Min 1Q Median 3Q Max
## -220208 -35416 -5443 27570 464320
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.399e+05 1.647e+04 8.491 < 2e-16 ***
## lotSize 7.501e+03 2.075e+03 3.615 0.000309 ***
## age -1.360e+02 5.416e+01 -2.512 0.012099 *
## landValue 9.093e-01 4.583e-02 19.841 < 2e-16 ***
## livingArea 7.518e+01 4.158e+00 18.080 < 2e-16 ***
## bedrooms -5.767e+03 2.388e+03 -2.414 0.015863 *
## bathrooms 2.455e+04 3.332e+03 7.366 2.71e-13 ***
## waterfrontNo -1.207e+05 1.560e+04 -7.738 1.70e-14 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 59370 on 1720 degrees of freedom
## Multiple R-squared: 0.6378, Adjusted R-squared: 0.6363
## F-statistic: 432.6 on 7 and 1720 DF, p-value: < 2.2e-16
Interpretation
Intercept line, for example, indicates that the coefficient is significant at 0.1% signficance level (low p-values). It means that we are 99.9% confident that the interecept is true. One the other hand, The variable age has only one star. It means that we are only 95% confident that age is meaningful in explaining home prices. If a variable had no star, it would have meant that we are not confident of the reported coefficient at all. In other words, it would be highly unlikely that changes in the variable with no star is meaningful in explaining changes in the home prices.living area is 75.18. It means that an increase of one square foot of living area is associated with a home price increase of $75, holding the other variables constant. When interpreting coeffcients, make sure to check the unit of the variables in the data.living area = 0). Of coure, living area canโt be zero. Often, interpret is meaningless.hightemp, and precip.Hint: The variables are available in the RailTrail data set from the mosaicData package.
data(RailTrail, package="mosaicData")
railtrail_lm <- lm(volume ~ hightemp + precip,
data = RailTrail)
# View summary of model 1
summary(railtrail_lm)
##
## Call:
## lm(formula = volume ~ hightemp + precip, data = RailTrail)
##
## Residuals:
## Min 1Q Median 3Q Max
## -271.311 -56.545 5.915 48.962 296.453
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -31.5197 55.2383 -0.571 0.56973
## hightemp 6.1177 0.7941 7.704 1.97e-11 ***
## precip -153.2608 39.3071 -3.899 0.00019 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 96.68 on 87 degrees of freedom
## Multiple R-squared: 0.4377, Adjusted R-squared: 0.4247
## F-statistic: 33.85 on 2 and 87 DF, p-value: 1.334e-11
hightemp statistically significant at 5%?Yes it is at 5 % becasue its p value is smaller than 5%.
hightemp?the number of trail users increases by more than 6 People per degree increase in fahrenheit.
No, becasue its larger than 5% and the estimate for people is -31 people which is not possible.
The reported R^2 of the model is 0.9668. It means that 97% of the variability in home price can be explained by the model.
The typical difference between the actual railtrail price and the hightemp price predicted by the model is 296. In other words, the model estimated home price misses the actual home price by 296 on average. ## Q7 Interpret the reported adjusted R squared.
The reported R^2 of the model is 0.42 It means that 42% of the variability in home price can be explained by the model.
Hint: Use message, echo and results in the chunk options. Refer to the RMarkdown Reference Guide.