load the mosaic package first:
library(mosaic)
In this exercise you will study the data described in Agresti EXAMPLE 9.10.
You are studying house sales in Gainesville, Florida, where among other things the data contain the selling price (Price), property taxes (Taxes) and house size (Size).
HousePrices <- read.table("http://asta.math.aau.dk/dan/static/datasets?file=HousePrice.dat", header=TRUE)
head(HousePrices)
## Taxes Price Size
## 1 3104 279900 2048
## 2 1173 146500 912
## 3 3076 237700 1654
## 4 1608 200000 2068
## 5 1454 159900 1477
## 6 2997 499900 3153
plot(HousePrices)
Taxes and Size.cor.test(~ Size + Taxes, data = HousePrices)
##
## Pearson's product-moment correlation
##
## data: Size and Taxes
## t = 14.119, df = 98, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.7416554 0.8745614
## sample estimates:
## cor
## 0.8187958
Taxes and Size as predictors.model <- lm(Price ~ Taxes + Size, data = HousePrices)
summary(model)
##
## Call:
## lm(formula = Price ~ Taxes + Size, data = HousePrices)
##
## Residuals:
## Min 1Q Median 3Q Max
## -188027 -26138 347 22944 200114
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -28608.744 13519.096 -2.116 0.0369 *
## Taxes 39.601 6.917 5.725 1.16e-07 ***
## Size 66.512 12.817 5.189 1.16e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 48830 on 97 degrees of freedom
## Multiple R-squared: 0.7722, Adjusted R-squared: 0.7675
## F-statistic: 164.4 on 2 and 97 DF, p-value: < 2.2e-16
model is the fitted multiple regression model. This explanation should as a minimum includet value and determination and interpretation of p-value.tval1 = -28608.7 / 13519.1
tval2 = 39.6 / 6.9
tval3 = 66.5 / 12.8
tval1
## [1] -2.116169
tval2
## [1] 5.73913
tval3
## [1] 5.195312
(8)Interpretation of Multiple R-squared. # R^2=(TSS - SSE)/ TSS # We look at how many of the errors are not explained, to see how good the model is.
confint.95% confidence interval: (est??t*se)
t=qt (0.025, df=97, lower.tail = FALSE)
-28608.7 + (13519.1)*(t)
## [1] -1777.029
-28608.7 - (13519.1)*(t)
## [1] -55440.37
39.601 + (6.9)*(t)
## [1] 53.29559
39.601 - (6.9)*(t)
## [1] 25.90641
66.5 + (12.8)*(t)
## [1] 91.90446
66.5 - (12.8)*(t)
## [1] 41.09554
confint(model)
## 2.5 % 97.5 %
## (Intercept) -55440.40818 -1777.08054
## Taxes 25.87192 53.32920
## Size 41.07304 91.95066
Taxes and the effect of Size as predictors of Price.model2 <- lm(Price ~ Taxes * Size, data = HousePrices)
summary(model2)
##
## Call:
## lm(formula = Price ~ Taxes * Size, data = HousePrices)
##
## Residuals:
## Min 1Q Median 3Q Max
## -202902 -23642 -224 20081 213409
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.396e+04 2.450e+04 0.978 0.3305
## Taxes 1.991e+01 1.026e+01 1.941 0.0551 .
## Size 3.329e+01 1.806e+01 1.844 0.0683 .
## Taxes:Size 1.036e-02 4.072e-03 2.544 0.0126 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 47510 on 96 degrees of freedom
## Multiple R-squared: 0.7866, Adjusted R-squared: 0.7799
## F-statistic: 117.9 on 3 and 96 DF, p-value: < 2.2e-16