We used the data from https://www.lock5stat.com/datapage3e.html
These questions are to addressed using the data provided above
Use the data only for California. How much does the size of a home influence its price?
Use the data only for California. How does the number of bedrooms of a home influence its price?
Use the data only for California. How does the number of bathrooms of a home influence its price?
Use the data only for California. How do the size, the number of bedrooms, and the number of bathrooms of a home jointly influence its price?
Are there significant differences in home prices among the four states (CA, NY, NJ, PA)? This will help you determine if the state in which a home is located has a significant impact on its price. All data should be used.
We will use different statistical methods to analysis our data,
home <- read.csv("https://www.lock5stat.com/datasets3e/HomesForSale.csv")
head(home)
## State Price Size Beds Baths
## 1 CA 533 1589 3 2.5
## 2 CA 610 2008 3 2.0
## 3 CA 899 2380 5 3.0
## 4 CA 929 1868 3 3.0
## 5 CA 210 1360 2 2.0
## 6 CA 268 2131 3 2.0
CAdata = subset(home, State == "CA")
lm(Price ~ Size, data = CAdata)
##
## Call:
## lm(formula = Price ~ Size, data = CAdata)
##
## Coefficients:
## (Intercept) Size
## -56.8167 0.3392
Based off the regression model, it seems that for every square foot that a house has the price increases by roughly 340 dollars.
CA = subset(home, State == "CA")
lm(Price ~ Beds, data = CA)
##
## Call:
## lm(formula = Price ~ Beds, data = CA)
##
## Coefficients:
## (Intercept) Beds
## 269.76 84.77
model1 <- lm(Price ~ Beds, data = CA)
summary(model1)
##
## Call:
## lm(formula = Price ~ Beds, data = CA)
##
## Residuals:
## Min 1Q Median 3Q Max
## -413.83 -236.62 29.94 197.69 570.94
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 269.76 233.62 1.155 0.258
## Beds 84.77 72.91 1.163 0.255
##
## Residual standard error: 267.6 on 28 degrees of freedom
## Multiple R-squared: 0.04605, Adjusted R-squared: 0.01198
## F-statistic: 1.352 on 1 and 28 DF, p-value: 0.2548
The number of bedrooms does not significantly affect the price of a house in the California data as the p-value is larger than 0.05.
CA = subset(home, State == "CA")
lm(Price ~ Baths, data = CA)
##
## Call:
## lm(formula = Price ~ Baths, data = CA)
##
## Coefficients:
## (Intercept) Baths
## 90.71 194.74
model2 <- lm(Price ~ Baths, data = CA)
summary(model2)
##
## Call:
## lm(formula = Price ~ Baths, data = CA)
##
## Residuals:
## Min 1Q Median 3Q Max
## -374.93 -181.56 -2.74 152.31 614.81
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 90.71 148.57 0.611 0.54641
## Baths 194.74 62.28 3.127 0.00409 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 235.8 on 28 degrees of freedom
## Multiple R-squared: 0.2588, Adjusted R-squared: 0.2324
## F-statistic: 9.779 on 1 and 28 DF, p-value: 0.004092
The number of bathrooms does significantly affect the price of a house in California as the p value is much less than 0.05.
CA = subset(home, State == "CA")
lm(Price ~ Size + Beds + Baths, data = CA)
##
## Call:
## lm(formula = Price ~ Size + Beds + Baths, data = CA)
##
## Coefficients:
## (Intercept) Size Beds Baths
## -41.5608 0.2811 -33.7036 83.9844
model3 <- lm(Price ~ Size + Beds + Baths, data = CA)
summary(model3)
##
## Call:
## lm(formula = Price ~ Size + Beds + Baths, data = CA)
##
## Residuals:
## Min 1Q Median 3Q Max
## -415.47 -130.32 19.64 154.79 384.94
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -41.5608 210.3809 -0.198 0.8449
## Size 0.2811 0.1189 2.364 0.0259 *
## Beds -33.7036 67.9255 -0.496 0.6239
## Baths 83.9844 76.7530 1.094 0.2839
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 221.8 on 26 degrees of freedom
## Multiple R-squared: 0.3912, Adjusted R-squared: 0.3209
## F-statistic: 5.568 on 3 and 26 DF, p-value: 0.004353
Because the p-value is much less than 0.05 jointly these factors have a significant effect on the price of a home. However as Bedrooms (p=0.6239) and Bathrooms (p=0.2839) have large p values these factors don’t have a statistically significant effect on the price of a home. While the Size (p=0.0259) has a statistically significant effect on the price of the home compared to the other two factors as it has the small p value by a long shot.
aov(Price ~ State, data = home)
## Call:
## aov(formula = Price ~ State, data = home)
##
## Terms:
## State Residuals
## Sum of Squares 1198169 6299266
## Deg. of Freedom 3 116
##
## Residual standard error: 233.0322
## Estimated effects may be unbalanced
model4 <- aov(Price ~ State, data = home)
summary(model4)
## Df Sum Sq Mean Sq F value Pr(>F)
## State 3 1198169 399390 7.355 0.000148 ***
## Residuals 116 6299266 54304
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Since the P-value is 0.000148 which is very close to zero, the ANOVA test proves that there are differences in the price of homes based on the state that the house is being sold in.
In conclusion, we answered all the questions that were given to us, proved through p values if certain factors affected the price of homes, and learned more about posit.