We used the data from https://www.lock5stat.com/datapage3e.html
These are the following questions we will explore.
Use the data only for California. How much does the size of a home influence its price?
Use the data only for California. How does the number of bedrooms of a home influence its price?
Use the data only for California. How does the number of bathrooms of a home influence its price?
Use the data only for California. How do the size, the number of bedrooms, and the number of bathrooms of a home jointly influence its price?
Are there significant differences in home prices among the four states (CA, NY, NJ, PA)? This will help you determine if the state in which a home is located has a significant impact on its price. All data should be used.
Here we will explore the questions in detail.
Home = read.csv("https://www.lock5stat.com/datasets3e/HomesForSale.csv")
head(Home)
## State Price Size Beds Baths
## 1 CA 533 1589 3 2.5
## 2 CA 610 2008 3 2.0
## 3 CA 899 2380 5 3.0
## 4 CA 929 1868 3 3.0
## 5 CA 210 1360 2 2.0
## 6 CA 268 2131 3 2.0
HomeCA = read.csv("https://www.lock5stat.com/datasets3e/HomesForSaleCA.csv")
head(HomeCA)
## State Price Size Beds Baths
## 1 CA 533 1589 3 2.5
## 2 CA 610 2008 3 2.0
## 3 CA 899 2380 5 3.0
## 4 CA 929 1868 3 3.0
## 5 CA 210 1360 2 2.0
## 6 CA 268 2131 3 2.0
model1=lm(HomeCA$Price ~ HomeCA$Size, data=HomeCA)
plot(model1, 2, main="Size of California Homes and it's price")
A positive slope shows that the bigger the house, the more it costs.
This is a positively strong line, so there is a correlation between size
of a house and the price of the house. The bigger the house is, the more
expensive it is in California.
model2=lm(HomeCA$Price ~ HomeCA$Beds, data=HomeCA)
plot(model2, 2, main="Amount of Bedrooms in California Homes and it's price")
summary(model2)
##
## Call:
## lm(formula = HomeCA$Price ~ HomeCA$Beds, data = HomeCA)
##
## Residuals:
## Min 1Q Median 3Q Max
## -413.83 -236.62 29.94 197.69 570.94
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 269.76 233.62 1.155 0.258
## HomeCA$Beds 84.77 72.91 1.163 0.255
##
## Residual standard error: 267.6 on 28 degrees of freedom
## Multiple R-squared: 0.04605, Adjusted R-squared: 0.01198
## F-statistic: 1.352 on 1 and 28 DF, p-value: 0.2548
A straight positive graph means that the amount of bedrooms do affect the homes price. When theoretical quantiles are greater than -1, then the amount of bedrooms in California houses are more expensive. As more bedrooms are added, the increase in price is less.
A p-value less than .05 means a great positive correlation. p-value=0.2548 So there is a weak correlation between price and amount of bedrooms in California homes.
model3=lm(HomeCA$Price ~ HomeCA$Baths, data=HomeCA)
plot(model3, 2, main="Amount of Bathrooms in California Homes and it's price")
summary(model3)
##
## Call:
## lm(formula = HomeCA$Price ~ HomeCA$Baths, data = HomeCA)
##
## Residuals:
## Min 1Q Median 3Q Max
## -374.93 -181.56 -2.74 152.31 614.81
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 90.71 148.57 0.611 0.54641
## HomeCA$Baths 194.74 62.28 3.127 0.00409 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 235.8 on 28 degrees of freedom
## Multiple R-squared: 0.2588, Adjusted R-squared: 0.2324
## F-statistic: 9.779 on 1 and 28 DF, p-value: 0.004092
The graph is a strong postive slope, so the amount of bathrooms in the house definitely affects the houses price. The p-value is less than 0.05 so there is a big correlation between price and amount of baths in California homes. p-value=0.004092
model4=lm(HomeCA$Price ~ HomeCA$Size + HomeCA$Beds +HomeCA$Baths, data=HomeCA)
plot(model4, 2, main="Size, Beds, and Baths of California Homes and it's price")
summary(model4)
##
## Call:
## lm(formula = HomeCA$Price ~ HomeCA$Size + HomeCA$Beds + HomeCA$Baths,
## data = HomeCA)
##
## Residuals:
## Min 1Q Median 3Q Max
## -415.47 -130.32 19.64 154.79 384.94
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -41.5608 210.3809 -0.198 0.8449
## HomeCA$Size 0.2811 0.1189 2.364 0.0259 *
## HomeCA$Beds -33.7036 67.9255 -0.496 0.6239
## HomeCA$Baths 83.9844 76.7530 1.094 0.2839
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 221.8 on 26 degrees of freedom
## Multiple R-squared: 0.3912, Adjusted R-squared: 0.3209
## F-statistic: 5.568 on 3 and 26 DF, p-value: 0.004353
P_Size=0.0259 P_Bed=0.6239 P_Baths=0.2839 P_Slope = 0.004353
The size of the house has large positive correlation between the price since it’s p-value is 0.0259. So the size of the house affects it’s price the most. The amount of bathrooms affects its price a lot less. The amount of Bedrooms affects the price the least.
model5=aov(Home$Price ~ Home$State , data=Home)
plot(model5, 1, main="Differences between home prices in CA, NY, NJ, PA")
summary(model5)
## Df Sum Sq Mean Sq F value Pr(>F)
## Home$State 3 1198169 399390 7.355 0.000148 ***
## Residuals 116 6299266 54304
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
There is differences between home prices in CA, NY, NJ, and PA. The p-value is 0.000148, so prices significantly vary depending on the state you’re in.