Introduction

This report presents an analysis of housing prices and home characteristics using the HomesForSale dataset. The objective of this report is to answer the following five questions about how certain characteristics affect a home’s price:

  1. Use the data only for California. How much does the size of a home influence its price?
  2. Use the data only for California. How does the number of bedrooms of a home influence its price?
  3. Use the data only for California. How does the number of bathrooms of a home influence its price?
  4. Use the data only for California. How do the size, the number of bedrooms, and the number of bathrooms of a home jointly influence its price?
  5. Are there significant differences in home prices among the four states (CA, NY, NJ, PA)? This will help you determine if the state in which a home is located has a significant impact on its price. All data should be used.

Data

A data frame with 120 observations on the following 5 variables.

  • State: Location of the home (CA, NJ, NY, or PA)
  • Price: Asking price (in $1,000s)
  • Size: Area of all rooms (in 1,000s sq. ft.)
  • Beds: Number of bedrooms
  • Baths: Number of bathrooms

Analysis

Here we will analyze the questions in further detail using R.

home = read.csv("https://www.lock5stat.com/datasets3e/HomesForSale.csv")
head(home)
##   State Price Size Beds Baths
## 1    CA   533 1589    3   2.5
## 2    CA   610 2008    3   2.0
## 3    CA   899 2380    5   3.0
## 4    CA   929 1868    3   3.0
## 5    CA   210 1360    2   2.0
## 6    CA   268 2131    3   2.0
CA = subset(home, State == "CA")

Q1: How much does the size of a home influence its price? (California)

model1 = lm(Price ~ Size, data = CA)
summary(model1)
## 
## Call:
## lm(formula = Price ~ Size, data = CA)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -462.55 -139.69   39.24  147.65  352.21 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -56.81675  154.68102  -0.367 0.716145    
## Size          0.33919    0.08558   3.963 0.000463 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 219.3 on 28 degrees of freedom
## Multiple R-squared:  0.3594, Adjusted R-squared:  0.3365 
## F-statistic: 15.71 on 1 and 28 DF,  p-value: 0.0004634
plot(CA$Size, CA$Price,
     main = "Home Size vs Price (California)",
     xlab = "Size (1,000 sq ft)",
     ylab = "Price ($1,000s)",
     pch = 19)

abline(model1, col = "red", lwd = 2)

Q2: How does the number of bedrooms of a home influence its price? (California)

model2 = lm(Price ~ Beds, data = CA)
summary(model2)
## 
## Call:
## lm(formula = Price ~ Beds, data = CA)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -413.83 -236.62   29.94  197.69  570.94 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)   269.76     233.62   1.155    0.258
## Beds           84.77      72.91   1.163    0.255
## 
## Residual standard error: 267.6 on 28 degrees of freedom
## Multiple R-squared:  0.04605,    Adjusted R-squared:  0.01198 
## F-statistic: 1.352 on 1 and 28 DF,  p-value: 0.2548
boxplot(Price ~ Beds, data = CA,
        main = "Bedrooms vs Price (California)",
        xlab = "Number of Bedrooms",
        ylab = "Price ($1,000s)")

Q3: How does the number of bathrooms of a home influence its price? (California)

model3 = lm(Price ~ Baths, data = CA)
summary(model3)
## 
## Call:
## lm(formula = Price ~ Baths, data = CA)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -374.93 -181.56   -2.74  152.31  614.81 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)   
## (Intercept)    90.71     148.57   0.611  0.54641   
## Baths         194.74      62.28   3.127  0.00409 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 235.8 on 28 degrees of freedom
## Multiple R-squared:  0.2588, Adjusted R-squared:  0.2324 
## F-statistic: 9.779 on 1 and 28 DF,  p-value: 0.004092
boxplot(Price ~ Baths, data = CA,
        main = "Bathrooms vs Price (California)",
        xlab = "Number of Bathrooms",
        ylab = "Price ($1,000s)")

Q4: How do the size, the number of bedrooms, and the number of bathrooms of a home jointly influence its price? (California)

model4 = lm(Price ~ Size + Beds + Baths, data = CA)
summary(model4)
## 
## Call:
## lm(formula = Price ~ Size + Beds + Baths, data = CA)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -415.47 -130.32   19.64  154.79  384.94 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept) -41.5608   210.3809  -0.198   0.8449  
## Size          0.2811     0.1189   2.364   0.0259 *
## Beds        -33.7036    67.9255  -0.496   0.6239  
## Baths        83.9844    76.7530   1.094   0.2839  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 221.8 on 26 degrees of freedom
## Multiple R-squared:  0.3912, Adjusted R-squared:  0.3209 
## F-statistic: 5.568 on 3 and 26 DF,  p-value: 0.004353
plot(model4$fitted.values, model4$residuals,
     main = "Residual Plot for Multiple Regression (California)",
     xlab = "Fitted Values",
     ylab = "Residuals",
     pch = 19)
abline(h = 0, col = "red", lwd = 2)

Q5: Are there significant differences in home prices among the four states (CA, NY, NJ, PA)?

model5 = aov(Price ~ State, data = home)
summary(model5)
##              Df  Sum Sq Mean Sq F value   Pr(>F)    
## State         3 1198169  399390   7.355 0.000148 ***
## Residuals   116 6299266   54304                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
boxplot(Price ~ State, data = home,
        main = "Home Prices by State",
        xlab = "State",
        ylab = "Price ($1,000s)")

Summary

1. Size and price correlation in CA

Larger homes tend to be significantly more expensive in California.

2. Bedroom total and price correlation in CA

The number of bedrooms alone does not meaningfully predict home prices in California.

3. Bathroom total and price correlation in CA

Homes with more bathrooms tend to be more expensive in California.

4. Size, Bedroom, Bathroom total correlation in CA

When considering all three variables together, size is the only significant predictor of California home prices.

5. Price difference between CA, NY, NJ, PA

The state in which a home is located has a significant effect on its price.

References