1 Introduction

We used the data from https://www.lock5stat.com/datapage3e.html

These are the following questions we will explore.

  1. Use the data only for California. How much does the size of a home influence its price?

  2. Use the data only for California. How does the number of bedrooms of a home influence its price?

  3. Use the data only for California. How does the number of bathrooms of a home influence its price?

  4. Use the data only for California. How do the size, the number of bedrooms, and the number of bathrooms of a home jointly influence its price?

  5. Are there significant differences in home prices among the four states (CA, NY, NJ, PA)? This will help you determine if the state in which a home is located has a significant impact on its price. All data should be used.

Analysis

Here we will explore the questions in detail.

Home = read.csv("https://www.lock5stat.com/datasets3e/HomesForSale.csv")
head(Home)
##   State Price Size Beds Baths
## 1    CA   533 1589    3   2.5
## 2    CA   610 2008    3   2.0
## 3    CA   899 2380    5   3.0
## 4    CA   929 1868    3   3.0
## 5    CA   210 1360    2   2.0
## 6    CA   268 2131    3   2.0
HomeCA = read.csv("https://www.lock5stat.com/datasets3e/HomesForSaleCA.csv")
head(HomeCA)
##   State Price Size Beds Baths
## 1    CA   533 1589    3   2.5
## 2    CA   610 2008    3   2.0
## 3    CA   899 2380    5   3.0
## 4    CA   929 1868    3   3.0
## 5    CA   210 1360    2   2.0
## 6    CA   268 2131    3   2.0

Question 1: How much does the size of a home influence its price in California?

model1=lm(HomeCA$Price ~ HomeCA$Size, data=HomeCA) 
plot(model1, 2, main="Size of California Homes and it's price")

A positive slope shows that the bigger the house, the more it costs. This is a positively strong line, so there is a correlation between size of a house and the price of the house. The bigger the house is, the more expensive it is in California.

Question 2: How does the number of bedrooms of a home influence its price in California?

model2=lm(HomeCA$Price ~ HomeCA$Beds, data=HomeCA) 
plot(model2, 2, main="Amount of Bedrooms in California Homes and it's price")

summary(model2)
## 
## Call:
## lm(formula = HomeCA$Price ~ HomeCA$Beds, data = HomeCA)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -413.83 -236.62   29.94  197.69  570.94 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)   269.76     233.62   1.155    0.258
## HomeCA$Beds    84.77      72.91   1.163    0.255
## 
## Residual standard error: 267.6 on 28 degrees of freedom
## Multiple R-squared:  0.04605,    Adjusted R-squared:  0.01198 
## F-statistic: 1.352 on 1 and 28 DF,  p-value: 0.2548

A straight positive graph means that the amount of bedrooms do affect the homes price. When theoretical quantiles are greater than -1, then the amount of bedrooms in California houses are more expensive. As more bedrooms are added, the increase in price is less.

A p-value less than .05 means a great positive correlation. p-value=0.2548 So there is a weak correlation between price and amount of bedrooms in California homes.

Question 3: How does the number of bathrooms of a home influence its price in California?

model3=lm(HomeCA$Price ~ HomeCA$Baths, data=HomeCA) 
plot(model3, 2, main="Amount of Bathrooms in California Homes and it's price")

summary(model3)
## 
## Call:
## lm(formula = HomeCA$Price ~ HomeCA$Baths, data = HomeCA)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -374.93 -181.56   -2.74  152.31  614.81 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)   
## (Intercept)     90.71     148.57   0.611  0.54641   
## HomeCA$Baths   194.74      62.28   3.127  0.00409 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 235.8 on 28 degrees of freedom
## Multiple R-squared:  0.2588, Adjusted R-squared:  0.2324 
## F-statistic: 9.779 on 1 and 28 DF,  p-value: 0.004092

The graph is a strong postive slope, so the amount of bathrooms in the house definitely affects the houses price. The p-value is less than 0.05 so there is a big correlation between price and amount of baths in California homes. p-value=0.004092

Question 4: How do the size, the number of bedrooms, and the number of bathrooms of a home jointly influence its price in California?

model4=lm(HomeCA$Price ~ HomeCA$Size + HomeCA$Beds +HomeCA$Baths, data=HomeCA) 
plot(model4, 2, main="Size, Beds, and Baths of California Homes and it's price")

summary(model4)
## 
## Call:
## lm(formula = HomeCA$Price ~ HomeCA$Size + HomeCA$Beds + HomeCA$Baths, 
##     data = HomeCA)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -415.47 -130.32   19.64  154.79  384.94 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)  
## (Intercept)  -41.5608   210.3809  -0.198   0.8449  
## HomeCA$Size    0.2811     0.1189   2.364   0.0259 *
## HomeCA$Beds  -33.7036    67.9255  -0.496   0.6239  
## HomeCA$Baths  83.9844    76.7530   1.094   0.2839  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 221.8 on 26 degrees of freedom
## Multiple R-squared:  0.3912, Adjusted R-squared:  0.3209 
## F-statistic: 5.568 on 3 and 26 DF,  p-value: 0.004353

P_Size=0.0259 P_Bed=0.6239 P_Baths=0.2839 P_Slope = 0.004353

The size of the house has large positive correlation between the price since it’s p-value is 0.0259. So the size of the house affects it’s price the most. The amount of bathrooms affects its price a lot less. The amount of Bedrooms affects the price the least.

Question 5: Are there significant differences in home prices among the four states (CA, NY, NJ, PA)? This will help you determine if the state in which a home is located has a significant impact on its price. All data should be used.

model5=aov(Home$Price ~ Home$State , data=Home) 
plot(model5, 1, main="Differences between home prices in CA, NY, NJ, PA")

summary(model5)
##              Df  Sum Sq Mean Sq F value   Pr(>F)    
## Home$State    3 1198169  399390   7.355 0.000148 ***
## Residuals   116 6299266   54304                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

There is differences between home prices in CA, NY, NJ, and PA. The p-value is 0.000148, so prices significantly vary depending on the state you’re in.