1. Introduction

This report explores housing market data using the dataset available at: https://www.lock5stat.com/datasets3e/HomesForSale.csv. With 120 observations and five variables, the dataset captures key characteristics of homes for sale in California (CA), New Jersey (NJ), New York (NY), and Pennsylvania (PA) in 2019.

The goal of this analysis is to investigate how factors such as home size, number of bedrooms, and number of bathrooms influence the asking price of homes, both individually and collectively. In addition, we examine whether there are significant differences in home prices across the four states. The study uses simple and multiple linear regression models, ANOVA, and graphical visualizations to answer five core research questions.

Analysis

You can also embed plots, for example:

home <- read.csv("https://www.lock5stat.com/datasets3e/HomesForSale.csv")
CA_home <- subset(home, State == "CA")

Q1: Use the data only for California. How much does the size of a home influence its price?

library(ggplot2)

ggplot(CA_home, aes(x = Size, y = Price)) +
  geom_point() +
  geom_smooth(method = "lm", se = TRUE) +
  labs(title = "Home Price vs Size in California",
       x = "Size (1,000 sq. ft.)",
       y = "Price ($1,000's)")
## `geom_smooth()` using formula = 'y ~ x'

Size has a significant positive influence on price. Each additional 1000 sq. ft. increases price by approximately the value of the slope.

Q2: Use the data only for California. How does the number of bedrooms of a home influence its price?

ggplot(CA_home, aes(x = factor(Beds), y = Price)) +
  geom_violin(fill = "lightblue") +
  geom_boxplot(width = 0.1) +
  labs(title = "Price Distribution by Number of Bedrooms",
       x = "Number of Bedrooms",
       y = "Price ($1,000's)")

The p-value for beds tells if it’s statistically significant. If p-value < 0.05, more bedrooms increase price, but not strongly because of overlapping ranges.

Q3: Use the data only for California. How does the number of bathrooms of a home influence its price?

ggplot(CA_home, aes(x = factor(Baths), y = Price)) +
  geom_boxplot(fill = "lightgreen") +
  labs(title = "Price by Number of Bathrooms",
       x = "Number of Bathrooms",
       y = "Price ($1,000's)")

Homes with more bathrooms generally have higher prices, but it varies. If the p-value for baths is small, the number of bathrooms significantly affects the price.

Q4: Use the data only for California. How do the size, the number of bedrooms, and the number of bathrooms of a home jointly influence its price?

library(GGally)
## Registered S3 method overwritten by 'GGally':
##   method from   
##   +.gg   ggplot2
ggpairs(CA_home, columns = c("Size", "Beds", "Baths", "Price"))

Size and Baths influence price more strongly than beds after controlling for other factors.

Q5: Are there significant differences in home prices among the four states (CA, NY, NJ, PA)? This will help you determine if the state in which a home is located has a significant impact on its price. All data should be used.

ggplot(home, aes(x = State, y = Price, fill = State)) +
  geom_boxplot() +
  labs(title = "Price Differences Among States",
       x = "State",
       y = "Price ($1,000's)") +
  theme(legend.position = "none")

California (CA) and New York (NY) often have higher median home prices compared to New Jersey (NJ) and Pennsylvania (PA). Based on ANOVA, if p-value < 0.05, there are significant differences among the states.

Summary

The data analysis conducted in this report provides insights into the relationships between home characteristics and their asking prices. Using statistical models and visualization techniques, we addressed the five research questions posed at the start of the project. Major findings are summarized below:

Q1. Home Size and Price: In California, larger home sizes are significantly associated with higher asking prices.

Q2. Bedrooms and Price: The number of bedrooms shows some positive influence on price, although the effect is weaker compared to home size.

Q3. Bathrooms and Price: The number of bathrooms has a significant positive relationship with home price, suggesting that additional bathrooms add considerable value.

Q4. Combined Effect of Size, Bedrooms, and Bathrooms: When analyzed together, home size and number of bathrooms remain strong predictors of price, while the effect of bedrooms diminishes after controlling for size and baths.

Q5. Price Differences Among States: Significant differences in average home prices exist across the four states. California and New York homes generally command higher prices compared to homes in New Jersey and Pennsylvania.

These findings highlight how specific home features contribute to price variation within states and across markets. Understanding these factors can be valuable for buyers, sellers, and real estate professionals aiming to make informed housing decisions.