Mini Project 1

import

library(tidyverse)

load data

housing <- read.csv(“pierce_county_house_sales.csv”, header=TRUE)

———————————

bedrooms analysis

hist(housing$bedrooms, main= “Home Bedroom Count”, xlab=“Bedrooms”) # Notes: All homes were between 1-10 bedrooms. The vast majority of homes have less than 5 bedrooms

year_built analysis box splot

boxplot(housing$year_built, xlab = “Year Built”, notch = TRUE, horizontal = TRUE) # Notes: Most homes were built between 1960 and 2010. The median year a home was built was around 1998.

house_square_feet analysis scatter plot

plot(housing\(house_square_feet, housing\)sale_price, ylab = “Sale Price”, xlab = “Square Footage”, main=“Sale Price by Square Footage”) # Notes: The smaller the home, the lower the price with some exceptions. The majority of homes cost less than 1 million dollars and have less than 4000 square feet.

hvac_description analysis

ggplot(housing, aes(x = hvac_description, y = sale_price)) + geom_point(alpha = 0.5) + scale_y_continuous(labels = scales::dollar_format()) + labs(title = “Price vs HVAC Style”, x = “HVAC Description”, y = “Sale Price”) + theme(axis.text.x = element_text(angle = 45, hjust = 1)) # Notes: Homes with Heat Pumps display the most variability in sale price. Homes with Floor/Wall Furnaces show the least variability in sale price.

summary statistics

housing |> summarise(mean_bedroom_count=mean(bedrooms), mean_square_feet=mean(house_square_feet), mean_year_built=mean(year_built), mean_sale_price=mean(sale_price)) # Notes: On average, most homes were built around 1980, sold for approximately $460,000, included 1800 sqft, and included 3 bedrooms.

———————————

houses sale_price above 1 million

housing_1M <- housing |> filter(sale_price > 1000000) View(housing_1M)

bedrooms analysis

hist(housing_1M$bedrooms, main= “Million Dollar Home Bedroom Count”, xlab=“Bedrooms”) # Notes: Homes worth more than 1 million dollars were most often between 3-4 bedrooms. The distribution is roughly even.

year_built analysis box splot

boxplot(housing_1M$year_built, xlab = “Year Built”, notch = TRUE, horizontal = TRUE) # Notes: Most million dollar homes were built between 1980 and 2010. The median year a home was built was around 1998.

house_square_feet analysis scatter plot

plot(housing_1M\(house_square_feet, housing_1M\)sale_price, ylab = “Sale Price”, xlab = “Square Footage”, main=“Sale Price by Square Footage”) # Notes: The relationship between the square footage of a house and it’s sale price are less consistent when home value is more than a million dollars. The most expensive homes did not always have the most square feet.

hvac_description analysis

ggplot(housing_1M, aes(x = hvac_description, y = sale_price)) + geom_point(alpha = 0.5) + scale_y_continuous(labels = scales::dollar_format()) + labs(title = “Price vs HVAC Style”, x = “HVAC Description”, y = “Sale Price”) + theme(axis.text.x = element_text(angle = 45, hjust = 1)) # Notes: Homes with Heat Pumps and Warm and Cool Air Zones displayed the most variability in sale price. Homes with Hot Water Baseboards showed the least variability in sale price.

summary statistcs

housing_1M |> summarise(mean_bedroom_count=mean(bedrooms), mean_square_feet=mean(house_square_feet), mean_year_built=mean(year_built), mean_sale_price=mean(sale_price)) # Notes: On average, most million dollar homes were built around 1984, sold for approximately $1,480,000, included 3000 sqft, and included 3 bedrooms.

———————————

Comparative Analysis

Very few million dollars homes included Floor/Wall Furnaces compared to homes less than 1 million dollars. Despite the increased price most homes were sold with 3-4 bedrooms.