I will be testing the effect on sale price through four variables lot area, year built, their quality rating, and the zoning of houses. I will do so by creating a new quality rating column, creating a linear regression model and analyzing their statistical significances.
The model created shows the impact of four variables on sale prices. The lot area is in square feet, the year built of the houses, the numerical rating of the house quality, and the zoning classification. The RSME of 44338.77 shows the variance of the model, meaning the amount is skewed by $44338.77. The R-squared of the model is .69, representing a high amount(69%) of the data variance is explained by the model.
The analysis on each variable against sales price displays that lot area, quality rating, year built, and zoning all have an impact on sale price. The quality rating and lot area have the highest significance proving that they impact price the most. The recent years also reflect on higher sale prices.
Lot Area: The larger lot area, the higher the sale price. For every square foot of lot area, the price increases by $1.8, this is highly statistically significant(< 2e-16).
Year Built: The newer homes typically sell for more. For each year, the house increases in sale price by $301.50, this is proven by its significance of 1.11e-15.
Quality Rating: Homes with a higher quality rating, tend to have higher sale prices. For every point increase of quality rating, the sale price increases by $40,130 and its significance is < 2e-16.
MS Zoning: Various zoning also affects sale prices based on its density, for example, medium density tends to sell for 64,910 less than the base zoning and this is proven through its significance of .0409. Whereas the High density is less statistically significant so it is not as confident but it still predicts that sales will be $59,030 less than the base zoning price. The less common zoning areas tend to sell for less than the normal zoning area.