Introduction
The data we are using is trying to predict the sales price of a house
(in thousands). The model created shows that the year a remodel was
added, the size of the ground floor living area, the number of cars the
garage can hold, and the overall quality of the house contributes to the
sales price of the home.
Data Description
- rsme: The rsme is 45.65. That means that the error on average is
45.65k.
- r2: The r2 (adjusted) is 0.6729. This means that the model correctly
accounts for 67.29% of the data.
Method
My method was to look at how different variables affected sales
price. I converted sales price into thousands to make a smaller number
that was easier for me to follow. I thought quality would be a good
factor, so I chose that as my categorical variable, and decided to make
it more concise by showing low quality vs high quality. I chose to do
this because it was similar to what we did in class when looking a
different grades. I tried to use variables that did not correlate much
with each other but at least had some sort of correlation with sales
price in thousands. I made a ggcorrplot to see these numbers and how
they correlated with sales price.
Key Findings
- YearRemodAdd: This is the year a remodel was added to the house.
This variable shows that the later the year a remodel was added, or the
more recent it was added, the higher the sales price will be.
- GrLivArea: This is the size of the ground floor living area. This
variable shows that the larger this area is, the higher the sales price
will be.
- GarageCars: This is the number of cars the garage can hold. This
variable shows that the more cars the garage can hold, the higher the
sale price will be.
- low_quality: This variable shows the overall quality of the home. It
is a collapsed version of the categorical data provided. It shows that
anything that is not listed as good, very good, excellent, or very
excellent is not high quality, or for the purposes of naming the
variable, low quality. It shows that if the house is considered low
quality, the sales price is lower.