Link to HTML version published on RPubs.com: http://rpubs.com/lucian_lee/362414
data %>%
select(SalePrice, LotArea, PoolArea, GarageArea) %>%
cor() %>%
round(2) %>%
corrplot(type="upper", order="hclust",
tl.col="black", tl.srt=45)
We will use the categorical variable HouseStyle, which is the style of dwelling.
table(data$HouseStyle)
##
## 1.5Fin 1.5Unf 1Story 2.5Fin 2.5Unf 2Story SFoyer SLvl
## 154 14 726 8 11 445 37 65
barplot(table(data$HouseStyle), main="Bar plot of style of dwelling",
xlab="Style of dwelling")
boxplot(SalePrice~HouseStyle, data=data, main="Sale prices for each style of dwelling",
xlab="Style of dwelling", ylab="Sale price ($)")
The median sales price is higher for finished houses compared to unfinished houses, and higher for 2-story buildings compared to 1-story buildings.
We will use the categorical variable KitchenQual, which is the kitchen quality.
summarise(group_by(data,KitchenQual),
meanprice=mean(SalePrice,na.rm=TRUE))
## # A tibble: 4 x 2
## KitchenQual meanprice
## <fct> <dbl>
## 1 Ex 328555
## 2 Fa 105565
## 3 Gd 212116
## 4 TA 139963
sm.density.compare(data$SalePrice, data$KitchenQual,
xlab="Sale price ($)")
title(main="Density plots of sales prices for each type of kitchen quality")
The mean sales price is higher for houses with better kitchen quality.
Garage area is strongly and positively correlated with sales price of homes. The sales prices are also higher for finished houses (compared to unfinished houses), 2-story buildings (compared to 1-story buildings), and houses with better kitchen quality.