This brief analysis looks at data that is derived from HousePrices.csv. In particular, this analysis looks at the relationship between price vs. neighborhood, square foot, vs. price, and general and uses summary statistics and graphical summaries mentioned in Chapter 2.
summary(HousePrices$Price)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 69100 111325 125950 130427 148250 211200
summary(HousePrices$SqFt)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1450 1880 2000 2001 2140 2590
stem(HousePrices$Price)
##
## The decimal point is 4 digit(s) to the right of the |
##
## 6 | 9
## 7 |
## 8 | 124
## 9 | 01124589
## 10 | 13333346667778899
## 11 | 001122444455566677778889
## 12 | 001133455666678
## 13 | 0001133445778
## 14 | 0133344566777889
## 15 | 0011222345678
## 16 | 1156777
## 17 | 1377
## 18 | 12488
## 19 |
## 20 | 0
## 21 | 1
stem(HousePrices$SqFt)
##
## The decimal point is 2 digit(s) to the right of the |
##
## 14 | 5
## 15 | 26
## 16 | 01559
## 17 | 000122344488899
## 18 | 12344667889
## 19 | 0000111222223333344557889999
## 20 | 000001112334445566788889
## 21 | 001113334445555666999
## 22 | 01124555568899
## 23 |
## 24 | 11244
## 25 | 39
histogram(HousePrices$Price)
histogram(HousePrices$SqFt)
densityplot(HousePrices$Price)
densityplot(HousePrices$SqFt)
house.reg=table(HousePrices$Neighborhood)
house.reg
##
## East North West
## 45 44 39
barchart(house.reg,ylab="Neighborhood",col="black")
ggplot(data = HousePrices) +
geom_boxplot(mapping = aes(x= Neighborhood,y = Price))
ggplot(data = HousePrices) +
geom_boxplot(mapping = aes(x= Neighborhood,y = SqFt))
Summary statistics were made for price and square footage. The minimum price for a house was 69,100 dollars and the maximum price for a house was 211,200 dollars. The average price for a house is 130,427 dollars. The minimum square footage of a house was 1,450 square feet and maximum square footage of a house was 2,590 square feet. The average square footage was 2,001 square feet. The stem and leaf plots for the price and square footage showed that their distributions are close to being approximately normal. This was supported by the histograms and density plots for price and square footage as they all showed a similar shape. The bar chart for neighborhood showed that most of the houses are found in the east part of the neighborhood.The box plots for square footage vs. neighborhood and price vs. neighborhood. The boxplot for price vs. neighborhood showed that the housing prices for the east and west neighborhoods are skewed. The west neighborhood is skewed towards the right and the east neighborhood is skewed towards the left. The north part of the neighborhood is approximately normal, but has a few outliers. The first outlier is around and 70,000 dollars and the other two are around $150,000 dollars. The boxplot for square footage vs. neighborhood showed that the square footage for the east neighborhood is approximately normal, and that the north and west neighborhoods have some skewness, with the north slightly skewed to the left and the west skewed to the right.