Boston Housing Data Analysis

Author

Tagg Shetty

Introduction

This blog will analyze the real estate market in Boston. Boston is an area with a rich history and diverse neighborhoods which makes it a fascinating case study for analyzing real estate trends. This project aims to explore how different variables such as crime rates, local education quality, and the physical characteristics of homes affect housing prices in Boston.

Dataset Introduction

I will be using the Boston Housing Dataset to do my analysis. This dataset was collected by the U.S. Census service concerning housing in the area of Boston Mass

https://github.com/selva86/datasets/blob/master/BostonHousing.csv

Data Dictionary

CRIM - per capita crime rate by town

ZN - proportion of residential land zoned for lots over 25,000 sq.ft.

INDUS - proportion of non-retail business acres per town.

CHAS - Charles River dummy variable (1 if tract bounds river; 0 otherwise)

NOX - nitric oxides concentration (parts per 10 million)

RM - average number of rooms per dwelling

AGE - proportion of owner-occupied units built prior to 1940

DIS - weighted distances to five Boston employment centres

RAD - index of accessibility to radial highways

TAX - full-value property-tax rate per $10,000

PTRATIO - pupil-teacher ratio by town

B - 1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town

LSTAT - % lower status of the population

MEDV - Median value of owner-occupied homes in $1000’s

Analysis

Histogram of Housing Prices

This histogram helps give us a better understanding of where home prices typically fall in the Boston housing market. Houses seem to average a price of around the 200,000 dollar amount. The data also seems to cap the max price of a house around the 500,000 amount which is important to keep in mind when analyzing the rest of the data.

Scatter Plot of Crime Rate by Housing Prices

There seems to be a pretty direct correlation of housing prices to crime rate. The higher the crime rate of the town the cheaper the houses tend to be. This means in areas of low crime on average the homes will be worth more.

Box Plot of Home Price by Charles River Proximity

This chart shows that homes near the Charles River tend to be worth more on average. While they’re some outliers this could be due to homes near the Charles River being in nicer neighborhoods.

Number of Rooms vs Housing Price

This visualizations shows that the more rooms a house has the more it tends to be worth. While this is not the case with every home as other factors can influence this it does show that how big a home is can be used to make a general approximation about how much the home is worth.

Highway Proximity to Home Value

Next we have a box plot showing proximity to the highway and how it affects average home price. This box plot shows that homes right next to the highway are not worth as much likely due to the noise and traffic. Homes that are accessible to the highway but not to close are worth more on average as this is probably ideal for people who are commuting regulary. There is then a drop off in home value for 4,5, and 6 but it then spikes up again at home values 7 and 8. This could be reflected by 7 and 8 being a distance for people who are wealthy and do not care as much about being close to the city.

Sentiment Analysis

I scraped 20 reviews about the city of Boston to help get an idea of why people want to live there. I collected the reviews from the website url below.

https://www.niche.com/places-to-live/boston-suffolk-ma/reviews/

This shows that people seem to like to live in Boston for a multitude of reasons. The opportunities the city presents seem to be a major influence on people living here. The beauty of the city also seems to attract people as the words “beautiful” and “charming” both came up. The city also seems like an entertaining place to live as “vibrant”, “enjoy”, and “attractions” also all came up.

Conclusion

There are a lot of different factors that go into making a house valuable in Boston. If we were to predict a house to be valuable based off the data we say it has the following characteristics.

Area with low crime rates

Close to the Charles River

Has lots of rooms ideally at least 9

Is close enough to easily access the highway but not so close that it is right on top of it.

Why its not guaranteed this house would have a high value due to variables not included in the data based off what we have to work with this house would be a valuable house in Boston.