My Boston House Value Predictor App

T. Roelofs
17 June

What is the app about?

The Boston Housing Data Set was gathered by Harrison, D. and Rubinfeld, D.L. for their study 'Hedonic prices and the demand for clean air' (Journal for Environmental Economics & Management, vol.5, p 81-102) in 1978. In the data set several variables were measured that influence house prices in Boston (and throughout the whole world).

  • 506 rows of measurements
  • 14 features of houses and environmental factors
  • 14th feature is the median value of owner-occupied homes in $1000's

The data set can be obtained at archive.ics.uci.edu/ml/datasets/housing. Furthermore, the data set is contained in the MASS package, which we will use in the App.

What can the user do with the App?

The user of the App can do the following:

  • Play with the variables to see the influence on the median house price that is predicted
  • Observe the difference in dollar value prediction between the PLS and GBM algorithms
  • Observe the difference in percentage in the outcome of the prediction models

Slide With example Plot generated in the App

plot of chunk unnamed-chunk-1

Slide With Code Examples from the App

The following code is used to generate the models:

model_gbm <- train(MEDV ~ ., method="gbm", data = Boston_data, verbose=FALSE)
model_pls <- train(MEDV ~ ., method ="pls", data= Boston_data, verbose=FALSE)

The predictions can be generated with input data frames:

input_values <- data.frame(50, 12,12,0,0.5,6,70,5,10,500,20,300,13)
colnames(input_values) <- c("CRIM", "ZN", "INDUS", "CHAS", "NOX", "RM", "AGE", "DIS", "RAD", "TAX", "PRATIO", "B", "LSTAT")
pred_pls <- predict(model_pls, newdata = input_values)
pred_gbm <- predict(model_gbm, newdata = input_values)
print(pred_pls)
[1] 19.86858