Developing Data Products, Course Project

Brad Dietz
10/22/15

Boston Housing Data

The following presentation is an introduction to a Shiny application that allows the user to see the impact of certain variables on the Median Value of Boston Homes. The program calculates the Median Values of Homes using 5 common regressions.

Additionally, the Root Mean Squared Error is calculated once initially for each regression type. Lower RMSE values are more accurate.

The application is available on

Note, that is may take a few seconds to load so please be patient! :)

The Data

This application is based on data originally published by Harrison, D. and Rubinfeld, D.L. 'Hedonic prices and the demand for clean air', J. Environ. Economics & Management, vol.5, 81-102, 1978.

The Original Dataset has been obtained from UCI Machine Learning Repository Dataset and processed as a part of peer assignment for the Coursera Course Developing Data Products.

The Dataset is described in the following url UCI MLR Archive

Source code is available on the GitHub.

Regressions

Linear_model <- lm(MEDV~CRIM+ZN+INDUS+CHAS+
NOX+RM+AGE+DIS+RAD+TAX+PTRATIO+LSTAT,
data=housing)

RLM_model <- rlm(same as above)

EARTH_model <- earth(same as above)

MVR_model <- mvr(same as above)

SVM_model <- svm(same as above)

Summary of SVM_model


Call:
svm(formula = MEDV ~ CRIM + ZN + INDUS + CHAS + NOX + RM + AGE + 
    DIS + RAD + TAX + PTRATIO + LSTAT, data = housing)


Parameters:
   SVM-Type:  eps-regression 
 SVM-Kernel:  radial 
       cost:  1 
      gamma:  0.08333333 
    epsilon:  0.1 


Number of Support Vectors:  338