Table of Contents

  • Introduction
  • Exploratory Data Analysis
  • Performing Multiple Linear Regression
  • Bootstrap Multiple Linear Regression
  • Final Model
  • Conclusion & Discussion

Introduction

  • Data set related to homes on the Melbourne housing market
    • 34,857 total observations and 21 different variables
    • 8 categorical variables (e.g. Suburb, Address, Method of Sale, etc.), 13 numerical (e.g. Number of Rooms, Selling Price, Distance From Central Business District, etc.)
  • Exploratory Data Analysis & MLR with Selling Price as response variable
    • Best MLR model chosen from several candidate models

Exploratory Data Analysis

  • Group location variable created using latitude/longitude data
    • factor variable: main.group, 1 = in main cluster of houses, 0 = not in main cluster
## Warning: NAs introduced by coercion

Initial Full Model

Estimate Std. Error t value Pr(>|t|)
(Intercept) 341598.26372 35441.748430 9.6383017 0.0000000
Rooms 191050.76381 20835.328167 9.1695587 0.0000000
Typet -330310.57353 19774.823888 -16.7035912 0.0000000
Typeu -397338.37437 17058.481520 -23.2927165 0.0000000
Bedroom2 -10296.44660 20751.931053 -0.4961681 0.6197876
Bathroom 259327.02431 9512.669773 27.2612243 0.0000000
Car 38405.27626 6011.405633 6.3887348 0.0000000
Landsize 23.36419 4.723378 4.9464991 0.0000008
BuildingArea 34.83513 11.819982 2.9471386 0.0032153
main.groupTRUE 204924.13563 23100.687825 8.8709106 0.0000000
Distance.num -36587.73611 1000.907423 -36.5545657 0.0000000