Keavatey Srun, S3767615
Last updated: 24 October, 2019
[Photo: Melbourne’s Skyline (The Australian 2018)]
[Photo: House in North Melbourne (realestate.com.au 2019)]
Suburb | Address | Rooms | Type | Price | Day | Month | Year | Distance | Postcode | Bathroom | Car | YearBuilt |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Abbotsford | 68 Studley St | 2 | h | NA | 3 | 09 | 2016 | 2.5 | 3067 | 1 | 1 | NA |
Abbotsford | 85 Turner St | 2 | h | 1480000 | 3 | 12 | 2016 | 2.5 | 3067 | 1 | 1 | NA |
Abbotsford | 25 Bloomburg St | 2 | h | 1035000 | 4 | 02 | 2016 | 2.5 | 3067 | 1 | 0 | 1900 |
Abbotsford | 18/659 Victoria St | 3 | u | NA | 4 | 02 | 2016 | 2.5 | 3067 | 2 | 1 | NA |
Abbotsford | 5 Charles St | 3 | h | 1465000 | 4 | 03 | 2017 | 2.5 | 3067 | 2 | 0 | 1900 |
Abbotsford | 40 Federation La | 3 | h | 850000 | 4 | 03 | 2017 | 2.5 | 3067 | 2 | 1 | NA |
Suburb | Address | Rooms | Type | Price | Day | Month | Year | Distance | Postcode | Bathroom | Car | YearBuilt |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Epping | 10A Cabot Dr | 2 | h | 432000 | 6 | 01 | 2018 | 19.6 | 3076 | 1 | 1 | 1995 |
Fawkner | 1/1 Clara St | 2 | u | 412000 | 6 | 01 | 2018 | 13.1 | 3060 | 1 | 1 | NA |
Glenroy | 70 Beatty Av | 2 | t | 530000 | 6 | 01 | 2018 | 11.2 | 3046 | 1 | 2 | 2010 |
Glenroy | 43 Bindi St | 2 | h | 637000 | 6 | 01 | 2018 | 11.2 | 3046 | 1 | 2 | NA |
Glenroy | 181 Daley St | 2 | h | 628000 | 6 | 01 | 2018 | 11.2 | 3046 | 1 | 4 | NA |
Glenroy | 36 Gladstone Pde | 2 | h | 1245000 | 6 | 01 | 2018 | 11.2 | 3046 | 1 | 2 | NA |
## [1] 248 13
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 385000 788125 979775 977615 1102000 2220000
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.600 5.100 6.700 7.865 10.600 19.900
plot(Price ~ Distance, data = Houseprice0_clean,
xlab = "Distances from CBD (in Km)", ylab = "House Prices")
\(H_0\): The data does not fit the linear regression model.
\(H_A\): The data fits the linear regression model.
F-test will be used to test this overall model.
Below assumptions will also be checked.
##
## Call:
## lm(formula = Price ~ Distance, data = Houseprice0_clean)
##
## Residuals:
## Min 1Q Median 3Q Max
## -592417 -190017 -35996 115621 1342832
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1110740 41858 26.536 < 2e-16 ***
## Distance -16926 4753 -3.561 0.000443 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 296500 on 246 degrees of freedom
## Multiple R-squared: 0.04902, Adjusted R-squared: 0.04516
## F-statistic: 12.68 on 1 and 246 DF, p-value: 0.0004433
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1110740.24 41857.879 26.535989 3.859886e-74
## Distance -16925.56 4752.918 -3.561088 4.432840e-04
## 2.5 % 97.5 %
## (Intercept) 1028294.69 1193185.787
## Distance -26287.17 -7563.955
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1110740.24 41857.879 26.535989 3.859886e-74
## Distance -16925.56 4752.918 -3.561088 4.432840e-04
The best line fit: House Price = 1110740 - 16926 x Distance from CBD
Summarise Linear Relationship in a plot
Before the final regression model can be reported, the all assumptions for linear regression mentioned earlier must be validated.
Independence is checked through the research design. Since the data set is cross sectional as each observation is collected at one point of time, the independence of residuals is assumed to be met.
From the plot, it shows a very slight curve but the residuals equally
spread around the horizontal line without a distinct pattern (close to flat). Therefore, it is a good indication it is a linear relationship.
As seen in the plot above, the residuals follow close to a straight line on this plot except the last part that moves off the curve. Therefore, it is a fairly good to indicate they are normally distributed.
The residuals reasonably well spread above and below and along a pretty horizontal line but in the beginning of the line there are fewer points along and below the line, so it is slightly less variance there. However, homoscedasticity should still be assumed.
There are no values fall outside the bands; therefore, no evidence of influencial cases.
## [1] -0.2214115
## [1] -0.33669245 -0.09959113
## [1] 0.04902305