MRT stands for Mass Rapid transit which is the distant to the subway which is equal to 0. the average age for the houses is 43.8 years to 0 years. the five number summary of the price per unit is (7.60,27.70,38.45,46.60,117.50)

##     0%    25%    50%    75%   100% 
##   7.60  27.70  38.45  46.60 117.50

summary (model_estate)

## 
## Call:
## lm(formula = price_per_unit_area ~ house_age + stores + distance_MRT, 
##     data = Real_estate)
## 
## Coefficients:
##  (Intercept)     house_age        stores  distance_MRT  
##    42.977286     -0.252856      1.297442     -0.005379
## 
## Call:
## lm(formula = price_per_unit_area ~ house_age + stores + distance_MRT, 
##     data = Real_estate)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -37.304  -5.430  -1.738   4.325  77.315 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  42.977286   1.384542  31.041  < 2e-16 ***
## house_age    -0.252856   0.040105  -6.305 7.47e-10 ***
## stores        1.297443   0.194290   6.678 7.91e-11 ***
## distance_MRT -0.005379   0.000453 -11.874  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 9.251 on 410 degrees of freedom
## Multiple R-squared:  0.5411, Adjusted R-squared:  0.5377 
## F-statistic: 161.1 on 3 and 410 DF,  p-value: < 2.2e-16

the R-square is high and is fit for the model and the quality is good because the variables reduces the adjusted R-sqaure.the p-value is equal to 2.2e16 which is small and equal to zero and is good for the changes in the value being predicted. the quality of the model is good because the points are near the zero line which means the quality is good.

##                     house_age distance_MRT stores price_per_unit_area
## house_age                1.00         0.03   0.05               -0.21
## distance_MRT             0.03         1.00  -0.60               -0.67
## stores                   0.05        -0.60   1.00                0.57
## price_per_unit_area     -0.21        -0.67   0.57                1.00

this shows the matrix of the correlation between variables, the variable number was removed because it had a higher p-value and there is a negative correlation between the distance MRT and number of stores.

## 
## Call:
## lm(formula = price_per_unit_area ~ distance_MRT + house_age, 
##     data = Real_estate)
## 
## Coefficients:
##  (Intercept)  distance_MRT     house_age  
##    49.885586     -0.007209     -0.231027
## 
## Call:
## lm(formula = price_per_unit_area ~ distance_MRT + house_age, 
##     data = Real_estate)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -36.032  -4.742  -1.037   4.533  71.930 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  49.8855858  0.9677644  51.547  < 2e-16 ***
## distance_MRT -0.0072086  0.0003795 -18.997  < 2e-16 ***
## house_age    -0.2310266  0.0420383  -5.496 6.84e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 9.73 on 411 degrees of freedom
## Multiple R-squared:  0.4911, Adjusted R-squared:  0.4887 
## F-statistic: 198.3 on 2 and 411 DF,  p-value: < 2.2e-16