This study looked into assessing the heating load and cooling load requirements of buildings (that is, energy efficiency) as a function of building parameters.
library(readxl)
## Warning: package 'readxl' was built under R version 4.3.2
Energy<- read_xlsx("ENB2012_data.xlsx")
head(Energy)
## # A tibble: 6 × 10
## X1 X2 X3 X4 X5 X6 X7 X8 Y1 Y2
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 0.98 514. 294 110. 7 2 0 0 15.6 21.3
## 2 0.98 514. 294 110. 7 3 0 0 15.6 21.3
## 3 0.98 514. 294 110. 7 4 0 0 15.6 21.3
## 4 0.98 514. 294 110. 7 5 0 0 15.6 21.3
## 5 0.9 564. 318. 122. 7 2 0 0 20.8 28.3
## 6 0.9 564. 318. 122. 7 3 0 0 21.5 25.4
This dataset contains 10 Variables, 8 Predictors and 2 Response: - X1: Relative Compactness - X2: Surface Area - X3: Wall Area - X4: Roff Area - X5: Overall Height - X6: Orientation - X7: Glazing Area - X8: Glazing Area Distribution - Y1: Heating Load - Y2: Cooling Load
Fit two multiple Regression to predict heating load and cooling load, using all 8 predictors, write up the linear expression and explain the coefficient in the models.
HL<-lm(formula = Y1~.-Y2,data = Energy)
summary(HL)
##
## Call:
## lm(formula = Y1 ~ . - Y2, data = Energy)
##
## Residuals:
## Min 1Q Median 3Q Max
## -9.8965 -1.3196 -0.0252 1.3532 7.7052
##
## Coefficients: (1 not defined because of singularities)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 84.013418 19.033613 4.414 1.16e-05 ***
## X1 -64.773432 10.289448 -6.295 5.19e-10 ***
## X2 -0.087289 0.017075 -5.112 4.04e-07 ***
## X3 0.060813 0.006648 9.148 < 2e-16 ***
## X4 NA NA NA NA
## X5 4.169954 0.337990 12.338 < 2e-16 ***
## X6 -0.023330 0.094705 -0.246 0.80548
## X7 19.932736 0.813986 24.488 < 2e-16 ***
## X8 0.203777 0.069918 2.915 0.00367 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.934 on 760 degrees of freedom
## Multiple R-squared: 0.9162, Adjusted R-squared: 0.9154
## F-statistic: 1187 on 7 and 760 DF, p-value: < 2.2e-16
CL<-lm(formula = Y2~.-Y1,data = Energy)
summary(CL)
##
## Call:
## lm(formula = Y2 ~ . - Y1, data = Energy)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8.6940 -1.5606 -0.2668 1.3968 11.1775
##
## Coefficients: (1 not defined because of singularities)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 97.245749 20.764711 4.683 3.34e-06 ***
## X1 -70.787707 11.225269 -6.306 4.85e-10 ***
## X2 -0.088245 0.018628 -4.737 2.59e-06 ***
## X3 0.044682 0.007253 6.161 1.17e-09 ***
## X4 NA NA NA NA
## X5 4.283843 0.368730 11.618 < 2e-16 ***
## X6 0.121510 0.103318 1.176 0.240
## X7 14.717068 0.888018 16.573 < 2e-16 ***
## X8 0.040697 0.076277 0.534 0.594
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.201 on 760 degrees of freedom
## Multiple R-squared: 0.8878, Adjusted R-squared: 0.8868
## F-statistic: 859.1 on 7 and 760 DF, p-value: < 2.2e-16
In model above, we have some varaible that not significant, so I want to reduce to increase goodness of model. Which Predictor we can reject the null hypothesis?
HL2<-lm(formula = Y1~.-Y2-X8-X6-X4,data = Energy)
summary(HL2)
##
## Call:
## lm(formula = Y1 ~ . - Y2 - X8 - X6 - X4, data = Energy)
##
## Residuals:
## Min 1Q Median 3Q Max
## -10.3862 -1.3667 -0.0142 1.3162 7.5555
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 84.386471 19.111765 4.415 1.15e-05 ***
## X1 -64.773432 10.333611 -6.268 6.11e-10 ***
## X2 -0.087289 0.017149 -5.090 4.51e-07 ***
## X3 0.060813 0.006676 9.109 < 2e-16 ***
## X5 4.169954 0.339441 12.285 < 2e-16 ***
## X7 20.437968 0.798727 25.588 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.947 on 762 degrees of freedom
## Multiple R-squared: 0.9153, Adjusted R-squared: 0.9147
## F-statistic: 1646 on 5 and 762 DF, p-value: < 2.2e-16
CL2<-lm(formula = Y2~.-Y1-X8-X6-X4,data = Energy)
summary(CL2)
##
## Call:
## lm(formula = Y2 ~ . - Y1 - X8 - X6 - X4, data = Energy)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8.7240 -1.6017 -0.2631 1.3417 11.3251
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 97.761848 20.756339 4.710 2.94e-06 ***
## X1 -70.787707 11.222822 -6.307 4.80e-10 ***
## X2 -0.088245 0.018624 -4.738 2.57e-06 ***
## X3 0.044682 0.007251 6.162 1.16e-09 ***
## X5 4.283843 0.368650 11.620 < 2e-16 ***
## X7 14.817971 0.867458 17.082 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.2 on 762 degrees of freedom
## Multiple R-squared: 0.8876, Adjusted R-squared: 0.8868
## F-statistic: 1203 on 5 and 762 DF, p-value: < 2.2e-16
The Variable that are Significantly Effect on Energy Efficiency rather in Heating LOad (HL) or Cooling Load (CL) are Relative Compactness (X1), Surface Area (X2), Wall Area (X3), Overall Height (X5), Glazing Area (X7), The relation of each Variable with Heating LOad (HL) or Cooling Load (CL) are determine by the sign of its estimate.