data <- read.csv("Household energy bill data.csv")

I. Introduction

Background of the Study

Electricity is a fundamental input into powering the systems within residential properties. Being able to consume electricity helps improve one’s style of living at the expense of the environment (Zhang et al., 2017). To combat the potential effects of climate change, governments need to enact relevant energy regulations to encourage efficient energy consumption (Zhang et al., 2017). Nilsson et al. (2017) reported that a third of global electricity demand is consumed by residential households while the services and industrial sectors share 19% and 53%, respectively. While the industrial sector holds the highest share of electricity consumption, companies have their own ways to voluntarily reduce electricity consumption. The same goes for the services sector with the objective of growing profit. Therefore, there is a need to study the residential sector since it does not necessarily reduce electricity consumption voluntarily. Understanding the factors that influence the electricity consumption of households can help policymakers craft relevant energy regulations to encourage reduced energy consumption.

The utilization of data-driven approaches, such as regression modeling, offers valuable insights into the factors influencing residential energy consumption patterns. By identifying significant predictors, policymakers can tailor interventions to promote energy efficiency and reduce electricity bills for households. Thus, investigating the relationship between household characteristics and electricity consumption aligns with broader efforts to achieve sustainable energy practices.

Objective

This study aims to achieve the following objectives: 1. To create desciptive visuals or statistical visuals such as scatter plots and correlation, to provide comprehemsive understanding of the relationships between the variables 2. To create a model using stepwise regression that will determine or predict the insurance premiums according to the independent variables 3. To create a model using multiple linear regression that will examine the correlation between the independent variables and the dependent variables 4. To assess the accuracy of the models created

Statement of the Problem

The study aims to investigate the factors influencing monthly electricity consumption in households. It seeks to analyze the impact of various household and house characteristics, including the number of rooms, house area, presence of appliances, average monthly income, and number of children, on the monthly energy bill. By examining these factors, this study aims to address the following main research question: How do different household and house characteristics influence the monthly electricity consumption of households?

Hypotheses

Null Hypothesis (H0): There is no significant relationship between the dependent variable (amount_paid) and the independent variables (num_people, housearea, is_tv, is_ac, is_flat, ave_monthly_income, num_children, is_urban).

Alternative Hypothesis (H1): There is a significant relationship between the dependent variable (amount_paid) and the independent variables (num_people, housearea, is_tv, is_ac, is_flat, ave_monthly_income, num_children, is_urban).

III. Descriptive Statistics

class(data)
## [1] "data.frame"
str(data)
## 'data.frame':    1000 obs. of  10 variables:
##  $ num_rooms         : int  3 1 3 0 1 0 4 3 2 1 ...
##  $ num_people        : int  3 5 1 5 8 5 5 4 4 6 ...
##  $ housearea         : num  743 953 761 861 732 ...
##  $ is_ac             : int  1 0 1 1 0 0 0 0 1 0 ...
##  $ is_tv             : int  1 1 1 1 1 1 1 0 0 0 ...
##  $ is_flat           : int  1 0 1 0 0 1 0 1 0 0 ...
##  $ ave_monthly_income: num  9676 35065 22292 12139 17230 ...
##  $ num_children      : int  2 1 0 0 2 2 1 2 0 2 ...
##  $ is_urban          : int  0 1 0 0 1 1 1 1 1 1 ...
##  $ amount_paid       : num  560 633 512 333 658 ...
dplyr::glimpse(data)
## Rows: 1,000
## Columns: 10
## $ num_rooms          <int> 3, 1, 3, 0, 1, 0, 4, 3, 2, 1, 2, 1, 0, 2, 2, 1, 3, …
## $ num_people         <int> 3, 5, 1, 5, 8, 5, 5, 4, 4, 6, 6, 6, 3, 4, 7, 5, 5, …
## $ housearea          <dbl> 742.57, 952.99, 761.44, 861.32, 731.61, 837.24, 679…
## $ is_ac              <int> 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, …
## $ is_tv              <int> 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, …
## $ is_flat            <int> 1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, …
## $ ave_monthly_income <dbl> 9675.93, 35064.79, 22292.44, 12139.08, 17230.10, 24…
## $ num_children       <int> 2, 1, 0, 0, 2, 2, 1, 2, 0, 2, 0, 0, 0, 0, 1, 0, 3, …
## $ is_urban           <int> 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, …
## $ amount_paid        <dbl> 560.4814, 633.2837, 511.8792, 332.9920, 658.2856, 7…
summary(data)
##    num_rooms        num_people       housearea          is_ac      
##  Min.   :-1.000   Min.   :-1.000   Min.   : 244.4   Min.   :0.000  
##  1st Qu.: 1.000   1st Qu.: 4.000   1st Qu.: 691.0   1st Qu.:0.000  
##  Median : 2.000   Median : 5.000   Median : 790.0   Median :0.000  
##  Mean   : 1.962   Mean   : 4.897   Mean   : 794.7   Mean   :0.376  
##  3rd Qu.: 3.000   3rd Qu.: 6.000   3rd Qu.: 893.0   3rd Qu.:1.000  
##  Max.   : 5.000   Max.   :11.000   Max.   :1189.1   Max.   :1.000  
##      is_tv          is_flat      ave_monthly_income  num_children  
##  Min.   :0.000   Min.   :0.000   Min.   :-1576      Min.   :0.000  
##  1st Qu.:1.000   1st Qu.:0.000   1st Qu.:18037      1st Qu.:0.000  
##  Median :1.000   Median :0.000   Median :24743      Median :1.000  
##  Mean   :0.798   Mean   :0.477   Mean   :24685      Mean   :1.078  
##  3rd Qu.:1.000   3rd Qu.:1.000   3rd Qu.:31402      3rd Qu.:2.000  
##  Max.   :1.000   Max.   :1.000   Max.   :56531      Max.   :4.000  
##     is_urban      amount_paid     
##  Min.   :0.000   Min.   :  87.85  
##  1st Qu.:0.000   1st Qu.: 475.07  
##  Median :1.000   Median : 598.33  
##  Mean   :0.608   Mean   : 600.40  
##  3rd Qu.:1.000   3rd Qu.: 729.93  
##  Max.   :1.000   Max.   :1102.99

Tables for Categorical Variables

table(data$is_ac); table (data$is_tv);
## 
##   0   1 
## 624 376
## 
##   0   1 
## 202 798
table(data$is_flat); table(data$is_urban)
## 
##   0   1 
## 523 477
## 
##   0   1 
## 392 608

Above, we can see the tables for the categorical variables in the dataset. 1 indicates those who qualify for the given characters as mentioned whicle 0 signify the other. For is the electric devices contributing to the bill, there are 376 answered yes for the AC while 624 ansewered no while as for TV, there are significantly more people who answered yes as it is 798 while only 202 answered no. Moving on the characteristics of the area of living, there are more people who said they don’t live in a flat garnerning 523 votes while only 477 live in flats. Lastly, there are a lot more people who lives in the urban area as 608 people said yes and only 392 said no.

psych::describe(data)
psych::describeBy(data, data$num_people)
## 
##  Descriptive statistics by group 
## group: -1
##                    vars n     mean      sd   median  trimmed     mad      min
## num_rooms             1 4     2.50    1.29     2.50     2.50    1.48     1.00
## num_people            2 4    -1.00    0.00    -1.00    -1.00    0.00    -1.00
## housearea             3 4   782.63  202.96   738.69   782.63  154.82   594.82
## is_ac                 4 4     0.25    0.50     0.00     0.25    0.00     0.00
## is_tv                 5 4     0.75    0.50     1.00     0.75    0.00     0.00
## is_flat               6 4     0.50    0.58     0.50     0.50    0.74     0.00
## ave_monthly_income    7 4 27033.30 7701.81 28037.55 27033.30 7255.75 17182.76
## num_children          8 4     1.25    1.89     0.50     1.25    0.74     0.00
## is_urban              9 4     0.75    0.50     1.00     0.75    0.00     0.00
## amount_paid          10 4   538.23   50.79   534.69   538.23   54.93   484.60
##                         max    range  skew kurtosis      se
## num_rooms              4.00     3.00  0.00    -2.08    0.65
## num_people            -1.00     0.00   NaN      NaN    0.00
## housearea           1058.32   463.50  0.39    -1.94  101.48
## is_ac                  1.00     1.00  0.75    -1.69    0.25
## is_tv                  1.00     1.00 -0.75    -1.69    0.25
## is_flat                1.00     1.00  0.00    -2.44    0.29
## ave_monthly_income 34875.33 17692.57 -0.23    -2.04 3850.91
## num_children           4.00     4.00  0.62    -1.79    0.95
## is_urban               1.00     1.00 -0.75    -1.69    0.25
## amount_paid          598.94   114.34  0.11    -2.15   25.39
## ------------------------------------------------------------ 
## group: 0
##                    vars  n     mean      sd   median  trimmed     mad      min
## num_rooms             1 13     2.62    0.77     3.00     2.64    0.00     1.00
## num_people            2 13     0.00    0.00     0.00     0.00    0.00     0.00
## housearea             3 13   840.34  183.83   797.67   836.26  131.79   549.85
## is_ac                 4 13     0.46    0.52     0.00     0.45    0.00     0.00
## is_tv                 5 13     0.77    0.44     1.00     0.82    0.00     0.00
## is_flat               6 13     0.62    0.51     1.00     0.64    0.00     0.00
## ave_monthly_income    7 13 24772.57 9160.53 24223.22 24505.63 9135.83 12317.68
## num_children          8 13     1.15    0.90     1.00     1.18    1.48     0.00
## is_urban              9 13     0.85    0.38     1.00     0.91    0.00     0.00
## amount_paid          10 13   697.10  155.81   709.07   724.80   78.88   238.90
##                         max    range  skew kurtosis      se
## num_rooms              4.00     3.00 -0.36    -0.52    0.21
## num_people             0.00     0.00   NaN      NaN    0.00
## housearea           1175.71   625.86  0.24    -1.16   50.99
## is_ac                  1.00     1.00  0.14    -2.13    0.14
## is_tv                  1.00     1.00 -1.13    -0.76    0.12
## is_flat                1.00     1.00 -0.42    -1.96    0.14
## ave_monthly_income 40163.85 27846.17  0.29    -1.22 2540.67
## num_children           2.00     2.00 -0.27    -1.80    0.25
## is_urban               1.00     1.00 -1.70     0.99    0.10
## amount_paid          850.62   611.72 -1.79     2.91   43.21
## ------------------------------------------------------------ 
## group: 1
##                    vars  n     mean      sd   median  trimmed     mad     min
## num_rooms             1 33     2.06    0.97     2.00     2.11    1.48    0.00
## num_people            2 33     1.00    0.00     1.00     1.00    0.00    1.00
## housearea             3 33   798.39  147.98   808.88   801.53  140.46  515.34
## is_ac                 4 33     0.33    0.48     0.00     0.30    0.00    0.00
## is_tv                 5 33     0.73    0.45     1.00     0.78    0.00    0.00
## is_flat               6 33     0.45    0.51     0.00     0.44    0.00    0.00
## ave_monthly_income    7 33 22898.67 8572.32 22346.64 22789.89 8787.93 7525.82
## num_children          8 33     0.94    0.83     1.00     0.89    1.48    0.00
## is_urban              9 33     0.52    0.51     1.00     0.52    0.00    0.00
## amount_paid          10 33   531.72  152.01   535.70   530.02  183.13  272.87
##                         max    range  skew kurtosis      se
## num_rooms              4.00     4.00 -0.32    -0.65    0.17
## num_people             1.00     0.00   NaN      NaN    0.00
## housearea           1077.31   561.97 -0.20    -0.97   25.76
## is_ac                  1.00     1.00  0.68    -1.59    0.08
## is_tv                  1.00     1.00 -0.97    -1.08    0.08
## is_flat                1.00     1.00  0.17    -2.03    0.09
## ave_monthly_income 40260.49 32734.67  0.10    -0.64 1492.25
## num_children           3.00     3.00  0.43    -0.70    0.14
## is_urban               1.00     1.00 -0.06    -2.06    0.09
## amount_paid          813.74   540.88  0.13    -1.06   26.46
## ------------------------------------------------------------ 
## group: 2
##                    vars  n     mean      sd   median  trimmed     mad    min
## num_rooms             1 72     1.79    1.01     2.00     1.81    1.48  -1.00
## num_people            2 72     2.00    0.00     2.00     2.00    0.00   2.00
## housearea             3 72   796.60  130.24   767.43   788.63  127.17 516.37
## is_ac                 4 72     0.44    0.50     0.00     0.43    0.00   0.00
## is_tv                 5 72     0.82    0.39     1.00     0.90    0.00   0.00
## is_flat               6 72     0.51    0.50     1.00     0.52    0.00   0.00
## ave_monthly_income    7 72 23761.73 8843.90 24221.15 23924.47 8881.03  37.78
## num_children          8 72     1.11    0.99     1.00     1.03    1.48   0.00
## is_urban              9 72     0.60    0.49     1.00     0.62    0.00   0.00
## amount_paid          10 72   604.92  162.57   598.19   601.90  137.79 224.59
##                         max    range  skew kurtosis      se
## num_rooms              4.00     5.00 -0.07    -0.02    0.12
## num_people             2.00     0.00   NaN      NaN    0.00
## housearea           1136.00   619.63  0.51    -0.29   15.35
## is_ac                  1.00     1.00  0.22    -1.98    0.06
## is_tv                  1.00     1.00 -1.63     0.66    0.05
## is_flat                1.00     1.00 -0.05    -2.02    0.06
## ave_monthly_income 46993.00 46955.22 -0.09    -0.10 1042.26
## num_children           4.00     4.00  0.48    -0.50    0.12
## is_urban               1.00     1.00 -0.39    -1.87    0.06
## amount_paid         1018.81   794.22  0.18    -0.06   19.16
## ------------------------------------------------------------ 
## group: 3
##                    vars  n     mean      sd   median  trimmed      mad     min
## num_rooms             1 96     2.01    1.07     2.00     2.00     1.48    0.00
## num_people            2 96     3.00    0.00     3.00     3.00     0.00    3.00
## housearea             3 96   790.93  151.63   776.86   791.93   147.74  443.63
## is_ac                 4 96     0.34    0.48     0.00     0.31     0.00    0.00
## is_tv                 5 96     0.82    0.38     1.00     0.90     0.00    0.00
## is_flat               6 96     0.42    0.50     0.00     0.40     0.00    0.00
## ave_monthly_income    7 96 24262.36 9544.91 24666.29 24105.45 11005.15 1220.02
## num_children          8 96     1.16    0.97     1.00     1.09     1.48    0.00
## is_urban              9 96     0.60    0.49     1.00     0.63     0.00    0.00
## amount_paid          10 96   579.97  173.64   562.67   585.17   158.04   97.54
##                         max    range  skew kurtosis     se
## num_rooms              5.00     5.00  0.13    -0.34   0.11
## num_people             3.00     0.00   NaN      NaN   0.00
## housearea           1118.29   674.66  0.00    -0.47  15.48
## is_ac                  1.00     1.00  0.65    -1.60   0.05
## is_tv                  1.00     1.00 -1.67     0.78   0.04
## is_flat                1.00     1.00  0.33    -1.91   0.05
## ave_monthly_income 49294.85 48074.83  0.11    -0.38 974.17
## num_children           3.00     3.00  0.24    -1.08   0.10
## is_urban               1.00     1.00 -0.42    -1.84   0.05
## amount_paid          915.47   817.93 -0.20    -0.20  17.72
## ------------------------------------------------------------ 
## group: 4
##                    vars   n     mean       sd   median  trimmed      mad
## num_rooms             1 187     1.92     0.97     2.00     1.89     1.48
## num_people            2 187     4.00     0.00     4.00     4.00     0.00
## housearea             3 187   807.87   144.54   794.74   804.18   142.54
## is_ac                 4 187     0.30     0.46     0.00     0.25     0.00
## is_tv                 5 187     0.77     0.42     1.00     0.83     0.00
## is_flat               6 187     0.44     0.50     0.00     0.42     0.00
## ave_monthly_income    7 187 25480.34 11028.43 25193.60 25445.58 11493.49
## num_children          8 187     1.10     0.88     1.00     1.06     1.48
## is_urban              9 187     0.68     0.47     1.00     0.73     0.00
## amount_paid          10 187   603.38   172.81   598.52   600.53   159.63
##                         min      max    range  skew kurtosis     se
## num_rooms              0.00     4.00     4.00  0.23    -0.18   0.07
## num_people             4.00     4.00     0.00   NaN      NaN   0.00
## housearea            485.29  1185.36   700.07  0.24    -0.27  10.57
## is_ac                  0.00     1.00     1.00  0.87    -1.25   0.03
## is_tv                  0.00     1.00     1.00 -1.27    -0.38   0.03
## is_flat                0.00     1.00     1.00  0.25    -1.95   0.04
## ave_monthly_income -1177.42 56531.08 57708.50  0.08    -0.07 806.48
## num_children           0.00     3.00     3.00  0.28    -0.81   0.06
## is_urban               0.00     1.00     1.00 -0.79    -1.39   0.03
## amount_paid          152.81  1022.60   869.80  0.16    -0.19  12.64
## ------------------------------------------------------------ 
## group: 5
##                    vars   n     mean      sd   median  trimmed     mad      min
## num_rooms             1 224     1.90    1.09     2.00     1.92    1.48    -1.00
## num_people            2 224     5.00    0.00     5.00     5.00    0.00     5.00
## housearea             3 224   778.91  157.42   778.52   778.46  161.20   244.40
## is_ac                 4 224     0.38    0.49     0.00     0.36    0.00     0.00
## is_tv                 5 224     0.80    0.40     1.00     0.88    0.00     0.00
## is_flat               6 224     0.50    0.50     0.50     0.50    0.74     0.00
## ave_monthly_income    7 224 25106.43 9302.86 26217.44 25180.88 9178.12 -1013.85
## num_children          8 224     1.04    0.94     1.00     0.94    1.48     0.00
## is_urban              9 224     0.56    0.50     1.00     0.58    0.00     0.00
## amount_paid          10 224   588.22  207.94   582.97   585.93  230.75    87.85
##                         max    range  skew kurtosis     se
## num_rooms              5.00     6.00  0.01    -0.40   0.07
## num_people             5.00     0.00   NaN      NaN   0.00
## housearea           1189.12   944.72 -0.11     0.14  10.52
## is_ac                  1.00     1.00  0.47    -1.78   0.03
## is_tv                  1.00     1.00 -1.52     0.31   0.03
## is_flat                1.00     1.00  0.00    -2.01   0.03
## ave_monthly_income 47904.37 48918.22 -0.14    -0.15 621.57
## num_children           3.00     3.00  0.55    -0.63   0.06
## is_urban               1.00     1.00 -0.25    -1.95   0.03
## amount_paid         1094.76  1006.91  0.10    -0.67  13.89
## ------------------------------------------------------------ 
## group: 6
##                    vars   n     mean      sd   median  trimmed     mad      min
## num_rooms             1 174     1.98    1.01     2.00     1.96    1.48     0.00
## num_people            2 174     6.00    0.00     6.00     6.00    0.00     6.00
## housearea             3 174   789.20  143.57   788.26   790.31  142.49   361.13
## is_ac                 4 174     0.40    0.49     0.00     0.38    0.00     0.00
## is_tv                 5 174     0.78    0.42     1.00     0.84    0.00     0.00
## is_flat               6 174     0.50    0.50     0.50     0.50    0.74     0.00
## ave_monthly_income    7 174 23787.30 9439.65 22769.45 23566.13 9867.36 -1576.44
## num_children          8 174     1.11    1.01     1.00     0.99    1.48     0.00
## is_urban              9 174     0.58    0.49     1.00     0.60    0.00     0.00
## amount_paid          10 174   602.76  186.09   603.33   604.65  191.61   175.89
##                         max    range  skew kurtosis     se
## num_rooms              5.00     5.00  0.23    -0.04   0.08
## num_people             6.00     0.00   NaN      NaN   0.00
## housearea           1129.49   768.36 -0.10     0.06  10.88
## is_ac                  1.00     1.00  0.40    -1.85   0.04
## is_tv                  1.00     1.00 -1.31    -0.28   0.03
## is_flat                1.00     1.00  0.00    -2.01   0.04
## ave_monthly_income 47483.47 49059.91  0.19    -0.30 715.62
## num_children           4.00     4.00  0.65    -0.26   0.08
## is_urban               1.00     1.00 -0.32    -1.91   0.04
## amount_paid         1102.99   927.11 -0.01    -0.42  14.11
## ------------------------------------------------------------ 
## group: 7
##                    vars   n     mean      sd   median  trimmed      mad     min
## num_rooms             1 104     2.09    0.97     2.00     2.12     1.48    0.00
## num_people            2 104     7.00    0.00     7.00     7.00     0.00    7.00
## housearea             3 104   797.56  142.35   806.61   799.24   157.42  478.33
## is_ac                 4 104     0.37    0.48     0.00     0.33     0.00    0.00
## is_tv                 5 104     0.85    0.36     1.00     0.93     0.00    0.00
## is_flat               6 104     0.48    0.50     0.00     0.48     0.00    0.00
## ave_monthly_income    7 104 25605.47 9763.12 25669.60 25539.26 10163.16 -526.54
## num_children          8 104     0.97    0.88     1.00     0.90     1.48    0.00
## is_urban              9 104     0.64    0.48     1.00     0.68     0.00    0.00
## amount_paid          10 104   618.82  162.06   630.76   621.70   161.90  226.46
##                         max    range  skew kurtosis     se
## num_rooms              4.00     4.00 -0.23    -0.44   0.09
## num_people             7.00     0.00   NaN      NaN   0.00
## housearea           1130.50   652.17 -0.09    -0.61  13.96
## is_ac                  1.00     1.00  0.55    -1.71   0.05
## is_tv                  1.00     1.00 -1.89     1.59   0.04
## is_flat                1.00     1.00  0.08    -2.01   0.05
## ave_monthly_income 49063.95 49590.49  0.02    -0.24 957.35
## num_children           3.00     3.00  0.49    -0.67   0.09
## is_urban               1.00     1.00 -0.59    -1.66   0.05
## amount_paid          976.82   750.36 -0.16    -0.47  15.89
## ------------------------------------------------------------ 
## group: 8
##                    vars  n     mean      sd   median  trimmed     mad     min
## num_rooms             1 54     2.02    1.14     2.00     2.07    1.48   -1.00
## num_people            2 54     8.00    0.00     8.00     8.00    0.00    8.00
## housearea             3 54   784.06  137.94   792.98   782.72  162.11  518.49
## is_ac                 4 54     0.54    0.50     1.00     0.55    0.00    0.00
## is_tv                 5 54     0.85    0.36     1.00     0.93    0.00    0.00
## is_flat               6 54     0.43    0.50     0.00     0.41    0.00    0.00
## ave_monthly_income    7 54 23038.59 9078.73 22968.41 22768.94 9951.20 4577.68
## num_children          8 54     1.07    0.93     1.00     0.98    1.48    0.00
## is_urban              9 54     0.57    0.50     1.00     0.59    0.00    0.00
## amount_paid          10 54   639.65  171.78   670.07   645.37  179.43  273.60
##                         max    range  skew kurtosis      se
## num_rooms              4.00     5.00 -0.48     0.15    0.16
## num_people             8.00     0.00   NaN      NaN    0.00
## housearea           1055.17   536.68  0.07    -0.98   18.77
## is_ac                  1.00     1.00 -0.14    -2.02    0.07
## is_tv                  1.00     1.00 -1.93     1.74    0.05
## is_flat                1.00     1.00  0.29    -1.95    0.07
## ave_monthly_income 50110.52 45532.84  0.36    -0.01 1235.46
## num_children           4.00     4.00  0.83     0.53    0.13
## is_urban               1.00     1.00 -0.29    -1.95    0.07
## amount_paid          961.68   688.08 -0.31    -0.56   23.38
## ------------------------------------------------------------ 
## group: 9
##                    vars  n     mean      sd   median  trimmed     mad      min
## num_rooms             1 28     1.79    1.03     2.00     1.75    1.48     0.00
## num_people            2 28     9.00    0.00     9.00     9.00    0.00     9.00
## housearea             3 28   860.13  137.06   850.00   858.29  152.42   572.76
## is_ac                 4 28     0.39    0.50     0.00     0.38    0.00     0.00
## is_tv                 5 28     0.82    0.39     1.00     0.88    0.00     0.00
## is_flat               6 28     0.54    0.51     1.00     0.54    0.00     0.00
## ave_monthly_income    7 28 26224.54 7141.07 26320.66 26245.03 5404.08 10770.83
## num_children          8 28     1.25    0.93     1.00     1.21    1.48     0.00
## is_urban              9 28     0.61    0.50     1.00     0.62    0.00     0.00
## amount_paid          10 28   632.07  179.83   615.61   633.67  190.37   262.13
##                         max    range  skew kurtosis      se
## num_rooms              4.00     4.00  0.22    -0.23    0.19
## num_people             9.00     0.00   NaN      NaN    0.00
## housearea           1123.53   550.77  0.08    -0.71   25.90
## is_ac                  1.00     1.00  0.42    -1.89    0.09
## is_tv                  1.00     1.00 -1.59     0.55    0.07
## is_flat                1.00     1.00 -0.14    -2.05    0.10
## ave_monthly_income 44083.19 33312.36  0.17     0.16 1349.54
## num_children           3.00     3.00  0.05    -1.12    0.18
## is_urban               1.00     1.00 -0.42    -1.89    0.09
## amount_paid          992.32   730.19 -0.14    -0.69   33.98
## ------------------------------------------------------------ 
## group: 10
##                    vars n     mean       sd   median  trimmed      mad     min
## num_rooms             1 9     2.00     1.22     2.00     2.00     1.48    0.00
## num_people            2 9    10.00     0.00    10.00    10.00     0.00   10.00
## housearea             3 9   808.36   188.37   858.17   808.36   239.40  496.23
## is_ac                 4 9     0.33     0.50     0.00     0.33     0.00    0.00
## is_tv                 5 9     0.67     0.50     1.00     0.67     0.00    0.00
## is_flat               6 9     0.44     0.53     0.00     0.44     0.00    0.00
## ave_monthly_income    7 9 23171.91 12738.47 24743.19 23171.91 17065.19 6031.98
## num_children          8 9     1.00     0.50     1.00     1.00     0.00    0.00
## is_urban              9 9     0.44     0.53     0.00     0.44     0.00    0.00
## amount_paid          10 9   569.14   161.90   505.92   569.14    99.42  408.76
##                         max    range  skew kurtosis      se
## num_rooms              3.00     3.00 -0.73    -1.22    0.41
## num_people            10.00     0.00   NaN      NaN    0.00
## housearea           1064.73   568.50 -0.25    -1.41   62.79
## is_ac                  1.00     1.00  0.59    -1.81    0.17
## is_tv                  1.00     1.00 -0.59    -1.81    0.17
## is_flat                1.00     1.00  0.19    -2.17    0.18
## ave_monthly_income 40972.21 34940.23  0.09    -1.71 4246.16
## num_children           2.00     2.00  0.00     0.56    0.17
## is_urban               1.00     1.00  0.19    -2.17    0.18
## amount_paid          819.69   410.93  0.59    -1.53   53.97
## ------------------------------------------------------------ 
## group: 11
##                    vars n     mean       sd   median  trimmed      mad      min
## num_rooms             1 2     2.00     1.41     2.00     2.00     1.48     1.00
## num_people            2 2    11.00     0.00    11.00    11.00     0.00    11.00
## housearea             3 2   751.18   276.56   751.18   751.18   289.94   555.62
## is_ac                 4 2     0.00     0.00     0.00     0.00     0.00     0.00
## is_tv                 5 2     0.50     0.71     0.50     0.50     0.74     0.00
## is_flat               6 2     1.00     0.00     1.00     1.00     0.00     1.00
## ave_monthly_income    7 2 40792.07 16932.43 40792.07 40792.07 17751.22 28819.03
## num_children          8 2     1.00     1.41     1.00     1.00     1.48     0.00
## is_urban              9 2     1.00     0.00     1.00     1.00     0.00     1.00
## amount_paid          10 2   605.70   175.92   605.70   605.70   184.42   481.31
##                         max    range skew kurtosis       se
## num_rooms              3.00     2.00    0    -2.75     1.00
## num_people            11.00     0.00  NaN      NaN     0.00
## housearea            946.74   391.12    0    -2.75   195.56
## is_ac                  0.00     0.00  NaN      NaN     0.00
## is_tv                  1.00     1.00    0    -2.75     0.50
## is_flat                1.00     0.00  NaN      NaN     0.00
## ave_monthly_income 52765.10 23946.07    0    -2.75 11973.03
## num_children           2.00     2.00    0    -2.75     1.00
## is_urban               1.00     0.00  NaN      NaN     0.00
## amount_paid          730.09   248.78    0    -2.75   124.39

The best suit for this dataset is the “psych” package for describing a household electrivity monthly bill dataset due to its versatile tools for statistical analysis. In this dataset, representing household monthly bill information, several key variables provide insights into the characteristics of the households. On average, households have approximately 1.96 rooms, and the number of rooms ranges from 1 to 5, with a median of 2. The average household size is about 4.90 people, ranging from 2 to 11, with a median of 5. The average household area is 794.70 square units, varying between 244.40 and 1189.12 square units, and a median of 789.97. About 38% of households have air conditioning, and around 80% have a TV. Additionally, approximately 48% of households live in flats. The average monthly income is $24,684.99, with incomes ranging from -$1,576.44 to $58,107.52 and a median of $24,742.57. On average, households have around 1.08 children, ranging from 0 to 4, with a median of 1. About 61% of households are located in urban areas. Lastly, the average monthly bill payment is $600.40, ranging from $87.85 to $1,102.99, with a median of $598.33.

cor(data$num_children, data$amount_paid)
## [1] 0.4475123
psych::pairs.panels(data)

Conducting the correlation analysis through the pairs.panels, we can see that the one with the highest correlation with the amount_paid dependent variable is the is_urban variable which is a categorical variable with 0.65. With the numerical values, we can see that the number of children is the one with the highest correlation with 0.45 to our DV. Meanwhile, number of rooms has the lowest correlation with the DV, with only -0.02.

bill1 <- data [ , -c(4:6,9)] 
psych::pairs.panels(bill1)

While as for here, we made a new dataset containing only the numerical variables which will make a better reference looking at the numerical values that affects our Dependent Variable. Conducting the same analysis as earlier, we can see the same data but only with the numerical values. It retains the same information as earlier, only with the values

##Plotting the Data

attach(data)
plot(num_children, amount_paid, main = "scatterplot of num_children and amount_paid")

detach(data)

Conducting a scatterplot with the number of children and amount paid by the household, we can see that as the number of children increase, the amount paid for the electricity also increases as we can the range by each children adding slightly going up.

IV. Model Creation Using Backward Stepwise Regression

library(olsrr)
## 
## Attaching package: 'olsrr'
## The following object is masked from 'package:datasets':
## 
##     rivers
mod1 <- lm(amount_paid ~., data = data)
summary(mod1)
## 
## Call:
## lm(formula = amount_paid ~ ., data = data)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -120.294  -53.617   -0.841   52.174  120.337 
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)    
## (Intercept)         1.208e+02  1.463e+01   8.253 4.89e-16 ***
## num_rooms          -1.095e-01  1.939e+00  -0.056  0.95496    
## num_people          4.828e+00  9.948e-01   4.853 1.41e-06 ***
## housearea           3.919e-02  1.359e-02   2.883  0.00403 ** 
## is_ac               1.632e+02  4.129e+00  39.525  < 2e-16 ***
## is_tv               7.548e+01  4.992e+00  15.121  < 2e-16 ***
## is_flat             5.976e+01  3.995e+00  14.957  < 2e-16 ***
## ave_monthly_income  1.034e-03  2.065e-04   5.008 6.52e-07 ***
## num_children        9.037e+01  2.140e+00  42.228  < 2e-16 ***
## is_urban            2.500e+02  4.098e+00  61.013  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 63.04 on 990 degrees of freedom
## Multiple R-squared:  0.8803, Adjusted R-squared:  0.8792 
## F-statistic: 809.2 on 9 and 990 DF,  p-value: < 2.2e-16
jtools::summ(mod1)
## MODEL INFO:
## Observations: 1000
## Dependent Variable: amount_paid
## Type: OLS linear regression 
## 
## MODEL FIT:
## F(9,990) = 809.20, p = 0.00
## R² = 0.88
## Adj. R² = 0.88 
## 
## Standard errors: OLS
## ---------------------------------------------------------
##                              Est.    S.E.   t val.      p
## ------------------------ -------- ------- -------- ------
## (Intercept)                120.77   14.63     8.25   0.00
## num_rooms                   -0.11    1.94    -0.06   0.95
## num_people                   4.83    0.99     4.85   0.00
## housearea                    0.04    0.01     2.88   0.00
## is_ac                      163.21    4.13    39.53   0.00
## is_tv                       75.48    4.99    15.12   0.00
## is_flat                     59.76    4.00    14.96   0.00
## ave_monthly_income           0.00    0.00     5.01   0.00
## num_children                90.37    2.14    42.23   0.00
## is_urban                   250.01    4.10    61.01   0.00
## ---------------------------------------------------------
(bwdfit.p <- ols_step_backward_p(mod1, pent = .05, prem = .05, details = TRUE))
## Backward Elimination Method 
## ---------------------------
## 
## Candidate Terms: 
## 
## 1 . num_rooms 
## 2 . num_people 
## 3 . housearea 
## 4 . is_ac 
## 5 . is_tv 
## 6 . is_flat 
## 7 . ave_monthly_income 
## 8 . num_children 
## 9 . is_urban 
## 
## We are eliminating variables based on p value...
## 
## - num_rooms 
## 
## Backward Elimination: Step 1 
## 
##  Variable num_rooms Removed 
## 
##                          Model Summary                           
## ----------------------------------------------------------------
## R                       0.938       RMSE                 63.007 
## R-Squared               0.880       Coef. Var            10.494 
## Adj. R-Squared          0.879       MSE                3969.919 
## Pred R-Squared          0.878       MAE                  53.638 
## ----------------------------------------------------------------
##  RMSE: Root Mean Square Error 
##  MSE: Mean Square Error 
##  MAE: Mean Absolute Error 
## 
##                                   ANOVA                                    
## --------------------------------------------------------------------------
##                     Sum of                                                
##                    Squares         DF    Mean Square       F         Sig. 
## --------------------------------------------------------------------------
## Regression    28941114.476          8    3617639.309    911.263    0.0000 
## Residual       3934190.219        991       3969.919                      
## Total         32875304.695        999                                     
## --------------------------------------------------------------------------
## 
##                                        Parameter Estimates                                         
## --------------------------------------------------------------------------------------------------
##              model       Beta    Std. Error    Std. Beta      t        Sig       lower      upper 
## --------------------------------------------------------------------------------------------------
##        (Intercept)    120.523        13.975                  8.624    0.000     93.098    147.948 
##         num_people      4.828         0.994        0.053     4.856    0.000      2.877      6.779 
##          housearea      0.039         0.014        0.032     2.888    0.004      0.013      0.066 
##              is_ac    163.208         4.127        0.436    39.551    0.000    155.110    171.305 
##              is_tv     75.486         4.989        0.167    15.130    0.000     65.696     85.277 
##            is_flat     59.755         3.993        0.165    14.964    0.000     51.919     67.591 
## ave_monthly_income      0.001         0.000        0.055     5.010    0.000      0.001      0.001 
##       num_children     90.374         2.138        0.465    42.279    0.000     86.179     94.569 
##           is_urban    250.012         4.095        0.673    61.050    0.000    241.976    258.048 
## --------------------------------------------------------------------------------------------------
## 
## 
## 
## No more variables satisfy the condition of p value = 0.05
## 
## 
## Variables Removed: 
## 
## - num_rooms 
## 
## 
## Final Model Output 
## ------------------
## 
##                          Model Summary                           
## ----------------------------------------------------------------
## R                       0.938       RMSE                 63.007 
## R-Squared               0.880       Coef. Var            10.494 
## Adj. R-Squared          0.879       MSE                3969.919 
## Pred R-Squared          0.878       MAE                  53.638 
## ----------------------------------------------------------------
##  RMSE: Root Mean Square Error 
##  MSE: Mean Square Error 
##  MAE: Mean Absolute Error 
## 
##                                   ANOVA                                    
## --------------------------------------------------------------------------
##                     Sum of                                                
##                    Squares         DF    Mean Square       F         Sig. 
## --------------------------------------------------------------------------
## Regression    28941114.476          8    3617639.309    911.263    0.0000 
## Residual       3934190.219        991       3969.919                      
## Total         32875304.695        999                                     
## --------------------------------------------------------------------------
## 
##                                        Parameter Estimates                                         
## --------------------------------------------------------------------------------------------------
##              model       Beta    Std. Error    Std. Beta      t        Sig       lower      upper 
## --------------------------------------------------------------------------------------------------
##        (Intercept)    120.523        13.975                  8.624    0.000     93.098    147.948 
##         num_people      4.828         0.994        0.053     4.856    0.000      2.877      6.779 
##          housearea      0.039         0.014        0.032     2.888    0.004      0.013      0.066 
##              is_ac    163.208         4.127        0.436    39.551    0.000    155.110    171.305 
##              is_tv     75.486         4.989        0.167    15.130    0.000     65.696     85.277 
##            is_flat     59.755         3.993        0.165    14.964    0.000     51.919     67.591 
## ave_monthly_income      0.001         0.000        0.055     5.010    0.000      0.001      0.001 
##       num_children     90.374         2.138        0.465    42.279    0.000     86.179     94.569 
##           is_urban    250.012         4.095        0.673    61.050    0.000    241.976    258.048 
## --------------------------------------------------------------------------------------------------
## 
## 
##                             Elimination Summary                              
## ----------------------------------------------------------------------------
##         Variable                   Adj.                                         
## Step     Removed     R-Square    R-Square     C(p)        AIC         RMSE      
## ----------------------------------------------------------------------------
##    1    num_rooms      0.8803      0.8794    8.0032    11135.3374    63.0073    
## ----------------------------------------------------------------------------
modfinal <-  lm(amount_paid ~ num_people + housearea + is_tv + is_ac + is_flat + ave_monthly_income + num_children + is_urban, data = data)
summary(modfinal)
## 
## Call:
## lm(formula = amount_paid ~ num_people + housearea + is_tv + is_ac + 
##     is_flat + ave_monthly_income + num_children + is_urban, data = data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -120.18  -53.58   -0.87   52.09  120.33 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        1.205e+02  1.398e+01   8.624  < 2e-16 ***
## num_people         4.828e+00  9.943e-01   4.856 1.39e-06 ***
## housearea          3.921e-02  1.358e-02   2.888  0.00397 ** 
## is_tv              7.549e+01  4.989e+00  15.130  < 2e-16 ***
## is_ac              1.632e+02  4.127e+00  39.551  < 2e-16 ***
## is_flat            5.975e+01  3.993e+00  14.964  < 2e-16 ***
## ave_monthly_income 1.034e-03  2.064e-04   5.010 6.43e-07 ***
## num_children       9.037e+01  2.138e+00  42.279  < 2e-16 ***
## is_urban           2.500e+02  4.095e+00  61.050  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 63.01 on 991 degrees of freedom
## Multiple R-squared:  0.8803, Adjusted R-squared:  0.8794 
## F-statistic: 911.3 on 8 and 991 DF,  p-value: < 2.2e-16
jtools :: summ(modfinal)
## MODEL INFO:
## Observations: 1000
## Dependent Variable: amount_paid
## Type: OLS linear regression 
## 
## MODEL FIT:
## F(8,991) = 911.26, p = 0.00
## R² = 0.88
## Adj. R² = 0.88 
## 
## Standard errors: OLS
## ---------------------------------------------------------
##                              Est.    S.E.   t val.      p
## ------------------------ -------- ------- -------- ------
## (Intercept)                120.52   13.98     8.62   0.00
## num_people                   4.83    0.99     4.86   0.00
## housearea                    0.04    0.01     2.89   0.00
## is_tv                       75.49    4.99    15.13   0.00
## is_ac                      163.21    4.13    39.55   0.00
## is_flat                     59.75    3.99    14.96   0.00
## ave_monthly_income           0.00    0.00     5.01   0.00
## num_children                90.37    2.14    42.28   0.00
## is_urban                   250.01    4.10    61.05   0.00
## ---------------------------------------------------------
confint(modfinal)
##                           2.5 %       97.5 %
## (Intercept)        9.309783e+01 1.479476e+02
## num_people         2.876889e+00 6.779198e+00
## housearea          1.256295e-02 6.585822e-02
## is_tv              6.569557e+01 8.527666e+01
## is_ac              1.551098e+02 1.713054e+02
## is_flat            5.191862e+01 6.759105e+01
## ave_monthly_income 6.292315e-04 1.439435e-03
## num_children       8.617924e+01 9.456867e+01
## is_urban           2.419757e+02 2.580482e+02
library(performance)
citation("performance")
## To cite package 'performance' in publications use:
## 
##   Lüdecke et al., (2021). performance: An R Package for Assessment,
##   Comparison and Testing of Statistical Models. Journal of Open Source
##   Software, 6(60), 3139. https://doi.org/10.21105/joss.03139
## 
## A BibTeX entry for LaTeX users is
## 
##   @Article{,
##     title = {{performance}: An {R} Package for Assessment, Comparison and Testing of Statistical Models},
##     author = {Daniel Lüdecke and Mattan S. Ben-Shachar and Indrajeet Patil and Philip Waggoner and Dominique Makowski},
##     year = {2021},
##     journal = {Journal of Open Source Software},
##     volume = {6},
##     number = {60},
##     pages = {3139},
##     doi = {10.21105/joss.03139},
##   }
compare_performance(mod1, modfinal, rank = 1)
anova(mod1)
anova(modfinal)

V. Discussion of Results

Discussion of the Final Model

The objective of this study is to know the factors that influence the household energy bills. We fitted a linear regression model using ordinary least squares (OLS) to predict the dependent variable “amount_paid”. The predictors in the model are num_people, housearea, is_ac, is_tv, is_flat, ave_monthly_income, num_children, and is_urban. The final formula for this model is amount_paid ~ num_people + housearea + is_ac + is_tv + is_flat + ave_monthly_income + num_children + is_urban. The model appears to explain a significant proportion of the variance in amount_paid (R2 = 0.880, F(8, 991) = 911.26, p < .001, adj. R2 = 0.879). The variable num_rooms was removed as it got the highest p value among all the variables.

The model’s intercept corresponding to num_people = 0, housearea = 0, ave_monthly_income = 0, num_children = 0, is_ac = 0, is_tv = 0, is_flat = 0, is_urban = 0, is at 120.52 (95% CI [9.31 , 1.48], t(991) = 8.62, P < .001 ) Within this model:

  • the effect of num_people is statistically significant and postive (beta = 4.83, 95% CI [2.88, 6.78], t(521) = 4.86, p < .001)
  • the effect of housearea is statistically significant and postive (beta = 0.04, 95% CI [1.26, 6.59], t(521) = 2.89, p < .001)
  • the effect of is_tv is statistically significant and positive (beta = 75.49, 95% CI [6.57, 8.53], t(521) = 15.13, p < .001)
  • the effect of is_ac is statistically significant and positive (beta = 163.21, 95% CI [1.55, 1.71], t(521) = 39.55, p < .001)
  • the effect of is_flat is statistically significant and positive (beta = 59.75, 95% CI [5.19, 6.76], t(521) = 14.96, p < .001)
  • the effect of ave_monthly_income is statistically significant and postive (beta = 0, 95% CI [6.29, 1.44], t(521) = 5.01, p < .001)
  • the effect of num_children is statistically significant and positive (beta = 90.37 CI [8.62, 9.46], t(521) = 42.28, p < .001)
  • the effect of is_urban is statistically significant and positive (beta = 250.01 CI[2.42, 2.58], t(521) = 61.05, p < .001)

Each predictor variable in the final model shows statistical significance, indicating that all the independent variables are significantly associated with the total amount paid for energy bills. For example, for every one-person increase, the predicted amount paid for the energy bill increases by $4.38 ceteris paribus. Moreover, for every one-unit increase in house area, the predicted amount paid for the energy bill will increase by $0.04 ceteris paribus. For an additional television, there will be an increase of $75.49 in the energy bill ceteris paribus. Similarly, with an additional unit of air conditioner, there will be an increase of $163.21 in the energy bill ceteris paribus. When the average monthly income increases every year, then the energy bill will also increase by $0 ceteris paribus. The $0 increase in the energy bill might indicate that there is no effect on the amount paid for the energy bill regardless of the increase in monthly income. When a family has another child, the energy bill will increase by $90.37 ceteris paribus. Lastly, with an increase in housing in an urban area, there will be a $250.01 increase ceteris paribus.

Comparison of the Models

In the compare_performance section, the table displayed both AIC and BIC, R², Adj R², RMSE, and Sigma. The AIC and BIC values are lower for the final model, indicating that modfinal is much better than mod1. Moreover, both models have similar R² and Adj R² values, indicating that they explain a comparable amount of variance in the dependent variable.

In the ANOVA test, it showed that all the independent variables are statistically significant predictors for the dependent variable (amount_paid). The low p-value indicates that there is a low probability of observing weak relationships between the independent variables and the dependent variable (amount_paid). Moreover, the high F-statistic value indicates a significant relationship between the independent variables and the dependent variable.

VI. Model Diagnostics

plot(modfinal)

plot(modfinal, which = 1)

car::crPlots(modfinal)

The scatterplot demonstrates a linear relationship between the dependent variable and the independent variables. Furthermore, the points are randomly distributed around the regression line. The residuals remain constant across the fitted values, indicating that the assumption of homoscedasticity is met.

H0: Residuals variance is constant.

lmtest::bptest(modfinal)
## 
##  studentized Breusch-Pagan test
## 
## data:  modfinal
## BP = 3.4156, df = 8, p-value = 0.9056
plot(modfinal, 2)

We fail to reject the null hypothesis since the p-value exceeds the significance level (alpha) of 0.05. Rejecting the null hypothesis would indicate evidence of non-constant residuals or heteroscedasticity in the final model. However, as the assumption of homoscedasticity is met, it confirms the reliability of the standard errors or residuals in the regression model. The result from the Normal QQ plot showed that the dataset is uniformly distributed. The pattern showed that the outliers are less likely to occur. This also suggests that the values in the dataset are relatively spread out across the range.

performance::check_heteroscedasticity(modfinal)
## OK: Error variance appears to be homoscedastic (p = 0.956).

The result showed that the final model appears to be homoscedastic as the p-value of the final model is 0.956. Since the p-value is greater than the significant level of 0.05, there is no significant evidence of heteroscedasticity in the final model. As a consequence, the assuption about homoscedasticity is met. This means that the standard errors or the outliers of the regression model are reliable.

ols_plot_resid_hist(modfinal) 

ols_test_correlation(modfinal)
## [1] 0.983584

The result of the correlation is 0.983584 means that the model has a strong liinear relationship between the independent variables and dependent variable.

car:: vif(modfinal)
##         num_people          housearea              is_tv              is_ac 
##           1.002674           1.013272           1.010714           1.006384 
##            is_flat ave_monthly_income       num_children           is_urban 
##           1.002059           1.004492           1.003553           1.006827

All the independent variables resulted in VIF values of 1, indicating that the independent variables are not correlated with each other. This shows that the model does not have multicollinearity.

VII. Conclusion and Recommendations

Conclusion

In summary, our case study investigates the factors influencing household electrical monthly bills. It collected data on various household attributes such as the number of rooms, number of people, presence of amenities like air conditioning and television, average monthly income, number of children, urban location status, and the amount paid for monthly bills. Using the collected data, a backward stepwise regression model was employed to identify the significant predictors of household monthly bills. The results of the stepwise regression model provided insights into the factors that most strongly influence household monthly bills such as the urban area and the number of children in the household. By systematically removing predictors, the model identified the subset of variables that best predict the variation in monthly bills. This helped in understanding which factors have the most significant impact on household expenditures.

Recommendations

We recommend conducting further research to explore additional factors that may influence household monthly bills, such as regional differences, lifestyle choices, or seasonal variations. This could provide a more comprehensive understanding of the determinants of household expenditures and inform future policy interventions. Additionally, policymakers can use the insights gained from the regression model to design targeted policies aimed at reducing household expenses. For instance, subsidies or incentives can be provided for energy-efficient upgrades or public transportation to reduce transportation costs for urban households. This way it will encourage households to engage in effective financial planning based on their average monthly income and other socioeconomic factors. Providing resources or workshops on budgeting and financial management can help households better allocate their resources and prioritize spending. Overall, the study demonstrated the utility of a backward stepwise regression model in identifying the key determinants of household monthly bills. By pinpointing the most influential factors, policymakers and households can better understand and manage their monthly expenses, leading to improved financial planning and decision-making.

Bibliography

Bartiaux, F., & Gram-Hanssen, K. (2005). Socio-political factors influencing household consumption of electricity: a comparison between Denmark and Belgium. Print: Eceee summer study proceedings energy savings what works & who deliver.

Bartusch, C., Odlare, M., Wallin, F., & Wester, L. (2012). Exploring variance in residential electricity consumption: Household features and building properties. Applied Energy, 92, 637–643. doi:10.1016/j.apenergy.2011.04.034

Bedir, M., Hasselaar, E., & Itard, L. (2013). Determinants of electricity consumption in Dutch dwellings. Energy and Buildings, 58, 194–207. https://10.1016/j.enbuild.2012.10.016

Brounen, D., Kok, N., & Quigley, J. M. (2012). Residential energy use and conservation: Economics and demographics. European Economic Review, 56(5), 931-945. https://doi.org/10.1016/j.euroecorev.2012.02.007

Confalonieri, U., Menne, B., Akhtar, R., Ebi, K. L., Hauengue, M., Kovats, R. S., … & Woodward, A. (2007). Human health. Climate change 2007: impacts, adaptation and vulnerability: contribution of Working Group II to the fourth assessment report of the Intergovernmental Panel on Climate Change.

Cramer, J. C., Miller, N., Craig, P., Hackett, B. M., Dietz, T. M., Vine, E. L., … & Kowalczyk, D. J. (1985). Social and engineering determinants and their equity implications in residential electricity use. Energy, 10(12), 1283-1291. https://doi.org/10.1016/0360-5442(85)90139-2

Dixon, G. N., Deline, M. B., McComas, K., Chambliss, L., & Hoffmann, M. (2015). Saving energy at the workplace: The salience of behavioral antecedents and sense of community. Energy Research & Social Science, 6, 121-127. https://doi.org/10.1016/j.erss.2015.01.004

Fu, K. S., Allen, M. R., & Archibald, R. K. (2015). Evaluating the relationship between the population trends, prices, heat waves, and the demands of energy consumption in cities. Sustainability, 7(11), 15284-15301. https://doi.org/10.3390/su71115284

Guo, Z., Zhou, K., Zhang, C., Lu, X., Chen, W., & Yang, S. (2018). Residential electricity consumption behavior: Influencing factors, related theories and intervention strategies. Renewable and Sustainable Energy Reviews, 81, 399-412. https://doi.org/10.1016/j.rser.2017.07.046

Kavousian, A., Rajagopal, R., & Fischer, M. (2013). Determinants of residential electricity consumption: Using smart meter data to examine the effect of climate, building characteristics, appliance stock, and occupants’ behavior. Energy, 55, 184-194. https://doi.org/10.1016/j.energy.2013.03.086

Leahy, E., & Lyons, S. (2010). Energy use and appliance ownership in Ireland. Energy Policy, 38(8), 4265-4279. https://doi.org/10.1016/j.enpol.2010.03.056

Littleford, C., Ryley, T. J., & Firth, S. K. (2014). Context, control and the spillover of energy use behaviours between office and home settings. Journal of Environmental Psychology, 40, 157-166. https://doi.org/10.1016/j.jenvp.2014.06.002

McLoughlin, F., Duffy, A., & Conlon, M. (2012). Characterising domestic electricity consumption patterns by dwelling and occupant socio-economic variables: An Irish case study. Energy and buildings, 48, 240-248. https://doi.org/10.1016/j.enbuild.2012.01.037

Nakamura, H. (2013). Effects of social participation and the emergence of voluntary social interactions on household power-saving practices in post-disaster Kanagawa, Japan. Energy policy, 54, 397-403. https://doi.org/10.1016/j.enpol.2012.11.041

Nilsson, A., Stoll, P., & Brandt, N. (2017). Assessing the impact of real-time price visualization on residential electricity consumption, costs, and carbon emissions. Resources, Conservation and Recycling, 124, 152-161. https://doi.org/10.1016/j.resconrec.2015.10.007

Sundram, G (2020). Household monthly electricity bill. Retrieved from https://www.kaggle.com/datasets/gireeshs/household-monthly-electricity-bill/data?fbclid=IwAR2tESLvu1TikxZ0djsSaRq5WQgSix2V45sqRgP-k6Sn5tguZ8jyn7zToR4

Özkan, H. A. (2016). Appliance based control for home power management systems. Energy, 114, 693-707. https://doi.org/10.1016/j.energy.2016.08.016

Yohanis, Y. G., Mondol, J. D., Wright, A., & Norton, B. (2008). Real-life energy use in the UK: How occupancy and dwelling characteristics affect domestic electricity use. Energy and buildings, 40(6), 1053-1059. https://doi.org/10.1016/j.enbuild.2007.09.001

Zhang, C., Zhang, M., Zhang, N. (2017). CO2 Emissions from the Power Industry in the China’s Beijing-Tianjin-Hebei Region: Decomposition and Policy Analysis. Polish Journal of Environmental Studies, 26(2), 903-916. https://doi.org/10.15244/pjoes/66718

Zhang, C., Zhou, K., Yang, S., & Shao, Z. (2017). On electricity consumption and economic growth in China. Renewable and Sustainable Energy Reviews, 76, 353-368. https://doi.org/10.1016/j.rser.2017.03.071