HOUSEHOLD ENERGY OPTIMIZATION: is the process of managing and controlling how energy is use in the home to reduce waste, improving efficiency, and reduces electricity costs.

This involves analyzing energy consuming appliance, to identify excessive or unnecessary usage (find out where and how electricity Is being used more than when not necessary in the house hold), using energy saving devices (using appliance or equipment that are designed to use less electricity while still perform the same function as regular devices), to prevent shortage of energy.

QUESTIONS TO SOLVE UNEDER THIS TOPIC:

1.How can wasteful or unnecessary energy consumption patterns within the household be detected?

2.How does energy usage vary across different time periods (hourly, weekly, or monthly)?

3.How do environmental factors such as temperature and humidity influence energy consumption levels?

4.Are there differences in energy consumption patterns between weekdays and weekends?

5.What recommendations can be made to help households reduce energy waste and improve energy efficiency?

6.How many household has the highest energy usage

THE AIM OF ENERGY OPTIMIZATION:

The major aim for this project is to analyze and optimize the energy usage within house hold in order to minimize energy consumption and reduces electricity costs, and improve overall energy efficiency

SOME OF THE MAIN OBJECTIVES ARE:

  1. To detect wasteful or unnecessary energy consumption patterns within the household.

  2. To examine energy usage patterns across different time periods (hourly,weekly, or monthly).

  3. To analyze environmental factors(such as temperature or humidity) that influence energy consumption levels.

  4. To analyze differences in energy consumption patterns between weekdays and weekends.

  5. To provide recommendations for households on reducing energy waste and improving efficiency.

THE EXPLANATION OF THIS DATASET: The energy optimization dataset i got from KAGGLE,this contains the time stamped measurements of household environment conditions, appliance energy usage, and energy related variables(temperature,humidity,and windspeed), it is designed to support analysis of consumption patterns, environmental impacts, and opportunities for reducing energy waste.This dataset has thousands of hourly observations

The key variables i well be working on in this dataset: ID,DATE/TIME,APPLIANES,LIGHTS,T1-T9,RH1-RH9,T_OUT,RH_OUT

THIS ARE THE LIBRARIES NECESSARY FOR MY DATA ANALYSIS

library(tidyverse) # THIS FOR DATA MANIPULATION
library(lubridate) # THIS FOR WORKING WITH DATE AND TIME
library(ggplot2)   # THIS FOR VISUALIZATION PLOT
library(reshape2) # THIS FOR RESAPING AND AGGREGATION
library(plotly)   # THIS FOR INTERACTIVE VISUALIZATION PLOT

LOADING THE DATASET(ENERGY OPTIMIZATION)

# Loading the dataset
 #this function read.csv is used to import data from a csv file into a data frame
Energy_optm <- read.csv("C:/Users/user/Downloads/train (1).csv") # loading the data
# To see the first few rows of the data set with the function head()
head(Energy_optm) # the first six rows of the dataset
##      ID                date lights       t1  rh_1       t2     rh_2       t3
## 1  2133 2016-01-26 12:30:00      0 19.89000 45.50 19.20000 45.09000 20.39000
## 2 19730 2016-05-27 17:20:00      0 25.56667 46.56 25.89000 42.02571 27.20000
## 3  3288 2016-02-03 13:00:00      0 22.50000 44.43 21.53333 42.59000 21.96333
## 4  7730 2016-03-05 09:20:00      0 19.79000 38.06 17.20000 40.93333 20.60000
## 5  8852 2016-03-13 04:20:00      0 20.60000 35.29 17.10000 39.79000 20.29000
## 6   425 2016-01-14 15:50:00      0 21.53333 40.00 21.07500 38.99750 21.32333
##       rh_3    t4     rh_4       t5     rh_5         t6     rh_6    t7  rh_7
## 1 44.29000 19.10 46.70000 17.51111 53.00000 11.1000000 98.43333 17.50 43.50
## 2 41.16333 24.70 45.59000 23.20000 52.40000 24.7966667  1.00000 24.50 44.50
## 3 44.55500 22.00 40.46667 19.10000 55.32667  6.5300000 61.46333 19.29 34.32
## 4 37.16333 18.39 37.00000 18.29000 42.26000  2.7900000 79.93333 18.10 32.00
## 5 37.00000 19.50 34.50000 18.20000 49.00000 -0.6666667 68.53000 20.70 33.59
## 6 41.43333 18.76 42.36333 17.10000 53.50000  5.3666667 85.53000 17.39 37.90
##         t8     rh_8       t9     rh_9      t_out press_mm_hg   rh_out windspeed
## 1 18.11111 50.00000 17.16667 48.70000 10.3000000    761.9000 85.50000  7.500000
## 2 24.70000 50.07400 23.20000 46.79000 22.7333333    755.2000 55.66667  3.333333
## 3 20.56667 41.33111 18.60000 45.53000  6.6000000    760.2000 64.00000  8.000000
## 4 20.50000 42.59000 18.39000 40.72333  2.1000000    741.5333 94.33333  1.000000
## 5 22.70000 39.26000 18.92667 40.09000 -0.8666667    768.2667 92.33333  1.666667
## 6 17.89000 44.70000 17.10000 44.96667  5.4166667    747.5667 79.83333  6.000000
##   visibility tdewpoint      rv1      rv2 appliances
## 1   23.50000  7.950000 39.24086 39.24086   3.912023
## 2   23.66667 13.333333 43.09681 43.09681   4.605170
## 3   40.00000  0.200000 42.05466 42.05466   4.248495
## 4   48.66667  1.233333 12.61586 12.61586   3.688879
## 5   34.00000 -1.933333 10.89793 10.89793   3.688879
## 6   40.00000  2.166667 20.28818 20.28818   3.688879
# To view the dataset in a large format or to have a clearer view 
View(Energy_optm) # using this function to view
dim(Energy_optm) # the number of rows and columns of the dataset
## [1] 15788    30

To explore the dataset and understand the behavior of energy consumption across appliances and time periods

#convert the date to a normal date-time format
Energy_optm$date <- as.POSIXct(Energy_optm$date, format="%Y-%m-%d %H:%M:%S")
# to extract the useful time features
Energy_optm$hour <- format(Energy_optm$date, "%H") # to extract the hours
Energy_optm$day  <- format(Energy_optm$date, "%A") # to extract the days
Energy_optm$month <- format(Energy_optm$date, "%Y-%m") # to extract the month
Energy_optm <- Energy_optm %>%
  mutate(
    day_type = ifelse(wday(date) %in% c(1, 7), "Weekend", "Weekday")
  )

Checking for the struture and the summary of the dataset

# using function str()to check for the structure of the data
str(Energy_optm) # this give me the skeleton view of my data 
## 'data.frame':    15788 obs. of  34 variables:
##  $ ID         : int  2133 19730 3288 7730 8852 425 10277 15028 16841 1890 ...
##  $ date       : POSIXct, format: "2016-01-26 12:30:00" "2016-05-27 17:20:00" ...
##  $ lights     : int  0 0 0 0 0 0 0 0 0 30 ...
##  $ t1         : num  19.9 25.6 22.5 19.8 20.6 ...
##  $ rh_1       : num  45.5 46.6 44.4 38.1 35.3 ...
##  $ t2         : num  19.2 25.9 21.5 17.2 17.1 ...
##  $ rh_2       : num  45.1 42 42.6 40.9 39.8 ...
##  $ t3         : num  20.4 27.2 22 20.6 20.3 ...
##  $ rh_3       : num  44.3 41.2 44.6 37.2 37 ...
##  $ t4         : num  19.1 24.7 22 18.4 19.5 ...
##  $ rh_4       : num  46.7 45.6 40.5 37 34.5 ...
##  $ t5         : num  17.5 23.2 19.1 18.3 18.2 ...
##  $ rh_5       : num  53 52.4 55.3 42.3 49 ...
##  $ t6         : num  11.1 24.797 6.53 2.79 -0.667 ...
##  $ rh_6       : num  98.4 1 61.5 79.9 68.5 ...
##  $ t7         : num  17.5 24.5 19.3 18.1 20.7 ...
##  $ rh_7       : num  43.5 44.5 34.3 32 33.6 ...
##  $ t8         : num  18.1 24.7 20.6 20.5 22.7 ...
##  $ rh_8       : num  50 50.1 41.3 42.6 39.3 ...
##  $ t9         : num  17.2 23.2 18.6 18.4 18.9 ...
##  $ rh_9       : num  48.7 46.8 45.5 40.7 40.1 ...
##  $ t_out      : num  10.3 22.733 6.6 2.1 -0.867 ...
##  $ press_mm_hg: num  762 755 760 742 768 ...
##  $ rh_out     : num  85.5 55.7 64 94.3 92.3 ...
##  $ windspeed  : num  7.5 3.33 8 1 1.67 ...
##  $ visibility : num  23.5 23.7 40 48.7 34 ...
##  $ tdewpoint  : num  7.95 13.33 0.2 1.23 -1.93 ...
##  $ rv1        : num  39.2 43.1 42.1 12.6 10.9 ...
##  $ rv2        : num  39.2 43.1 42.1 12.6 10.9 ...
##  $ appliances : num  3.91 4.61 4.25 3.69 3.69 ...
##  $ hour       : chr  "12" "17" "13" "09" ...
##  $ day        : chr  "Tuesday" "Friday" "Wednesday" "Saturday" ...
##  $ month      : chr  "2016-01" "2016-05" "2016-02" "2016-03" ...
##  $ day_type   : chr  "Weekday" "Weekday" "Weekday" "Weekend" ...
summary(Energy_optm) # this gives me the(min,1st.qu,median,mean,3rd.qu,,max,class,and length) of each variables
##        ID             date                         lights             t1       
##  Min.   :    1   Min.   :2016-01-11 17:10:00   Min.   : 0.000   Min.   :16.79  
##  1st Qu.: 4923   1st Qu.:2016-02-14 21:27:30   1st Qu.: 0.000   1st Qu.:20.78  
##  Median : 9908   Median :2016-03-20 12:15:00   Median : 0.000   Median :21.60  
##  Mean   : 9873   Mean   :2016-03-20 06:26:24   Mean   : 3.809   Mean   :21.69  
##  3rd Qu.:14821   3rd Qu.:2016-04-23 15:12:30   3rd Qu.: 0.000   3rd Qu.:22.60  
##  Max.   :19734   Max.   :2016-05-27 18:00:00   Max.   :70.000   Max.   :26.26  
##       rh_1             t2             rh_2             t3       
##  Min.   :27.02   Min.   :16.10   Min.   :20.46   Min.   :17.20  
##  1st Qu.:37.40   1st Qu.:18.82   1st Qu.:37.90   1st Qu.:20.79  
##  Median :39.66   Median :20.00   Median :40.50   Median :22.10  
##  Mean   :40.27   Mean   :20.35   Mean   :40.43   Mean   :22.27  
##  3rd Qu.:43.06   3rd Qu.:21.50   3rd Qu.:43.29   3rd Qu.:23.29  
##  Max.   :57.42   Max.   :29.86   Max.   :54.77   Max.   :29.24  
##       rh_3             t4             rh_4             t5       
##  Min.   :28.77   Min.   :15.10   Min.   :27.66   Min.   :15.33  
##  1st Qu.:36.90   1st Qu.:19.53   1st Qu.:35.59   1st Qu.:18.29  
##  Median :38.56   Median :20.63   Median :38.46   Median :19.39  
##  Mean   :39.25   Mean   :20.85   Mean   :39.05   Mean   :19.60  
##  3rd Qu.:41.76   3rd Qu.:22.10   3rd Qu.:42.19   3rd Qu.:20.63  
##  Max.   :50.16   Max.   :26.20   Max.   :51.09   Max.   :25.80  
##       rh_5             t6              rh_6             t7       
##  Min.   :30.17   Min.   :-6.030   Min.   : 1.00   Min.   :15.39  
##  1st Qu.:45.43   1st Qu.: 3.595   1st Qu.:29.99   1st Qu.:18.70  
##  Median :49.08   Median : 7.300   Median :55.30   Median :20.06  
##  Mean   :50.95   Mean   : 7.914   Mean   :54.64   Mean   :20.27  
##  3rd Qu.:53.70   3rd Qu.:11.263   3rd Qu.:83.30   3rd Qu.:21.60  
##  Max.   :96.32   Max.   :28.290   Max.   :99.90   Max.   :25.96  
##       rh_7             t8             rh_8             t9       
##  Min.   :23.23   Min.   :16.31   Min.   :29.60   Min.   :14.89  
##  1st Qu.:31.50   1st Qu.:20.79   1st Qu.:39.09   1st Qu.:18.00  
##  Median :34.90   Median :22.10   Median :42.43   Median :19.39  
##  Mean   :35.41   Mean   :22.03   Mean   :42.96   Mean   :19.49  
##  3rd Qu.:39.02   3rd Qu.:23.39   3rd Qu.:46.56   3rd Qu.:20.60  
##  Max.   :51.33   Max.   :27.23   Max.   :58.78   Max.   :24.50  
##       rh_9           t_out         press_mm_hg        rh_out      
##  Min.   :29.17   Min.   :-5.000   Min.   :729.3   Min.   : 24.00  
##  1st Qu.:38.53   1st Qu.: 3.633   1st Qu.:750.9   1st Qu.: 70.33  
##  Median :40.93   Median : 6.933   Median :756.1   Median : 84.00  
##  Mean   :41.57   Mean   : 7.418   Mean   :755.5   Mean   : 79.82  
##  3rd Qu.:44.36   3rd Qu.:10.417   3rd Qu.:760.9   3rd Qu.: 91.67  
##  Max.   :53.33   Max.   :26.100   Max.   :772.3   Max.   :100.00  
##    windspeed        visibility      tdewpoint            rv1           
##  Min.   : 0.000   Min.   : 1.00   Min.   :-6.6000   Min.   : 0.006033  
##  1st Qu.: 2.000   1st Qu.:29.00   1st Qu.: 0.9333   1st Qu.:12.510037  
##  Median : 3.667   Median :40.00   Median : 3.4333   Median :24.912220  
##  Mean   : 4.031   Mean   :38.33   Mean   : 3.7814   Mean   :25.027694  
##  3rd Qu.: 5.500   3rd Qu.:40.00   3rd Qu.: 6.6000   3rd Qu.:37.665543  
##  Max.   :14.000   Max.   :66.00   Max.   :15.4000   Max.   :49.996530  
##       rv2              appliances        hour               day           
##  Min.   : 0.006033   Min.   :2.303   Length:15788       Length:15788      
##  1st Qu.:12.510037   1st Qu.:3.912   Class :character   Class :character  
##  Median :24.912220   Median :4.094   Mode  :character   Mode  :character  
##  Mean   :25.027694   Mean   :4.305                                        
##  3rd Qu.:37.665543   3rd Qu.:4.605                                        
##  Max.   :49.996530   Max.   :6.985                                        
##     month             day_type        
##  Length:15788       Length:15788      
##  Class :character   Class :character  
##  Mode  :character   Mode  :character  
##                                       
##                                       
## 

Check for missing values

# so i want to check if there's missing values 
all(is.na(Energy_optm))
## [1] FALSE
 # the function is.na() is to check if not available or missing values

During the process of cleaning the data i found out that the date is not properly arranged, so i sorted it because it can give me proper plotting and good interpretation for the visualization.

# To arrange the date ascending order
sorted <- Energy_optm[ #assigning it with sorted
  order(Energy_optm$date),
  ] # Using the function order()

Based on the objective for energy optimization which is to know where there is waste or unnecessary energy consumption in the househould. but before i check, i calculated the total amount of energy consumed in all the household

# Total energy used by appliances in all household
total_appliances <- sum(Energy_optm$appliances, na.rm = TRUE)
total_appliances
## [1] 67971.71
# Total energy used by lights in all household
total_lights <- sum(Energy_optm$lights, na.rm = TRUE)
total_lights
## [1] 60130
# Total energy used in the household
total_energy <- total_appliances + total_lights
total_energy
## [1] 128101.7

Calculating the total energy used by each household

Energy_optm$total_energy_used<- Energy_optm$lights+ # assigning the new variable with the sum of lights and appliance
             Energy_optm$appliances
tibble(Energy_optm$ID,Energy_optm$total_energy_used)
## # A tibble: 15,788 × 2
##    `Energy_optm$ID` `Energy_optm$total_energy_used`
##               <int>                           <dbl>
##  1             2133                            3.91
##  2            19730                            4.61
##  3             3288                            4.25
##  4             7730                            3.69
##  5             8852                            3.69
##  6              425                            3.69
##  7            10277                            4.09
##  8            15028                            4.09
##  9            16841                            4.50
## 10             1890                           34.6 
## # ℹ 15,778 more rows
Energy_optm <- Energy_optm %>%
  mutate(
    day_type = ifelse(wday(date) %in% c(1, 7), "Weekend", "Weekday")
  )
View(Energy_optm) # to check if the new variable is added

Distribution of household energy consumption

ggplot(Energy_optm, aes(x = appliances)) + 
  geom_histogram(fill="red") + # filling it with red colour
  labs(
    title = "Distribution of Household Energy Consumption", # the title
    x = "Energy Consumption", # xlabel
    y = "household" # ylabel
  ) +
  theme_minimal()
## `stat_bin()` using `bins = 30`. Pick better value `binwidth`.

The distribution of household energy usage shows how energy is spread across households. Most households use moderate energy, while a few may have unusually high or low usage. Understanding this distribution helps identify wasteful consumption,

Using boxplot to detect where there’s wasteful and unusual energy usage

ggplot(Energy_optm, aes(y = appliances)) + # using aes() to map out appliances 
  geom_boxplot(fill = "yellow") + # creating a boxplot
  labs(title = " Appliances outliers(unusual and wasteful)", # add the title for the graph
       y = "Appliances (Wh)") # ylab

The wasteful energy occurs when appliances is left on when not needed e.g Lights, heaters, AC units, or other appliances running longer than necessary. so the minimum energy usage occurs when the applainces is turned off or unused in nighttime periods

Interactive plot to identify unusual or wasteful energy consumption

# Create a Plotly boxplot to identify unusual or wasteful energy consumption
plot_ly(
  data = Energy_optm,   # Use the Energy_optm dataset
  y = ~appliances,   # Plot the appliances energy usage on the y-axis
  type = "box",   # Specify that this is a boxplot
  boxpoints = "outliers",  # Show only outlier points (unusual high/low usage)
  marker = list(size = 6),   # Set the size of the outlier points
  color = "yellow",
  line = list(width = 2),  # Adjust the thickness of the boxplot lines
  name = "Appliances"       # Name the box in case of multiple boxplots
) %>%
  # Customize the layout of the plot
  plotly::layout(
    title = "Appliances Outliers (Unusual and Wasteful)",  # Title of the chart
    yaxis = list(title = "Appliances (Wh)"),    # Label for y-axis
    showlegend = FALSE     # Hide the legend
  )

Examine the usage pattern across different time period(hours,weeks,and months)

Energy_optm <- Energy_optm %>%
  mutate(hour = hour(date)) # TO modify hour from the existing date and add it to the columns
Energy_optm <- Energy_optm %>%
  mutate(day = day(date)) # TO modify day from the existing date and add it to the columns
Energy_optm <- Energy_optm %>%
  mutate(week = week(date)) # TO modify week from the existing date and add it to the columns
Energy_optm <- Energy_optm %>%
  mutate(month = month(date)) # TO modify hour from the existing date and add it to the columns

# Calculate average energy used per hour
hourly <- Energy_optm %>% 
  group_by(hour) %>% # calculate the mean and summaries it with total energy used
  summarise(avg_energy = mean(total_energy_used, na.rm = TRUE)) 

# Plot for the average energy usage by hour
ggplot(hourly, aes(x=hour, y=avg_energy, fill=avg_energy)) +
  geom_bar(stat="identity") +
  labs(title="The Average Energy Usage by Hour",
       x="Hours", # xlabel
       y="Energy consumption (Wh)") + # ylabel
  scale_fill_gradient(low = "green", high = "red") +
  theme_minimal()

In this plot we can see it showing how energy is been used during the hours of the day.let’s consider homan activities as an example, energy usage is low in the morning when people are out of the house and moderate in the afternoon when some people are home, and high when most are home and using multiple appliances like Ac, television, water heater,space heater and some mini electronic devices

plot for the weekly usage

weekly <- Energy_optm %>% 
  group_by(week) %>% # grouping the data by week
  summarise(avg_energy = mean(total_energy_used, na.rm = TRUE)) 
ggplot(weekly, aes(x=week, y=avg_energy, fill=avg_energy)) +
  geom_bar(stat="identity") +
  labs(title="The Average Energy Usage by Week", # title for the graph
       x="Weekly", #xlab
       y="Energy consumption (Wh)") + #ylab
  scale_fill_gradient(low = "green", high = "red") + # add a gradient color highlight high(red) vs low (green) usage
  theme_minimal()

Household energy Consumption by Month

monthly <- Energy_optm %>% 
  group_by(month) %>% # grouping the data by month
  summarise(avg_energy = mean(total_energy_used, na.rm = TRUE))
# Create a bar plot comparing energy usage between month
ggplot(monthly, aes(x=month, y=avg_energy, fill=avg_energy)) +
  geom_bar(stat="identity") +
  labs(title="The Average Energy Usage by month",
       x="month", # xlab
       y="Energy consumption (Wh)") + # ylab
  scale_fill_gradient(low = "green", high = "red") + # add a gradient color highlight high(red) vs low (green) usage
  theme_minimal()

February had the highest energy consumption due to the cold weather, which led to increased use of heating system.may has the lowest energy because temperatures are moderate and comfortable, heating is rarely needed and cooling may not be fully used.

Household energy consumption patterns between weekdays and weekends.

avg_energy_by_day <- Energy_optm %>%
  group_by(day_type) %>% # grouping the data by day type
  summarise(avg_usage = mean(appliances, na.rm = TRUE)) %>% # Compute the mean appliances usage for each group
  arrange(desc(avg_usage)) # Arrange the results from highest to lowest average usage
# Create a bar plot comparing energy usage between day types
ggplot(avg_energy_by_day, aes(x = day_type, y = avg_usage, fill = avg_usage)) +
  geom_bar(stat = "identity") +   # Draw bars where the height represents the average usage
  labs(
    title = "energy consumption patterns between weekdays and weekends", # add the title
    x = "Day of the Week", # label for x-axis
    y = "Energy Usage (Appliances)" # label for y-axis
  ) +
  scale_fill_gradient(low = "green", high = "red") + # add a gradient color highlight high(red) vs low (green) usage
  theme_minimal()

# Saturdays are often days when all family members are home. More people at home usually means more use of appliances, like lights, fans, air conditioning, washing machines, ovens, Televisions,heater and other electronics.

Environment factors that influences energy usage(temperature,humidity and windspeed)

The effect of Temperature

# the effect of temperature on energy usage
ggplot(Energy_optm, aes(t1, appliances)) +
  geom_point(alpha=0.4) + # add each points
  geom_smooth(color="red") + # to add the temperature trend line
  labs(title="Effect of Temperature on Energy usage",
       x="Temperature (°C)", # label for x-axis
       y="Energy Usage (Wh)") # label for the y-axis
## `geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'

Below the line showing the lower energy usage while above the line is showing higher energy usage in respect to the temperature measured outside each household, meaning that the higher the temp outside the higher the appliances used(Ac) and the lower the temperature the higher the appliances used(heater)

The influence of humidity on energy usage

# the effect of humidity on energy usage
ggplot(Energy_optm, aes(rh_1,appliances)) +
  geom_point(alpha=0.4) + # add the scatter points
  geom_smooth(color="blue") + # add the trend line to show the general relationship
  labs(title="Effect of Humidity on Energy used", # the title for the plot
       x="Humidity (%)", # label for x-axis
       y="Energy Usage (Wh)") # label for the y-axis
## `geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'

Low humidity often occurs during cold weather,When it’s cold and dry, households may use heating systems more, which increases energy usage, while high humidity increase cooling system usage(dehumidifier)

The influence of windspeed on energy usage

ggplot(Energy_optm, aes(windspeed,appliances)) +
  geom_point(alpha=0.4) + # add the scatter points
  geom_smooth(color="blue") + # add the trend line to show the general relationship
  labs(title="Effect of windspeed on Energy used", # the title for the plot
       x="windspeed(m/s) ", # label for x-axis
       y="Energy Usage (Wh)") # label for the y-axis
## `geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'

windspeed contribute to energy usage in household but its effect is smaller than temperature and humidity.for example in warm weather, wind can help cool the house naturally, reducing AC use.

Top ten house hold with the highest energy usage

top10_households <- Energy_optm %>%
  group_by(ID) %>%                       # Group by household
  summarise(total_usage = sum(total_energy_used, na.rm = TRUE)) %>%  # Total energy used
  arrange(desc(total_usage)) %>%               # Sort from highest to lowest
  slice(1:10)    
top10_households
## # A tibble: 10 × 2
##       ID total_usage
##    <int>       <dbl>
##  1    10        75.4
##  2    11        66.4
##  3    12        56.1
##  4  2197        55.9
##  5  1313        55.8
##  6  2888        54.9
##  7  8933        54.5
##  8  8934        54.4
##  9     7        54.1
## 10  1837        46.3
# Plot top 10 households 
ggplot(top10_households, aes(x = reorder(ID, total_usage), y = total_usage, fill = total_usage)) +
  geom_bar(stat = "identity") + # Draw bars with fill mapped to total_usage 
  # Flip coordinates to make it horizontal
  coord_flip() +
  labs(
    title = "Top 10 Households with Highest Energy Usage",   # Add title and axis labels
    x = "Household ID",
    y = "Total Energy Usage (Wh)"
  ) +
  scale_fill_gradient(low = "orange", high = "red") +   # Apply a color gradient from green (low) to red (high)
  theme_minimal(base_size = 14)

The top ten households use the most energy mainly because they need more heating or cooling, use inefficient appliances, and do not schedule their appliance use well. Larger household size and poor building insulation also increase their energy use. In addition, weather conditions like temperature and wind, together with daily usage habits, further raise their energy consumption.

so after visualizing the results shows how energy is been used overtime (hour,week and month),the results show that temperature and humidity affect energy use, with higher consumption during extreme weather and this visualizations help identify wasteful energy use and provide insights that can guide households to reduce unnecessary consumption and improve energy efficiency.

My advice as a data analyst to the household is to turn off appliances like Ac, lights, and other devices when not in use, to maximize natural light during the day,to Maintain moderate indoor temperature and humidity to reduce HVAC load and Reduce usage during peak hours (evenings and weekends) when appliances consume the most energy.