Big news about a new project! Hi,
I have exciting news!
A large regional residential developer is designing a large ‘Smart Home’ apartment housing development and is looking for evidence or positive reasons for adopting the use of electrical sub-metering devices used for power management in Smart Homes.
Installing these sub-meters could be a big step towards the developer’s goal of offering highly efficient Smart Homes that provide owners with power usage analytics - a large piece of their planned marketing efforts.
We have been asked to find enough evidence to help support the developer’s marketing claims of sub-meters providing owners with ‘useful’ power usage analytics; we need to do this by performing an ‘analytical deep dive’ of sub-metering generated data and producing high quality visualizations that support a positive narrative or story around your findings (you will create this narrative).
We have also been asked to demonstrate that we can predict future energy consumption from the same data; this would help further support their adoption of sub-meters from the viewpoint of possible energy consumption reduction and a monetary savings as a result.,
As a starting point, they have provided us with a very large data set that contains 47 months of energy usage data from these devices. Our job over the next few weeks will be to analyze this data to determine what kind of analytics and visualizations can be created that would empower Smart Home owners with greater understanding and control of their power usage.
Since you are new at IOT Analytics, I’ll explain our onboarding process for new clients.
Conduct research to become informed on the client’s business. In this case, you will need to get up to speed on the domains of Smart Homes, sub-meters and household power consumption.
Identify any analytic skill/knowledge gaps foreseen for the project and plug those gaps with self-learning. With a data set this big, data munging and sub-setting will be essential to the analytic process. Working with DateTime and Time Series also needs to be mastered.
Perform an initial exploration of the data. This exploration should be used to understand any potential issues, conduct early preprocessing, note summary statistics and identify any early recommendations about improvements to future data collection.
Hold a project kick-off meeting with the client to close the deal. This meeting will center around a presentation that contains all of the project details and our initial exploration of the data.
Your initial task is to work through the onboarding process and produce a PowerPoint presentation that will be delivered to the home developer’s management team during the kick off meeting. This report will give them confidence in our process and convince them that this project is relevant to their business needs.
It is important to remember that we do not know what the data will show, and neither does the developer; it is your job to tell them how we will conduct the analysis and what they are likely to gain, so be precise and accurate with your findings and any initial recommendations. However, keep in mind that you will be presenting your report to business rather than technical people.
Good luck,
Kathy
VP, IOT Analytics
The dataset contains 47 months of energy usage data from sub-meters in a house located in Sceaux, France. The data includes measurements of electric power consumption from three sub-meters with a one-minute sampling rate.
library(RMySQL)
## Loading required package: DBI
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(lubridate)
##
## Attaching package: 'lubridate'
## The following objects are masked from 'package:base':
##
## date, intersect, setdiff, union
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ forcats 1.0.0 ✔ stringr 1.5.0
## ✔ ggplot2 3.4.2 ✔ tibble 3.2.1
## ✔ purrr 1.0.1 ✔ tidyr 1.3.0
## ✔ readr 2.1.4
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(plotly)
##
## Attaching package: 'plotly'
##
## The following object is masked from 'package:ggplot2':
##
## last_plot
##
## The following object is masked from 'package:stats':
##
## filter
##
## The following object is masked from 'package:graphics':
##
## layout
library(ggplot2)
library(ggfortify)
library(fracdiff)
library(plotly)
library(tidyquant)
## Loading required package: PerformanceAnalytics
## Loading required package: xts
## Loading required package: zoo
##
## Attaching package: 'zoo'
##
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
##
##
## ######################### Warning from 'xts' package ##########################
## # #
## # The dplyr lag() function breaks how base R's lag() function is supposed to #
## # work, which breaks lag(my_xts). Calls to lag(my_xts) that you type or #
## # source() into this session won't work correctly. #
## # #
## # Use stats::lag() to make sure you're not using dplyr::lag(), or you can add #
## # conflictRules('dplyr', exclude = 'lag') to your .Rprofile to stop #
## # dplyr from breaking base R's lag() function. #
## # #
## # Code in packages is not affected. It's protected by R's namespace mechanism #
## # Set `options(xts.warn_dplyr_breaks_lag = FALSE)` to suppress this warning. #
## # #
## ###############################################################################
##
## Attaching package: 'xts'
##
## The following objects are masked from 'package:dplyr':
##
## first, last
##
##
## Attaching package: 'PerformanceAnalytics'
##
## The following object is masked from 'package:graphics':
##
## legend
##
## Loading required package: quantmod
## Loading required package: TTR
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
library(lubridate)
library(forecast)
## Registered S3 methods overwritten by 'forecast':
## method from
## autoplot.Arima ggfortify
## autoplot.acf ggfortify
## autoplot.ar ggfortify
## autoplot.bats ggfortify
## autoplot.decomposed.ts ggfortify
## autoplot.ets ggfortify
## autoplot.forecast ggfortify
## autoplot.stl ggfortify
## autoplot.ts ggfortify
## fitted.ar ggfortify
## fortify.ts ggfortify
## residuals.ar ggfortify
library(zoo)
library(dplyr)
library(fpp2)
## ── Attaching packages ────────────────────────────────────────────── fpp2 2.5 ──
## ✔ fma 2.5 ✔ expsmooth 2.3
library(shiny)
library(textshaping)
library(inline)
##
## Attaching package: 'inline'
##
## The following object is masked from 'package:shiny':
##
## code
library(pryr)
##
## Attaching package: 'pryr'
##
## The following objects are masked from 'package:purrr':
##
## compose, partial
##
## The following object is masked from 'package:dplyr':
##
## where
library(stats)
## Create a database connection
#con = dbConnect(MySQL(), user='deepAnalytics', password='Sqltask1234!', dbname='dataanalytics2018', #host='data-analytics-2018.cbrosir2cswx.us-east-1.rds.amazonaws.com')
## Create a database connection
#y2007 <- dbGetQuery(con, "SELECT * FROM yr_2007")
#y2008 <- dbGetQuery(con, "SELECT * FROM yr_2008")
#y2009 <- dbGetQuery(con, "SELECT * FROM yr_2009")
# Load required libraries
library(shiny)
library(DBI)
library(dplyr)
# Define UI for the Shiny app
ui <- fluidPage(
# Application title
titlePanel("Database Connection"),
# Sidebar layout
sidebarLayout(
# Sidebar panel
sidebarPanel(
# Database connection inputs
textInput("user_input", "Username", value = ""),
passwordInput("password_input", "Password", value = ""),
textInput("dbname_input", "Database Name", value = ""),
textInput("host_input", "Host", value = ""),
# Action button to trigger connection and data retrieval
actionButton("connect_button", "Connect to Database")
),
# Main panel
mainPanel(
# Output for status message
verbatimTextOutput("status_output"),
# Output for downloaded data
downloadButton("download_button", "Download Data")
)
)
)
# Define server logic for the Shiny app
server <- function(input, output) {
# Reactive values for storing the database connection and retrieved data
con <- reactiveValues()
data <- reactiveValues()
# Connect to the database when the "Connect to Database" button is clicked
observeEvent(input$connect_button, {
# Retrieve user inputs
user <- input$user_input
password <- input$password_input
dbname <- input$dbname_input
host <- input$host_input
# Create a database connection
con$connection <- dbConnect(MySQL(), user = user, password = password, dbname = dbname, host = host)
# Check if the connection was successful
if (!is.null(con$connection)) {
output$status_output <- renderPrint("Database connection successful.")
} else {
output$status_output <- renderPrint("Database connection failed.")
}
})
# Retrieve data from the database and store it when the connection is successful
observeEvent(con$connection, {
if (!is.null(con$connection)) {
# Retrieve data from the database
y2007 <- dbGetQuery(con$connection, "SELECT * FROM yr_2007")
y2008 <- dbGetQuery(con$connection, "SELECT * FROM yr_2008")
y2009 <- dbGetQuery(con$connection, "SELECT * FROM yr_2009")
# Combine the data into a single dataframe
data$df <- bind_rows(y2007, y2008, y2009)
}
})
# Download the data as a CSV file
output$download_button <- downloadHandler(
filename = "data.csv",
content = function(file) {
if (!is.null(data$df)) {
write.csv(data$df, file, row.names = FALSE)
}
}
)
}
# Run the Shiny app
#shinyApp(ui = ui, server = server)
This project aims to analyze and forecast power consumption data from a large ‘Smart Home’ apartment housing development. The goal is to provide evidence for the adoption of electrical sub-metering devices used for power management in Smart Homes, as well as empower Smart Home owners with greater understanding and control of their power usage.
We have been tasked with conducting an analytical deep dive into sub-metering generated data and producing high-quality visualizations that support a positive narrative around the benefits of sub-meters. Additionally, we need to demonstrate the ability to predict future energy consumption based on the same data.
# Read data.csv
data <- read.csv("data.csv")
# Convert "Date" column to datetime
data$Date <- as.POSIXct(data$Date)
# Select specific years
y2007 <- data[format(data$Date, "%Y") == "2007", ]
y2008 <- data[format(data$Date, "%Y") == "2008", ]
y2009 <- data[format(data$Date, "%Y") == "2009", ]
## 2007
str(y2007)
## 'data.frame': 521669 obs. of 10 variables:
## $ id : num 1 2 3 4 5 6 7 8 9 10 ...
## $ Date : POSIXct, format: "2007-01-01" "2007-01-01" ...
## $ Time : chr "00:00:00" "00:01:00" "00:02:00" "00:03:00" ...
## $ Global_active_power : num 2.58 2.55 2.55 2.55 2.55 ...
## $ Global_reactive_power: num 0.136 0.1 0.1 0.1 0.1 0.1 0.096 0 0 0 ...
## $ Global_intensity : num 10.6 10.4 10.4 10.4 10.4 10.4 10.4 10.2 10.2 10.2 ...
## $ Voltage : num 242 242 242 242 242 ...
## $ Sub_metering_1 : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Sub_metering_2 : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Sub_metering_3 : int 0 0 0 0 0 0 0 0 0 0 ...
summary(y2007)
## id Date Time
## Min. : 1 Min. :2007-01-01 00:00:00.00 Length:521669
## 1st Qu.:130423 1st Qu.:2007-04-01 00:00:00.00 Class :character
## Median :264606 Median :2007-07-03 00:00:00.00 Mode :character
## Mean :263456 Mean :2007-07-02 11:19:19.42
## 3rd Qu.:395178 3rd Qu.:2007-10-02 00:00:00.00
## Max. :525600 Max. :2007-12-31 00:00:00.00
## Global_active_power Global_reactive_power Global_intensity Voltage
## Min. : 0.082 Min. :0.0000 Min. : 0.400 Min. :223.5
## 1st Qu.: 0.278 1st Qu.:0.0000 1st Qu.: 1.200 1st Qu.:236.9
## Median : 0.504 Median :0.1000 Median : 2.400 Median :239.7
## Mean : 1.117 Mean :0.1174 Mean : 4.764 Mean :239.4
## 3rd Qu.: 1.548 3rd Qu.:0.1860 3rd Qu.: 6.400 3rd Qu.:241.8
## Max. :10.670 Max. :1.1480 Max. :46.400 Max. :252.1
## Sub_metering_1 Sub_metering_2 Sub_metering_3
## Min. : 0.000 Min. : 0.000 Min. : 0.000
## 1st Qu.: 0.000 1st Qu.: 0.000 1st Qu.: 0.000
## Median : 0.000 Median : 0.000 Median : 0.000
## Mean : 1.232 Mean : 1.638 Mean : 5.795
## 3rd Qu.: 0.000 3rd Qu.: 1.000 3rd Qu.:17.000
## Max. :78.000 Max. :78.000 Max. :20.000
head(y2007)
## id Date Time Global_active_power Global_reactive_power
## 1 1 2007-01-01 00:00:00 2.580 0.136
## 2 2 2007-01-01 00:01:00 2.552 0.100
## 3 3 2007-01-01 00:02:00 2.550 0.100
## 4 4 2007-01-01 00:03:00 2.550 0.100
## 5 5 2007-01-01 00:04:00 2.554 0.100
## 6 6 2007-01-01 00:05:00 2.550 0.100
## Global_intensity Voltage Sub_metering_1 Sub_metering_2 Sub_metering_3
## 1 10.6 241.97 0 0 0
## 2 10.4 241.75 0 0 0
## 3 10.4 241.64 0 0 0
## 4 10.4 241.71 0 0 0
## 5 10.4 241.98 0 0 0
## 6 10.4 241.83 0 0 0
tail(y2007)
## id Date Time Global_active_power Global_reactive_power
## 521664 525595 2007-12-31 23:54:00 1.648 0.102
## 521665 525596 2007-12-31 23:55:00 1.746 0.204
## 521666 525597 2007-12-31 23:56:00 1.732 0.210
## 521667 525598 2007-12-31 23:57:00 1.732 0.210
## 521668 525599 2007-12-31 23:58:00 1.684 0.144
## 521669 525600 2007-12-31 23:59:00 1.628 0.072
## Global_intensity Voltage Sub_metering_1 Sub_metering_2 Sub_metering_3
## 521664 6.8 241.56 0 0 18
## 521665 7.2 242.41 0 0 18
## 521666 7.2 242.42 0 0 18
## 521667 7.2 242.50 0 0 18
## 521668 7.0 242.18 0 0 18
## 521669 6.6 241.79 0 0 18
## 2008
str(y2008)
## 'data.frame': 526905 obs. of 10 variables:
## $ id : num 1 2 3 4 5 6 7 8 9 10 ...
## $ Date : POSIXct, format: "2008-01-01" "2008-01-01" ...
## $ Time : chr "00:00:00" "00:01:00" "00:02:00" "00:03:00" ...
## $ Global_active_power : num 1.62 1.63 1.62 1.61 1.61 ...
## $ Global_reactive_power: num 0.07 0.072 0.072 0.07 0.07 0 0 0 0 0 ...
## $ Global_intensity : num 6.6 6.6 6.6 6.6 6.6 6.4 6.4 6.4 6.4 6.4 ...
## $ Voltage : num 241 242 242 241 241 ...
## $ Sub_metering_1 : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Sub_metering_2 : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Sub_metering_3 : int 18 18 18 18 18 17 18 18 18 18 ...
summary(y2008)
## id Date Time
## Min. : 1 Min. :2008-01-01 00:00:00.00 Length:526905
## 1st Qu.:131732 1st Qu.:2008-04-01 00:00:00.00 Class :character
## Median :263461 Median :2008-07-01 00:00:00.00 Mode :character
## Mean :263474 Mean :2008-07-01 11:39:11.64
## 3rd Qu.:395191 3rd Qu.:2008-10-01 00:00:00.00
## Max. :527040 Max. :2008-12-31 00:00:00.00
## Global_active_power Global_reactive_power Global_intensity Voltage
## Min. : 0.076 Min. :0.0000 Min. : 0.200 Min. :224.6
## 1st Qu.: 0.300 1st Qu.:0.0460 1st Qu.: 1.400 1st Qu.:238.9
## Median : 0.566 Median :0.0940 Median : 2.400 Median :240.7
## Mean : 1.072 Mean :0.1171 Mean : 4.552 Mean :240.6
## 3rd Qu.: 1.518 3rd Qu.:0.1840 3rd Qu.: 6.400 3rd Qu.:242.5
## Max. :10.348 Max. :1.3900 Max. :44.600 Max. :250.9
## Sub_metering_1 Sub_metering_2 Sub_metering_3
## Min. : 0.00 Min. : 0.000 Min. : 0.000
## 1st Qu.: 0.00 1st Qu.: 0.000 1st Qu.: 0.000
## Median : 0.00 Median : 0.000 Median : 1.000
## Mean : 1.11 Mean : 1.256 Mean : 6.034
## 3rd Qu.: 0.00 3rd Qu.: 1.000 3rd Qu.:17.000
## Max. :80.00 Max. :76.000 Max. :31.000
head(y2008)
## id Date Time Global_active_power Global_reactive_power
## 521670 1 2008-01-01 00:00:00 1.620 0.070
## 521671 2 2008-01-01 00:01:00 1.626 0.072
## 521672 3 2008-01-01 00:02:00 1.622 0.072
## 521673 4 2008-01-01 00:03:00 1.612 0.070
## 521674 5 2008-01-01 00:04:00 1.612 0.070
## 521675 6 2008-01-01 00:05:00 1.546 0.000
## Global_intensity Voltage Sub_metering_1 Sub_metering_2 Sub_metering_3
## 521670 6.6 241.25 0 0 18
## 521671 6.6 241.74 0 0 18
## 521672 6.6 241.52 0 0 18
## 521673 6.6 240.82 0 0 18
## 521674 6.6 240.80 0 0 18
## 521675 6.4 240.66 0 0 17
tail(y2008)
## id Date Time Global_active_power Global_reactive_power
## 1048569 527035 2008-12-31 23:54:00 0.484 0.064
## 1048570 527036 2008-12-31 23:55:00 0.484 0.064
## 1048571 527037 2008-12-31 23:56:00 0.482 0.064
## 1048572 527038 2008-12-31 23:57:00 0.482 0.064
## 1048573 527039 2008-12-31 23:58:00 0.480 0.064
## 1048574 527040 2008-12-31 23:59:00 0.482 0.062
## Global_intensity Voltage Sub_metering_1 Sub_metering_2 Sub_metering_3
## 1048569 2.2 248.15 0 0 0
## 1048570 2.2 247.69 0 0 0
## 1048571 2.2 247.35 0 0 0
## 1048572 2.2 246.99 0 0 0
## 1048573 2.2 246.52 0 0 0
## 1048574 2.2 246.97 0 0 0
## 2009
str(y2009)
## 'data.frame': 521320 obs. of 10 variables:
## $ id : num 1 2 3 4 5 6 7 8 9 10 ...
## $ Date : POSIXct, format: "2009-01-01" "2009-01-01" ...
## $ Time : chr "00:00:00" "00:01:00" "00:02:00" "00:03:00" ...
## $ Global_active_power : num 0.484 0.484 0.482 0.482 0.482 0.57 0.59 0.588 0.586 0.586 ...
## $ Global_reactive_power: num 0.062 0.062 0.062 0.06 0.062 0 0.078 0.078 0.078 0.078 ...
## $ Global_intensity : num 2.2 2.2 2.2 2.2 2.2 2.6 2.6 2.6 2.6 2.6 ...
## $ Voltage : num 248 248 248 248 247 ...
## $ Sub_metering_1 : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Sub_metering_2 : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Sub_metering_3 : int 0 0 0 0 0 0 0 0 0 0 ...
summary(y2009)
## id Date Time
## Min. : 1 Min. :2009-01-01 00:00:00.00 Length:521320
## 1st Qu.:130398 1st Qu.:2009-04-01 00:00:00.00 Class :character
## Median :264038 Median :2009-07-03 00:00:00.00 Mode :character
## Mean :262890 Mean :2009-07-02 01:54:31.30
## 3rd Qu.:395266 3rd Qu.:2009-10-02 00:00:00.00
## Max. :525600 Max. :2009-12-31 00:00:00.00
## Global_active_power Global_reactive_power Global_intensity Voltage
## Min. : 0.122 Min. :0.0000 Min. : 0.400 Min. :223.2
## 1st Qu.: 0.318 1st Qu.:0.0520 1st Qu.: 1.400 1st Qu.:240.1
## Median : 0.622 Median :0.1060 Median : 2.800 Median :241.9
## Mean : 1.079 Mean :0.1314 Mean : 4.555 Mean :241.9
## 3rd Qu.: 1.514 3rd Qu.:0.2060 3rd Qu.: 6.200 3rd Qu.:243.6
## Max. :11.122 Max. :1.2400 Max. :48.400 Max. :254.2
## Sub_metering_1 Sub_metering_2 Sub_metering_3
## Min. : 0.000 Min. : 0.000 Min. : 0.000
## 1st Qu.: 0.000 1st Qu.: 0.000 1st Qu.: 0.000
## Median : 0.000 Median : 0.000 Median : 1.000
## Mean : 1.137 Mean : 1.136 Mean : 6.823
## 3rd Qu.: 0.000 3rd Qu.: 1.000 3rd Qu.:18.000
## Max. :82.000 Max. :77.000 Max. :31.000
head(y2009)
## id Date Time Global_active_power Global_reactive_power
## 1048575 1 2009-01-01 00:00:00 0.484 0.062
## 1048576 2 2009-01-01 00:01:00 0.484 0.062
## 1048577 3 2009-01-01 00:02:00 0.482 0.062
## 1048578 4 2009-01-01 00:03:00 0.482 0.060
## 1048579 5 2009-01-01 00:04:00 0.482 0.062
## 1048580 6 2009-01-01 00:05:00 0.570 0.000
## Global_intensity Voltage Sub_metering_1 Sub_metering_2 Sub_metering_3
## 1048575 2.2 247.86 0 0 0
## 1048576 2.2 247.72 0 0 0
## 1048577 2.2 247.75 0 0 0
## 1048578 2.2 247.52 0 0 0
## 1048579 2.2 246.94 0 0 0
## 1048580 2.6 246.94 0 0 0
tail(y2009)
## id Date Time Global_active_power Global_reactive_power
## 1569889 525595 2009-12-31 23:54:00 1.704 0.128
## 1569890 525596 2009-12-31 23:55:00 1.746 0.158
## 1569891 525597 2009-12-31 23:56:00 1.786 0.234
## 1569892 525598 2009-12-31 23:57:00 1.784 0.232
## 1569893 525599 2009-12-31 23:58:00 1.792 0.236
## 1569894 525600 2009-12-31 23:59:00 1.792 0.238
## Global_intensity Voltage Sub_metering_1 Sub_metering_2 Sub_metering_3
## 1569889 7.0 239.43 0 0 18
## 1569890 7.2 239.95 0 0 18
## 1569891 7.4 240.09 0 0 19
## 1569892 7.4 239.99 0 0 18
## 1569893 7.4 240.62 0 0 18
## 1569894 7.4 240.82 0 0 19
df<-(bind_rows(y2007, y2008, y2009))
str(df)
## 'data.frame': 1569894 obs. of 10 variables:
## $ id : num 1 2 3 4 5 6 7 8 9 10 ...
## $ Date : POSIXct, format: "2007-01-01" "2007-01-01" ...
## $ Time : chr "00:00:00" "00:01:00" "00:02:00" "00:03:00" ...
## $ Global_active_power : num 2.58 2.55 2.55 2.55 2.55 ...
## $ Global_reactive_power: num 0.136 0.1 0.1 0.1 0.1 0.1 0.096 0 0 0 ...
## $ Global_intensity : num 10.6 10.4 10.4 10.4 10.4 10.4 10.4 10.2 10.2 10.2 ...
## $ Voltage : num 242 242 242 242 242 ...
## $ Sub_metering_1 : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Sub_metering_2 : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Sub_metering_3 : int 0 0 0 0 0 0 0 0 0 0 ...
summary(df)
## id Date Time
## Min. : 1 Min. :2007-01-01 00:00:00.00 Length:1569894
## 1st Qu.:130851 1st Qu.:2007-10-03 00:00:00.00 Class :character
## Median :264035 Median :2008-07-01 00:00:00.00 Mode :character
## Mean :263274 Mean :2008-07-01 14:19:46.66
## 3rd Qu.:395212 3rd Qu.:2009-03-31 00:00:00.00
## Max. :527040 Max. :2009-12-31 00:00:00.00
## Global_active_power Global_reactive_power Global_intensity Voltage
## Min. : 0.076 Min. :0.0000 Min. : 0.200 Min. :223.2
## 1st Qu.: 0.300 1st Qu.:0.0460 1st Qu.: 1.400 1st Qu.:238.7
## Median : 0.566 Median :0.1000 Median : 2.600 Median :240.8
## Mean : 1.089 Mean :0.1219 Mean : 4.624 Mean :240.6
## 3rd Qu.: 1.524 3rd Qu.:0.1920 3rd Qu.: 6.400 3rd Qu.:242.8
## Max. :11.122 Max. :1.3900 Max. :48.400 Max. :254.2
## Sub_metering_1 Sub_metering_2 Sub_metering_3
## Min. : 0.000 Min. : 0.000 Min. : 0.000
## 1st Qu.: 0.000 1st Qu.: 0.000 1st Qu.: 0.000
## Median : 0.000 Median : 0.000 Median : 1.000
## Mean : 1.159 Mean : 1.343 Mean : 6.216
## 3rd Qu.: 0.000 3rd Qu.: 1.000 3rd Qu.:17.000
## Max. :82.000 Max. :78.000 Max. :31.000
head(df)
## id Date Time Global_active_power Global_reactive_power
## 1 1 2007-01-01 00:00:00 2.580 0.136
## 2 2 2007-01-01 00:01:00 2.552 0.100
## 3 3 2007-01-01 00:02:00 2.550 0.100
## 4 4 2007-01-01 00:03:00 2.550 0.100
## 5 5 2007-01-01 00:04:00 2.554 0.100
## 6 6 2007-01-01 00:05:00 2.550 0.100
## Global_intensity Voltage Sub_metering_1 Sub_metering_2 Sub_metering_3
## 1 10.6 241.97 0 0 0
## 2 10.4 241.75 0 0 0
## 3 10.4 241.64 0 0 0
## 4 10.4 241.71 0 0 0
## 5 10.4 241.98 0 0 0
## 6 10.4 241.83 0 0 0
tail(df)
## id Date Time Global_active_power Global_reactive_power
## 1569889 525595 2009-12-31 23:54:00 1.704 0.128
## 1569890 525596 2009-12-31 23:55:00 1.746 0.158
## 1569891 525597 2009-12-31 23:56:00 1.786 0.234
## 1569892 525598 2009-12-31 23:57:00 1.784 0.232
## 1569893 525599 2009-12-31 23:58:00 1.792 0.236
## 1569894 525600 2009-12-31 23:59:00 1.792 0.238
## Global_intensity Voltage Sub_metering_1 Sub_metering_2 Sub_metering_3
## 1569889 7.0 239.43 0 0 18
## 1569890 7.2 239.95 0 0 18
## 1569891 7.4 240.09 0 0 19
## 1569892 7.4 239.99 0 0 18
## 1569893 7.4 240.62 0 0 18
## 1569894 7.4 240.82 0 0 19
df$datetime <- as.POSIXct(paste(df$Date, df$Time), format = "%Y-%m-%d %H:%M:%S")
df <- df[,c(ncol(df), 1:(ncol(df)-1))]
## Analize by year (Lubridate)
df$year <- year(df$datetime)
sum(is.na(df))
## [1] 360
df<-na.omit(df)
duplicated_rows <- subset(df, duplicated(df))
print(duplicated_rows)
## [1] datetime id Date
## [4] Time Global_active_power Global_reactive_power
## [7] Global_intensity Voltage Sub_metering_1
## [10] Sub_metering_2 Sub_metering_3 year
## <0 rows> (or 0-length row.names)
str(df)
## 'data.frame': 1569714 obs. of 12 variables:
## $ datetime : POSIXct, format: "2007-01-01 00:00:00" "2007-01-01 00:01:00" ...
## $ id : num 1 2 3 4 5 6 7 8 9 10 ...
## $ Date : POSIXct, format: "2007-01-01" "2007-01-01" ...
## $ Time : chr "00:00:00" "00:01:00" "00:02:00" "00:03:00" ...
## $ Global_active_power : num 2.58 2.55 2.55 2.55 2.55 ...
## $ Global_reactive_power: num 0.136 0.1 0.1 0.1 0.1 0.1 0.096 0 0 0 ...
## $ Global_intensity : num 10.6 10.4 10.4 10.4 10.4 10.4 10.4 10.2 10.2 10.2 ...
## $ Voltage : num 242 242 242 242 242 ...
## $ Sub_metering_1 : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Sub_metering_2 : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Sub_metering_3 : int 0 0 0 0 0 0 0 0 0 0 ...
## $ year : num 2007 2007 2007 2007 2007 ...
## - attr(*, "na.action")= 'omit' Named int [1:180] 119637 119638 119639 119640 119641 119642 119643 119644 119645 119646 ...
## ..- attr(*, "names")= chr [1:180] "119637" "119638" "119639" "119640" ...
summary(df)
## datetime id
## Min. :2007-01-01 00:00:00.00 Min. : 1
## 1st Qu.:2007-10-03 06:54:15.00 1st Qu.:130896
## Median :2008-07-01 20:35:30.00 Median :264065
## Mean :2008-07-02 02:35:23.74 Mean :263290
## 3rd Qu.:2009-03-31 13:17:45.00 3rd Qu.:395227
## Max. :2009-12-31 23:59:00.00 Max. :527040
## Date Time Global_active_power
## Min. :2007-01-01 00:00:00.00 Length:1569714 Min. : 0.076
## 1st Qu.:2007-10-03 00:00:00.00 Class :character 1st Qu.: 0.300
## Median :2008-07-01 00:00:00.00 Mode :character Median : 0.566
## Mean :2008-07-01 14:35:36.68 Mean : 1.089
## 3rd Qu.:2009-03-31 00:00:00.00 3rd Qu.: 1.524
## Max. :2009-12-31 00:00:00.00 Max. :11.122
## Global_reactive_power Global_intensity Voltage Sub_metering_1
## Min. :0.0000 Min. : 0.200 Min. :223.2 Min. : 0.000
## 1st Qu.:0.0460 1st Qu.: 1.400 1st Qu.:238.7 1st Qu.: 0.000
## Median :0.1000 Median : 2.600 Median :240.8 Median : 0.000
## Mean :0.1219 Mean : 4.623 Mean :240.6 Mean : 1.159
## 3rd Qu.:0.1920 3rd Qu.: 6.400 3rd Qu.:242.8 3rd Qu.: 0.000
## Max. :1.3900 Max. :48.400 Max. :254.2 Max. :82.000
## Sub_metering_2 Sub_metering_3 year
## Min. : 0.000 Min. : 0.000 Min. :2007
## 1st Qu.: 0.000 1st Qu.: 0.000 1st Qu.:2007
## Median : 0.000 Median : 1.000 Median :2008
## Mean : 1.343 Mean : 6.216 Mean :2008
## 3rd Qu.: 1.000 3rd Qu.:17.000 3rd Qu.:2009
## Max. :78.000 Max. :31.000 Max. :2009
df$year <- year(df$datetime)
df$month <-month(df$datetime)
df$week <-week(df$datetime)
df$day<- day(df$datetime)
df$hour<-hour(df$datetime)
df$minute<-minute(df$datetime)
df$weekday <- weekdays(df$datetime)
df$weekday <- factor(df$weekday, levels = c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"))
df$season <- df %>%
mutate(season = case_when(
month %in% 9:11 ~ "Fall",
month %in% c(12, 1, 2) ~ "Winter",
month %in% 3:5 ~ "Spring",
TRUE ~ "Summer"))
# Now, let's create a sample dataframe with 1000 rows from the larger dataframe
set.seed(123) # set seed for reproducibility
sample_size <- 20000
sample_indices <- sample(1:nrow(df), sample_size, replace = FALSE)
df_sample <- df[sample_indices, ]
head(df_sample)
## datetime id Date Time Global_active_power
## 1237698 2009-05-12 09:13:00 189194 2009-05-12 09:13:00 1.374
## 134118 2007-04-04 03:22:00 134123 2007-04-04 03:22:00 0.318
## 1172718 2009-03-28 06:10:00 124211 2009-03-28 06:10:00 0.354
## 685405 2008-04-23 17:00:00 163741 2008-04-23 17:00:00 0.426
## 1275074 2009-06-07 08:12:00 226573 2009-06-07 08:12:00 0.222
## 1413965 2009-09-14 17:04:00 369665 2009-09-14 17:04:00 0.368
## Global_reactive_power Global_intensity Voltage Sub_metering_1
## 1237698 0.000 5.6 239.61 0
## 134118 0.110 1.4 241.87 0
## 1172718 0.100 1.6 243.52 0
## 685405 0.048 2.0 243.82 0
## 1275074 0.000 0.8 240.12 0
## 1413965 0.256 1.8 243.51 0
## Sub_metering_2 Sub_metering_3 year month week day hour minute weekday
## 1237698 0 18 2009 5 19 12 9 13 Tuesday
## 134118 0 0 2007 4 14 4 3 22 Wednesday
## 1172718 1 0 2009 3 13 28 6 10 Saturday
## 685405 0 1 2008 4 17 23 17 0 Wednesday
## 1275074 0 0 2009 6 23 7 8 12 Sunday
## 1413965 0 0 2009 9 37 14 17 4 Monday
## season.datetime season.id season.Date season.Time
## 1237698 2009-05-12 09:13:00 189194 2009-05-12 09:13:00
## 134118 2007-04-04 03:22:00 134123 2007-04-04 03:22:00
## 1172718 2009-03-28 06:10:00 124211 2009-03-28 06:10:00
## 685405 2008-04-23 17:00:00 163741 2008-04-23 17:00:00
## 1275074 2009-06-07 08:12:00 226573 2009-06-07 08:12:00
## 1413965 2009-09-14 17:04:00 369665 2009-09-14 17:04:00
## season.Global_active_power season.Global_reactive_power
## 1237698 1.374 0.000
## 134118 0.318 0.110
## 1172718 0.354 0.100
## 685405 0.426 0.048
## 1275074 0.222 0.000
## 1413965 0.368 0.256
## season.Global_intensity season.Voltage season.Sub_metering_1
## 1237698 5.6 239.61 0
## 134118 1.4 241.87 0
## 1172718 1.6 243.52 0
## 685405 2.0 243.82 0
## 1275074 0.8 240.12 0
## 1413965 1.8 243.51 0
## season.Sub_metering_2 season.Sub_metering_3 season.year season.month
## 1237698 0 18 2009 5
## 134118 0 0 2007 4
## 1172718 1 0 2009 3
## 685405 0 1 2008 4
## 1275074 0 0 2009 6
## 1413965 0 0 2009 9
## season.week season.day season.hour season.minute season.weekday
## 1237698 19 12 9 13 Tuesday
## 134118 14 4 3 22 Wednesday
## 1172718 13 28 6 10 Saturday
## 685405 17 23 17 0 Wednesday
## 1275074 23 7 8 12 Sunday
## 1413965 37 14 17 4 Monday
## season.season
## 1237698 Spring
## 134118 Spring
## 1172718 Spring
## 685405 Spring
## 1275074 Summer
## 1413965 Fall
write.csv(df_sample, file = "df_sample.csv", row.names = FALSE)
by_year <- df %>% group_by(year)
by_year %>% summarise(
Sub_metering_1 = mean(Sub_metering_1),
Sub_metering_2 = mean(Sub_metering_2),
Sub_metering_3 = mean(Sub_metering_3)
)
## # A tibble: 3 × 4
## year Sub_metering_1 Sub_metering_2 Sub_metering_3
## <dbl> <dbl> <dbl> <dbl>
## 1 2007 1.23 1.64 5.80
## 2 2008 1.11 1.26 6.03
## 3 2009 1.14 1.14 6.82
by_month <- df %>% group_by(month)
by_month %>% summarise(
Sub_metering_1 = mean(Sub_metering_1),
Sub_metering_2 = mean(Sub_metering_2),
Sub_metering_3 = mean(Sub_metering_3)
)
## # A tibble: 12 × 4
## month Sub_metering_1 Sub_metering_2 Sub_metering_3
## <dbl> <dbl> <dbl> <dbl>
## 1 1 1.44 1.61 7.25
## 2 2 1.10 1.41 6.72
## 3 3 1.41 1.74 6.71
## 4 4 1.14 1.29 6.30
## 5 5 1.29 1.35 6.21
## 6 6 1.27 1.29 5.85
## 7 7 0.812 1.07 4.26
## 8 8 0.557 0.828 3.74
## 9 9 1.21 1.28 6.23
## 10 10 1.06 1.53 6.42
## 11 11 1.31 1.41 7.07
## 12 12 1.32 1.31 7.87
by_week<- df %>% group_by(week)
by_week %>% summarise(
Sub_metering_1 = mean(Sub_metering_1),
Sub_metering_2 = mean(Sub_metering_2),
Sub_metering_3 = mean(Sub_metering_3)
)
## # A tibble: 53 × 4
## week Sub_metering_1 Sub_metering_2 Sub_metering_3
## <dbl> <dbl> <dbl> <dbl>
## 1 1 1.03 1.73 6.15
## 2 2 1.74 1.25 7.50
## 3 3 1.67 1.81 7.91
## 4 4 1.44 1.61 7.17
## 5 5 1.33 1.59 8.25
## 6 6 1.47 1.66 7.69
## 7 7 1.08 1.25 7.02
## 8 8 0.990 1.53 5.94
## 9 9 0.658 1.19 3.75
## 10 10 1.64 2.08 6.51
## # ℹ 43 more rows
by_day<- df %>% group_by(day)
by_day %>% summarise(
Sub_metering_1 = mean(Sub_metering_1),
Sub_metering_2 = mean(Sub_metering_2),
Sub_metering_3 = mean(Sub_metering_3)
)
## # A tibble: 31 × 4
## day Sub_metering_1 Sub_metering_2 Sub_metering_3
## <int> <dbl> <dbl> <dbl>
## 1 1 1.27 1.00 5.93
## 2 2 0.844 1.27 5.90
## 3 3 1.14 1.14 5.47
## 4 4 1.10 1.98 6.25
## 5 5 1.27 1.60 6.26
## 6 6 1.08 1.29 6.02
## 7 7 1.22 1.47 6.45
## 8 8 1.43 1.49 6.90
## 9 9 1.08 1.17 6.21
## 10 10 1.28 1.22 6.43
## # ℹ 21 more rows
by_hour<- df %>% group_by(day)
by_hour %>% summarise(
Sub_metering_1 = mean(Sub_metering_1),
Sub_metering_2 = mean(Sub_metering_2),
Sub_metering_3 = mean(Sub_metering_3)
)
## # A tibble: 31 × 4
## day Sub_metering_1 Sub_metering_2 Sub_metering_3
## <int> <dbl> <dbl> <dbl>
## 1 1 1.27 1.00 5.93
## 2 2 0.844 1.27 5.90
## 3 3 1.14 1.14 5.47
## 4 4 1.10 1.98 6.25
## 5 5 1.27 1.60 6.26
## 6 6 1.08 1.29 6.02
## 7 7 1.22 1.47 6.45
## 8 8 1.43 1.49 6.90
## 9 9 1.08 1.17 6.21
## 10 10 1.28 1.22 6.43
## # ℹ 21 more rows
by_minute<- df %>% group_by(day)
by_minute %>% summarise(
Sub_metering_1 = mean(Sub_metering_1),
Sub_metering_2 = mean(Sub_metering_2),
Sub_metering_3 = mean(Sub_metering_3)
)
## # A tibble: 31 × 4
## day Sub_metering_1 Sub_metering_2 Sub_metering_3
## <int> <dbl> <dbl> <dbl>
## 1 1 1.27 1.00 5.93
## 2 2 0.844 1.27 5.90
## 3 3 1.14 1.14 5.47
## 4 4 1.10 1.98 6.25
## 5 5 1.27 1.60 6.26
## 6 6 1.08 1.29 6.02
## 7 7 1.22 1.47 6.45
## 8 8 1.43 1.49 6.90
## 9 9 1.08 1.17 6.21
## 10 10 1.28 1.22 6.43
## # ℹ 21 more rows
#LEGEND: #SUB_METER 1-> KITCHEN #SUB_METER_2-> LAUNDRY ROOM #SUB_METER_3-> WATER HEATER & AC
#1. SUM ALL 3 SUBMETERS
df$total_sub_meter<-df$Sub_metering_1 + df$Sub_metering_2 + df$Sub_metering_3
#2.WEEKLY SUB_METER_MEANS
#2.1 WEEKLY/WEEKDAY/MONTHLY/HOURLY MEAN SUB_METER_1 KITCHEN for 2007 & 2008 & 2009
mean_per_week_kitchen <- df %>%
filter(year %in% c(2007, 2008, 2009)) %>%
group_by(year, week) %>%
summarise(mean = mean(Sub_metering_1), .groups = "drop") %>%
arrange(year, week)
mean_per_weekday_kitchen <- df %>%
filter(year %in% c(2007, 2008, 2009)) %>%
group_by(year, weekday) %>%
summarise(mean = mean(Sub_metering_1), .groups = "drop") %>%
arrange(year, weekday)
mean_per_month_kitchen <- df %>%
filter(year %in% c(2007, 2008, 2009)) %>%
group_by(year, month, week, day, weekday) %>%
summarise(mean = mean(Sub_metering_1), .groups = "drop") %>%
arrange(year, month, week, day, weekday)
mean_per_hour_kitchen <- df %>%
filter(year %in% c(2007, 2008, 2009)) %>%
group_by(year, hour) %>%
summarise(mean = mean(Sub_metering_1), .groups = "drop") %>%
arrange(year, hour)
mean_per_week_laundry <- df %>%
filter(year %in% c(2007, 2008, 2009)) %>%
group_by(year, week) %>%
summarise(mean = mean(Sub_metering_2), .groups = "drop") %>%
arrange(year)
mean_per_weekday_laundry<- df %>%
filter(year %in% c(2007, 2008, 2009)) %>%
group_by(year, weekday) %>%
summarise(mean = mean(Sub_metering_2), .groups = "drop") %>%
arrange(year, weekday)
mean_per_month_laundry <- df %>%
filter(year %in% c(2007, 2008, 2009)) %>%
group_by(year, month, week, day, weekday) %>%
summarise(mean = mean(Sub_metering_2), .groups = "drop") %>%
arrange(year, month, week, day, weekday)
mean_per_hour_laundry <- df %>%
filter(year %in% c(2007, 2008, 2009)) %>%
group_by(year, hour) %>%
summarise(mean = mean(Sub_metering_2), .groups = "drop") %>%
arrange(year, hour)
mean_per_week_water <- df %>%
filter(year %in% c(2007, 2008, 2009)) %>%
group_by(year, week) %>%
summarise(mean = mean(Sub_metering_3), .groups = "drop") %>%
arrange(year, week)
mean_per_weekday_water <- df %>%
filter(year %in% c(2007, 2008, 2009)) %>%
group_by(year, weekday) %>%
summarise(mean = mean(Sub_metering_3), .groups = "drop") %>%
arrange(year, weekday)
mean_per_month_water <- df %>%
filter(year %in% c(2007, 2008, 2009)) %>%
group_by(year, month, week, day, weekday) %>%
summarise(mean = mean(Sub_metering_3), .groups = "drop") %>%
arrange(year, month, week, day, weekday)
mean_per_water <- df %>%
filter(year %in% c(2007, 2008, 2009)) %>%
group_by(year, weekday, hour) %>%
summarise(mean = mean(Sub_metering_3), .groups = "drop") %>%
arrange(year, weekday, hour)
mean_per_hour_water <- df %>%
filter(year %in% c(2007, 2008, 2009)) %>%
group_by(year, hour) %>%
summarise(mean = mean(Sub_metering_3), .groups = "drop") %>%
arrange(year, hour)
mean_per_week_total <-df %>%
filter(year %in% c(2007, 2008, 2009)) %>%
group_by(year, week) %>%
summarise(mean = mean(total_sub_meter), .groups = "drop") %>%
arrange(year, week)
mean_per_weekday_total <- df %>%
filter(year %in% c(2007, 2008, 2009)) %>%
group_by(year, weekday) %>%
summarise(mean = mean(Sub_metering_1+Sub_metering_2+Sub_metering_3), .groups = "drop") %>%
arrange(year, weekday)
mean_per_month_total <-df %>%
filter(year %in% c(2007, 2008, 2009)) %>%
group_by(year, month, week, day, weekday) %>%
summarise(mean = mean(total_sub_meter), .groups = "drop") %>%
arrange(year, month, week, day, weekday)
mean_2007 <- df %>%
filter(year %in% c(2007)) %>%
group_by(weekday) %>%
summarise('2007' = mean(total_sub_meter), .groups = "drop") %>%
arrange(weekday)
mean_2008 <- df %>%
filter(year %in% c(2008)) %>%
group_by(weekday) %>%
summarise('2008' = mean(total_sub_meter), .groups = "drop") %>%
arrange(weekday)
mean_2009 <- df %>%
filter(year %in% c(2009)) %>%
group_by(year, weekday) %>%
summarise('2009' = mean(total_sub_meter), .groups = "drop") %>%
arrange(year, weekday)
#3. PER SEASON and SUB-METER AND WEEKDAY
#3.1. SUB-METER 1 - KITCHEN
season_1 <-df %>%
group_by(year, month, weekday) %>%
summarise(mean = mean(Sub_metering_1)) %>%
arrange(year, month)
## `summarise()` has grouped output by 'year', 'month'. You can override using the
## `.groups` argument.
season_1<- season_1 %>%
mutate(season = case_when(
month %in% 9:11 ~ "Fall",
month %in% c(12, 1, 2) ~ "Winter",
month %in% 3:5 ~ "Spring",
TRUE ~ "Summer"))
season_1 <- subset(season_1, select = - c(month))
season_1
## # A tibble: 252 × 4
## # Groups: year [3]
## year weekday mean season
## <dbl> <fct> <dbl> <chr>
## 1 2007 Monday 0.455 Winter
## 2 2007 Tuesday 0.677 Winter
## 3 2007 Wednesday 1.07 Winter
## 4 2007 Thursday 0.623 Winter
## 5 2007 Friday 0.824 Winter
## 6 2007 Saturday 2.54 Winter
## 7 2007 Sunday 3.05 Winter
## 8 2007 Monday 0.826 Winter
## 9 2007 Tuesday 0.563 Winter
## 10 2007 Wednesday 1.76 Winter
## # ℹ 242 more rows
season_1_1 <-season_1 %>%
filter(season %in% c("Winter")) %>%
group_by(year, weekday) %>%
summarise(mean = mean(mean)) %>%
arrange(year, weekday)
## `summarise()` has grouped output by 'year'. You can override using the
## `.groups` argument.
season_1_2 <-season_1 %>%
filter(season %in% c("Spring")) %>%
group_by(year, weekday) %>%
summarise(mean = mean(mean)) %>%
arrange(year, weekday)
## `summarise()` has grouped output by 'year'. You can override using the
## `.groups` argument.
season_1_3 <-season_1 %>%
filter(season %in% c("Summer")) %>%
group_by(year, weekday) %>%
summarise(mean = mean(mean)) %>%
arrange(year, weekday)
## `summarise()` has grouped output by 'year'. You can override using the
## `.groups` argument.
season_1_4 <-season_1 %>%
filter(season %in% c("Fall")) %>%
group_by(year, weekday) %>%
summarise(mean = mean(mean)) %>%
arrange(year, weekday)
## `summarise()` has grouped output by 'year'. You can override using the
## `.groups` argument.
season_kitchen<-season_1_1 %>%
left_join(season_1_2, by = c("year","weekday")) %>%
left_join(season_1_3, by = c("year","weekday")) %>%
left_join(season_1_4, by = c("year","weekday"))
seasonal_kitchen <- season_kitchen %>%
rename(mean_per_kitchen_winter = mean.x,
mean_per_kitchen_spring = mean.y,
mean_per_kitchen_summer = mean.x.x,
mean_per_kitchen_fall = mean.y.y)
#3.1. SUB-METER 2 - LAUNDRY
season_2 <-df %>%
group_by(year, month, weekday) %>%
summarise(mean = mean(Sub_metering_2)) %>%
arrange(year, month)
## `summarise()` has grouped output by 'year', 'month'. You can override using the
## `.groups` argument.
season_2<- season_2 %>%
mutate(season = case_when(
month %in% 9:11 ~ "Fall",
month %in% c(12, 1, 2) ~ "Winter",
month %in% 3:5 ~ "Spring",
TRUE ~ "Summer"))
season_2 <- subset(season_2, select = - c(month))
season_2
## # A tibble: 252 × 4
## # Groups: year [3]
## year weekday mean season
## <dbl> <fct> <dbl> <chr>
## 1 2007 Monday 0.880 Winter
## 2 2007 Tuesday 1.08 Winter
## 3 2007 Wednesday 2.63 Winter
## 4 2007 Thursday 1.80 Winter
## 5 2007 Friday 0.245 Winter
## 6 2007 Saturday 2.17 Winter
## 7 2007 Sunday 3.81 Winter
## 8 2007 Monday 1.08 Winter
## 9 2007 Tuesday 1.52 Winter
## 10 2007 Wednesday 1.73 Winter
## # ℹ 242 more rows
season_2_1 <-season_2 %>%
filter(season %in% c("Winter")) %>%
group_by(year, weekday) %>%
summarise(mean = mean(mean)) %>%
arrange(year, weekday)
## `summarise()` has grouped output by 'year'. You can override using the
## `.groups` argument.
season_2_2 <-season_2 %>%
filter(season %in% c("Spring")) %>%
group_by(year, weekday) %>%
summarise(mean = mean(mean)) %>%
arrange(year, weekday)
## `summarise()` has grouped output by 'year'. You can override using the
## `.groups` argument.
season_2_3 <-season_2 %>%
filter(season %in% c("Summer")) %>%
group_by(year, weekday) %>%
summarise(mean = mean(mean)) %>%
arrange(year, weekday)
## `summarise()` has grouped output by 'year'. You can override using the
## `.groups` argument.
season_2_4 <-season_2 %>%
filter(season %in% c("Fall")) %>%
group_by(year, weekday) %>%
summarise(mean = mean(mean)) %>%
arrange(year, weekday)
## `summarise()` has grouped output by 'year'. You can override using the
## `.groups` argument.
season_laundry<-season_2_1 %>%
left_join(season_2_2, by = c("year","weekday")) %>%
left_join(season_2_3, by = c("year","weekday")) %>%
left_join(season_2_4, by = c("year","weekday"))
seasonal_laundry <- season_laundry %>%
rename(mean_per_laundry_winter = mean.x,
mean_per_laundry_spring = mean.y,
mean_per_laundry_summer = mean.x.x,
mean_per_laundry_fall = mean.y.y)
#3.3. SUB-METER 3 - WATER
season_3 <-df %>%
group_by(year, month, weekday) %>%
summarise(mean = mean(Sub_metering_3)) %>%
arrange(year, month)
## `summarise()` has grouped output by 'year', 'month'. You can override using the
## `.groups` argument.
season_3<- season_3 %>%
mutate(season = case_when(
month %in% 9:11 ~ "Fall",
month %in% c(12, 1, 2) ~ "Winter",
month %in% 3:5 ~ "Spring",
TRUE ~ "Summer"))
season_3 <- subset(season_3, select = - c(month))
season_3
## # A tibble: 252 × 4
## # Groups: year [3]
## year weekday mean season
## <dbl> <fct> <dbl> <chr>
## 1 2007 Monday 9.06 Winter
## 2 2007 Tuesday 5.02 Winter
## 3 2007 Wednesday 8.14 Winter
## 4 2007 Thursday 8.34 Winter
## 5 2007 Friday 5.43 Winter
## 6 2007 Saturday 7.09 Winter
## 7 2007 Sunday 8.58 Winter
## 8 2007 Monday 6.13 Winter
## 9 2007 Tuesday 3.89 Winter
## 10 2007 Wednesday 6.44 Winter
## # ℹ 242 more rows
season_3_1 <-season_3 %>%
filter(season %in% c("Winter")) %>%
group_by(year, weekday) %>%
summarise(mean = mean(mean)) %>%
arrange(year, weekday)
## `summarise()` has grouped output by 'year'. You can override using the
## `.groups` argument.
season_3_2 <-season_3 %>%
filter(season %in% c("Spring")) %>%
group_by(year, weekday) %>%
summarise(mean = mean(mean)) %>%
arrange(year, weekday)
## `summarise()` has grouped output by 'year'. You can override using the
## `.groups` argument.
season_3_3 <-season_3 %>%
filter(season %in% c("Summer")) %>%
group_by(year, weekday) %>%
summarise(mean = mean(mean)) %>%
arrange(year, weekday)
## `summarise()` has grouped output by 'year'. You can override using the
## `.groups` argument.
season_3_4 <-season_3 %>%
filter(season %in% c("Fall")) %>%
group_by(year, weekday) %>%
summarise(mean = mean(mean)) %>%
arrange(year, weekday)
## `summarise()` has grouped output by 'year'. You can override using the
## `.groups` argument.
season_water<-season_3_1 %>%
left_join(season_3_2, by = c("year","weekday")) %>%
left_join(season_3_3, by = c("year","weekday")) %>%
left_join(season_3_4, by = c("year","weekday"))
seasonal_water <- season_water %>%
rename(mean_per_water_winter = mean.x,
mean_per_water_spring = mean.y,
mean_per_water_summer = mean.x.x,
mean_per_water_fall = mean.y.y)
#3. PER HOUR and SUB-METER AND WEEKDAY
#3.1. SUB-METER 1 - KITCHEN
hour_kitchen <-df %>%
group_by(year, weekday, hour) %>%
summarise(mean = mean(Sub_metering_1), .groups = "drop") %>%
arrange(year, weekday, hour)
hour_kitchen<- hour_kitchen %>%
mutate(timezone = case_when(
hour %in% 6:13 ~ "Morning",
hour %in% 13:15 ~ "MidDay",
hour %in% 15:19 ~ "Afternoon",
hour %in% 19:23 ~ "Noon",
hour %in% c(23, 0, 1, 2, 3, 4, 5, 6) ~ "Night"))
hour_morning_kitchen <-hour_kitchen %>%
filter(timezone%in% c("Morning")) %>%
group_by(year, weekday) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, weekday)
hour_midday_kitchen <-hour_kitchen %>%
filter(timezone %in% c("MidDay")) %>%
group_by(year, weekday) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, weekday)
hour_afternoon_kitchen <-hour_kitchen %>%
filter(timezone %in% c("Afternoon")) %>%
group_by(year, weekday) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, weekday)
hour_noon_kitchen <-hour_kitchen %>%
filter(timezone %in% c("Noon")) %>%
group_by(year, weekday) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, weekday)
hour_night_kitchen <-hour_kitchen %>%
filter(timezone %in% c("Night")) %>%
group_by(year, weekday) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, weekday)
hour_kitchen<-hour_morning_kitchen %>%
left_join(hour_midday_kitchen, by = c("year","weekday")) %>%
left_join(hour_afternoon_kitchen, by = c("year","weekday")) %>%
left_join(hour_noon_kitchen, by = c("year","weekday")) %>%
left_join(hour_night_kitchen, by = c("year","weekday"))
hour_kitchen <- hour_kitchen %>%
rename(morning_kitchen = mean.x,
midday_kitchen = mean.y,
afternoon_kitchen = mean.x.x,
noon_kitchen = mean.y.y,
night_kitchen = mean)
#3.2. SUB-METER 2 - LAUNDRY
hour_laundry <-df %>%
group_by(year, weekday, hour) %>%
summarise(mean = mean(Sub_metering_2), .groups = "drop") %>%
arrange(year, weekday, hour)
hour_laundry<- hour_laundry %>%
mutate(timezone = case_when(
hour %in% 6:13 ~ "Morning",
hour %in% 13:15 ~ "MidDay",
hour %in% 15:19 ~ "Afternoon",
hour %in% 19:23 ~ "Noon",
hour %in% c(23, 0, 1, 2, 3, 4, 5, 6) ~ "Night"))
hour_morning_laundry <-hour_laundry %>%
filter(timezone%in% c("Morning")) %>%
group_by(year, weekday) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, weekday)
hour_midday_laundry <-hour_laundry %>%
filter(timezone %in% c("MidDay")) %>%
group_by(year, weekday) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, weekday)
hour_afternoon_laundry <-hour_laundry %>%
filter(timezone %in% c("Afternoon")) %>%
group_by(year, weekday) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, weekday)
hour_noon_laundry <-hour_laundry %>%
filter(timezone %in% c("Noon")) %>%
group_by(year, weekday) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, weekday)
hour_night_laundry <-hour_laundry %>%
filter(timezone %in% c("Night")) %>%
group_by(year, weekday) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, weekday)
hour_laundry<-hour_morning_laundry %>%
left_join(hour_midday_laundry, by = c("year","weekday")) %>%
left_join(hour_afternoon_laundry, by = c("year","weekday")) %>%
left_join(hour_noon_laundry, by = c("year","weekday")) %>%
left_join(hour_night_laundry, by = c("year","weekday"))
hour_laundry <- hour_laundry %>%
rename(morning_laundry = mean.x,
midday_laundry = mean.y,
afternoon_laundry = mean.x.x,
noon_laundry = mean.y.y,
night_laundry = mean)
#3.3. SUB-METER 3 - WATER
hour_water <-df %>%
group_by(year, weekday, hour) %>%
summarise(mean = mean(Sub_metering_3), .groups = "drop") %>%
arrange(year, weekday, hour)
hour_water<- hour_water %>%
mutate(timezone = case_when(
hour %in% 6:13 ~ "Morning",
hour %in% 13:15 ~ "MidDay",
hour %in% 15:19 ~ "Afternoon",
hour %in% 19:23 ~ "Noon",
hour %in% c(23, 0, 1, 2, 3, 4, 5, 6) ~ "Night"))
hour_morning_water <-hour_water %>%
filter(timezone%in% c("Morning")) %>%
group_by(year, weekday) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, weekday)
hour_midday_water <-hour_water %>%
filter(timezone %in% c("MidDay")) %>%
group_by(year, weekday) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, weekday)
hour_afternoon_water <-hour_water %>%
filter(timezone %in% c("Afternoon")) %>%
group_by(year, weekday) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, weekday)
hour_noon_water <-hour_water %>%
filter(timezone %in% c("Noon")) %>%
group_by(year, weekday) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, weekday)
hour_night_water <-hour_water %>%
filter(timezone %in% c("Night")) %>%
group_by(year, weekday) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, weekday)
hour_water<-hour_morning_water %>%
left_join(hour_midday_water, by = c("year","weekday")) %>%
left_join(hour_afternoon_water, by = c("year","weekday")) %>%
left_join(hour_noon_water, by = c("year","weekday")) %>%
left_join(hour_night_water, by = c("year","weekday"))
hour_water <- hour_water %>%
rename(morning_water = mean.x,
midday_water = mean.y,
afternoon_water = mean.x.x,
noon_water = mean.y.y,
night_water = mean)
#3.1. SUB-METER 1 - KITCHEN
hour_week_kitchen <-df %>%
group_by(year, week, hour) %>%
summarise(mean = mean(Sub_metering_1), .groups = "drop") %>%
arrange(year, week, hour)
hour_week_kitchen<- hour_week_kitchen %>%
mutate(timezone = case_when(
hour %in% 6:13 ~ "Morning",
hour %in% 13:15 ~ "MidDay",
hour %in% 15:19 ~ "Afternoon",
hour %in% 19:23 ~ "Noon",
hour %in% c(23, 0, 1, 2, 3, 4, 5, 6) ~ "Night"))
hour_morning_week_kitchen <-hour_week_kitchen %>%
filter(timezone%in% c("Morning")) %>%
group_by(year, week) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, week)
hour_midday_week_kitchen <-hour_week_kitchen %>%
filter(timezone %in% c("MidDay")) %>%
group_by(year, week) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, week)
hour_afternoon_week_kitchen <-hour_week_kitchen %>%
filter(timezone %in% c("Afternoon")) %>%
group_by(year, week) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, week)
hour_noon_week_kitchen <-hour_week_kitchen %>%
filter(timezone %in% c("Noon")) %>%
group_by(year, week) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, week)
hour_night_week_kitchen<-kitchen <-hour_week_kitchen %>%
filter(timezone %in% c("Night")) %>%
group_by(year, week) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, week)
hour_week_kitchen<-hour_morning_week_kitchen %>%
left_join(hour_midday_week_kitchen, by = c("year","week")) %>%
left_join(hour_afternoon_week_kitchen, by = c("year","week")) %>%
left_join(hour_noon_week_kitchen, by = c("year","week")) %>%
left_join(hour_night_week_kitchen, by = c("year","week"))
hour_week_kitchen <- hour_week_kitchen %>%
rename(morning_week_kitchen = mean.x,
midday_weeK_kitchen = mean.y,
afternoon_week_kitchen = mean.x.x,
noon_week_kitchen = mean.y.y,
night_week_kitchen = mean)
#3.2. SUB-METER 2 - LAUNDRY
hour_laundry <-df %>%
group_by(year, weekday, hour) %>%
summarise(mean = mean(Sub_metering_2), .groups = "drop") %>%
arrange(year, weekday, hour)
hour_laundry<- hour_laundry %>%
mutate(timezone = case_when(
hour %in% 6:13 ~ "Morning",
hour %in% 13:15 ~ "MidDay",
hour %in% 15:19 ~ "Afternoon",
hour %in% 19:23 ~ "Noon",
hour %in% c(23, 0, 1, 2, 3, 4, 5, 6) ~ "Night"))
hour_morning_laundry <-hour_laundry %>%
filter(timezone%in% c("Morning")) %>%
group_by(year, weekday) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, weekday)
hour_midday_laundry <-hour_laundry %>%
filter(timezone %in% c("MidDay")) %>%
group_by(year, weekday) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, weekday)
hour_afternoon_laundry <-hour_laundry %>%
filter(timezone %in% c("Afternoon")) %>%
group_by(year, weekday) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, weekday)
hour_noon_laundry <-hour_laundry %>%
filter(timezone %in% c("Noon")) %>%
group_by(year, weekday) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, weekday)
hour_night_laundry <-hour_laundry %>%
filter(timezone %in% c("Night")) %>%
group_by(year, weekday) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, weekday)
hour_laundry<-hour_morning_laundry %>%
left_join(hour_midday_laundry, by = c("year","weekday")) %>%
left_join(hour_afternoon_laundry, by = c("year","weekday")) %>%
left_join(hour_noon_laundry, by = c("year","weekday")) %>%
left_join(hour_night_laundry, by = c("year","weekday"))
hour_laundry <- hour_laundry %>%
rename(morning_laundry = mean.x,
midday_laundry = mean.y,
afternoon_laundry = mean.x.x,
noon_laundry = mean.y.y,
night_laundry = mean)
#3.3. SUB-METER 3 - WATER
hour_water <-df %>%
group_by(year, weekday, hour) %>%
summarise(mean = mean(Sub_metering_3), .groups = "drop") %>%
arrange(year, weekday, hour)
hour_water<- hour_water %>%
mutate(timezone = case_when(
hour %in% 6:13 ~ "Morning",
hour %in% 13:15 ~ "MidDay",
hour %in% 15:19 ~ "Afternoon",
hour %in% 19:23 ~ "Noon",
hour %in% c(23, 0, 1, 2, 3, 4, 5, 6) ~ "Night"))
hour_morning_water <-hour_water %>%
filter(timezone%in% c("Morning")) %>%
group_by(year, weekday) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, weekday)
hour_midday_water <-hour_water %>%
filter(timezone %in% c("MidDay")) %>%
group_by(year, weekday) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, weekday)
hour_afternoon_water <-hour_water %>%
filter(timezone %in% c("Afternoon")) %>%
group_by(year, weekday) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, weekday)
hour_noon_water <-hour_water %>%
filter(timezone %in% c("Noon")) %>%
group_by(year, weekday) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, weekday)
hour_night_water <-hour_water %>%
filter(timezone %in% c("Night")) %>%
group_by(year, weekday) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, weekday)
hour_water<-hour_morning_water %>%
left_join(hour_midday_water, by = c("year","weekday")) %>%
left_join(hour_afternoon_water, by = c("year","weekday")) %>%
left_join(hour_noon_water, by = c("year","weekday")) %>%
left_join(hour_night_water, by = c("year","weekday"))
hour_water <- hour_water %>%
rename(morning_water = mean.x,
midday_water = mean.y,
afternoon_water = mean.x.x,
noon_water = mean.y.y,
night_water = mean)
#4.1. SUB-METER 1 - KITCHEN BY HOUR AND MONTH
month_kitchen <-df %>%
group_by(year, month, hour) %>%
summarise(mean = mean(Sub_metering_1), .groups = "drop") %>%
arrange(year, month, hour)
month_kitchen<- month_kitchen %>%
mutate(timezone = case_when(
hour %in% 6:13 ~ "Morning",
hour %in% 13:19 ~ "Afternoon",
hour %in% 19:23 ~ "Dusk",
hour %in% c(23, 0, 1, 2, 3, 4, 5, 6) ~ "Night"))
month_kitchen<- month_kitchen %>%
mutate(season = case_when(
month %in% 9:11 ~ "Fall",
month %in% c(12, 1, 2) ~ "Winter",
month %in% 3:5 ~ "Spring",
TRUE ~ "Summer"))
month_morning_kitchen <-month_kitchen %>%
filter(timezone %in% c("Morning")) %>%
group_by(year, month) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, month)
month_afternoon_kitchen <-month_kitchen %>%
filter(timezone %in% c("Afternoon")) %>%
group_by(year, month) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, month)
month_noon_kitchen <-month_kitchen %>%
filter(timezone %in% c("Afternoon")) %>%
group_by(year, month) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, month)
month_dusk_kitchen <-month_kitchen %>%
filter(timezone %in% c("Dusk")) %>%
group_by(year, month) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, month)
month_night_kitchen <-month_kitchen %>%
filter(timezone %in% c("Night")) %>%
group_by(year, month) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, month)
month_kitchen<-month_morning_kitchen %>%
left_join(month_afternoon_kitchen, by = c("year","month")) %>%
left_join(month_dusk_kitchen, by = c("year","month")) %>%
left_join(month_night_kitchen, by = c("year","month"))
month_kitchen <- month_kitchen %>%
rename(morning_kitchen = mean.x,
afternoon_kitchen = mean.y,
dusk_kitchen = mean.x.x,
night_kitchen = mean.y.y)
#3.4. SUB-METER 2 - LAUNDRY
month_laundry <-df %>%
group_by(year, month, hour) %>%
summarise(mean = mean(Sub_metering_1), .groups = "drop") %>%
arrange(year, month, hour)
month_laundry<- month_laundry %>%
mutate(timezone = case_when(
hour %in% 6:13 ~ "Morning",
hour %in% 13:19 ~ "Afternoon",
hour %in% 19:23 ~ "Dusk",
hour %in% c(23, 0, 1, 2, 3, 4, 5, 6) ~ "Night"))
month_laundry<- month_laundry %>%
mutate(season = case_when(
month %in% 9:11 ~ "Fall",
month %in% c(12, 1, 2) ~ "Winter",
month %in% 3:5 ~ "Spring",
TRUE ~ "Summer"))
month_morning_laundry <-month_laundry %>%
filter(timezone %in% c("Morning")) %>%
group_by(year, month) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, month)
month_afternoon_laundry <-month_laundry %>%
filter(timezone %in% c("Afternoon")) %>%
group_by(year, month) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, month)
month_dusk_laundry <-month_laundry %>%
filter(timezone %in% c("Dusk")) %>%
group_by(year, month) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, month)
month_night_laundry <-month_laundry %>%
filter(timezone %in% c("Night")) %>%
group_by(year, month) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, month)
month_laundry<-month_morning_laundry %>%
left_join(month_afternoon_laundry, by = c("year","month")) %>%
left_join(month_dusk_laundry, by = c("year","month")) %>%
left_join(month_night_laundry, by = c("year","month"))
month_laundry <- month_laundry %>%
rename(morning_laundry = mean.x,
afternoon_laundry = mean.y,
dusk_laundry = mean.x.x,
night_laundry = mean.y.y)
#3.3. SUB-METER 3 - WATER
month_water <-df %>%
group_by(year, month, hour) %>%
summarise(mean = mean(Sub_metering_1), .groups = "drop") %>%
arrange(year, month, hour)
month_water<- month_water %>%
mutate(timezone = case_when(
hour %in% 6:13 ~ "Morning",
hour %in% 13:19 ~ "Afternoon",
hour %in% 19:23 ~ "Dusk",
hour %in% c(23, 0, 1, 2, 3, 4, 5, 6) ~ "Night"))
month_water<- month_water %>%
mutate(season = case_when(
month %in% 9:11 ~ "Fall",
month %in% c(12, 1, 2) ~ "Winter",
month %in% 3:5 ~ "Spring",
TRUE ~ "Summer"))
month_morning_water <-month_water %>%
filter(timezone %in% c("Morning")) %>%
group_by(year, month) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, month)
month_afternoon_water <-month_water %>%
filter(timezone %in% c("Afternoon")) %>%
group_by(year, month) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, month)
month_dusk_water <-month_water %>%
filter(timezone %in% c("Dusk")) %>%
group_by(year, month) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, month)
month_night_water <-month_water %>%
filter(timezone %in% c("Night")) %>%
group_by(year, month) %>%
summarise(mean = mean(mean), .groups = "drop") %>%
arrange(year, month)
month_water<-month_morning_water %>%
left_join(month_afternoon_water, by = c("year","month")) %>%
left_join(month_dusk_water, by = c("year","month")) %>%
left_join(month_night_water, by = c("year","month"))
month_water <- month_water %>%
rename(morning_water = mean.x,
afternoon_water = mean.y,
dusk_water = mean.x.x,
night_water = mean.y.y)
total <- mean_per_week_water %>%
left_join(mean_per_week_kitchen, by = c("year", "week")) %>%
left_join(mean_per_week_laundry, by = c("year", "week")) %>%
left_join(mean_per_week_total, by = c("year", "week"))
total <- total %>%
rename(mean_per_week_water = mean.x,
mean_per_week_kitchen = mean.y,
mean_per_week_laundry = mean.x.x,
mean_per_week_total = mean.y.y)
total
## # A tibble: 159 × 6
## year week mean_per_week_water mean_per_week_kitchen mean_per_week_laundry
## <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2007 1 5.38 0.581 1.75
## 2 2007 2 8.27 1.33 1.88
## 3 2007 3 8.09 2.32 1.99
## 4 2007 4 7.15 1.14 1.48
## 5 2007 5 9.45 1.15 1.74
## 6 2007 6 7.12 1.50 1.74
## 7 2007 7 7.18 1.44 2.03
## 8 2007 8 6.21 0.856 1.56
## 9 2007 9 2.45 0.146 1.06
## 10 2007 10 6.55 1.89 3.09
## # ℹ 149 more rows
## # ℹ 1 more variable: mean_per_week_total <dbl>
total_month<-mean_per_month_water %>%
left_join(mean_per_month_kitchen, by = c("year", "month")) %>%
left_join(mean_per_month_laundry, by = c("year", "month")) %>%
left_join(mean_per_month_total, by = c("year", "month"))
## Warning in left_join(., mean_per_month_kitchen, by = c("year", "month")): Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 1 of `x` matches multiple rows in `y`.
## ℹ Row 1 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
## "many-to-many"` to silence this warning.
## Warning in left_join(., mean_per_month_laundry, by = c("year", "month")): Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 1 of `x` matches multiple rows in `y`.
## ℹ Row 1 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
## "many-to-many"` to silence this warning.
## Warning in left_join(., mean_per_month_total, by = c("year", "month")): Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 1 of `x` matches multiple rows in `y`.
## ℹ Row 1 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
## "many-to-many"` to silence this warning.
total_month <- total_month %>%
rename(mean_per_month_water = mean.x,
mean_per_month_kitchen = mean.y,
mean_per_month_laundry = mean.x.x,
mean_per_month_total= mean.y.y)
total_month <- mean_per_month_water %>%
left_join(mean_per_month_kitchen, by = c("year", "month", "week", "day", "weekday")) %>%
left_join(mean_per_month_laundry, by = c("year", "month", "week", "day", "weekday")) %>%
left_join(mean_per_month_total, by = c("year", "month", "week", "day", "weekday"))
total_month <- total_month %>%
rename(mean_per_month_water = mean.x,
mean_per_month_kitchen = mean.y,
mean_per_month_laundry = mean.x.x,
mean_per_month_total= mean.y.y)
total_month
## # A tibble: 1,094 × 9
## year month week day weekday mean_per_month_water mean_per_month_kitchen
## <dbl> <dbl> <dbl> <int> <fct> <dbl> <dbl>
## 1 2007 1 1 1 Monday 4.08 0
## 2 2007 1 1 2 Tuesday 4.56 0
## 3 2007 1 1 3 Wednesday 3.31 0
## 4 2007 1 1 4 Thursday 7.57 0.730
## 5 2007 1 1 5 Friday 5.28 1.03
## 6 2007 1 1 6 Saturday 3.94 0.928
## 7 2007 1 1 7 Sunday 8.90 1.38
## 8 2007 1 2 8 Monday 12.2 0
## 9 2007 1 2 9 Tuesday 6.80 1.17
## 10 2007 1 2 10 Wednesday 7.65 0.535
## # ℹ 1,084 more rows
## # ℹ 2 more variables: mean_per_month_laundry <dbl>, mean_per_month_total <dbl>
total_year<-mean_2007 %>%
left_join(mean_2008, by = c("weekday")) %>%
left_join(mean_2009, by = c("weekday"))
total_year
## # A tibble: 7 × 5
## weekday `2007` `2008` year `2009`
## <fct> <dbl> <dbl> <dbl> <dbl>
## 1 Monday 8.04 7.04 2009 7.47
## 2 Tuesday 7.63 9.37 2009 8.38
## 3 Wednesday 8.86 8.31 2009 9.82
## 4 Thursday 7.40 6.41 2009 7.86
## 5 Friday 8.30 8.79 2009 8.51
## 6 Saturday 10.2 9.93 2009 11.8
## 7 Sunday 10.3 8.92 2009 9.92
total_season<-season_1 %>%
left_join(season_2, by = c("year","season", "weekday")) %>%
left_join(season_3, by = c("year", "season", "weekday"))
## Warning in left_join(., season_2, by = c("year", "season", "weekday")): Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 1 of `x` matches multiple rows in `y`.
## ℹ Row 1 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
## "many-to-many"` to silence this warning.
## Warning in left_join(., season_3, by = c("year", "season", "weekday")): Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 1 of `x` matches multiple rows in `y`.
## ℹ Row 1 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
## "many-to-many"` to silence this warning.
total_season <- total_season %>%
rename(mean_per_season_kitchen = mean.x,
mean_per_season_laundry = mean.y,
mean_per_season_water = mean)
total_season
## # A tibble: 2,268 × 6
## # Groups: year [3]
## year weekday mean_per_season_kitchen season mean_per_season_laundry
## <dbl> <fct> <dbl> <chr> <dbl>
## 1 2007 Monday 0.455 Winter 0.880
## 2 2007 Monday 0.455 Winter 0.880
## 3 2007 Monday 0.455 Winter 0.880
## 4 2007 Monday 0.455 Winter 1.08
## 5 2007 Monday 0.455 Winter 1.08
## 6 2007 Monday 0.455 Winter 1.08
## 7 2007 Monday 0.455 Winter 0.916
## 8 2007 Monday 0.455 Winter 0.916
## 9 2007 Monday 0.455 Winter 0.916
## 10 2007 Tuesday 0.677 Winter 1.08
## # ℹ 2,258 more rows
## # ℹ 1 more variable: mean_per_season_water <dbl>
total_hour<-mean_per_hour_kitchen %>%
left_join(mean_per_hour_laundry, by = c("year", "hour")) %>%
left_join(mean_per_hour_water, by = c("year", "hour"))
total_hour <- total_hour %>%
rename(mean_per_hour_kitchen = mean.x,
mean_per_hour_laundry = mean.y,
mean_per_hour_water = mean)
total_hour
## # A tibble: 72 × 5
## year hour mean_per_hour_kitchen mean_per_hour_laundry mean_per_hour_water
## <dbl> <int> <dbl> <dbl> <dbl>
## 1 2007 0 0.522 0.809 3.13
## 2 2007 1 0.223 0.381 2.12
## 3 2007 2 0.125 0.339 1.28
## 4 2007 3 0.0230 0.322 0.720
## 5 2007 4 0 0.316 0.785
## 6 2007 5 0 0.295 1.03
## 7 2007 6 0.0271 0.316 3.35
## 8 2007 7 0.187 0.552 8.83
## 9 2007 8 1.72 1.15 11.4
## 10 2007 9 1.61 1.24 11.2
## # ℹ 62 more rows
total_weekday<-mean_per_weekday_total %>%
left_join(mean_per_weekday_kitchen, by = c("year", "weekday")) %>%
left_join(mean_per_weekday_laundry, by = c("year", "weekday")) %>%
left_join(mean_per_weekday_water, by = c("year", "weekday"))
total_weekday <- total_weekday %>%
rename(mean_per_weekday_total = mean.x,
mean_per_weekday_kitchen = mean.y,
mean_per_weekday_laundry = mean.x.x,
mean_per_weekday_water= mean.y.y)
total_weekday
## # A tibble: 21 × 6
## year weekday mean_per_weekday_total mean_per_weekday_kitchen
## <dbl> <fct> <dbl> <dbl>
## 1 2007 Monday 8.04 0.930
## 2 2007 Tuesday 7.63 0.893
## 3 2007 Wednesday 8.86 1.12
## 4 2007 Thursday 7.40 0.951
## 5 2007 Friday 8.30 0.930
## 6 2007 Saturday 10.2 1.79
## 7 2007 Sunday 10.3 2.05
## 8 2008 Monday 7.04 0.710
## 9 2008 Tuesday 9.37 0.927
## 10 2008 Wednesday 8.31 1.13
## # ℹ 11 more rows
## # ℹ 2 more variables: mean_per_weekday_laundry <dbl>,
## # mean_per_weekday_water <dbl>
weekly_water <- ts(mean_per_week_water$mean, frequency=52.25, start=c(2007,1))
weekly_water<-(weekly_water)
autoplot(weekly_water) + xlab("Week") + ylab("Mean") +
ggtitle("Water Heater & AC")
boxplot(mean_per_week_water ~ week,
data=total,
main="Water & AC",
ylab="Mean", xlab="week")
class(total)
## [1] "tbl_df" "tbl" "data.frame"
weekday_water<-ts(mean_per_weekday_kitchen$mean, frequency=52.25, start=c(2007, 1))
plot(weekday_water)
autoplot(weekday_water) + xlab("Day of the Week") + ylab("Mean") +
ggtitle("Water Heater & AC")
weekly_kitchen <- ts(mean_per_week_kitchen$mean, frequency=52.25, start=c(2007,1))
plot(weekly_kitchen)
autoplot(weekly_kitchen) + xlab("Week") + ylab("Mean") +
ggtitle("Kitchen")
weekly_laundry<- ts(mean_per_week_laundry$mean, frequency=52.25, start=c(2007,1))
plot(weekly_laundry)
autoplot(weekly_laundry) + xlab("Week") + ylab("Mean") +
ggtitle("Laundry Room")
weekly_total<- ts(mean_per_week_total$mean, frequency=52.25, start=c(2007,1))
plot(weekly_total)
by_year<-plot_ly(total_year, x = ~total_year$weekday, y = ~total_year$"2007",
name = '2007', type = 'scatter', mode = 'lines') %>%
add_trace(y = ~total_year$"2008" , name = '2008', mode = 'lines') %>%
add_trace(y = ~total_year$"2009", name = '2009', mode = 'lines') %>%
layout(title = "Total Power Consumption by Year",
xaxis = list(title = "Weekday"),
yaxis = list (title = "Power (watt-hours)"))
by_year
weekdayy_2007 <- filter(total_weekday, (year == 2007))
weekday_2007<-plot_ly(weekdayy_2007, x = ~weekdayy_2007$weekday, y = ~weekdayy_2007$"mean_per_weekday_water",
name = 'mean_per_weekday_water ', type = 'scatter', mode = 'lines') %>%
add_trace(y = ~weekdayy_2007$"mean_per_weekday_kitchen" , name = 'mean_per_weekday_kitchen ', mode = 'lines') %>%
add_trace(y = ~weekdayy_2007$"mean_per_weekday_laundry", name = 'mean_per_weekday_laundry ', mode = 'lines') %>%
layout(title = "Power Consumption by Sub-meter 2007",
xaxis = list(title = "Weekday"),
yaxis = list (title = "Power (watt-hours)"))
weekday_2007
weekdayy_2008 <- filter(total_weekday, (year == 2008))
weekday_2008<-plot_ly(weekdayy_2008, x = ~weekdayy_2008$weekday, y = ~weekdayy_2008$"mean_per_weekday_water",
name = 'mean_per_weekday_water ', type = 'scatter', mode = 'lines') %>%
add_trace(y = ~weekdayy_2008$"mean_per_weekday_kitchen" , name = 'mean_per_weekday_kitchen ', mode = 'lines') %>%
add_trace(y = ~weekdayy_2008$"mean_per_weekday_laundry", name = 'mean_per_weekday_laundry ', mode = 'lines') %>%
layout(title = "Power Consumption by Sub-meter 2008",
xaxis = list(title = "Weekday"),
yaxis = list (title = "Power (watt-hours)"))
weekday_2008
weekdayy_2009 <- filter(total_weekday, (year == 2009))
weekday_2009<-plot_ly(weekdayy_2009, x = ~weekdayy_2009$weekday, y = ~weekdayy_2009$"mean_per_weekday_water",
name = 'mean_per_weekday_water ', type = 'scatter', mode = 'lines') %>%
add_trace(y = ~weekdayy_2009$"mean_per_weekday_kitchen" , name = 'mean_per_weekday_kitchen ', mode = 'lines') %>%
add_trace(y = ~weekdayy_2009$"mean_per_weekday_laundry", name = 'mean_per_weekday_laundry ', mode = 'lines') %>%
layout(title = "Power Consumption by Sub-meter 2009",
xaxis = list(title = "Weekday"),
yaxis = list (title = "Power (watt-hours)"))
weekday_2009
kitchen_season <- filter(seasonal_kitchen, (year == 2007))
kitchen_season_2007<-plot_ly(kitchen_season, x = ~kitchen_season$weekday, y = ~kitchen_season$"mean_per_kitchen_winter",
name = 'WINTER', type = 'scatter', mode = 'lines') %>%
add_trace(y = ~kitchen_season$"mean_per_kitchen_spring" , name = 'SPRING', mode = 'lines') %>%
add_trace(y = ~kitchen_season$"mean_per_kitchen_summer", name = 'SUMMER', mode = 'lines') %>%
add_trace(y = ~kitchen_season$"mean_per_kitchen_fall", name = 'FALL', mode = 'lines') %>%
layout(title = "Power Consumption kitchen 2007",
xaxis = list(title = "Weekday"),
yaxis = list (title = "Power (watt-hours)"))
kitchen_seasonn <- filter(seasonal_kitchen, (year == 2008))
kitchen_season_2008<-plot_ly(kitchen_seasonn, x = ~kitchen_seasonn$weekday, y = ~kitchen_seasonn$"mean_per_kitchen_winter",
name = 'WINTER', type = 'scatter', mode = 'lines') %>%
add_trace(y = ~kitchen_seasonn$"mean_per_kitchen_spring" , name = 'SPRING', mode = 'lines') %>%
add_trace(y = ~kitchen_seasonn$"mean_per_kitchen_summer", name = 'SUMMER', mode = 'lines') %>%
add_trace(y = ~kitchen_seasonn$"mean_per_kitchen_fall", name = 'FALL', mode = 'lines') %>%
layout(title = "Power Consumption kitchen 2008",
xaxis = list(title = "Weekday"),
yaxis = list (title = "Power (watt-hours)"))
kitchen_seasonnn <- filter(seasonal_kitchen, (year == 2009))
kitchen_season_2009<-plot_ly(kitchen_seasonnn, x = ~kitchen_seasonnn$weekday, y = ~kitchen_seasonnn$"mean_per_kitchen_winter",
name = 'WINTER', type = 'scatter', mode = 'lines') %>%
add_trace(y = ~kitchen_seasonnn$"mean_per_kitchen_spring" , name = 'SPRING', mode = 'lines') %>%
add_trace(y = ~kitchen_seasonnn$"mean_per_kitchen_summer", name = 'SUMMER', mode = 'lines') %>%
add_trace(y = ~kitchen_seasonnn$"mean_per_kitchen_fall", name = 'FALL', mode = 'lines') %>%
layout(title = "Power Consumption kitchen 2009",
xaxis = list(title = "Weekday"),
yaxis = list (title = "Power (watt-hours)"))
kitchen_season_2007
laundry_season <- filter(seasonal_laundry, (year == 2007))
laundry_season_2007<-plot_ly(laundry_season, x = ~laundry_season$weekday, y = ~laundry_season$"mean_per_laundry_winter",
name = 'WINTER', type = 'scatter', mode = 'lines') %>%
add_trace(y = ~laundry_season$"mean_per_laundry_spring" , name = 'SPRING', mode = 'lines') %>%
add_trace(y = ~laundry_season$"mean_per_laundry_summer", name = 'SUMMER', mode = 'lines') %>%
add_trace(y = ~laundry_season$"mean_per_laundry_fall", name = 'FALL', mode = 'lines') %>%
layout(title = "Power Consumption LAUNDRY 2007",
xaxis = list(title = "Weekday"),
yaxis = list (title = "Power (watt-hours)"))
laundry_seasonn <- filter(seasonal_laundry, (year == 2008))
laundry_season_2008<-plot_ly(laundry_seasonn, x = ~laundry_seasonn$weekday, y = ~laundry_seasonn$"mean_per_laundry_winter",
name = 'WINTER', type = 'scatter', mode = 'lines') %>%
add_trace(y = ~laundry_seasonn$"mean_per_laundry_spring" , name = 'SPRING', mode = 'lines') %>%
add_trace(y = ~laundry_seasonn$"mean_per_laundry_summer", name = 'SUMMER', mode = 'lines') %>%
add_trace(y = ~laundry_seasonn$"mean_per_laundry_fall", name = 'FALL', mode = 'lines') %>%
layout(title = "Power Consumption LAUNDRY 2008",
xaxis = list(title = "Weekday"),
yaxis = list (title = "Power (watt-hours)"))
laundry_seasonnn <- filter(seasonal_laundry, (year == 2009))
laundry_season_2009<-plot_ly(laundry_seasonnn, x = ~laundry_seasonnn$weekday, y = ~laundry_seasonnn$"mean_per_laundry_winter",
name = 'WINTER', type = 'scatter', mode = 'lines') %>%
add_trace(y = ~laundry_seasonnn$"mean_per_laundry_spring" , name = 'SPRING', mode = 'lines') %>%
add_trace(y = ~laundry_seasonnn$"mean_per_laundry_summer", name = 'SUMMER', mode = 'lines') %>%
add_trace(y = ~laundry_seasonnn$"mean_per_laundry_fall", name = 'FALL', mode = 'lines') %>%
layout(title = "Power Consumption LAUNDRY 2009",
xaxis = list(title = "Weekday"),
yaxis = list (title = "Power (watt-hours)"))
laundry_season_2007
water_season <- filter(seasonal_water, (year == 2007))
water_season_2007<-plot_ly(water_season, x = ~water_season$weekday, y = ~water_season$"mean_per_water_winter",
name = 'WINTER', type = 'scatter', mode = 'lines') %>%
add_trace(y = ~water_season$"mean_per_water_spring" , name = 'SPRING', mode = 'lines') %>%
add_trace(y = ~water_season$"mean_per_water_summer", name = 'SUMMER', mode = 'lines') %>%
add_trace(y = ~water_season$"mean_per_water_fall", name = 'FALL', mode = 'lines') %>%
layout(title = "Power Consumption WATER 2007",
xaxis = list(title = "Weekday"),
yaxis = list (title = "Power (watt-hours)"))
water_seasonn <- filter(seasonal_water, (year == 2008))
water_season_2008<-plot_ly(water_seasonn, x = ~water_seasonn$weekday, y = ~water_seasonn$"mean_per_water_winter",
name = 'WINTER', type = 'scatter', mode = 'lines') %>%
add_trace(y = ~water_seasonn$"mean_per_water_spring" , name = 'SPRING', mode = 'lines') %>%
add_trace(y = ~water_seasonn$"mean_per_water_summer", name = 'SUMMER', mode = 'lines') %>%
add_trace(y = ~water_seasonn$"mean_per_water_fall", name = 'FALL', mode = 'lines') %>%
layout(title = "Power Consumption WATER 2008",
xaxis = list(title = "Weekday"),
yaxis = list (title = "Power (watt-hours)"))
water_seasonnn <- filter(seasonal_water, (year == 2009))
water_season_2009<-plot_ly(water_seasonnn, x = ~water_seasonnn$weekday, y = ~water_seasonnn$"mean_per_water_winter",
name = 'WINTER', type = 'scatter', mode = 'lines') %>%
add_trace(y = ~water_seasonnn$"mean_per_water_spring" , name = 'SPRING', mode = 'lines') %>%
add_trace(y = ~water_seasonnn$"mean_per_water_summer", name = 'SUMMER', mode = 'lines') %>%
add_trace(y = ~water_seasonnn$"mean_per_water_fall", name = 'FALL', mode = 'lines') %>%
layout(title = "Power Consumption WATER 2009",
xaxis = list(title = "Weekday"),
yaxis = list (title = "Power (watt-hours)"))
water_season_2007
#6. PLOT BY HOUR
hour_207 <- filter(total_hour, (year == 2007))
hour_2007<-plot_ly(hour_207, x = ~hour_207$hour, y = ~hour_207$"mean_per_hour_kitchen",
name = 'KITCHEN', type = 'scatter', mode = 'lines') %>%
add_trace(y = ~hour_207$"mean_per_hour_laundry" , name = 'LAUNDRY', mode = 'lines') %>%
add_trace(y = ~hour_207$"mean_per_hour_water", name = 'WATER', mode = 'lines') %>%
layout(title = "Power Consumption WATER by hour 2007",
xaxis = list(title = "hour"),
yaxis = list (title = "Power (watt-hours)"))
hour_208 <- filter(total_hour, (year == 2008))
hour_2008<-plot_ly(hour_208, x = ~hour_208$hour, y = ~hour_208$"mean_per_hour_kitchen",
name = 'KITCHEN', type = 'scatter', mode = 'lines') %>%
add_trace(y = ~hour_208$"mean_per_hour_laundry" , name = 'LAUNDRY', mode = 'lines') %>%
add_trace(y = ~hour_208$"mean_per_hour_water", name = 'WATER', mode = 'lines') %>%
layout(title = "Power Consumption by hour 2008",
xaxis = list(title = "hour"),
yaxis = list (title = "Power (watt-hours)"))
hour_209 <- filter(total_hour, (year == 2009))
hour_2009<-plot_ly(hour_209, x = ~hour_209$hour, y = ~hour_209$"mean_per_hour_kitchen",
name = 'KITCHEN', type = 'scatter', mode = 'lines') %>%
add_trace(y = ~hour_209$"mean_per_hour_laundry" , name = 'LAUNDRY', mode = 'lines') %>%
add_trace(y = ~hour_209$"mean_per_hour_water", name = 'WATER', mode = 'lines') %>%
layout(title = "Power Consumption by hour 2009",
xaxis = list(title = "hour"),
yaxis = list (title = "Power (watt-hours)"))
hour_2007
hour_2008
hour_2009
hourly_kitchen <- filter(hour_kitchen, (year == 2007))
ly_2007<-plot_ly(hourly_kitchen, x = ~hourly_kitchen$weekday, y = ~hourly_kitchen$"morning_kitchen",
name = 'Morning', type = 'scatter', mode = 'lines') %>%
add_trace(y = ~hourly_kitchen$"midday_kitchen" , name = 'MidDay', mode = 'lines') %>%
add_trace(y = ~hourly_kitchen$"afternoon_kitchen", name = 'Afternoon', mode = 'lines') %>%
add_trace(y = ~hourly_kitchen$"noon_kitchen", name = 'Noon', mode = 'lines') %>%
add_trace(y = ~hourly_kitchen$"night_kitchen", name = 'Night', mode = 'lines') %>%
layout(title = "Power Consumption by hour Kitchen 2007",
xaxis = list(title = "hour"),
yaxis = list (title = "Power (watt-hours)"))
hourlyy_kitchen <- filter(hour_kitchen, (year == 2008))
ly_2008<-plot_ly(hourlyy_kitchen, x = ~hourlyy_kitchen$weekday, y = ~hourlyy_kitchen$"morning_kitchen",
name = 'Morning', type = 'scatter', mode = 'lines') %>%
add_trace(y = ~hourlyy_kitchen$"midday_kitchen" , name = 'MidDay', mode = 'lines') %>%
add_trace(y = ~hourlyy_kitchen$"afternoon_kitchen", name = 'Afternoon', mode = 'lines') %>%
add_trace(y = ~hourlyy_kitchen$"noon_kitchen", name = 'Noon', mode = 'lines') %>%
add_trace(y = ~hourlyy_kitchen$"night_kitchen", name = 'Night', mode = 'lines') %>%
layout(title = "Power Consumption by hour Kitchen 2008",
xaxis = list(title = "hour"),
yaxis = list (title = "Power (watt-hours)"))
hourlyyy_kitchen <- filter(hour_kitchen, (year == 2009))
ly_2009<-plot_ly(hourlyyy_kitchen, x = ~hourlyyy_kitchen$weekday, y = ~hourlyyy_kitchen$"morning_kitchen",
name = 'Morning', type = 'scatter', mode = 'lines') %>%
add_trace(y = ~hourlyyy_kitchen$"midday_kitchen" , name = 'MidDay', mode = 'lines') %>%
add_trace(y = ~hourlyyy_kitchen$"afternoon_kitchen", name = 'Afternoon', mode = 'lines') %>%
add_trace(y = ~hourlyyy_kitchen$"noon_kitchen", name = 'Noon', mode = 'lines') %>%
add_trace(y = ~hourlyyy_kitchen$"night_kitchen", name = 'Night', mode = 'lines') %>%
layout(title = "Power Consumption by hour Kitchen 2009",
xaxis = list(title = "hour"),
yaxis = list (title = "Power (watt-hours)"))
ly_2007
ly_2008
ly_2009
hourly_laundry <- filter(hour_laundry, (year == 2007))
lyy_2007<-plot_ly(hourly_laundry, x = ~hourly_laundry$weekday, y = ~hourly_laundry$"morning_laundry",
name = 'Morning', type = 'scatter', mode = 'lines') %>%
add_trace(y = ~hourly_laundry$"midday_laundry" , name = 'MidDay', mode = 'lines') %>%
add_trace(y = ~hourly_laundry$"afternoon_laundry", name = 'Afternoon', mode = 'lines') %>%
add_trace(y = ~hourly_laundry$"noon_laundry", name = 'Noon', mode = 'lines') %>%
add_trace(y = ~hourly_laundry$"night_laundry", name = 'Night', mode = 'lines') %>%
layout(title = "Power Consumption by hour Laundry Room 2007",
xaxis = list(title = "hour"),
yaxis = list (title = "Power (watt-hours)"))
hourlyy_laundry <- filter(hour_laundry, (year == 2008))
lyy_2008<-plot_ly(hourlyy_laundry, x = ~hourlyy_laundry$weekday, y = ~hourlyy_laundry$"morning_laundry",
name = 'Morning', type = 'scatter', mode = 'lines') %>%
add_trace(y = ~hourlyy_laundry$"midday_laundry" , name = 'MidDay', mode = 'lines') %>%
add_trace(y = ~hourlyy_laundry$"afternoon_laundry", name = 'Afternoon', mode = 'lines') %>%
add_trace(y = ~hourlyy_laundry$"noon_laundry", name = 'Noon', mode = 'lines') %>%
add_trace(y = ~hourlyy_laundry$"night_laundry", name = 'Night', mode = 'lines') %>%
layout(title = "Power Consumption by hour Laundry Room 2008",
xaxis = list(title = "hour"),
yaxis = list (title = "Power (watt-hours)"))
hourlyyy_laundry <- filter(hour_laundry, (year == 2009))
lyy_2009<-plot_ly(hourlyyy_laundry, x = ~hourlyyy_laundry$weekday, y = ~hourlyyy_laundry$"morning_laundry",
name = 'Morning', type = 'scatter', mode = 'lines') %>%
add_trace(y = ~hourlyyy_laundry$"midday_laundry" , name = 'MidDay', mode = 'lines') %>%
add_trace(y = ~hourlyyy_laundry$"afternoon_laundry", name = 'Afternoon', mode = 'lines') %>%
add_trace(y = ~hourlyyy_laundry$"noon_laundry", name = 'Noon', mode = 'lines') %>%
add_trace(y = ~hourlyyy_laundry$"night_laundry", name = 'Night', mode = 'lines') %>%
layout(title = "Power Consumption by hour Laundry Room 2009",
xaxis = list(title = "hour"),
yaxis = list (title = "Power (watt-hours)"))
lyy_2007
lyy_2008
lyy_2009
hourly_water <- filter(hour_water, (year == 2007))
lyyy_2007<-plot_ly(hourly_water, x = ~hourly_water$weekday, y = ~hourly_water$"morning_water",
name = 'Morning', type = 'scatter', mode = 'lines') %>%
add_trace(y = ~hourly_water$"midday_water" , name = 'MidDay', mode = 'lines') %>%
add_trace(y = ~hourly_water$"afternoon_water", name = 'Afternoon', mode = 'lines') %>%
add_trace(y = ~hourly_water$"noon_water", name = 'Noon', mode = 'lines') %>%
add_trace(y = ~hourly_water$"night_water", name = 'Night', mode = 'lines') %>%
layout(title = "Power Consumption by hour Water Heater & AC 2007",
xaxis = list(title = "hour"),
yaxis = list (title = "Power (watt-hours)"))
hourlyy_water <- filter(hour_water, (year == 2008))
lyyy_2008<-plot_ly(hourlyy_water, x = ~hourlyy_water$weekday, y = ~hourlyy_water$"morning_water",
name = 'Morning', type = 'scatter', mode = 'lines') %>%
add_trace(y = ~hourlyy_water$"midday_water" , name = 'MidDay', mode = 'lines') %>%
add_trace(y = ~hourlyy_water$"afternoon_water", name = 'Afternoon', mode = 'lines') %>%
add_trace(y = ~hourlyy_water$"noon_water", name = 'Noon', mode = 'lines') %>%
add_trace(y = ~hourlyy_water$"night_water", name = 'Night', mode = 'lines') %>%
layout(title = "Power Consumption by hour Water Heater & AC 2008",
xaxis = list(title = "hour"),
yaxis = list (title = "Power (watt-hours)"))
hourlyyy_water <- filter(hour_water, (year == 2009))
lyyy_2009<-plot_ly(hourlyyy_water, x = ~hourlyyy_water$weekday, y = ~hourlyyy_water$"morning_water",
name = 'Morning', type = 'scatter', mode = 'lines') %>%
add_trace(y = ~hourlyyy_water$"midday_water" , name = 'MidDay', mode = 'lines') %>%
add_trace(y = ~hourlyyy_water$"afternoon_water", name = 'Afternoon', mode = 'lines') %>%
add_trace(y = ~hourlyyy_water$"noon_water", name = 'Noon', mode = 'lines') %>%
add_trace(y = ~hourlyyy_water$"night_water", name = 'Night', mode = 'lines') %>%
layout(title = "Power Consumption by hour Water Heater & AC 2009",
xaxis = list(title = "hour"),
yaxis = list (title = "Power (watt-hours)"))
lyyy_2007
lyyy_2008
lyyy_2009
monthly_kitchen <- filter(month_kitchen, (year == 2007))
thly_2007<-plot_ly(monthly_kitchen , x = ~monthly_kitchen$month, y = ~monthly_kitchen$"morning_kitchen",
name = 'Morning', type = 'scatter', mode = 'lines') %>%
add_trace(y = ~monthly_kitchen$"afternoon_kitchen", name = 'Afternoon', mode = 'lines') %>%
add_trace(y = ~monthly_kitchen$"dusk_kitchen", name = 'Dusk', mode = 'lines') %>%
add_trace(y = ~monthly_kitchen$"night_kitchen", name = 'Night', mode = 'lines') %>%
layout(title = "Power Consumption by hour Kitchen 2007",
xaxis = list(title = "Month"),
yaxis = list (title = "Power (watt-hours)"))
monthly_kitchen <- filter(month_kitchen, (year == 2008))
thly_2008<-plot_ly(monthly_kitchen , x = ~monthly_kitchen$month, y = ~monthly_kitchen$"morning_kitchen",
name = 'Morning', type = 'scatter', mode = 'lines') %>%
add_trace(y = ~monthly_kitchen$"afternoon_kitchen", name = 'Afternoon', mode = 'lines') %>%
add_trace(y = ~monthly_kitchen$"dusk_kitchen", name = 'Dusk', mode = 'lines') %>%
add_trace(y = ~monthly_kitchen$"night_kitchen", name = 'Night', mode = 'lines') %>%
layout(title = "Power Consumption by hour Kitchen 2008",
xaxis = list(title = "Month"),
yaxis = list (title = "Power (watt-hours)"))
monthly_kitchen <- filter(month_kitchen, (year == 2009))
thly_2009<-plot_ly(monthly_kitchen , x = ~monthly_kitchen$month, y = ~monthly_kitchen$"morning_kitchen",
name = 'Morning', type = 'scatter', mode = 'lines') %>%
add_trace(y = ~monthly_kitchen$"afternoon_kitchen", name = 'Afternoon', mode = 'lines') %>%
add_trace(y = ~monthly_kitchen$"dusk_kitchen", name = 'Dusk', mode = 'lines') %>%
add_trace(y = ~monthly_kitchen$"night_kitchen", name = 'Night', mode = 'lines') %>%
layout(title = "Power Consumption by hour Kitchen 2009",
xaxis = list(title = "Month"),
yaxis = list (title = "Power (watt-hours)"))
thly_2007
thly_2008
thly_2009
monthly_laundry <- filter(month_laundry, (year == 2007))
thlyy_2007<-plot_ly(monthly_laundry , x = ~monthly_laundry$month, y = ~monthly_laundry$"morning_laundry",
name = 'Morning', type = 'scatter', mode = 'lines') %>%
add_trace(y = ~monthly_laundry$"afternoon_laundry", name = 'Afternoon', mode = 'lines') %>%
add_trace(y = ~monthly_laundry$"dusk_laundry", name = 'Dusk', mode = 'lines') %>%
add_trace(y = ~monthly_laundry$"night_laundry", name = 'Night', mode = 'lines') %>%
layout(title = "Power Consumption by hour laundry 2007",
xaxis = list(title = "Month"),
yaxis = list (title = "Power (watt-hours)"))
monthly_laundry <- filter(month_laundry, (year == 2008))
thlyy_2008<-plot_ly(monthly_laundry , x = ~monthly_laundry$month, y = ~monthly_laundry$"morning_laundry",
name = 'Morning', type = 'scatter', mode = 'lines') %>%
add_trace(y = ~monthly_laundry$"afternoon_laundry", name = 'Afternoon', mode = 'lines') %>%
add_trace(y = ~monthly_laundry$"dusk_laundry", name = 'Dusk', mode = 'lines') %>%
add_trace(y = ~monthly_laundry$"night_laundry", name = 'Night', mode = 'lines') %>%
layout(title = "Power Consumption by hour laundry 2008",
xaxis = list(title = "Month"),
yaxis = list (title = "Power (watt-hours)"))
monthly_laundry <- filter(month_laundry, (year == 2009))
thlyy_2009<-plot_ly(monthly_laundry , x = ~monthly_laundry$month, y = ~monthly_laundry$"morning_laundry",
name = 'Morning', type = 'scatter', mode = 'lines') %>%
add_trace(y = ~monthly_laundry$"afternoon_laundry", name = 'Afternoon', mode = 'lines') %>%
add_trace(y = ~monthly_laundry$"dusk_laundry", name = 'Dusk', mode = 'lines') %>%
add_trace(y = ~monthly_laundry$"night_laundry", name = 'Night', mode = 'lines') %>%
layout(title = "Power Consumption by hour laundry 2009",
xaxis = list(title = "Month"),
yaxis = list (title = "Power (watt-hours)"))
thlyy_2007
thlyy_2008
thlyy_2009
monthly_water <- filter(month_water, (year == 2007))
thlyyy_2007<-plot_ly(monthly_water , x = ~monthly_water$month, y = ~monthly_water$"morning_water",
name = 'Morning', type = 'scatter', mode = 'lines') %>%
add_trace(y = ~monthly_water$"afternoon_water", name = 'Afternoon', mode = 'lines') %>%
add_trace(y = ~monthly_water$"dusk_water", name = 'Dusk', mode = 'lines') %>%
add_trace(y = ~monthly_water$"night_water", name = 'Night', mode = 'lines') %>%
layout(title = "Power Consumption by hour water 2007",
xaxis = list(title = "Month"),
yaxis = list (title = "Power (watt-hours)"))
monthly_water <- filter(month_water, (year == 2008))
thlyyy_2008<-plot_ly(monthly_water , x = ~monthly_water$month, y = ~monthly_water$"morning_water",
name = 'Morning', type = 'scatter', mode = 'lines') %>%
add_trace(y = ~monthly_water$"afternoon_water", name = 'Afternoon', mode = 'lines') %>%
add_trace(y = ~monthly_water$"dusk_water", name = 'Dusk', mode = 'lines') %>%
add_trace(y = ~monthly_water$"night_water", name = 'Night', mode = 'lines') %>%
layout(title = "Power Consumption by hour water 2008",
xaxis = list(title = "Month"),
yaxis = list (title = "Power (watt-hours)"))
monthly_water <- filter(month_water, (year == 2009))
thlyyy_2009<-plot_ly(monthly_water , x = ~monthly_water$month, y = ~monthly_water$"morning_water",
name = 'Morning', type = 'scatter', mode = 'lines') %>%
add_trace(y = ~monthly_water$"afternoon_water", name = 'Afternoon', mode = 'lines') %>%
add_trace(y = ~monthly_water$"dusk_water", name = 'Dusk', mode = 'lines') %>%
add_trace(y = ~monthly_water$"night_water", name = 'Night', mode = 'lines') %>%
layout(title = "Power Consumption by hour water 2009",
xaxis = list(title = "Month"),
yaxis = list (title = "Power (watt-hours)"))
thlyyy_2007
thlyyy_2008
thlyyy_2009
hourly_week_kitchen <- filter(hour_week_kitchen, (year == 2007))
ly_week_2007<-plot_ly(hourly_week_kitchen, x = ~hourly_week_kitchen$week, y = ~hourly_week_kitchen$"morning_week_kitchen",
name = 'Morning', type = 'scatter', mode = 'lines') %>%
add_trace(y = ~hourly_week_kitchen$"midday_weeK_kitchen" , name = 'MidDay', mode = 'lines') %>%
add_trace(y = ~hourly_week_kitchen$"afternoon_week_kitchen", name = 'Afternoon', mode = 'lines') %>%
add_trace(y = ~hourly_week_kitchen$"noon_week_kitchen", name = 'Noon', mode = 'lines') %>%
add_trace(y = ~hourly_week_kitchen$"night_week_kitchen", name = 'Night', mode = 'lines') %>%
layout(title = "Power Consumption by hour Kitchen 2007",
xaxis = list(title = "hour"),
yaxis = list (title = "Power (watt-hours)"))
ly_week_2007
all_2007 <- filter(total, (year == 2007))
## Plot sub-meter 1, 2 and 3 with title, legend and labels
year_2007<-plot_ly(all_2007, x = ~all_2007$week, y = ~all_2007$mean_per_week_water, name = 'Water Heater & AC', type = 'scatter', mode = 'lines') %>%
add_trace(y = ~all_2007$mean_per_week_kitchen , name = 'Kitchen', mode = 'lines') %>%
add_trace(y = ~all_2007$mean_per_week_laundry, name = 'Laundry RoomC', mode = 'lines') %>%
layout(title = "Power Consumption 2007",
xaxis = list(title = "Week"),
yaxis = list (title = "Power (watt-hours)"))
year_2007
all_2008 <- filter(total, (year == 2008))
## Plot sub-meter 1, 2 and 3 with title, legend and labels
year_2008<-plot_ly(all_2008, x = ~all_2008$week, y = ~all_2008$mean_per_week_water, name = 'Water Heater & AC', type = 'scatter', mode = 'lines') %>%
add_trace(y = ~all_2008$mean_per_week_kitchen , name = 'Kitchen', mode = 'lines') %>%
add_trace(y = ~all_2008$mean_per_week_laundry, name = 'Laundry Room', mode = 'lines') %>%
layout(title = "Power Consumption 2008",
xaxis = list(title = "Week"),
yaxis = list (title = "Power (watt-hours)"))
year_2008
all_2009 <- filter(total, (year == 2009))
## Plot sub-meter 1, 2 and 3 with title, legend and labels
year_2009<-plot_ly(all_2009, x = ~all_2009$week, y = ~all_2009$mean_per_week_water, name = 'Water Heater & AC', type = 'scatter', mode = 'lines') %>%
add_trace(y = ~all_2009$mean_per_week_kitchen , name = 'Kitchen', mode = 'lines') %>%
add_trace(y = ~all_2009$mean_per_week_laundry, name = 'Laundry Room', mode = 'lines') %>%
layout(title = "Power Consumption 2009",
xaxis = list(title = "Week"),
yaxis = list (title = "Power (watt-hours)"))
year_2009
Based on the analysis of the power consumption data, the following insights and recommendations were derived:
decomp_water <- decompose(weekly_water)
plot(decomp_water)
decomp_kitchen<- decompose(weekly_kitchen)
plot(decomp_kitchen)
decomp_laundry<- decompose(weekly_laundry)
plot(decomp_laundry)
decomp_total<- decompose(weekly_total)
plot(decomp_total)
weekly <- total[, c("mean_per_week_total", "mean_per_week_water",
"mean_per_week_kitchen", "mean_per_week_laundry")]
datacheck<- total[, c("mean_per_week_total")]
The forecasting and analysis process included cleaning and preprocessing the dataset, followed by exploratory data analysis. The findings can be summarized as follows:
#TAKING AWAY THE TREND FOR TOTAL SUB_METER:
weekly_total <- total[, c("mean_per_week_total")]
weekly_total_frequency<-ts(weekly_total, start=c(2007, 1), frequency=52)
DY<-diff(weekly_total_frequency)
autoplot(DY)
ggseasonplot(DY)
ggsubseriesplot(DY)
fit<-snaive(DY)
print(summary(fit))
##
## Forecast method: Seasonal naive method
##
## Model Information:
## Call: snaive(y = DY)
##
## Residual sd: 2.9364
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set -0.0006972522 2.936427 2.114023 275.0587 785.2804 1 -0.279661
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 2010.058 0.852302032 -2.9108804 4.6154844 -4.902989 6.607593
## 2010.077 1.880832888 -1.8823495 5.6440153 -3.874458 7.636124
## 2010.096 -1.691765873 -5.4549483 2.0714165 -7.447057 4.063525
## 2010.115 1.318258809 -2.4449236 5.0814412 -4.437032 7.073550
## 2010.135 0.938388016 -2.8247944 4.7015704 -4.816903 6.693679
## 2010.154 -4.567146891 -8.3303293 -0.8039645 -10.322438 1.188144
## 2010.173 0.171809589 -3.5913728 3.9349920 -5.583481 5.927100
## 2010.192 1.347579607 -2.4156028 5.1107620 -4.407711 7.102870
## 2010.212 1.122856901 -2.6403255 4.8860393 -4.632434 6.878148
## 2010.231 -0.413461127 -4.1766435 3.3497213 -6.168752 5.341830
## 2010.250 -1.488523000 -5.2517054 2.2746594 -7.243814 4.266768
## 2010.269 0.591057765 -3.1721246 4.3542402 -5.164233 6.346349
## 2010.288 -0.246811734 -4.0099941 3.5163707 -6.002103 5.508479
## 2010.308 2.350598841 -1.4125835 6.1137812 -3.404692 8.105890
## 2010.327 -2.661213920 -6.4243963 1.1019685 -8.416505 3.094077
## 2010.346 -0.126289683 -3.8894721 3.6368927 -5.881580 5.629001
## 2010.365 0.847916667 -2.9152657 4.6110991 -4.907374 6.603207
## 2010.385 -0.524796252 -4.2879786 3.2383861 -6.280087 5.230495
## 2010.404 0.631542284 -3.1316401 4.3947247 -5.123749 6.386833
## 2010.423 -0.777805304 -4.5409877 2.9853771 -6.533096 4.977486
## 2010.442 -0.662075648 -4.4252580 3.1011067 -6.417366 5.093215
## 2010.462 0.175496032 -3.5876864 3.9386784 -5.579795 5.930787
## 2010.481 -0.543146562 -4.3063290 3.2200358 -6.298437 5.212144
## 2010.500 0.134217990 -3.6289644 3.8974004 -5.621073 5.889509
## 2010.519 -0.418154762 -4.1813372 3.3450276 -6.173446 5.337136
## 2010.538 -0.274900794 -4.0380832 3.4882816 -6.030192 5.480390
## 2010.558 -2.178714193 -5.9418966 1.5844682 -7.934005 3.576577
## 2010.577 -0.705908823 -4.4690912 3.0572736 -6.461200 5.049382
## 2010.596 0.542261905 -3.2209205 4.3054443 -5.213029 6.297553
## 2010.615 -0.833928571 -4.5971110 2.9292538 -6.589219 4.921362
## 2010.635 0.741567460 -3.0216149 4.5047498 -5.013723 6.496858
## 2010.654 1.650652469 -2.1125299 5.4138349 -4.104638 7.405943
## 2010.673 0.855200706 -2.9079817 4.6183831 -4.900090 6.610492
## 2010.692 -0.048511905 -3.8116943 3.7146705 -5.803803 5.706779
## 2010.712 0.901884921 -2.8612975 4.6650673 -4.853406 6.657176
## 2010.731 1.685409609 -2.0777728 5.4485920 -4.069881 7.440700
## 2010.750 -0.022512784 -3.7856952 3.7406696 -5.777804 5.732778
## 2010.769 -0.681006877 -4.4441893 3.0821755 -6.436298 5.074284
## 2010.788 -0.272564552 -4.0357469 3.4906178 -6.027855 5.482726
## 2010.808 1.145347036 -2.6178354 4.9085294 -4.609944 6.900638
## 2010.827 0.061101377 -3.7020810 3.8242838 -5.694189 5.816392
## 2010.846 -0.295039683 -4.0582221 3.4681427 -6.050330 5.460251
## 2010.865 -0.051289683 -3.8144721 3.7118927 -5.806580 5.704001
## 2010.885 0.003553219 -3.7596292 3.7667356 -5.751738 5.758844
## 2010.904 1.628490432 -2.1346920 5.3916728 -4.126800 7.383781
## 2010.923 -0.510317460 -4.2734998 3.2528649 -6.265608 5.244973
## 2010.942 -0.981150794 -4.7443332 2.7820316 -6.736442 4.774140
## 2010.962 0.059226190 -3.7039562 3.8224086 -5.696065 5.814517
## 2010.981 0.368924925 -3.3942575 4.1321073 -5.386366 6.124216
## 2011.000 -0.628151116 -4.3913335 3.1350313 -6.383442 5.127140
## 2011.019 4.067361111 0.3041787 7.8305435 -1.687930 9.822652
## 2011.038 -0.073611111 -3.8367935 3.6895713 -5.828902 5.681680
## 2011.058 0.852302032 -4.4696415 6.1742456 -7.286908 8.991512
## 2011.077 1.880832888 -3.4411107 7.2027765 -6.258377 10.020043
## 2011.096 -1.691765873 -7.0137094 3.6301777 -9.830976 6.447444
## 2011.115 1.318258809 -4.0036848 6.6402024 -6.820952 9.457469
## 2011.135 0.938388016 -4.3835556 6.2603316 -7.200822 9.077598
## 2011.154 -4.567146891 -9.8890905 0.7547967 -12.706357 3.572063
## 2011.173 0.171809589 -5.1501340 5.4937532 -7.967401 8.311020
## 2011.192 1.347579607 -3.9743640 6.6695232 -6.791631 9.486790
## 2011.212 1.122856901 -4.1990867 6.4448005 -7.016353 9.262067
## 2011.231 -0.413461127 -5.7354047 4.9084824 -8.552671 7.725749
## 2011.250 -1.488523000 -6.8104666 3.8334206 -9.627733 6.650687
## 2011.269 0.591057765 -4.7308858 5.9130013 -7.548153 8.730268
## 2011.288 -0.246811734 -5.5687553 5.0751318 -8.386022 7.892399
## 2011.308 2.350598841 -2.9713447 7.6725424 -5.788611 10.489809
## 2011.327 -2.661213920 -7.9831575 2.6607297 -10.800424 5.477996
## 2011.346 -0.126289683 -5.4482333 5.1956539 -8.265500 8.012921
## 2011.365 0.847916667 -4.4740269 6.1698602 -7.291294 8.987127
## 2011.385 -0.524796252 -5.8467398 4.7971473 -8.664007 7.614414
## 2011.404 0.631542284 -4.6904013 5.9534859 -7.507668 8.770753
## 2011.423 -0.777805304 -6.0997489 4.5441383 -8.917016 7.361405
## 2011.442 -0.662075648 -5.9840192 4.6598679 -8.801286 7.477135
## 2011.462 0.175496032 -5.1464475 5.4974396 -7.963714 8.314706
## 2011.481 -0.543146562 -5.8650901 4.7787970 -8.682357 7.596064
## 2011.500 0.134217990 -5.1877256 5.4561616 -8.004992 8.273428
## 2011.519 -0.418154762 -5.7400983 4.9037888 -8.557365 7.721056
## 2011.538 -0.274900794 -5.5968444 5.0470428 -8.414111 7.864310
## 2011.558 -2.178714193 -7.5006578 3.1432294 -10.317925 5.960496
## 2011.577 -0.705908823 -6.0278524 4.6160347 -8.845119 7.433301
## 2011.596 0.542261905 -4.7796817 5.8642055 -7.596948 8.681472
## 2011.615 -0.833928571 -6.1558721 4.4880150 -8.973139 7.305282
## 2011.635 0.741567460 -4.5803761 6.0635110 -7.397643 8.880778
## 2011.654 1.650652469 -3.6712911 6.9725960 -6.488558 9.789863
## 2011.673 0.855200706 -4.4667429 6.1771443 -7.284010 8.994411
## 2011.692 -0.048511905 -5.3704555 5.2734317 -8.187722 8.090698
## 2011.712 0.901884921 -4.4200587 6.2238285 -7.237325 9.041095
## 2011.731 1.685409609 -3.6365340 7.0073532 -6.453801 9.824620
## 2011.750 -0.022512784 -5.3444564 5.2994308 -8.161723 8.116698
## 2011.769 -0.681006877 -6.0029504 4.6409367 -8.820217 7.458203
## 2011.788 -0.272564552 -5.5945081 5.0493790 -8.411775 7.866646
## 2011.808 1.145347036 -4.1765965 6.4672906 -6.993863 9.284557
## 2011.827 0.061101377 -5.2608422 5.3830449 -8.078109 8.200312
## 2011.846 -0.295039683 -5.6169833 5.0269039 -8.434250 7.844171
## 2011.865 -0.051289683 -5.3732333 5.2706539 -8.190500 8.087921
## 2011.885 0.003553219 -5.3183904 5.3254968 -8.135657 8.142764
## 2011.904 1.628490432 -3.6934531 6.9504340 -6.510720 9.767701
## 2011.923 -0.510317460 -5.8322610 4.8116261 -8.649528 7.628893
## 2011.942 -0.981150794 -6.3030944 4.3407928 -9.120361 7.158060
## 2011.962 0.059226190 -5.2627174 5.3811698 -8.079984 8.198437
## 2011.981 0.368924925 -4.9530186 5.6908685 -7.770285 8.508135
## 2012.000 -0.628151116 -5.9500947 4.6937925 -8.767361 7.511059
## 2012.019 4.067361111 -1.2545825 9.3893047 -4.071849 12.206571
## 2012.038 -0.073611111 -5.3955547 5.2483325 -8.212821 8.065599
checkresiduals(fit)
##
## Ljung-Box test
##
## data: Residuals from Seasonal naive method
## Q* = 21.867, df = 32, p-value = 0.9109
##
## Model df: 0. Total lags used: 32
viewer <- function() {
a <- readline(prompt = "Select from the following: Year / Seasonality / Week / Hour / Month")
if (a == "Year") {
b <- readline(prompt = "By which year? (2007 / 2008 / 2009) ")
if (b == "2007") {
print(year_2007)
} else if (b == "2008") {
print(year_2008)
} else if (b == "2009") {
print(year_2009)
}
}
if (a == "Seasonality") {
c <- readline(prompt = "By which sub-meter? (1 - Kitchen, 2 - Laundry or 3 - Water & AC) ")
d <- readline(prompt = "By which year? (2007, 2008 or 2009) ")
if (c == "1" && d == "2007") {
print(kitchen_season_2007)
} else if (c == "2" && d == "2007") {
print(laundry_season_2007)
} else if (c == "3" && d == "2007") {
print(water_season_2007)
} else if (c == "1" && d == "2008") {
print(kitchen_season_2008)
} else if (c == "2" && d == "2008") {
print(laundry_season_2008)
} else if (c == "3" && d == "2008") {
print(water_season_2008)
} else if (c == "1" && d == "2009") {
print(kitchen_season_2009)
} else if (c == "2" && d == "2009") {
print(laundry_season_2009)
} else if (c == "3" && d == "2009") {
print(water_season_2009)
}
}
if (a == "Week") {
d <- readline(prompt="By what year? (2007 / 2008 / 2009) ")
if (d == "2007") {
print(weekday_2007)
} else if (d == "2008") {
print(weekday_2008)
} else if (d == "2009") {
print(weekday_2009)
}
}
if(a == "Hour") {
e <- readline(prompt="By what year? (2007 / 2008 / 2009) ")
f <- readline(prompt="By which Sub-meter? (1-Kitchen, 2-Laundry or 3-Water) ")
if (e == "2007") {
print(hour_2007)
} else if (e == "2008") {
print(hour_2008)
} else if (e == "2009") {
print(hour_2009)
}
if (e == "2007") {
if (f=="1-Kitchen") {
print(ly_2007)
} else if (f=="2-Laundry") {
print(lyy_2007)
} else if (f=="3-Water") {
print(lyyy_2007)
}
} else if (e == "2008") {
if (f=="1-Kitchen") {
print(ly_2008)
} else if (f=="2-Laundry") {
print(lyy_2008)
} else if (f=="3-Water") {
print(lyyy_2008)
}
} else if (e == "2009")
if (f=="1-Kitchen") {
print(ly_2009)
} else if (f=="2-Laundry") {
print(lyy_2009)
} else if (f=="3-Water") {
print(lyyy_2009)
}
}
if (a == "Month") {
g <- readline(prompt = "By which sub-meter? (1 - Kitchen, 2 - Laundry or 3 - Water & AC) ")
h <- readline(prompt = "By which year? (2007, 2008 or 2009) ")
if (g == "1 - Kitchen" && h == "2007") {
print(thly_2007)
} else if (g == "2 - Laundry" && h == "2007") {
print(thlyy_2007)
} else if (g == "3 - Water & AC" && h == "2007") {
print(thlyyy_2007)
} else if (g == "1 - Kitchen" && h == "2008") {
print(thly_2008)
} else if (g == "2 - Laundry" && h == "2008") {
print(thlyy_2008)
} else if (g == "3 - Water & AC" && h == "2008") {
print(thlyyy_2008)
} else if (g == "1 - Kitchen" && h == "2009") {
print(thly_2009)
} else if (g == "2 - Laundry" && h == "2009") {
print(thlyy_2009)
} else if (g == "3 - Water & AC" && h == "2009") {
print(thlyyy_2009)
}
}
}
viewer()
v## Conclusion
This project aims to provide insights and forecasting capabilities for a sub-metering company operating in the Smart Home sector. By analyzing power consumption data, we can empower Smart Home owners with better understanding and control of their energy usage. The recommendations provided will assist the developer in offering highly efficient Smart Homes that provide owners with valuable power usage analytics.
The project also proposes the development of a dashboard application that enables homeowners to manage their power consumption effectively. The dashboard app will have the following features:
Visualization of Power Consumption: The app will provide intuitive visualizations that allow homeowners to monitor their power consumption patterns over time. This includes daily, monthly, and yearly views, as well as comparisons between different sub-meters and appliances.
Real-time Monitoring: Homeowners will be able to track their power consumption in real-time, allowing them to identify any sudden spikes or unusual patterns. This feature promotes awareness and encourages energy-saving habits.
Customizable Alerts: The dashboard app will allow homeowners to set custom thresholds for power consumption. When the consumption exceeds the specified threshold, the app will send alerts and reminders to prompt energy-saving actions.
Recommendations and Tips: Based on the analysis of power consumption data, the app will provide personalized recommendations and energy-saving tips to homeowners. This will help them make informed decisions and optimize their power usage.
Usage Comparison: The app will enable homeowners to compare their current power consumption with previous periods, such as the previous month or year. This feature will help identify behaviors or changes that lead to lower power usage and encourage sustainable practices.
Overall, the dashboard app will serve as a powerful tool for homeowners to manage their power consumption efficiently, promote energy-saving behaviors, and contribute to a more sustainable and environmentally friendly Smart Home ecosystem.
#LEGEND: #SUB_METER 1-> KITCHEN #SUB_METER_2-> LAUNDRY ROOM #SUB_METER_3-> WATER HEATER & AC