Individual Household Electric Power Consumption

Synopsis

The dataset consists of measurements of electric power consumption in one household with a one-minute sampling rate over a period of almost 4 years. The overall goal is to analyze how the household energy uses varies over a 2-day period in February, 2007.

Data Processing

Loading the data and subsetting the two required days for which analysis will be performed

file <- "household_power_consumption.txt"
householdpower <- read.table(file, header = FALSE, sep = ";", skip = 1, colClasses = c(rep("character", 
    2), rep("numeric", 7)), na.strings = "?")
cnames <- readLines(file, 1)
cnames <- strsplit(cnames, ";", fixed = TRUE)
names(householdpower) <- cnames[[1]]
householdpower <- householdpower[householdpower$Date %in% c("1/2/2007", "2/2/2007"), 
    ]
head(householdpower)
##           Date     Time Global_active_power Global_reactive_power Voltage
## 66637 1/2/2007 00:00:00               0.326                 0.128   243.2
## 66638 1/2/2007 00:01:00               0.326                 0.130   243.3
## 66639 1/2/2007 00:02:00               0.324                 0.132   243.5
## 66640 1/2/2007 00:03:00               0.324                 0.134   243.9
## 66641 1/2/2007 00:04:00               0.322                 0.130   243.2
## 66642 1/2/2007 00:05:00               0.320                 0.126   242.3
##       Global_intensity Sub_metering_1 Sub_metering_2 Sub_metering_3
## 66637              1.4              0              0              0
## 66638              1.4              0              0              0
## 66639              1.4              0              0              0
## 66640              1.4              0              0              0
## 66641              1.4              0              0              0
## 66642              1.4              0              0              0

Formatting the data

Create a column 'Date_Time' coverting 'Date' and 'Time' into respective Date and Time formats

householdpower$Date_Time <- paste(householdpower$Date, householdpower$Time)
householdpower$Date_Time <- strptime(householdpower$Date_Time, "%d/%m/%Y %H:%M:%S")

Histogram of household global minute-averaged active power (in kilowatt)

with(householdpower, hist(Global_active_power, main = "Global Active Power", 
    xlab = "Global Active Power (kilowatts)", ylab = "Frequency", col = "red"))

plot of chunk unnamed-chunk-3

Time series plot of global minute-averaged active power (in kilowatt)

plot(x = householdpower$Date_Time, y = householdpower$Global_active_power, type = "l", 
    xlab = "", ylab = "Global Active Power (kilowatts)")

plot of chunk unnamed-chunk-4

Time series plot of energy usage across various sections of the house.

Sub_metering_1 corresponds to the kitchen, containing of mainly a dishwasher, an oven and a microwave (hot plates are not electric but gas powered). Sub_metering_2 corresponds to the laundry room, containing a washing-machine, a tumble-drier, a refrigerator and a light. Sub_metering_3 corresponds to an electric water-heater and an air-conditioner.

All measurements reflect watt-hour of active energy.

plot(x = householdpower$Date_Time, y = householdpower$Sub_metering_1, type = "n", 
    xlab = "", ylab = "Energy sub metering")
lines(x = householdpower$Date_Time, y = householdpower$Sub_metering_1)
lines(x = householdpower$Date_Time, y = householdpower$Sub_metering_2, col = "red")
lines(x = householdpower$Date_Time, y = householdpower$Sub_metering_3, col = "blue")
legend("topright", legend = c("Sub_metering_1", "Sub_metering_2", "Sub_metering_3"), 
    lty = 1, col = c("black", "red", "blue"))

plot of chunk unnamed-chunk-5

Time series plots of global active power, global reactive power, voltage and sub_metering

## Split the device into four plotting regions
par(mfcol = c(2, 2))

## Create plots and send to file

## Create the 'Global Active Power' vs 'Time' plot
plot(x = householdpower$Date_Time, y = householdpower$Global_active_power, type = "l", 
    xlab = "", ylab = "Global Active Power")

## Create the 'Energy sub metering' vs 'Time' plot
plot(x = householdpower$Date_Time, y = householdpower$Sub_metering_1, type = "n", 
    xlab = "", ylab = "Energy sub metering")
lines(x = householdpower$Date_Time, y = householdpower$Sub_metering_1)
lines(x = householdpower$Date_Time, y = householdpower$Sub_metering_2, col = "red")
lines(x = householdpower$Date_Time, y = householdpower$Sub_metering_3, col = "blue")
legend("topright", legend = c("Sub_metering_1", "Sub_metering_2", "Sub_metering_3"), 
    lty = 1, col = c("black", "red", "blue"), bty = "n")

## Create the 'Voltage' vs 'Time' plot
plot(x = householdpower$Date_Time, y = householdpower$Voltage, type = "l", xlab = "datetime", 
    ylab = "Voltage")

## Create the 'Global_reactive_power' vs 'Time' plot
plot(x = householdpower$Date_Time, y = householdpower$Global_reactive_power, 
    type = "l", xlab = "datetime", ylab = "Global Reactive Power")

plot of chunk unnamed-chunk-6

Results

Power usage in a household over a 2 day period was analyzed and it was noted that there was a sharp increase in power consumption in the kitchen for a brief time during those 2 days which could be due to the dishwasher.

Reference :

Link to data source : https://archive.ics.uci.edu/ml/machine-learning-databases/00235/