Introduction

The Individual household electric power consumption Data Set is a very interesting dataset available in UCI Machine Learning Repository. This Dataset has huge number of instances which makes analysis more precise and challenging. This dataset contains about 2075259 instances and 9 attributes. It basically holds data on the power consumption made in a individual house over a period of 4 years with a sampling rate of 1 minute.

Objective

Our objective is to analyse the power consumption made in an individual house over a period of 4 years i.e. from december 2006 to november 2010. We can find the peak usage and also when electricity has not at all been consumed. We can plot various graphs to analyse this and produce our predictive analysis on this dataset. This dataset can be both done using regression or clustering.

Description about Dataset

1.date: Date in format dd/mm/yyyy

2.time: time in format hh:mm:ss

3.global_active_power: household global minute-averaged active power (in kilowatt). Global active power is the power consumed by appliances other than the appliances mapped to Sub Meters. Global active power is the real power consumption i.e. the power consumed by electrical appliances other than the sub metered appliances.It is basically called wattfull power.

4.global_reactive_power: household global minute-averaged reactive power (in kilowatt). Global reactive power is the power which bounces back and froth without any usage or leakage. It is the imaginary power consumption. It is basically called wattless power.

5.voltage: minute-averaged voltage (in volt)

6.global_intensity: household global minute-averaged current intensity (in ampere). Intensity is magnitude of the power consumed. Also called as strength of current.

7.sub_metering_1: energy sub-metering No. 1 (in watt-hour of active energy). It corresponds to the kitchen, containing mainly a dishwasher, an oven and a microwave (hot plates are not electric but gas powered).

8.sub_metering_2: energy sub-metering No. 2 (in watt-hour of active energy). It corresponds to the laundry room, containing a washing-machine, a tumble-drier, a refrigerator and a light.

9.sub_metering_3: energy sub-metering No. 3 (in watt-hour of active energy). It corresponds to an electric water-heater and an air-conditioner.

Models we can use

With this dataset we can use both Regression and Clustering.

1. Regression: We can use the graphs and predict what will the power consumprion for the next hour. Since electricity consumption is found to be numeric we can use regression model.

2. Clustering: We can also group these data into small groups i.e. Data of electricity consumption for 3 months can be clustered or grouped so that we can have an effective predictive model.

Reading the csv file

Elec <- read.csv("C:\\Users\\Damodaran\\Desktop\\New folder\\Electricity Dataset.csv", header = TRUE)

Processing of data

Attrib <- Elec[(Elec$Date=="16-12-2006") | (Elec$Date=="13-12-2008"),]
Attrib$Global_active_power <- as.numeric(as.character(Attrib$Global_active_power))
Attrib$Global_reactive_power <- as.numeric(as.character(Attrib$Global_reactive_power))
Attrib$Voltage <- as.numeric(as.character(Attrib$Voltage))

Histogram on Global Active Power

plot1 <- function() {
    hist(Attrib$Global_active_power, main = paste("Global Active Power"), col="yellow", xlab="Global Active Power (kilowatts)")
    dev.copy(png, file="1_Hist.png", width=480, height=480)
    dev.off()
    cat("Plot1.png has been saved in", getwd())
}
plot1()
## Plot1.png has been saved in C:/Users/Damodaran/Desktop

Histogram on Global Reactive Power

plot2 <- function() {
    hist(Attrib$Global_reactive_power, main = paste("Global Reactive Power"), col="Green", xlab="Global Reactive Power (kilowatts)")
    dev.copy(png, file="2_Hist.png", width=480, height=480)
    dev.off()
    cat("Plot2.png has been saved in", getwd())
}
plot2()
## Plot2.png has been saved in C:/Users/Damodaran/Desktop

Histogram on Voltage

plot3 <- function() {
    hist(Attrib$Voltage, main = paste("Voltage (minute averaged)"), col="Red", xlab="Voltage(Volt)")
    dev.copy(png, file="3_Hist.png", width=480, height=480)
    dev.off()
    cat("Plot3.png has been saved in", getwd())
}
plot3()
## Plot3.png has been saved in C:/Users/Damodaran/Desktop