This project uses data from the UC Irvine Machine Learning Repository, a popular repository for machine learning datasets. In particular, I used the “Individual household electric power consumption Data Set” which is available from the location below:
https://d396qusza40orc.cloudfront.net/exdata%2Fdata%2Fhousehold_power_consumption.zip
I am looking at several common measures for power usage - Global Active Power in Killowatts, Frequency of Use, and Sub-metering.
Here I am bringing in the data and placing it into a table and then further subsetting the data for two specific days - Feb 1st and Feb 2nd.
housepower <- read.table("household_power_consumption.txt", sep = ";", na.strings = "?", header = TRUE)
housepower$Date <- as.Date(housepower$Date, format="%d/%m/%Y")
febdata <- subset(housepower, subset=(Date >= "2007-02-01" & Date <= "2007-02-02"))
rm(housepower)
datetime <- paste(as.Date(febdata$Date), febdata$Time)
febdata$Datetime <- as.POSIXct(datetime)
The overall goal here is simply to examine how household energy usage varies over a 2-day period. The plots cover frequency of use, the use over the specific days and the sub-metering.
hist(febdata$Global_active_power, col = "red", main = "Global Active Power", xlab = "Global Active Power (kilowatts)")
plot(febdata$Global_active_power~febdata$Datetime, type="l", main = "Global Active Power Over Days", ylab="Global Active Power (kilowatts)", xlab="")
plot(febdata$Datetime, febdata$Sub_metering_1, type="l", main = "Sub-metering of Power Over Days", ylab="Energy sub metering", xlab="")
lines(febdata$Datetime, febdata$Sub_metering_2, col="red")
lines(febdata$Datetime, febdata$Sub_metering_3, col="blue")
legend("topright", col=c("black", "red", "blue"), lty=1, lwd=2, legend=c("Sub_metering_1", "Sub_metering_2", "Sub_metering_3"))
par(mfrow=c(2,2), mar=c(4,4,2,1))
plot(febdata$Global_active_power~febdata$Datetime, type="l", main = "Combined Measurements Over Days", ylab="Global Active Power (kilowatts)", xlab="")
plot(febdata$Voltage~febdata$Datetime, type="l", ylab="Voltage", xlab="datetime")
plot(febdata$Datetime, febdata$Sub_metering_1, type="l", ylab="Energy sub metering", xlab="")
lines(febdata$Datetime, febdata$Sub_metering_2, col="red")
lines(febdata$Datetime, febdata$Sub_metering_3, col="blue")
legend("topright", col=c("black", "red", "blue"), lty=1, lwd=2, legend=c("Sub_metering_1", "Sub_metering_2", "Sub_metering_3"), bty = "n")
plot(febdata$Global_reactive_power~febdata$Datetime, type="l", ylab="Global_reactive_power", xlab="datetime")