Analysis of Storm Data

In general the data is subset in order to isolate events that have a greater impact. Graphs are created to show the damage to both people and property.

Finally we examine if the events that cause the most damage to people change over time.

set up the environment

library(lattice)
setwd("~/Coursera_DataScience/RR/Pa2")

Load the storm data

storm_data <- read.csv("repdata-data-StormData.csv")

Data Processing

Set the date format for begin and end

storm_data$BGN_DATE <- strptime(storm_data$BGN_DATE, format = "%m/%d/%Y %H:%M:%S")
storm_data$END_DATE <- strptime(storm_data$END_DATE, format = "%m/%d/%Y %H:%M:%S")

Create a year for later analaysis

storm_data$YEAR <- format(storm_data$BGN_DATE, "%Y")

Subset the data for more than 100 injuries or fatalities

people_storm_data <- storm_data[storm_data$INJURIES + storm_data$FATALITIES > 
    100, ]

subset the data for property damage

cost_storm_data <- storm_data[storm_data$PROPDMG + storm_data$CROPDMG > 1000, 
    ]

Results

In general this is the total result

xyplot(people_storm_data$EVTYPE ~ people_storm_data$INJURIES + people_storm_data$FATALITIES)

plot of chunk people_plot overall

Has this changed over time?

xyplot(people_storm_data$EVTYPE ~ people_storm_data$INJURIES + people_storm_data$FATALITIES | 
    people_storm_data$YEAR)

plot of chunk plot years

Cost overall

xyplot(cost_storm_data$EVTYPE ~ cost_storm_data$PROPDMG + cost_storm_data$CROPDMG)

plot of chunk cost overall