Synopsis

Data Processing

Data reading form the csv.bz2 file.

storms <- read.csv(bzfile("repdata-data-StormData.csv.bz2", "r"))

Calculate a new variable that sums FATALITIES and INJURIES, that takes in account the health of the population of USA.

# New varaible
storms$People_affected <- with(storms, FATALITIES + INJURIES)
# Obtain the sum for each event, and sort them from the bigger.
total_PA_Event <- sort(with(storms, tapply(People_affected,EVTYPE, sum)),
                       decreasing = TRUE)
# Build a dataset to use in ggplot graphics
data1 <- data.frame(Event = factor(names(total_PA_Event),
                                   levels = names(total_PA_Event)),
                    Affected = total_PA_Event)

For the question about the economic consequiences here is a similar transformation of the data.

# Sum the dollar lost by properties and croping.
storms$Damage <- with(storms, PROPDMG + CROPDMG)
# Obtain the sum of the total demage of each event.
total_EC_Event <- sort(with(storms, tapply(Damage,EVTYPE, sum)),
                       decreasing = TRUE)
# Build a data set for the ggplot graphics.
data2 <- data.frame(Event = factor(names(total_EC_Event),
                                   levels = names(total_EC_Event)),
                    Damage = total_EC_Event)

Results

Here are the events that affected more tan 5 thousand people, both injuries and fatalities.

require(ggplot2)
## Loading required package: ggplot2
g1 <- ggplot(subset(data1, Affected > 5000), aes(Event, Affected))
g1 + geom_bar(stat = "identity") +
  theme(axis.text.x=element_text(angle=-90))

plot of chunk unnamed-chunk-4

Tornado is the most harmful event in USA.

Now for the economic consequences.

require(ggplot2)
g1 <- ggplot(subset(data2, Damage > 500000), aes(Event, Damage/1000))
g1 + geom_bar(stat = "identity") +
  theme(axis.text.x=element_text(angle=-90))+
  ylab("Damage in thousand dollars")

plot of chunk unnamed-chunk-5

Again the tormando is the most costly event in the USA.