This analysis explore the NOAA Storm Database to answer the following questions:

The analysis consists of two steps:

Data Processing

Load the data

First we load the data.

if(!exists("storm_data")) {
        storm_data <- read.csv("repdata_data_StormData.csv")
}

Calculate the population touched in their health

We add fatalities and injuries to get the total number of persons touched in their health.

touched <- storm_data[c("EVTYPE", "FATALITIES", "INJURIES")]
touched$total <- touched$FATALITIES + touched$INJURIES

We sum the number by the type of event.

touched_by_event <- aggregate(total ~ EVTYPE, data = touched, sum)

We order the dataset in decreasing number of touched persons.

touched_sorted <- touched_by_event[order(-touched_by_event$total),]

and take the first 10 rows.

most_harmful <- touched_sorted[1:10,]

Calculate the dammage

Fist we define a funkction to convert the damage exponents into a number:

convert_exp_to_number <- function(x) {
        switch (x,
                "B" = 1000000000,
                "8" = 100000000,
                "7" = 10000000,
                "M" = 1000000,
                "m" = 1000000,
                "6" = 1000000,
                "5" = 100000,
                "4" = 10000,
                "K" = 1000,
                "k" = 1000,
                "3" = 1000,
                "H" = 100,
                "h" = 100,
                "2" = 100,
                "1" = 10,
                0
        )
}

We create a dataframe with the damage colums:

damage <- storm_data[c("EVTYPE", "PROPDMG", "PROPDMGEXP", "CROPDMG", "CROPDMGEXP")]

and use the convertion function to convert the exponents into numbers

prodmgmult = sapply(as.character(damage$PROPDMGEXP), convert_exp_to_number)
cropdmgmult = sapply(as.character(damage$PROPDMGEXP), convert_exp_to_number)

Now we can calculate the total damage:

damage$total <- damage$PROPDMG * prodmgmult  + damage$CROPDMG * cropdmgmult

We sum the damage by the type of event.

damage_by_event <- aggregate(total ~ EVTYPE, data = damage, sum)

We order the dataset in decreasing number of touched persons.

damage_sorted <- damage_by_event[order(-damage_by_event$total),]

and take the first 10 rows.

highest_damage <- damage_sorted[1:10,]

Results

Which events are more harmful with respect to population health?

Which type of events have the greatest economic consequences?