This analysis explore the NOAA Storm Database to answer the following questions:
The analysis consists of two steps:
First we load the data.
if(!exists("storm_data")) {
storm_data <- read.csv("repdata_data_StormData.csv")
}
We add fatalities and injuries to get the total number of persons touched in their health.
touched <- storm_data[c("EVTYPE", "FATALITIES", "INJURIES")]
touched$total <- touched$FATALITIES + touched$INJURIES
We sum the number by the type of event.
touched_by_event <- aggregate(total ~ EVTYPE, data = touched, sum)
We order the dataset in decreasing number of touched persons.
touched_sorted <- touched_by_event[order(-touched_by_event$total),]
and take the first 10 rows.
most_harmful <- touched_sorted[1:10,]
Fist we define a funkction to convert the damage exponents into a number:
convert_exp_to_number <- function(x) {
switch (x,
"B" = 1000000000,
"8" = 100000000,
"7" = 10000000,
"M" = 1000000,
"m" = 1000000,
"6" = 1000000,
"5" = 100000,
"4" = 10000,
"K" = 1000,
"k" = 1000,
"3" = 1000,
"H" = 100,
"h" = 100,
"2" = 100,
"1" = 10,
0
)
}
We create a dataframe with the damage colums:
damage <- storm_data[c("EVTYPE", "PROPDMG", "PROPDMGEXP", "CROPDMG", "CROPDMGEXP")]
and use the convertion function to convert the exponents into numbers
prodmgmult = sapply(as.character(damage$PROPDMGEXP), convert_exp_to_number)
cropdmgmult = sapply(as.character(damage$PROPDMGEXP), convert_exp_to_number)
Now we can calculate the total damage:
damage$total <- damage$PROPDMG * prodmgmult + damage$CROPDMG * cropdmgmult
We sum the damage by the type of event.
damage_by_event <- aggregate(total ~ EVTYPE, data = damage, sum)
We order the dataset in decreasing number of touched persons.
damage_sorted <- damage_by_event[order(-damage_by_event$total),]
and take the first 10 rows.
highest_damage <- damage_sorted[1:10,]