Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. By using the storm database of the U.S. National Oceanic and Atmospheric Administration’s (NOAA), an analysis of which event types cause the most economic damage and damage to public health.
Fatalities are most caused by tornados and excessive heat, while injuries are caused most by tornados. The economic damage, the sum of crop and property damage, is most caused by floods. This is followed by hurricanes/typhoons, tornados and storm surges.
First, required libraries are loaded.
library(knitr)
library(ggplot2)
library(data.table)
library(dplyr)
library(gridExtra)
Then, the storm data file is read as CSV file and only required columns for analysis are extracted.
stormdata <- read.csv("repdata-data-StormData.csv.bz2")
stormdata <- stormdata[,c('EVTYPE', 'FATALITIES', 'INJURIES', 'PROPDMG',
'PROPDMGEXP', 'CROPDMG', 'CROPDMGEXP')]
The EVTYPE column, that stands for event type, can be loaded as factor. The magnitude of the columns PROPDMG and CROPDMG are determined by PROPDMGEXP and CROPDMGEXP respectively. The two EXP columns stand for:
The columns PROPDMG and CROPDMG are properly transformed using their EXP columns as described above.
stormdata$EVTYPE <- as.factor(stormdata$EVTYPE)
stormdata$PROPDMG <- ifelse(stormdata$PROPDMGEXP == 'K',
stormdata$PROPDMG*1000,
ifelse(stormdata$PROPDMGEXP == 'M',
stormdata$PROPDMG*1000000,
ifelse(stormdata$PROPDMGEXP == 'B',
stormdata$PROPDMG*1000000000,
stormdata$PROPDMG)))
stormdata$CROPDMG <- ifelse(stormdata$CROPDMGEXP == 'K',
stormdata$CROPDMG*1000,
ifelse(stormdata$CROPDMGEXP == 'M',
stormdata$CROPDMG*1000000,
ifelse(stormdata$CROPDMGEXP == 'B',
stormdata$CROPDMG*1000000000,
stormdata$CROPDMG)))
The table and plot above show that for fatalities, tornados and excessive heat causes most fatalities. Injuries are most caused by tornados.
The data will be analysed such that it is clear which types of events are:
The total number of fatalities and injuries by event type are calculated. Next, the total number of fatalities and injuries are calculated and ordered descending to get a quick overview of the total number of fatalities and injuries caused by an event.
agg <- aggregate(cbind(FATALITIES, INJURIES) ~ EVTYPE, data=stormdata, sum)
totalAgg <- agg
totalAgg$TOTAL <- totalAgg$FATALITIES+totalAgg$INJURIES
totalAgg <- totalAgg[order(-totalAgg$TOTAL),]
totalAgg <- head(totalAgg, 10)
The contents of totalAgg kan be seen below.
kable(totalAgg)
| EVTYPE | FATALITIES | INJURIES | TOTAL | |
|---|---|---|---|---|
| 830 | TORNADO | 5633 | 91346 | 96979 |
| 123 | EXCESSIVE HEAT | 1903 | 6525 | 8428 |
| 854 | TSTM WIND | 504 | 6957 | 7461 |
| 164 | FLOOD | 470 | 6789 | 7259 |
| 452 | LIGHTNING | 816 | 5230 | 6046 |
| 269 | HEAT | 937 | 2100 | 3037 |
| 147 | FLASH FLOOD | 978 | 1777 | 2755 |
| 424 | ICE STORM | 89 | 1975 | 2064 |
| 759 | THUNDERSTORM WIND | 133 | 1488 | 1621 |
| 972 | WINTER STORM | 206 | 1321 | 1527 |
fatalitiesOrdered <- agg[order(-agg$FATALITIES),]
fatalitiesOrdered <- head(fatalitiesOrdered, 10)
injuriesOrdered <- agg[order(-agg$INJURIES),]
injuriesOrdered <- head(injuriesOrdered, 10)
To get an ordered bat plot, both fatalities and injuries are sorted.
fatalitiesOrdered$EVTYPE <-factor(fatalitiesOrdered$EVTYPE,
levels=fatalitiesOrdered[
order(fatalitiesOrdered$FATALITIES), "EVTYPE"])
injuriesOrdered$EVTYPE <-factor(injuriesOrdered$EVTYPE,
levels=injuriesOrdered[
order(injuriesOrdered$INJURIES), "EVTYPE"])
Furthermore, plots of top fatalities and injuries by event type are created.
fatalitiesPlot <- ggplot(fatalitiesOrdered, aes(x=EVTYPE, y=FATALITIES)) +
geom_bar(stat='identity', position='dodge') +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
ggtitle('Top total fatalities by event') +
coord_flip()
injuriesPlot <- ggplot(injuriesOrdered, aes(x=EVTYPE, y=INJURIES)) +
geom_bar(stat='identity') +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
ggtitle('Top total injuries by event') +
coord_flip()
grid.arrange(fatalitiesPlot, injuriesPlot, ncol=2)
The left plot shows the top total fatalities by event type and the right plot shows top total injuries by event type. The x axis describes the number of people involved in a fatality or injury and the y axis describes the type of event that causes this fatality or injury.
The damage is aggregated by event type and then the property damage and crop damage are summed to get the total economic damage by event type.
economic <- aggregate(cbind(PROPDMG, CROPDMG) ~ EVTYPE, data=stormdata, sum)
economic$TOTAL <- economic$PROPDMG + economic$CROPDMG
economic <- economic[order(-economic$TOTAL),]
economic <- head(economic, 10)
The contents of the variable economic can be shown in a table.
kable(economic)
| EVTYPE | PROPDMG | CROPDMG | TOTAL | |
|---|---|---|---|---|
| 164 | FLOOD | 144657709807 | 5661968450 | 150319678257 |
| 406 | HURRICANE/TYPHOON | 69305840000 | 2607872800 | 71913712800 |
| 830 | TORNADO | 56925660790 | 414953270 | 57340614060 |
| 666 | STORM SURGE | 43323536000 | 5000 | 43323541000 |
| 238 | HAIL | 15727367053 | 3025537890 | 18752904943 |
| 147 | FLASH FLOOD | 16140812067 | 1421317100 | 17562129167 |
| 88 | DROUGHT | 1046106000 | 13972566000 | 15018672000 |
| 397 | HURRICANE | 11868319010 | 2741910000 | 14610229010 |
| 586 | RIVER FLOOD | 5118945500 | 5029459000 | 10148404500 |
| 424 | ICE STORM | 3944927860 | 5022113500 | 8967041360 |
Next, the plot of top economic damage by event type is created.
economic$EVTYPE <-factor(economic$EVTYPE,
levels=economic[order(economic$TOTAL),
"EVTYPE"])
ggplot(economic, aes(x=EVTYPE, y=TOTAL)) +
geom_bar(stat='identity', position='dodge') +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
ggtitle('Top economic damage by event type') +
coord_flip()
This plot shows the total economic damage per event type. The x axis describes the total damage in USD, EVTYPE describes the event type that causes this damage.
As seen in the table and plot above, floods and hurricanes/typhoons cause most economic damage. This is followed by tornados, storm surges and hails.