1. Synopsis
NOAA is an abbreviation for National Oceanic and Atmospheric Administration in the United States. They provide accurate data and cutting edge research in their field. The public dataset maintains weather data per storm event dating back over 50 years. These data can show us which weather events have occurred and what implications these events can have on the safety and well being of the surrounding communities. The damage to property, people, and crops, all organized by storm event type can be seen and visualised.
In the dataset provided, there were some events that proved to be more dangerous than others. The weather event that causes the most harm to public health is Tornadoes. This conclusion has been made after carefully analysing and visualing the data which indicate that they cause the highest fatality. Coming to economic damages, the events that have caused the most damage are Flood, Drought and Hurricane, but for different reasons. For example the biggest risk to crops is a drought event, whereas the biggest threat to properties are floods.
2. Data Processing
2.1 Loading dataset and libraries
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 3.5.3
library(dplyr)
## Warning: package 'dplyr' was built under R version 3.5.3
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(knitr)
## Warning: package 'knitr' was built under R version 3.5.3
df <- read.csv('repdata_data_StormData.csv')
2.2 Selecting important attributes
new_df <- select(df,'EVTYPE','FATALITIES','INJURIES','PROPDMG','PROPDMGEXP','CROPDMG','CROPDMGEXP')
2.3 Cleaning the dataset
new_df$PROPEXP[new_df$PROPDMGEXP == "K"] <- 1000
new_df$PROPEXP[new_df$PROPDMGEXP == "M"] <- 1e+06
new_df$PROPEXP[new_df$PROPDMGEXP == ""] <- 1
new_df$PROPEXP[new_df$PROPDMGEXP == "B"] <- 1e+09
new_df$PROPEXP[new_df$PROPDMGEXP == "m"] <- 1e+06
new_df$PROPEXP[new_df$PROPDMGEXP == "0"] <- 1
new_df$PROPEXP[new_df$PROPDMGEXP == "5"] <- 1e+05
new_df$PROPEXP[new_df$PROPDMGEXP == "6"] <- 1e+06
new_df$PROPEXP[new_df$PROPDMGEXP == "4"] <- 10000
new_df$PROPEXP[new_df$PROPDMGEXP == "2"] <- 100
new_df$PROPEXP[new_df$PROPDMGEXP == "3"] <- 1000
new_df$PROPEXP[new_df$PROPDMGEXP == "h"] <- 100
new_df$PROPEXP[new_df$PROPDMGEXP == "7"] <- 1e+07
new_df$PROPEXP[new_df$PROPDMGEXP == "H"] <- 100
new_df$PROPEXP[new_df$PROPDMGEXP == "1"] <- 10
new_df$PROPEXP[new_df$PROPDMGEXP == "8"] <- 1e+08
new_df$PROPEXP[new_df$PROPDMGEXP == "+"] <- 0
new_df$PROPEXP[new_df$PROPDMGEXP == "-"] <- 0
new_df$PROPEXP[new_df$PROPDMGEXP == "?"] <- 0
new_df$PROPDMGVAL <- new_df$PROPDMG * new_df$PROPEXP
new_df$CROPEXP[new_df$CROPDMGEXP == "M"] <- 1e+06
new_df$CROPEXP[new_df$CROPDMGEXP == "K"] <- 1000
new_df$CROPEXP[new_df$CROPDMGEXP == "m"] <- 1e+06
new_df$CROPEXP[new_df$CROPDMGEXP == "B"] <- 1e+09
new_df$CROPEXP[new_df$CROPDMGEXP == "0"] <- 1
new_df$CROPEXP[new_df$CROPDMGEXP == "k"] <- 1000
new_df$CROPEXP[new_df$CROPDMGEXP == "2"] <- 100
new_df$CROPEXP[new_df$CROPDMGEXP == ""] <- 1
new_df$CROPEXP[new_df$CROPDMGEXP == "?"] <- 0
new_df$CROPDMGVAL <- new_df$CROPDMG * new_df$CROPEXP
2.4 Top 10 health fatalities
fatal <- aggregate(FATALITIES~EVTYPE,new_df,sum)
fatal <- fatal[order(-fatal$FATALITIES),]
top_fatal <- fatal[1:10,]
top_fatal
## EVTYPE FATALITIES
## 834 TORNADO 5633
## 130 EXCESSIVE HEAT 1903
## 153 FLASH FLOOD 978
## 275 HEAT 937
## 464 LIGHTNING 816
## 856 TSTM WIND 504
## 170 FLOOD 470
## 585 RIP CURRENT 368
## 359 HIGH WIND 248
## 19 AVALANCHE 224
barplot(top_fatal$FATALITIES, las = 3, names.arg = top_fatal$EVTYPE, main = "Top 10 Fatalities by Weather Events", ylab = "Total Fatalities", col = "green")
2.5 Top 10 economic damages
property_damage <- aggregate(PROPDMGVAL~EVTYPE,new_df,sum)
property_damage <- property_damage[order(-property_damage$PROPDMGVAL),]
property_damage <- property_damage[1:10,]
property_damage
## EVTYPE PROPDMGVAL
## 170 FLOOD 144657709807
## 411 HURRICANE/TYPHOON 69305840000
## 834 TORNADO 56947380617
## 670 STORM SURGE 43323536000
## 153 FLASH FLOOD 16822673979
## 244 HAIL 15735267513
## 402 HURRICANE 11868319010
## 848 TROPICAL STORM 7703890550
## 972 WINTER STORM 6688497251
## 359 HIGH WIND 5270046260
barplot(property_damage$PROPDMGVAL/(10^9), las = 3, names.arg = property_damage$EVTYPE, main = "Top 10 Property Damages by Weather Events", ylab = "In Billions", col = "blue")
3. Conclusion
The events that caused the most damage to human life are tornado, excessive heat, flood, lightning and hurricane. On the other hand, events that damaged properties are flood, typhoon, tornado, flood, hail and winter storm.