Determines the most harmful events with respect to population health and greatest economic consequences based upon the data from the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm data.
This report does a data analysis to determine the most harmful events with respect to population health and greatest economic consequences by using the data from the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm databases. The most harmful events with respect to population health is calculated by determing the number of people impacted by fatalies or injuries by the NOAA storm data.
This report loads the raw U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm data directly from the Coursera cloudfront.net URL. Below is the R Code to determine the most impactful events on human health in the United States. In the code below, the data is being transformed by grouping it by the event type to determine most impactful event via fatailies and injuries.
library(dplyr)
library(ggplot2)
library(knitr)
setwd("/Users/lazar/Code/R/ReproducibleResearch-project2")
mytmpdir = tempdir()
temp <- tempfile(tmpdir = mytmpdir)
download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2", temp)
rawData <- read.csv(temp, header = TRUE, sep = ",")
unlink(mytmpdir)
# dplyr table
table1 <- tbl_df(rawData)
# total injuries and fatalities to determine health impact
table2 <- mutate(table1, num_people_impacted = INJURIES + FATALITIES)
# total property and crop damage to determine total economic impact
table2 <- mutate(table2, econ_impact_prop = ifelse(table2$PROPDMGEXP == "", table2$PROPDMG * 1,
ifelse(table2$PROPDMGEXP %in% c("H","h"), table2$PROPDMG * 100,
ifelse(table2$PROPDMGEXP %in% c("K","k"), table2$PROPDMG * 1000,
ifelse(table2$PROPDMGEXP %in% c("M","m"), table2$PROPDMG * 1e+06,
ifelse(table2$PROPDMGEXP %in% c("B","b"), table2$PROPDMG * 1e+09,
0))))))
table2 <- mutate(table2, econ_impact_crop = ifelse(table2$CROPDMGEXP == "", table2$CROPDMG * 1,
ifelse(table2$CROPDMGEXP %in% c("h","H"), table2$CROPDMG * 100,
ifelse(table2$CROPDMGEXP %in% c("k","K"), table2$CROPDMG * 1000,
ifelse(table2$CROPDMGEXP %in% c("m","M"), table2$CROPDMG * 1e+06,
ifelse(table2$CROPDMGEXP %in% c("B","b"), table2$CROPDMG * 1e+09,
0))))))
table2 <- mutate(table2, econ_impact_in_dollars = econ_impact_prop + econ_impact_crop)
table2 <- mutate(table2, event_type = EVTYPE)
# Transform the data by grouping the data by event types
grouped_by_events <- group_by(table2, event_type)
# Determine most human health impactful events
health_summary <- summarize(grouped_by_events, tot_num_people_impacted=sum(num_people_impacted))
top_health_impact_table <- arrange(health_summary, desc(tot_num_people_impacted))
top_health_impact_table <- filter(top_health_impact_table, tot_num_people_impacted > 0)
top5_health_impact_events <- head(top_health_impact_table, 5)
top10_health_impact_events <- head(top_health_impact_table, 10)
# Determine most economic impactful events
econ_summary <- summarize(grouped_by_events, tot_econ_impact_in_dollars=sum(econ_impact_in_dollars))
top_econ_impact_table <- arrange(econ_summary, desc(tot_econ_impact_in_dollars))
top_econ_impact_table <- filter(top_econ_impact_table, tot_econ_impact_in_dollars > 0)
top5_econ_impact_events <- head(top_econ_impact_table, 5)
top10_econ_impact_events <- head(top_econ_impact_table, 10)
# Top 5 impact events to human health
ggplot(top5_health_impact_events, aes(x = event_type, y = tot_num_people_impacted)) + geom_bar(stat="identity", colour="#FF0000", fill="#FF0000") + xlab("Event Types") + ylab("Number of people impacted by injury/fatality") + labs(title = "Top 5 Impactful Event to Human Health") +theme(text = element_text(size=7))
kable(top10_health_impact_events, caption="Top 10 Impactful events of human health")
| event_type | tot_num_people_impacted |
|---|---|
| TORNADO | 96979 |
| EXCESSIVE HEAT | 8428 |
| TSTM WIND | 7461 |
| FLOOD | 7259 |
| LIGHTNING | 6046 |
| HEAT | 3037 |
| FLASH FLOOD | 2755 |
| ICE STORM | 2064 |
| THUNDERSTORM WIND | 1621 |
| WINTER STORM | 1527 |
# Top 5 impact events to economic conseqences
ggplot(top5_econ_impact_events, aes(x = event_type, y = tot_econ_impact_in_dollars)) + geom_bar(stat="identity", colour="#808000", fill="#808000") + xlab("Event Types") + ylab("Total dollars impact from property/crop damage ") + labs(title = "Top 5 Impactful Event to Economic") + theme(text = element_text(size=6))
kable(top10_econ_impact_events, caption="Top 10 Impactful events of economic consequences")
| event_type | tot_econ_impact_in_dollars |
|---|---|
| FLOOD | 150319678257 |
| HURRICANE/TYPHOON | 71913712800 |
| TORNADO | 57352113593 |
| STORM SURGE | 43323541000 |
| HAIL | 18758221730 |
| FLASH FLOOD | 17562128817 |
| DROUGHT | 15018672000 |
| HURRICANE | 14610229010 |
| RIVER FLOOD | 10148404500 |
| ICE STORM | 8967041310 |
Tornado is the top event that has the biggest impact on human health in the United States. It is 10 times more impactful on human health than its closest competitor. Excession heat, storm wind, floods and lightning do make a sigigicant impact on human health, but all of them combined together doesn’t equal the impact of tornado events.
Floods is the top event that has the biggest impact for economic consequences in the United States. Hurricane and tornado events are close to half the ammout of dollars loss on flood events, but definitely less than half of the amount of money loss from property and crop damage from flood events.