This analysis reviews the effect of Events on the combination of Injuries and Fatalities and also economic consequence as a function of property and crop damage.
Need to define population health, the data contains two varaibles FATALITIES and INJURIES which could be used as a marker for “population health”.
In the following example a new variable totalIncidents is created to represent the total impact on poplulation which is based on the sum of FATALITIES and INJURIES.
A ratio of totalIncidents versus number of events is then calculated (incidentRatio) to highlight the net average impact of a single event. The output data frame is then sorted in descending order by incidentRatio with the top ten presented.
Economic consquences are also determined using the PROPDMG and CROPDMG variables.
The dplyr package is used to group the events and then summarise the data within each of the groupings.
# Calculate Event Duration
storm_df$BGN_DATE_TIME <- as.POSIXct(paste(as.Date(storm_df$BGN_DATE,"%m/%d/%Y"),
as.character(storm_df$BGN_TIME)),
format="%Y-%m-%d %H%M")
storm_df$END_DATE_TIME <- as.POSIXct(paste(as.Date(storm_df$END_DATE,"%m/%d/%Y"),
as.character(storm_df$END_TIME)),
format="%Y-%m-%d %H%M")
storm_df$DURATION <- difftime(storm_df$END_DATE_TIME,
storm_df$BGN_DATE_TIME,
units = "hours")
uniqueEVTYPE <- storm_df %>%
mutate(n=1,
impactArea = LENGTH * WIDTH) %>% # Insert a 1 against each obs
group_by(EVTYPE) %>%
summarise(totalEvents = sum(n), # Sum all 1's for total count by group
fatalities = sum(FATALITIES),
injuries = sum (INJURIES),
totalIncidents = sum(FATALITIES+INJURIES), # Total Health Impact
incidentRatio = 1/(totalEvents/totalIncidents), # Incident/Event Ratio
sevRatio = (fatalities/totalEvents), # Severity Ratio
incSevWeighting = if_else(sevRatio == 0,
incidentRatio ,
incidentRatio * sevRatio),
propDamage = sum(PROPDMG),
cropDamage = sum (CROPDMG),
totalDamage = sum(PROPDMG+CROPDMG), # Total Damage
incidentRatio_DMG = 1/(totalEvents/totalDamage), # Damage/Event Ratio
totalArea = sum(impactArea)) %>%
filter(totalIncidents > 0) %>%
arrange(desc(incSevWeighting)) # Sort by the most "Severe"
uniqueEVTYPE_DMG <- storm_df %>%
mutate(n=1,
impactArea = LENGTH * WIDTH) %>% # Insert a 1 against each obs
group_by(EVTYPE) %>%
summarise(totalEvents = sum(n), # Sum all 1's for total count by group
propDamage = sum(PROPDMG),
cropDamage = sum (CROPDMG),
totalDamage = sum(PROPDMG+CROPDMG), # Total Damage
incidentRatio_DMG = 1/(totalEvents/totalDamage), # Damage/Event Ratio
totalArea = sum(impactArea)) %>%
filter(totalDamage > 0) %>%
arrange(desc(totalDamage)) # Sort by the most "Severe"
Across the United States, which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health?
# Prepare the plot for the top 10 Events contributing the highest effect on Pop Health
uniqueEVTYPE[1:10,] %>%
ggplot() +
geom_col(aes(x = reorder(EVTYPE, -incSevWeighting), y = incSevWeighting), fill = "yellow") +
labs(title="Event By Incident Severity",
x = "Event",
y = "Incident Severity",
colour = "Event Type") +
theme_bw() +
theme(axis.text.x = element_text(angle = 90, hjust = 1))
The above chart demonstrates that Tornadoes have the highest impact to population health.
Across the United States, which types of events have the greatest economic consequences?
uniqueEVTYPE_DMG[1:10,] %>%
ggplot() +
geom_col(aes(x = reorder(EVTYPE, - totalDamage ),
y = totalDamage),
fill = "red") +
labs(title="Event By Damage Severity",
x = "Event",
y = "Damage Severity",
colour = "Event Type") +
theme_bw() +
theme(axis.text.x = element_text(angle = 90, hjust = 1))
The above plot demonstrates that Tornadoes have the highest impact on damage with Flash Flooding being the second most impacting. It is important to note thaty the cumulative sum of Thunderstorm Winds, Thunderstorm Wind and TSTM Wind would place these in second place, however they are separated out in this figure.
Overall it appears that Tornadoes and other high wind events have the largest impact on population health and economic outcomes.