SYNOPSIS This document will provide historic data for civilian casualities and economic cost to help Federal Emergency Management Agency (FEMA) prepare for future natural disasters. Damage and casualties results are based from the National Weather Service Storm DATA taken from 1950-2011.
Casulaties
Data results for casuality are based from Storm Data FATALITIES column(23) and INJURIES column (24). Since combining both these factors will not yield which factor has more impact on the population health, fatalities and injuries columns were tabulated separately. Each factor or column were summed up and at other instances averaged.
NOTE: there is no distinction between direct and inderect fatalities and injuries.
Economic Damage
Data results for economic damage are based from Storm Data PRPDMG column (25) and CRPDMG column (27). Both columns were summed up to get the total damage.
NOTE: results are not adjusted for inflation
Data Processing
Strom Data was processed using read_csv from the readr package and then loaded as data.table from the datatable package for easy data manipulation and viewing.
library(readr)
library(dplyr)
library(ggplot2)
library(data.table)
library(yarrr)
library(stats)
download.file('https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2','FStormData.csv.bz2')
dataRead <- read_csv("FStormData.csv.bz2")
rawdata <- data.table(dataRead)
Result: Storm Data
head(rawdata,5)
## STATE__ BGN_DATE BGN_TIME TIME_ZONE COUNTY COUNTYNAME STATE
## 1: 1 4/18/1950 0:00:00 0130 CST 97 MOBILE AL
## 2: 1 4/18/1950 0:00:00 0145 CST 3 BALDWIN AL
## 3: 1 2/20/1951 0:00:00 1600 CST 57 FAYETTE AL
## 4: 1 6/8/1951 0:00:00 0900 CST 89 MADISON AL
## 5: 1 11/15/1951 0:00:00 1500 CST 43 CULLMAN AL
## EVTYPE BGN_RANGE BGN_AZI BGN_LOCATI END_DATE END_TIME COUNTY_END
## 1: TORNADO 0 NA NA NA NA 0
## 2: TORNADO 0 NA NA NA NA 0
## 3: TORNADO 0 NA NA NA NA 0
## 4: TORNADO 0 NA NA NA NA 0
## 5: TORNADO 0 NA NA NA NA 0
## COUNTYENDN END_RANGE END_AZI END_LOCATI LENGTH WIDTH F MAG FATALITIES
## 1: NA 0 NA NA 14.0 100 3 0 0
## 2: NA 0 NA NA 2.0 150 2 0 0
## 3: NA 0 NA NA 0.1 123 2 0 0
## 4: NA 0 NA NA 0.0 100 2 0 0
## 5: NA 0 NA NA 0.0 150 2 0 0
## INJURIES PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP WFO STATEOFFIC ZONENAMES
## 1: 15 25.0 K 0 NA NA NA NA
## 2: 0 2.5 K 0 NA NA NA NA
## 3: 2 25.0 K 0 NA NA NA NA
## 4: 2 2.5 K 0 NA NA NA NA
## 5: 2 2.5 K 0 NA NA NA NA
## LATITUDE LONGITUDE LATITUDE_E LONGITUDE_ REMARKS REFNUM
## 1: 3040 8812 3051 8806 NA 1
## 2: 3042 8755 0 0 NA 2
## 3: 3340 8742 0 0 NA 3
## 4: 3458 8626 0 0 NA 4
## 5: 3412 8642 0 0 NA 5
Most Fatalities by Event
Fatalities results are based from adding the total numbers from Fatalities column per event.
All other columns from raw data were discarded except for EVTYPE,FATALITIES and INJURIES columns.
NOTE: Observation column added to quantify number of records entered per event
PopHealth <- rawdata %>%
select(EVTYPE,FATALITIES,INJURIES) %>%
group_by(EVTYPE)%>%
summarise(
Observation = n(),
TotalFatalities = sum(FATALITIES),
TotalInjuries = sum(INJURIES)
)
MostFatalities <- PopHealth %>% #code for getting the top ten fatalities data per event
select(-TotalInjuries, -Observation) %>%
arrange(desc(TotalFatalities))%>%
head(10)
mycolors <- piratepal("basel")#color pallete
ggplot(MostFatalities, aes(x= reorder(MostFatalities$EVTYPE,1:10 ),MostFatalities$TotalFatalities)) + geom_col(fill=mycolors) + theme(axis.text.x = element_text(angle = 70, vjust = 0.5)) + labs(title= "Top 10 Events with Most Fatalities", x = "Natural Disasters", y= "Total Fatalities")
Table Results: Top 10 Events with Most Fatalities
print(MostFatalities)
## # A tibble: 10 × 2
## EVTYPE TotalFatalities
## <chr> <dbl>
## 1 TORNADO 5633
## 2 EXCESSIVE HEAT 1903
## 3 FLASH FLOOD 978
## 4 HEAT 937
## 5 LIGHTNING 816
## 6 TSTM WIND 504
## 7 FLOOD 470
## 8 RIP CURRENT 368
## 9 HIGH WIND 248
## 10 AVALANCHE 224
Most Fatalities per Incident
Fatalities per incident results are based from adding the total numbers from Fatalities column per event divided by the number of observations or incidents.
MostFatal.Incident <- PopHealth %>% #code for getting the top 10 fatalities per reported incident
mutate( Fatality.incident = TotalFatalities / Observation) %>%
select(-TotalFatalities,-Observation, -TotalInjuries) %>%
arrange(desc(Fatality.incident)) %>%
head(10)
ggplot(MostFatal.Incident, aes(x= MostFatal.Incident$EVTYPE,MostFatal.Incident$Fatality.incident),MostFatalities$TotalFatalities) + geom_col(fill=mycolors) + theme(axis.text.x = element_text(angle = 70, vjust = 0.5)) + labs(title= "Top 10 Events with Most Fatalies Per Incident", x = "Natural Disasters", y= "Ave. Fatalities per incident")
Table Results: Top 10 Events with Most Fatalities Per Incident
print(MostFatal.Incident)
## # A tibble: 10 × 2
## EVTYPE Fatality.incident
## <chr> <dbl>
## 1 TORNADOES, TSTM WIND, HAIL 25.000000
## 2 COLD AND SNOW 14.000000
## 3 TROPICAL STORM GORDON 8.000000
## 4 RECORD/EXCESSIVE HEAT 5.666667
## 5 EXTREME HEAT 4.363636
## 6 HEAT WAVE DROUGHT 4.000000
## 7 HIGH WIND/SEAS 4.000000
## 8 MARINE MISHAP 3.500000
## 9 WINTER STORMS 3.333333
## 10 Heavy surf and wind 3.000000
Most Injuries by Event
Injuries results are based from adding the total numbers from Injuries column per event.
MostInjuries <- PopHealth %>% #code for getting the top 10 injuries by event
select(-TotalFatalities, -Observation) %>%
arrange(desc(TotalInjuries, order))%>%
head(10)
ggplot(MostInjuries, aes(x= reorder(MostInjuries$EVTYPE,1:10 ),MostInjuries$TotalInjuries)) + geom_col(fill=mycolors) + theme(axis.text.x = element_text(angle = 70, vjust = 0.5)) + labs(title= "Top 10 Events with Most Injuries", x = "Natural Disasters", y= "Total Injuries")
Table Results: Top 10 Events with Most Injuries
print(MostInjuries)
## # A tibble: 10 × 2
## EVTYPE TotalInjuries
## <chr> <dbl>
## 1 TORNADO 91346
## 2 TSTM WIND 6957
## 3 FLOOD 6789
## 4 EXCESSIVE HEAT 6525
## 5 LIGHTNING 5230
## 6 HEAT 2100
## 7 ICE STORM 1975
## 8 FLASH FLOOD 1777
## 9 THUNDERSTORM WIND 1488
## 10 HAIL 1361
Most Injuries by Incident
Injuries results are based from adding the total numbers from Injuries column per event divided the number of observations or incidents.
MostInjury.Incident <- PopHealth %>% #code for getting the top 10 injuries per reported incident
mutate( Injury.incident = TotalInjuries / Observation) %>%
select(-TotalInjuries,-Observation, -TotalFatalities) %>%
arrange(desc(Injury.incident)) %>%
head(10)
ggplot(MostInjuries, aes(MostInjury.Incident$EVTYPE,MostInjury.Incident$Injury.incident))+ geom_col(fill=mycolors) + theme(axis.text.x = element_text(angle = 70, vjust = 0.5)) + labs(title= "Top 10 Events with Most Injuries per Incident", x = "Natural Disasters", y= "Ave. Injuries per incident")
Table Results: Top 10 Events with Most Injuries per Incident
print(MostInjury.Incident)
## # A tibble: 10 × 2
## EVTYPE Injury.incident
## <chr> <dbl>
## 1 Heat Wave 70.00000
## 2 TROPICAL STORM GORDON 43.00000
## 3 WILD FIRES 37.50000
## 4 THUNDERSTORMW 27.00000
## 5 HIGH WIND AND SEAS 20.00000
## 6 SNOW/HIGH WINDS 18.00000
## 7 GLAZE/ICE STORM 15.00000
## 8 HEAT WAVE DROUGHT 15.00000
## 9 WINTER STORM HIGH WINDS 15.00000
## 10 HURRICANE/TYPHOON 14.48864
Most Property Damage by Event
Total economic cost damage were divided by $10,000.
NOTE: Observation column added to quantify number of records entered per event
EcoDamage <-rawdata %>%
select(EVTYPE,PROPDMG,CROPDMG) %>%
group_by(EVTYPE)%>%
summarise(
Observation = n(),
TotalProp.Damage = sum(PROPDMG),
TotalCrop.Damage = sum(CROPDMG)
)%>%
mutate(TotalDamage = floor((TotalProp.Damage + TotalCrop.Damage)/ 10000) ) #Cost diveded by 10,000
EcoTotalDamage <- EcoDamage %>%
select (-Observation,-TotalCrop.Damage, -TotalProp.Damage)%>%
arrange(desc(TotalDamage)) %>%
head(10)
ggplot(EcoTotalDamage, aes(x= reorder(EcoTotalDamage$EVTYPE,1:10),EcoTotalDamage$TotalDamage))+ geom_col(fill=mycolors) + theme(axis.text.x = element_text(angle = 70, vjust = 0.5)) + labs(title= "Top 10 Events with Most Economic Damage", x = "Natural Disasters", y= "Damage in $10k units ")
Table Results: Top 10 Events with Most Economic Damage
print(EcoTotalDamage)
## # A tibble: 10 × 2
## EVTYPE TotalDamage
## <chr> <dbl>
## 1 TORNADO 331
## 2 FLASH FLOOD 159
## 3 TSTM WIND 144
## 4 HAIL 126
## 5 FLOOD 106
## 6 THUNDERSTORM WIND 94
## 7 LIGHTNING 60
## 8 THUNDERSTORM WINDS 46
## 9 HIGH WIND 34
## 10 WINTER STORM 13
Events that are most harmful to population health: Tornadoes,Thunderstorm and Hail
Event that have the greatest economic consequences: Tornadoes and FlashFlood