Introduction:
Severe weather Phenomenon can cause both damage public health and economic sectors and assets
like crops and property. They result in fatalities, injuries, and property damage.Prevention
of these natural disasters is important in public interest.
Through this project, we explore the U.S. National Oceanic and Atmospheric
Administration’s (NOAA) storm database. The database tracks characteristics of major storms and weather events in the United States, giving data on the fatalities, injuries, and property damage.
Synopsis:
This report explores the effect of natural disasters on public health(injuries and fatalities) and
economy (Crop damage and Property damage). The report analysed the NOAA storm database
containing data on extreme climate events. The data was collected during the period from 1950 through 2011.
This analysis aids to answer the following two questions:
1) Across the United States, which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health?
2) Across the United States, which types of events have the greatest economic consequences?
Main conclusions of the study:
1) Tornado has caused highest fatalities and injuries with more than 5600 deaths and 91400 injuries.
2) Floods have caused the most significant economic damage of more than 157 billion USD.
The data columns of interest:
* EVTYPE -> Type of event
* FATALITIES -> Number of fatalities
* INJURIES -> Number of injuries
* PROPDMG -> Amount of property damage in orders of magnitude
* PROPDMGEXP -> Order of magnitude for property damage (e.g. K for thousands)
* CROPDMG -> Amount of crop damage in orders of magnitude
* PROPDMGEXP -> Order of magnitude for crop damage (e.g. M for millions)
Loading data and reading it into data-frame storm:
setwd("/Users/kareena_610/https:/github.com/k-610z/Rep_Data-Storm-Project2")
storm <- read.csv(bzfile("StormData.csv.bz2"), sep = ",", header = TRUE, stringsAsFactors = FALSE)
1) Across the United States, which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health?
summ_pophealthdmg<- storm %>%
select(EVTYPE,FATALITIES,INJURIES) %>%
filter(complete.cases(.)) %>%
group_by(EVTYPE) %>%
summarise(sum_fatalities=sum(FATALITIES),sum_injuries=sum(INJURIES))
summ_pophealthdmg%>%
select(EVTYPE, sum_fatalities,sum_injuries) %>%
filter(sum_fatalities==max(sum_fatalities))
## # A tibble: 1 × 3
## EVTYPE sum_fatalities sum_injuries
## <chr> <dbl> <dbl>
## 1 TORNADO 5633 91346
summ_pophealthdmg%>%
select(EVTYPE, sum_fatalities,sum_injuries) %>%
filter(sum_injuries==max(sum_injuries))
## # A tibble: 1 × 3
## EVTYPE sum_fatalities sum_injuries
## <chr> <dbl> <dbl>
## 1 TORNADO 5633 91346
# Summary of EVTYPE with the highest fatalities and injuries
MaxPopFat<-summ_pophealthdmg%>%
select(EVTYPE, sum_fatalities,sum_injuries) %>%
arrange(desc(sum_fatalities))
#Lets take the 10 highest fatality count events
MaxPopFat<-MaxPopFat[1:10,]
MaxPopFat # Displays result
## # A tibble: 10 × 3
## EVTYPE sum_fatalities sum_injuries
## <chr> <dbl> <dbl>
## 1 TORNADO 5633 91346
## 2 EXCESSIVE HEAT 1903 6525
## 3 FLASH FLOOD 978 1777
## 4 HEAT 937 2100
## 5 LIGHTNING 816 5230
## 6 TSTM WIND 504 6957
## 7 FLOOD 470 6789
## 8 RIP CURRENT 368 232
## 9 HIGH WIND 248 1137
## 10 AVALANCHE 224 170
MaxPopInj<-summ_pophealthdmg%>%
select(EVTYPE, sum_fatalities,sum_injuries) %>%
arrange(desc(sum_injuries))
#Lets take the 10 highest fatality count events
MaxPopInj<-MaxPopInj[1:10,]
MaxPopInj # Displays result
## # A tibble: 10 × 3
## EVTYPE sum_fatalities sum_injuries
## <chr> <dbl> <dbl>
## 1 TORNADO 5633 91346
## 2 TSTM WIND 504 6957
## 3 FLOOD 470 6789
## 4 EXCESSIVE HEAT 1903 6525
## 5 LIGHTNING 816 5230
## 6 HEAT 937 2100
## 7 ICE STORM 89 1975
## 8 FLASH FLOOD 978 1777
## 9 THUNDERSTORM WIND 133 1488
## 10 HAIL 15 1361
Observation: We see that Tornado is the event causing maximum damage on public health.
Data Visualization of Total fatalities and Total Injuries caused by Severe Weather Events.
par(mfrow = c(1, 2), mar = c(15, 4, 3, 2), mgp = c(3, 1, 0), cex =1.0)
barplot(MaxPopFat$sum_fatalities, las = 3, names.arg = MaxPopFat$EVTYPE, main = "Weather Events With\n The Top 10 Highest Fatalities", ylab = "Number of Fatalities", col = "grey")
barplot(MaxPopInj$sum_injuries, las = 3, names.arg =MaxPopInj$EVTYPE , main = "Weather Events With\n The Top 10 Highest Injuries", ylab = "Number of Injuries", col = "seagreen")
2)Across the United States, which types of events have the greatest economic consequences?
eventsummary<- storm %>%
select(EVTYPE,PROPDMG,PROPDMGEXP,CROPDMG,CROPDMGEXP) %>%
filter(complete.cases(.))
row<-nrow(eventsummary)
Converting notation of ‘K’,‘M’,‘B’, into integers.
#Property damage expenses may be in H,K,M,B
for (i in length(row))
{ cropval=1
propval=1
if(eventsummary$CROPDMGEXP[i]=="K")
cropval=1000
else if(eventsummary$CROPDMGEXP[i]=="M")
cropval=1000000
else if(eventsummary$CROPDMGEXP[i]=="B")
cropval=1000000000
eventsummary$CROPDMG[i]<-eventsummary$CROPDMG[i]*cropval
if(eventsummary$PROPDMGEXP[i]=="K")
propval=1000
else if(eventsummary$PROPDMGEXP[i]=="M")
propval=1000000
else if(eventsummary$PROPDMGEXP[i]=="B")
propval=1000000000
eventsummary$PROPDMG[i]<-eventsummary$PROPDMG[i]*propval
}
Summarising the data as the highest damage to crops, to property and the total damage overall
This will be the damage to economy of United States
summary_ecodmg<- eventsummary %>%
select(EVTYPE,PROPDMG,CROPDMG) %>%
filter(complete.cases(.)) %>%
group_by(EVTYPE) %>%
summarise(sum_propdmg=sum(PROPDMG),sum_cropdmg=sum(CROPDMG))
#Event causing highest Property Damage
summary_ecodmg%>%
select(EVTYPE, sum_propdmg,sum_cropdmg) %>%
filter(sum_propdmg==max(sum_propdmg))
## # A tibble: 1 × 3
## EVTYPE sum_propdmg sum_cropdmg
## <chr> <dbl> <dbl>
## 1 TORNADO 3237233. 100019.
#Event causing highest Crop Damage
summary_ecodmg%>%
select(EVTYPE, sum_propdmg,sum_cropdmg) %>%
filter(sum_cropdmg==max(sum_cropdmg))
## # A tibble: 1 × 3
## EVTYPE sum_propdmg sum_cropdmg
## <chr> <dbl> <dbl>
## 1 HAIL 688693. 579596.
#Summarizing data
MaxPropDMG<-summary_ecodmg%>%
select(EVTYPE, sum_propdmg,sum_cropdmg) %>%
arrange(desc(sum_propdmg))
# Lets take the first 10 highest Property Damages:
MaxPropDMG<-MaxPropDMG[1:10,]
MaxPropDMG #Display result
## # A tibble: 10 × 3
## EVTYPE sum_propdmg sum_cropdmg
## <chr> <dbl> <dbl>
## 1 TORNADO 3237233. 100019.
## 2 FLASH FLOOD 1420125. 179200.
## 3 TSTM WIND 1335966. 109203.
## 4 FLOOD 899938. 168038.
## 5 THUNDERSTORM WIND 876844. 66791.
## 6 HAIL 688693. 579596.
## 7 LIGHTNING 603352. 3581.
## 8 THUNDERSTORM WINDS 446293. 18685.
## 9 HIGH WIND 324732. 17283.
## 10 WINTER STORM 132721. 1979.
MaxCropDMG<-summary_ecodmg%>%
select(EVTYPE, sum_propdmg,sum_cropdmg) %>%
arrange(desc(sum_cropdmg))
# Lets take the first 10 highest Crop damages:
MaxCropDMG<-MaxCropDMG[1:10,]
MaxCropDMG#Display result
## # A tibble: 10 × 3
## EVTYPE sum_propdmg sum_cropdmg
## <chr> <dbl> <dbl>
## 1 HAIL 688693. 579596.
## 2 FLASH FLOOD 1420125. 179200.
## 3 FLOOD 899938. 168038.
## 4 TSTM WIND 1335966. 109203.
## 5 TORNADO 3237233. 100019.
## 6 THUNDERSTORM WIND 876844. 66791.
## 7 DROUGHT 4099. 33899.
## 8 THUNDERSTORM WINDS 446293. 18685.
## 9 HIGH WIND 324732. 17283.
## 10 HEAVY RAIN 50842. 11123.
Observation: We see that property damage is highest in Tornado but crop damage is highest in Hail
Adding both the damages to choose one event that causes the highest damage to both properties and crops
Then we will compute the 10 events corresponding to highest total Economic Damage
MaxTotEcoDMG<-summary_ecodmg %>%
select(EVTYPE, sum_propdmg,sum_cropdmg) %>%
group_by(EVTYPE) %>%
summarize(tot_ecodmg=sum_propdmg+sum_cropdmg) %>%
arrange(desc(tot_ecodmg))
MaxTotEcoDMG<-MaxTotEcoDMG[1:10, ]
MaxTotEcoDMG#Display result
## # A tibble: 10 × 2
## EVTYPE tot_ecodmg
## <chr> <dbl>
## 1 TORNADO 3337252.
## 2 FLASH FLOOD 1599325.
## 3 TSTM WIND 1445168.
## 4 HAIL 1268290.
## 5 FLOOD 1067976.
## 6 THUNDERSTORM WIND 943636.
## 7 LIGHTNING 606932.
## 8 THUNDERSTORM WINDS 464978.
## 9 HIGH WIND 342015.
## 10 WINTER STORM 134700.
Observation:
If we add up all the data, the highest economical damage is by
1)Tornado
2)Flash Flood
Data Visualization of Total Property Damages, Total Crop Damages and Total Economic Damages caused by these Severe Weather Events
par(mfrow = c(1, 3), mar = c(15, 4, 3, 2), mgp = c(3, 1, 0), cex =0.6)
barplot(MaxPropDMG$sum_propdmg/(10^6), las = 3, names.arg = MaxPropDMG$EVTYPE , main = "Top 10 Events with\n Greatest Property Damages", ylab = "Cost of damages ($ million)", col = "lightblue")
barplot(MaxCropDMG$sum_cropdmg/(10^6), las = 3, names.arg = MaxCropDMG$EVTYPE , main = "Top 10 Events With\n Greatest Crop Damages", ylab = "Cost of damages ($ million)", col = "lightgreen")
barplot(MaxTotEcoDMG$tot_ecodmg/(10^6), las = 3, names.arg = MaxTotEcoDMG$EVTYPE, main = "Top 10 Events With\n Greatest Economic Damages", ylab = "Cost of damages ($ million)", col = "lightpink")
Results:
1)Tornado is the event causing maximum damage on public health.
2.1)Property damage is highest in Tornado but crop damage is highest in Hail.
2.2)If we add up all the data, the highest economical damage is by
Tornado
Flash Flood