With the data provided by U.S. National Oceanic and Atmospheric Administration on storm, we have performed analysis on the impact of weather event on population health(fatalities and injuries) and economy.
The results of the analysis points to Tornado and Hail as the biggest weather events to have an impact on health and economy. Tornado has had the highest fatalities and injuries.
Similarly, Hail has impacted the most on Crop Damages, which Tornado has biggest effect on Property Damages.
if(!file.exists("StormData.csv.bz2"))
{
download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2","StormData.csv.bz2")
}
storm.data <- read.csv("StormData.csv.bz2", header = T)
storm.data <- storm.data[,c("EVTYPE","FATALITIES","INJURIES","PROPDMG","PROPDMGEXP","CROPDMG","CROPDMGEXP")]
Units (H, K, M and B) are defined in the two fields (PROPDMGEXP, CROPDMGEXP). It is important to convert them into numbers for comparision. E.g. H - 100, K - 1000, M - 1000000, B - 1000000000
replace_units <- function(x)
{
ifelse(is.na(x),return(1),
ifelse(x=='K'|x=='k',return(1000),
ifelse(x=='H'|x=='h',return(100),
ifelse(x=='M'|x=='m',return(1000000),
ifelse(x=='B'|x=='b',return(1000000000),1)
)
)
)
)
return(1)
}
Update Crop Damage and Property damage appropriately.
storm.data$PROPDMGEXP <- replace_units(storm.data$PROPDMGEXP)
storm.data$CROPDMGEXP <- replace_units(storm.data$CROPDMGEXP)
storm.data$PROPDMG <- storm.data$PROPDMG * storm.data$PROPDMGEXP
storm.data$CROPDMG <- storm.data$CROPDMG * storm.data$CROPDMGEXP
storm.data$PROPDMGEXP <- NULL
storm.data$CROPDMGEXP <- NULL
#Update column names of the dataframe
colnames(storm.data) <- c("EventType","Fatalities","Injuries","PropertyDamage","CropDamage")
Summarize the sum of fatalities by event type and look at top 10 events
fatalitiest.eventtype <- aggregate(Fatalities ~ EventType, data = storm.data, FUN = sum)
top.event.fatalities <- head(fatalitiest.eventtype[order(-fatalitiest.eventtype$Fatalities),],10)
ggplot(top.event.fatalities) + aes(x = EventType, y=Fatalities, fill=EventType) + geom_bar(stat="identity") +
xlab("Weather Events") + ylab("Fatalities") + theme(axis.text.x = element_text(angle = 90, hjust = 1))
Summarize the sum of injuries by event type and look at top 10 events
injuries.eventtype <- aggregate(Injuries ~ EventType, data = storm.data, FUN = sum)
top.event.injuries <- head(injuries.eventtype[order(-injuries.eventtype$Injuries),],10)
ggplot(top.event.injuries) + aes(x = EventType, y=Injuries, fill=EventType) + geom_bar(stat="identity") +
xlab("Weather Events") + ylab("Injuries") + theme(axis.text.x = element_text(angle = 90, hjust = 1))
Summarize the sum of property damage by event type and look at top 10 events
property.damage.eventtype <- aggregate(PropertyDamage ~ EventType, data = storm.data, FUN = sum)
top.event.prop.damage <- head(property.damage.eventtype[order(-property.damage.eventtype$PropertyDamage),],10)
head(top.event.prop.damage, 10)
## EventType PropertyDamage
## 834 TORNADO 3212258160
## 153 FLASH FLOOD 1420124590
## 856 TSTM WIND 1335965610
## 170 FLOOD 899938480
## 760 THUNDERSTORM WIND 876844170
## 244 HAIL 688693380
## 464 LIGHTNING 603351780
## 786 THUNDERSTORM WINDS 446293180
## 359 HIGH WIND 324731560
## 972 WINTER STORM 132720590
Summarize the sum of crop damage by event type and look at top 10 events
crop.damage.eventtype <- aggregate(CropDamage ~ EventType, data = storm.data, FUN = sum)
top.event.crop.damage <- head(crop.damage.eventtype[order(-crop.damage.eventtype$CropDamage),],10)
head(top.event.crop.damage, 10)
## EventType CropDamage
## 244 HAIL 579596280
## 153 FLASH FLOOD 179200460
## 170 FLOOD 168037880
## 856 TSTM WIND 109202600
## 834 TORNADO 100018520
## 760 THUNDERSTORM WIND 66791450
## 95 DROUGHT 33898620
## 786 THUNDERSTORM WINDS 18684930
## 359 HIGH WIND 17283210
## 290 HEAVY RAIN 11122800
Summarize the sum of amount of crop and food damage by event type and look at top 5 events
storm.data$CropPropertyDamage <- storm.data$CropDamage + storm.data$PropertyDamage
damage.eventtype <- aggregate(CropPropertyDamage ~ EventType, data = storm.data, FUN = sum)
top.event.damage <- head(damage.eventtype[order(-damage.eventtype$CropPropertyDamage),],10)
ggplot(top.event.damage) + aes(x = EventType, y=CropPropertyDamage, fill=EventType) + geom_bar(stat="identity") +
xlab("Weather Events") + ylab("Crop and Property Damage") + theme(axis.text.x = element_text(angle = 90, hjust = 1))
The final analysis suggests Tornado being the biggest impactful weather event impacting both health and economy. TSTM Wind, Flash Flood and Hail also has a high impact on economy.
Similarly, Tornado is the highes fatalities and has a similar impact on number of injuries.