In the present analysis, the storm database of the U.S. National Oceanic and Atmospheric Administration’s (NOAA) gets evaluated. This database includes weather events of all kinds and their effects on the population health and the economy in the United States. In context of population health effects are fatalities and injuries, in connection with economy the weather related damages of properties and crop are meant.
For more informations on the storm data set please check the NOAA documenation and the FAQ of the weather event types.
The first objective of this investigation is to analyze which top weather event types cause the most human fatalities and injuries. This is represented by the total number of human casualties.
Secondly, this evaluation shows which top event types are resulting the most damage of property and crop. For this, the economic loss level in Dollar is chosen as the controlling factor.
First the needed data set of the U.S. National Oceanic and Atmospheric Administration’s (NOAA) gets loaded. Then the required R packages are getting loaded. The package plyr enables a fast data processing, ggplot2 creates easy to unterstand plots and xtable contructs readable tables. After that, the mandatory data to analyze human casulties, economic loss and the according weather event types gets subsetted, cleaned and adjusted. The latter is necessary to produce readable numbers.
## fileURL <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
## download.file(fileUrl, destfile = "stormdata.bz2", method = "curl")
stormdata <- read.csv("stormdata.csv", header=T)
library(plyr)
library(ggplot2)
library(xtable)
#Adjusting the property data to show readable numbers
levels(stormdata$PROPDMGEXP)[levels(stormdata$PROPDMGEXP)=="B"]=10^9
levels(stormdata$PROPDMGEXP)[levels(stormdata$PROPDMGEXP)=="h"]=10^2
levels(stormdata$PROPDMGEXP)[levels(stormdata$PROPDMGEXP)=="H"]=10^2
levels(stormdata$PROPDMGEXP)[levels(stormdata$PROPDMGEXP)=="M"]=10^6
levels(stormdata$PROPDMGEXP)[levels(stormdata$PROPDMGEXP)=="m"]=10^6
levels(stormdata$PROPDMGEXP)[levels(stormdata$PROPDMGEXP)=="K"]=10^3
## The unknown units are getting removed
levels(stormdata$PROPDMGEXP)[1:13]=0
#Adjusting the crop data to show readable numbers
levels(stormdata$CROPDMGEXP)[levels(stormdata$CROPDMGEXP)=="B"]=10^9
levels(stormdata$CROPDMGEXP)[levels(stormdata$CROPDMGEXP)=="M"]=10^6
levels(stormdata$CROPDMGEXP)[levels(stormdata$CROPDMGEXP)=="m"]=10^6
levels(stormdata$CROPDMGEXP)[levels(stormdata$CROPDMGEXP)=="K"]=10^3
levels(stormdata$CROPDMGEXP)[levels(stormdata$CROPDMGEXP)=="k"]=10^3
## The unknown units are getting removed
levels(stormdata$CROPDMGEXP)[1:4]=0
To display the event types with effects on human health, the storm data gets subsetted and consolidated for this case.
stormdata.casualties <- subset(stormdata, select=c("EVTYPE","FATALITIES","INJURIES"))
casualties <- ddply(stormdata.casualties, c("EVTYPE"),summarise, casualtieshuman=sum(FATALITIES)+sum(INJURIES))
casualties <- casualties[order(casualties$casualtieshuman, decreasing=TRUE),]
row.names(casualties) <- seq_along(casualties$EVTYPE)
names(casualties) <- c("event.type","casualties.human")
casualties <- casualties[1:10,]
To display the event types with property and crop damage, the storm data gets subsetted and consolidated for this case.
stormdata.damage <- subset(stormdata, select=c("EVTYPE","PROPDMG","PROPDMGEXP","CROPDMG","CROPDMGEXP"))
damage <- ddply(stormdata.damage, "EVTYPE", summarise,
damage.propcrop=sum(PROPDMG*as.numeric(as.character(PROPDMGEXP))) +
sum(CROPDMG*as.numeric(as.character(CROPDMGEXP))))
damage <- damage[order(damage$damage.propcrop, decreasing=TRUE),]
names(damage) <- c("event.type","damage.propcrop")
row.names(damage) <- seq_along(damage$event.type)
damage <- damage[1:10,]
After the data set was processed accordingly, the results of this analysis can be evaluated. First, the impacts of weather conditions on the population health of the United States is taken into account.
Enabling a quick and clear overview, a chart and then a table of the top 10 weather event types gets created. This top 10 event types had the worst impact on population health seen by the total number of fatalities and injuries. The term human casualties means both injuries and fatalities of humans.
plot1 <- ggplot(casualties, aes(x=reorder(event.type,-casualties.human), y=casualties.human, fill=reorder(event.type,-casualties.human)))
plot1 + geom_bar(width = 0.5, stat = "identity") +
labs(title="Human Casualties Caused By Top 10 Weather Conditions")+
labs(y="Human Casualties")+
labs(x="Event Types") +
coord_flip() +
guides(fill=FALSE) +
theme_bw()
print(xtable(data.frame("Event Type"=casualties$event.type, "Human Casualties"=casualties$casualties.human), digits=c(0,0,0)), type = "html")
| Event.Type | Human.Casualties | |
|---|---|---|
| 1 | TORNADO | 96979 |
| 2 | EXCESSIVE HEAT | 8428 |
| 3 | TSTM WIND | 7461 |
| 4 | FLOOD | 7259 |
| 5 | LIGHTNING | 6046 |
| 6 | HEAT | 3037 |
| 7 | FLASH FLOOD | 2755 |
| 8 | ICE STORM | 2064 |
| 9 | THUNDERSTORM WIND | 1621 |
| 10 | WINTER STORM | 1527 |
The Tornado represents with really clear distance the most harmful weather event for the population health. In fact, tornados caused 96,979 human fatalities and injuries.The second-place event type is Excessive Heat with 8428 human casualties, thus only a fraction of the tornado number.
Now the impact of weather conditions on the U.S. economy gets analyzed. Again, for a quick and clear overview a chart and a table with the top 10 weather events gets created.
plot2 <- ggplot(damage, aes(x = reorder(event.type,-damage.propcrop), y = damage.propcrop*10^-9, fill=reorder(event.type,-damage.propcrop)))
plot2 + geom_bar(width = 0.5, stat = "identity") +
labs(title = "Economic Loss Caused By Top 10 Weather Conditions") +
labs(y = "Economic Loss In Billion") +
labs(x = "Event Types") +
guides(fill=FALSE) +
coord_flip() +
theme_bw()
print(xtable(data.frame("Event Type"=damage$event.type, "Damage In Billions"=damage$damage.propcrop*10^-9)), type = "html")
| Event.Type | Damage.In.Billions | |
|---|---|---|
| 1 | FLOOD | 150.32 |
| 2 | HURRICANE/TYPHOON | 71.91 |
| 3 | TORNADO | 57.35 |
| 4 | STORM SURGE | 43.32 |
| 5 | HAIL | 18.76 |
| 6 | FLASH FLOOD | 17.56 |
| 7 | DROUGHT | 15.02 |
| 8 | HURRICANE | 14.61 |
| 9 | RIVER FLOOD | 10.15 |
| 10 | ICE STORM | 8.97 |
With a total economic loss of 150.32 billion U.S. dollars, the Flood represents the absolute worst weather condition for crop and properties. On the second place the Hurricane/Typhoon follows with a damage value of 71.91 billion $. Third-placed appears the Tornado with a damage value of 57.35 billion $.
In summary it can be stated that all in all the Tornado is the most damaging weather condition. On the one hand, it is by far the worst weather event to human health with 96979 cases of fatalities and injuries. On the other hand, Tornodas are responsible for a total economic loss of 57.35billion U.S. dollar. Although the Flood generates a three times as high economic loss, it is the combination of human casualties and property and crop damage which makes the Tornado the single worst weather event in the history of the NOAA data records.
Date: 27. July 2014 | Author: M.H. Nierhoff