Extreme events, such as storms, drought and other severe weather events, can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage. In order to prevent any damages from these extreme events a storm database has been developed by the U.S. National Oceanic and Atmospheric Administration’s (NOAA). This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage. In this exercise, a part of this database is used to explore which type of extreme events are the most harmful to property and human health. The events in the database start in the year 1950 and end in November 2011. In the earlier years of the database there are generally fewer events recorded, most likely due to a lack of good records. More recent years should be considered more complete.
The data for this assignment come in the form of a comma-separated-value file compressed via the bzip2 algorithm to reduce its size. You can download the file from the course web site:
data <- read.csv("repdata%2Fdata%2FStormData.csv.bz2", header = TRUE, sep = ",")
General processing: Calculation of some statistics for extreme events
et <-unique(data$EVTYPE) # All type of Extreme events
EN <- integer() # Vector contains total number of each type of events
EI <- integer() # Vector contains total number of Injuries caused by each type of events
EF <- integer() # Vector contains total number of Fatalities caused by each type of events
EP <- integer() # Vector contains values of total damage to the property (economic consequences) caused by each type of events
EPm <- integer() # Vector contains values of average damage to the property (economic consequences) caused by each type of events
EIm <- integer() # Vector contains average number of Injuries caused by each type of events
EFm <- integer() # Vector contains average number of Fatalities caused by each type of events
# Calculations of some statistics for each extreme event
for (d in 1:length(et)) {
# total number of each event
EN[d] <- sum(data$EVTYPE %in% et[d])
# total number of Injuries caused by each type of events
EI[d] <- sum(data[data$EVTYPE %in% et[d],24])
# total number of Fatalities caused by each type of events
EF[d] <- sum(data[data$EVTYPE %in% et[d],23])
# total damage to property caused by each type of events
EP[d] <- sum(data[data$EVTYPE %in% et[d],25])
# average number of Injuries per event
EIm[d] <- EI[d]/EN[d]
# average number of Fatalities per event
EFm[d] <- EF[d]/EN[d]
# average damage to property per event
EPm[d] <- EP[d]/EN[d]
}
# Make a dataframe containing all calculated values (1) Name of events, (2) total number of each event, (3) total number of Injuries, (4) average number of Injuries, (5) total number of Fatalities, (6) average number of Fatalities, (7) total damages to economy, and (8) average economic damage per event.
exdata <- data.frame(et, EN, EI, EIm, EF, EFm, EP, EPm)
str(exdata)
## 'data.frame': 985 obs. of 8 variables:
## $ et : Factor w/ 985 levels " HIGH SURF ADVISORY",..: 834 856 244 201 629 429 657 972 409 786 ...
## $ EN : int 60652 219940 288661 250 587 1 7 11433 1 20843 ...
## $ EI : num 91346 6957 1361 23 29 ...
## $ EIm: num 1.50607 0.03163 0.00471 0.092 0.0494 ...
## $ EF : num 5633 504 15 7 5 ...
## $ EFm: num 0.092874 0.002292 0.000052 0.028 0.008518 ...
## $ EP : num 3212258 1335966 688693 2917 3004 ...
## $ EPm: num 52.96 6.07 2.39 11.67 5.12 ...
List and plot 10 types of extreme events that are the most harmful to the population health
10 types of events that cause the highest number of injuries
# Sort descending the dataframe according to EI (number of Injuries caused by each type of events)
data1 <- exdata[order(-exdata$EI),]
# Sort descending the dataframe according to EIm (average number of Injuries caused by each type of events)
data2 <- exdata[order(-exdata$EIm),]
par(mfrow=c(2,1),mar=c(1, 4, 1, 1) + 0.9)
barplot(data1[c(1:10),3],horiz=TRUE, main="Total injuries",
names.arg=data1[c(1:10),1], cex.names=0.4, las=1)
barplot(data2[c(1:10),4],horiz=TRUE, main="Average injuries per event",
names.arg=data2[c(1:10),1], cex.names=0.4, las=1)
10 types of events that cause the highest number of fatalities
# Sort descending the dataframe according to EF (number of Fatalities caused by each type of events)
data3 <- exdata[order(-exdata$EF),]
# Sort descending the dataframe according to EFm (average number of Fatalities caused by each type of events)
data4 <- exdata[order(-exdata$EFm),]
par(mfrow=c(2,1),mar=c(1, 4, 1, 1) + 0.9)
barplot(data3[c(1:10),5],horiz=TRUE, main="Total fatalities",
names.arg=data3[c(1:10),1], cex.names=0.4, las=1)
barplot(data4[c(1:10),6],horiz=TRUE, main="Average fatalities per event",
names.arg=data4[c(1:10),1], cex.names=0.4, las=1)
10 types of extreme events that cause the greatest economic consequences
# Sort descending the dataframe according to EP (total economic damage caused by each type of events)
data5 <- exdata[order(-exdata$EP),]
# Sort descending the dataframe according to EFm (average economic damage by each type of events)
data6 <- exdata[order(-exdata$EPm),]
par(mfrow=c(2,1),mar=c(1, 4, 1, 1) + 0.9)
barplot(data5[c(1:10),7],horiz=TRUE, main="Total economic damage",
names.arg=data5[c(1:10),1], cex.names=0.4, las=1)
barplot(data6[c(1:10),8],horiz=TRUE, main="Average economic damage per event",names.arg=data6[c(1:10),1], cex.names=0.4, las=1)
In general, it seems that Tornado is the event that cause the greatest damages to both population health and economy.