Synopsis

After downloading the StormData.csv.bz2 file, I loaded into R for processing the data, as follows: I summed up the Fatalities and Injuries or Property Damages and Crop Damages on the basis of the Types of Weather Events, included them in the dataframes, and reordered the dataframes in descending order of the sums. I selected the event types that their impacts sum up cumulatively to 95% thresholds of either Fatalities and Injuries or Property an Crop Damages. These are the most dangerous events for population health (Figure 1) or financial damages (Figure 2). The data are plotted as a horizontal stacked bar plot, grouped by classes of events, coloured distinctly according to the legend included in the .

Data Processing

I downloaded the file and loaded in a dataframe called StormData2. A) For Calculating the impact of Weather Events on Population Heatlh. I reordered the data in a descending order of Fatalities and Injuries within a new dataframe called StormData3. I “cloned” a subset of the dataframe columns, namely Fatalities and Injuries, in a new datarame, called stormData4. I added a column in StormData4 dataframe with the total sum of Fatalities and Injuries, under the name SUMFATAL. Using “tapply” function, I calculated the total sum of fatalities and injuries, as well as the individual sums of fatalities and injuries on the basis of types of events of severe weather. The outputs were combined in a new dataframe along with the name of the events, the results corresponded to. I calculated the 95% percent of the total sum of fatalities and injuries. On the basis of that threshold, I extracted the types of events, whose impact sums cumulatively up to the 95% threshold of the total sum of fatalities and injuries. These events represent the most dangerous events for population health. The data were placed in a matrix called Final.

# load the data
StormData2<-read.csv("StormData.csv.bz2")
#order the data
StormData3<-StormData2[order(StormData2$FATALITIES, StormData2$INJURIES, decreasing=T), ]
# head(StormData3)
# Make a new table containing only the event-type and the caused fatalities and injuries
StormData4<-subset(StormData3, select=c(EVTYPE, FATALITIES,INJURIES))
# head(StormData4)
#Add the sum of the fatalities and injuries to the new dataset as a new column
StormData4$SUMFATAL<-StormData4$FATALITIES + StormData4$INJURIES
# head(StormData4)
#Calculate Sum of Fatalities, Injuries and their sum using tapply
TypeOfFatal1<-unlist(tapply(StormData4$SUMFATAL, StormData4$EVTYPE, sum, simplify=F))
TypeOfFatal2<-unlist(tapply(StormData4$FATALITIES, StormData4$EVTYPE, sum, simplify=F))
TypeOfFatal3<-unlist(tapply(StormData4$INJURIES, StormData4$EVTYPE, sum, simplify=F))
# Collecting the output in a dataframe
FinalData<-as.data.frame(cbind(Total=as.numeric(unname(TypeOfFatal1)), Fatal=as.numeric(unname(TypeOfFatal2)), Injuries=as.numeric(unname(TypeOfFatal3))))
# name the dataframe rows
FinalData<-cbind(EVTYPE=names(TypeOfFatal1),FinalData)
#order the dataframe according to total damages
FinalData<-FinalData[order(FinalData$Total, decreasing=T), ]
#calculate 95% cutoff
x<-0.95*sum(FinalData$Total)
# identify the components of the dataframe that cumulatively cause damages up to 90%
FinalData2<-FinalData[cumsum(FinalData$Total)<x, ]
#Put this components in a matrix
FINAL<-matrix(data = c(FinalData2$Fatal, FinalData2$Injuries), nrow = 2, ncol = length(FinalData2$Total), byrow = T,
dimnames = NULL)
#name the matrix
dimnames(FINAL)=list(colnames(FinalData2)[3:4], FinalData2$EVTYPE)
print (FINAL)
##          TORNADO EXCESSIVE HEAT TSTM WIND FLOOD LIGHTNING HEAT FLASH FLOOD
## Fatal       5633           1903       504   470       816  937         978
## Injuries   91346           6525      6957  6789      5230 2100        1777
##          ICE STORM THUNDERSTORM WIND WINTER STORM HIGH WIND HAIL
## Fatal           89               133          206       248   15
## Injuries      1975              1488         1321      1137 1361
##          HURRICANE/TYPHOON HEAVY SNOW WILDFIRE THUNDERSTORM WINDS BLIZZARD
## Fatal                   64        127       75                 64      101
## Injuries              1275       1021      911                908      805
##          FOG RIP CURRENT WILD/FOREST FIRE RIP CURRENTS
## Fatal     62         368               12          204
## Injuries 734         232              545          297
  1. For Calculating the impact of Weather Events on Financial Damages. I reordered the data in a descending order of Property and Crop Damages within a new dataframe called StormData3d. I “cloned” a subset of the dataframe columns, namely Property and Crop Damages, in a new datarame, called stormData4d. I added a column in StormData4d dataframe with the total sum of Property and Crop Damages, under the name SUMDMG. Using “tapply” function, I calculated the total sum of Property and Crop Damages, as well as the individual sums of Property and Crop Damages on the basis of types of events of severe weather. The outputs were combined in a new dataframe along with the name of the events, the results corresponded to. I calculated the 95% percent of the total sum of Property and Crop Damages. On the basis of that threshold, I extracted the types of events, whose impact sums cumulatively up to the 95% threshold of the total sum of Property and Crop Damages. These events represent the most dangerous events from the persective of their economic consequences. The data were placed in a matrix called Finald.
##Calculate damages
StormData3d<-StormData2[order(StormData2$PROPDMG, StormData2$CROPDMG, decreasing=T), ]
# head(StormData3d)

StormData3d<-StormData2[order(StormData2$PROPDMG, StormData2$CROPDMG, decreasing=T), ]
# head(StormData3d)
#Make a new table containing only the event-type and the caused property damages and crop damages
StormData4d<-subset(StormData3d, select=c(EVTYPE, PROPDMG,CROPDMG))
# head(StormData4)
#Add the sum of the property damages and crop damages to the new dataset as a new column
StormData4d$SUMDMG<-StormData4d$PROPDMG + StormData4d$CROPDMG
# head(StormData4)
#Calculate Sum of Fatalities, Injuries and their sum using tapply
TypeOfFatal1d<-unlist(tapply(StormData4d$SUMDMG, StormData4d$EVTYPE, sum, simplify=F))
TypeOfFatal2d<-unlist(tapply(StormData4d$PROPDMG, StormData4d$EVTYPE, sum, simplify=F))
TypeOfFatal3d<-unlist(tapply(StormData4d$CROPDMG, StormData4d$EVTYPE, sum, simplify=F))
# Collecting the output in a dataframe
FinalDatad<-as.data.frame(cbind(Total=as.numeric(unname(TypeOfFatal1d)), PROPDMG=as.numeric(unname(TypeOfFatal2d)), CROPDMG=as.numeric(unname(TypeOfFatal3d))))
# name the dataframe rows
FinalDatad<-cbind(EVTYPE=names(TypeOfFatal1d),FinalDatad)
#order the dataframe according to total damages
FinalDatad<-FinalDatad[order(FinalDatad$Total, decreasing=T), ]
#calculate 95% cutoff
y<-0.95*sum(FinalDatad$Total)
FinalData2d<-FinalDatad[cumsum(FinalDatad$Total)<y, ]
#Put this components in a matrix
FINALd<-matrix(data = c(FinalData2d$PROPDMG, FinalData2d$CROPDMG), nrow =2, ncol = length(FinalData2d$Total), byrow = T,
dimnames = NULL)
#name the matrix
dimnames(FINALd)=list(c("Property Damages", "Crop Damages"), FinalData2d$EVTYPE)
print (FINALd)
##                  TORNADO FLASH FLOOD TSTM WIND   HAIL  FLOOD
## Property Damages 3212258     1420125   1335966 688693 899938
## Crop Damages      100019      179200    109203 579596 168038
##                  THUNDERSTORM WIND LIGHTNING THUNDERSTORM WINDS HIGH WIND
## Property Damages            876844    603352             446293    324732
## Crop Damages                 66791      3581              18685     17283
##                  WINTER STORM HEAVY SNOW WILDFIRE ICE STORM STRONG WIND
## Property Damages       132721     122252    84459     66001       62994
## Crop Damages             1979       2166     4364      1689        1617
##                  HEAVY RAIN
## Property Damages      50842
## Crop Damages          11123

Results

In Figure 1, I plot in a horizontal, stacked, bar plot, the total of Fatalities and Injuries, caused by the most dangerous types of events of severe weather. The events were selected after calculating for their cumulative impact reaching 95% of the total sum of Fatalities and Injuries. The impact of distinct weather events is categorized in Fatalities and Injuries, as indicated by the color code in the included legend. The most harmful event is Tornado.

#plot the results as stacked bar plot to show the individual contribution of fatalities and injuries
par(oma=c(0,2,2,2), ps=8)
barplot(FINAL, main="Figure 1: Distribution of event types accounting for\n 95% of total fatalities and injuries", names.arg= FinalData2$EVTYPE, col=c("blue","red"), xlab="Number of incidents", las=1, horiz=T, xlim=c(0, 95000), legend=T)

plot of chunk unnamed-chunk-3

In Figure 2, I plot in a horizontal, stacked, bar plot, the total of Property and Crop Damages, caused by the most dangerous types of events of severe weather. The events were selected after calculating for their cumulative impact reaching 95% of the total sum of Property an Crop Damages. The impact of distinct weather events is categorized in Property damages and Crop damages, as indicated by the color code in the included legend. The most harmful event is Tornado.

#plot the results as stacked bar plot to show the individual contribution of property damages and crop damages
par(oma=c(0,6,2,2), ps=12)
barplot(FINALd, main="Figure 2: Distribution of event types accounting for\n 95% of total property and crop damages", names.arg= FinalData2d$EVTYPE, col=c("blue","red"), xlab="Dollars", las=1, horiz=T, xlim=c(0, 4000000), legend=T)

plot of chunk unnamed-chunk-4