After downloading the StormData.csv.bz2 file, I loaded into R for processing the data, as follows: I summed up the Fatalities and Injuries or Property Damages and Crop Damages on the basis of the Types of Weather Events, included them in the dataframes, and reordered the dataframes in descending order of the sums. I selected the event types that their impacts sum up cumulatively to 95% thresholds of either Fatalities and Injuries or Property an Crop Damages. These are the most dangerous events for population health (Figure 1) or financial damages (Figure 2). The data are plotted as a horizontal stacked bar plot, grouped by classes of events, coloured distinctly according to the legend included in the .
I downloaded the file and loaded in a dataframe called StormData2. A) For Calculating the impact of Weather Events on Population Heatlh. I reordered the data in a descending order of Fatalities and Injuries within a new dataframe called StormData3. I “cloned” a subset of the dataframe columns, namely Fatalities and Injuries, in a new datarame, called stormData4. I added a column in StormData4 dataframe with the total sum of Fatalities and Injuries, under the name SUMFATAL. Using “tapply” function, I calculated the total sum of fatalities and injuries, as well as the individual sums of fatalities and injuries on the basis of types of events of severe weather. The outputs were combined in a new dataframe along with the name of the events, the results corresponded to. I calculated the 95% percent of the total sum of fatalities and injuries. On the basis of that threshold, I extracted the types of events, whose impact sums cumulatively up to the 95% threshold of the total sum of fatalities and injuries. These events represent the most dangerous events for population health. The data were placed in a matrix called Final.
# load the data
StormData2<-read.csv("StormData.csv.bz2")
#order the data
StormData3<-StormData2[order(StormData2$FATALITIES, StormData2$INJURIES, decreasing=T), ]
# head(StormData3)
# Make a new table containing only the event-type and the caused fatalities and injuries
StormData4<-subset(StormData3, select=c(EVTYPE, FATALITIES,INJURIES))
# head(StormData4)
#Add the sum of the fatalities and injuries to the new dataset as a new column
StormData4$SUMFATAL<-StormData4$FATALITIES + StormData4$INJURIES
# head(StormData4)
#Calculate Sum of Fatalities, Injuries and their sum using tapply
TypeOfFatal1<-unlist(tapply(StormData4$SUMFATAL, StormData4$EVTYPE, sum, simplify=F))
TypeOfFatal2<-unlist(tapply(StormData4$FATALITIES, StormData4$EVTYPE, sum, simplify=F))
TypeOfFatal3<-unlist(tapply(StormData4$INJURIES, StormData4$EVTYPE, sum, simplify=F))
# Collecting the output in a dataframe
FinalData<-as.data.frame(cbind(Total=as.numeric(unname(TypeOfFatal1)), Fatal=as.numeric(unname(TypeOfFatal2)), Injuries=as.numeric(unname(TypeOfFatal3))))
# name the dataframe rows
FinalData<-cbind(EVTYPE=names(TypeOfFatal1),FinalData)
#order the dataframe according to total damages
FinalData<-FinalData[order(FinalData$Total, decreasing=T), ]
#calculate 95% cutoff
x<-0.95*sum(FinalData$Total)
# identify the components of the dataframe that cumulatively cause damages up to 90%
FinalData2<-FinalData[cumsum(FinalData$Total)<x, ]
#Put this components in a matrix
FINAL<-matrix(data = c(FinalData2$Fatal, FinalData2$Injuries), nrow = 2, ncol = length(FinalData2$Total), byrow = T,
dimnames = NULL)
#name the matrix
dimnames(FINAL)=list(colnames(FinalData2)[3:4], FinalData2$EVTYPE)
print (FINAL)
## TORNADO EXCESSIVE HEAT TSTM WIND FLOOD LIGHTNING HEAT FLASH FLOOD
## Fatal 5633 1903 504 470 816 937 978
## Injuries 91346 6525 6957 6789 5230 2100 1777
## ICE STORM THUNDERSTORM WIND WINTER STORM HIGH WIND HAIL
## Fatal 89 133 206 248 15
## Injuries 1975 1488 1321 1137 1361
## HURRICANE/TYPHOON HEAVY SNOW WILDFIRE THUNDERSTORM WINDS BLIZZARD
## Fatal 64 127 75 64 101
## Injuries 1275 1021 911 908 805
## FOG RIP CURRENT WILD/FOREST FIRE RIP CURRENTS
## Fatal 62 368 12 204
## Injuries 734 232 545 297
##Calculate damages
StormData3d<-StormData2[order(StormData2$PROPDMG, StormData2$CROPDMG, decreasing=T), ]
# head(StormData3d)
StormData3d<-StormData2[order(StormData2$PROPDMG, StormData2$CROPDMG, decreasing=T), ]
# head(StormData3d)
#Make a new table containing only the event-type and the caused property damages and crop damages
StormData4d<-subset(StormData3d, select=c(EVTYPE, PROPDMG,CROPDMG))
# head(StormData4)
#Add the sum of the property damages and crop damages to the new dataset as a new column
StormData4d$SUMDMG<-StormData4d$PROPDMG + StormData4d$CROPDMG
# head(StormData4)
#Calculate Sum of Fatalities, Injuries and their sum using tapply
TypeOfFatal1d<-unlist(tapply(StormData4d$SUMDMG, StormData4d$EVTYPE, sum, simplify=F))
TypeOfFatal2d<-unlist(tapply(StormData4d$PROPDMG, StormData4d$EVTYPE, sum, simplify=F))
TypeOfFatal3d<-unlist(tapply(StormData4d$CROPDMG, StormData4d$EVTYPE, sum, simplify=F))
# Collecting the output in a dataframe
FinalDatad<-as.data.frame(cbind(Total=as.numeric(unname(TypeOfFatal1d)), PROPDMG=as.numeric(unname(TypeOfFatal2d)), CROPDMG=as.numeric(unname(TypeOfFatal3d))))
# name the dataframe rows
FinalDatad<-cbind(EVTYPE=names(TypeOfFatal1d),FinalDatad)
#order the dataframe according to total damages
FinalDatad<-FinalDatad[order(FinalDatad$Total, decreasing=T), ]
#calculate 95% cutoff
y<-0.95*sum(FinalDatad$Total)
FinalData2d<-FinalDatad[cumsum(FinalDatad$Total)<y, ]
#Put this components in a matrix
FINALd<-matrix(data = c(FinalData2d$PROPDMG, FinalData2d$CROPDMG), nrow =2, ncol = length(FinalData2d$Total), byrow = T,
dimnames = NULL)
#name the matrix
dimnames(FINALd)=list(c("Property Damages", "Crop Damages"), FinalData2d$EVTYPE)
print (FINALd)
## TORNADO FLASH FLOOD TSTM WIND HAIL FLOOD
## Property Damages 3212258 1420125 1335966 688693 899938
## Crop Damages 100019 179200 109203 579596 168038
## THUNDERSTORM WIND LIGHTNING THUNDERSTORM WINDS HIGH WIND
## Property Damages 876844 603352 446293 324732
## Crop Damages 66791 3581 18685 17283
## WINTER STORM HEAVY SNOW WILDFIRE ICE STORM STRONG WIND
## Property Damages 132721 122252 84459 66001 62994
## Crop Damages 1979 2166 4364 1689 1617
## HEAVY RAIN
## Property Damages 50842
## Crop Damages 11123
In Figure 1, I plot in a horizontal, stacked, bar plot, the total of Fatalities and Injuries, caused by the most dangerous types of events of severe weather. The events were selected after calculating for their cumulative impact reaching 95% of the total sum of Fatalities and Injuries. The impact of distinct weather events is categorized in Fatalities and Injuries, as indicated by the color code in the included legend. The most harmful event is Tornado.
#plot the results as stacked bar plot to show the individual contribution of fatalities and injuries
par(oma=c(0,2,2,2), ps=8)
barplot(FINAL, main="Figure 1: Distribution of event types accounting for\n 95% of total fatalities and injuries", names.arg= FinalData2$EVTYPE, col=c("blue","red"), xlab="Number of incidents", las=1, horiz=T, xlim=c(0, 95000), legend=T)
In Figure 2, I plot in a horizontal, stacked, bar plot, the total of Property and Crop Damages, caused by the most dangerous types of events of severe weather. The events were selected after calculating for their cumulative impact reaching 95% of the total sum of Property an Crop Damages. The impact of distinct weather events is categorized in Property damages and Crop damages, as indicated by the color code in the included legend. The most harmful event is Tornado.
#plot the results as stacked bar plot to show the individual contribution of property damages and crop damages
par(oma=c(0,6,2,2), ps=12)
barplot(FINALd, main="Figure 2: Distribution of event types accounting for\n 95% of total property and crop damages", names.arg= FinalData2d$EVTYPE, col=c("blue","red"), xlab="Dollars", las=1, horiz=T, xlim=c(0, 4000000), legend=T)