An exploratory analysis on U.S. National Oceanic and Atmospheric Administration’s (NOAA)storm database to identify the weather types which have a major damage on population(human) and economy across US. The NOAA database tracks of major storms and weather events in the United States, including estimates of any fatalities, injuries, and property damages. This small analysis suggests that tornados and floods have the maximum impact over human and economy respectively.
record<- read.csv("./repdata-data-StormData.csv.bz2",header = TRUE, sep = ",", quote = "\"")
### extract relevant information
weather<-record[,c(8,23,24,25,26,27,28)]
colnames(weather) <- c("EVTYPE", "FATALITIES","INJURIES","PROPDMG","PROPDMGEXP","CROPDMG","CROPDMGEXP")
#### These extract of data contains relevant variable: EVTYPE(evnet type); FATALITIES(#of death); INJURIES(# of injuries); PROPDMG(dmg by flood); PROPDMGEXP(unit of flood damage, e.g. K, M, B); CROPDMG(crop damage); CROPDMGEXP(unit of crop damage, e.g. K, M). For the damage unit, only ones with K, M, B are used, as the meaning of other indicaters, 1-9, "", h, m, are not clear.
Weather event can cause either death(FATALITIES) or injuries(INJURIES). #### (A-1) Analysis on fatalities
##Subsetting fatalities
Fatalities<- aggregate(FATALITIES ~ EVTYPE,weather,sum)
##taking EVTYPE causes mean of fatalities
Fatalities<-Fatalities[Fatalities$FATALITIES>(mean(Fatalities$FATALITIES)),]
## frequency bar plot for fatalities
ft<-ggplot(aes(x=EVTYPE,y=FATALITIES),data=Fatalities)+
geom_bar(stat = "identity",aes(fill = FATALITIES))+
theme(axis.text.x=element_text(angle=90, hjust=1))+
labs(title="Total Fatalities by Event Type", x="", y="Total Fatalities")
print(ft)
#### (A-2) Analysis on injuries
##Subsetting fatalities
Injuries<- aggregate(INJURIES ~ EVTYPE,weather,sum)
##taking EVTYPE causes mean of injuries
Injuries<-Injuries[Injuries$INJURIES>(mean(Injuries$INJURIES)),]
## frequency bar plot for injuries
inj<-ggplot(aes(x=EVTYPE,y=INJURIES),data=Injuries)+
geom_bar(stat = "identity",aes(fill = INJURIES))+
theme(axis.text.x=element_text(angle=90, hjust=1))+
labs(title="Total Injuries by Event Type",x="",y="Total Injuries")
print(inj)
paste("For both fatalities(death) and injuries, type Tornade resulted in by far the largest damages, with ",max(Fatalities$FATALITIES),"fatalities and ",max(Injuries$INJURIES), "injuries")
## [1] "For both fatalities(death) and injuries, type Tornade resulted in by far the largest damages, with 5633 fatalities and 91346 injuries"
To answer this question , columns PROPDMG, PROPDMGEXP, CROPDMG, and CROPDMGEXP are used. Expenditures are marked in “K”,“M” and “B” (PROPDMGEXP and CROPDMGEXP) meaning thousands , millions and billions of dollars. Subdatasets are created for actual expenditure in dollars and one for total expenditure for a weather type.Plots are created to determine which weather type have maximum impact on economy.
eco<-weather[((weather$PROPDMG>0||CROPDMG>0))&((weather$PROPDMGEXP %in% c("K","M","B"))||(weather$CROPDMGEXP %in% c("K","M","B"))),c("EVTYPE","PROPDMG","PROPDMGEXP","CROPDMG","CROPDMGEXP")]
##Calculating actual damages
multiplier1<-data.frame(PROPDMGEXP=c("K","M","B"),multiplier1=c(1000,1000000,1000000000))
multiplier2<-data.frame(CROPDMGEXP=c("K","M","B"),multiplier2=c(1000,1000000,1000000000))
eco<-merge(eco,multiplier1,by= "PROPDMGEXP")
eco<-merge(eco,multiplier2,by= "CROPDMGEXP")
eco$DAMAGE<-(eco$PROPDMG*eco$multiplier1/1000000000+eco$CROPDMG*eco$multiplier2/1000000000)
##creating summaries for EVTYPE
ecodamage<- aggregate(DAMAGE ~ EVTYPE,eco, sum)
ecodamage<-ecodamage[ecodamage$DAMAGE>(mean(ecodamage$DAMAGE)),]
##plot for damages by event type
ggplot(aes(x=EVTYPE,,y=DAMAGE),data=ecodamage)+
geom_bar(stat = "identity",aes(fill = DAMAGE))+
theme(axis.text.x=element_text(angle=90, hjust=1)) +
labs(title="Total Damage($ billion) by Event Type", x="", y="Total Damages(B$)")
### conclusion for economic damage
paste("FLOODS have the largest impact on economy with",max(ecodamage$DAMAGE),"billion dollars.")
## [1] "FLOODS have the largest impact on economy with 138.0074445 billion dollars."