SYNOPSIS:

This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s storm database. This database tracks characteristics of major storms and weather events in the U.S. including when and where they occur along with respective resultant fatalities, property damage, and crop damage.

1- Loading and Processing Raw Data:

Download the data from the web onto your local storage on your computer and make sure that your working directory is set to the directory where the data file resides.

storm<-read.csv("stormdata.csv",header = TRUE)
names(storm)
##  [1] "STATE__"    "BGN_DATE"   "BGN_TIME"   "TIME_ZONE"  "COUNTY"    
##  [6] "COUNTYNAME" "STATE"      "EVTYPE"     "BGN_RANGE"  "BGN_AZI"   
## [11] "BGN_LOCATI" "END_DATE"   "END_TIME"   "COUNTY_END" "COUNTYENDN"
## [16] "END_RANGE"  "END_AZI"    "END_LOCATI" "LENGTH"     "WIDTH"     
## [21] "F"          "MAG"        "FATALITIES" "INJURIES"   "PROPDMG"   
## [26] "PROPDMGEXP" "CROPDMG"    "CROPDMGEXP" "WFO"        "STATEOFFIC"
## [31] "ZONENAMES"  "LATITUDE"   "LONGITUDE"  "LATITUDE_E" "LONGITUDE_"
## [36] "REMARKS"    "REFNUM"
str(storm)
## 'data.frame':    902297 obs. of  37 variables:
##  $ STATE__   : num  1 1 1 1 1 1 1 1 1 1 ...
##  $ BGN_DATE  : Factor w/ 16335 levels "1/1/1966 0:00:00",..: 6523 6523 4242 11116 2224 2224 2260 383 3980 3980 ...
##  $ BGN_TIME  : Factor w/ 3608 levels "00:00:00 AM",..: 272 287 2705 1683 2584 3186 242 1683 3186 3186 ...
##  $ TIME_ZONE : Factor w/ 22 levels "ADT","AKS","AST",..: 7 7 7 7 7 7 7 7 7 7 ...
##  $ COUNTY    : num  97 3 57 89 43 77 9 123 125 57 ...
##  $ COUNTYNAME: Factor w/ 29601 levels "","5NM E OF MACKINAC BRIDGE TO PRESQUE ISLE LT MI",..: 13513 1873 4598 10592 4372 10094 1973 23873 24418 4598 ...
##  $ STATE     : Factor w/ 72 levels "AK","AL","AM",..: 2 2 2 2 2 2 2 2 2 2 ...
##  $ EVTYPE    : Factor w/ 985 levels "   HIGH SURF ADVISORY",..: 834 834 834 834 834 834 834 834 834 834 ...
##  $ BGN_RANGE : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ BGN_AZI   : Factor w/ 35 levels "","  N"," NW",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ BGN_LOCATI: Factor w/ 54429 levels ""," Christiansburg",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ END_DATE  : Factor w/ 6663 levels "","1/1/1993 0:00:00",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ END_TIME  : Factor w/ 3647 levels ""," 0900CST",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ COUNTY_END: num  0 0 0 0 0 0 0 0 0 0 ...
##  $ COUNTYENDN: logi  NA NA NA NA NA NA ...
##  $ END_RANGE : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ END_AZI   : Factor w/ 24 levels "","E","ENE","ESE",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ END_LOCATI: Factor w/ 34506 levels ""," CANTON"," TULIA",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ LENGTH    : num  14 2 0.1 0 0 1.5 1.5 0 3.3 2.3 ...
##  $ WIDTH     : num  100 150 123 100 150 177 33 33 100 100 ...
##  $ F         : int  3 2 2 2 2 2 2 1 3 3 ...
##  $ MAG       : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ FATALITIES: num  0 0 0 0 0 0 0 0 1 0 ...
##  $ INJURIES  : num  15 0 2 2 2 6 1 0 14 0 ...
##  $ PROPDMG   : num  25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
##  $ PROPDMGEXP: Factor w/ 19 levels "","-","?","+",..: 17 17 17 17 17 17 17 17 17 17 ...
##  $ CROPDMG   : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ CROPDMGEXP: Factor w/ 9 levels "","?","0","2",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ WFO       : Factor w/ 542 levels ""," CI","%SD",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ STATEOFFIC: Factor w/ 250 levels "","ALABAMA, Central",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ ZONENAMES : Factor w/ 25112 levels "","                                                                                                                               "| __truncated__,..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ LATITUDE  : num  3040 3042 3340 3458 3412 ...
##  $ LONGITUDE : num  8812 8755 8742 8626 8642 ...
##  $ LATITUDE_E: num  3051 0 0 0 0 ...
##  $ LONGITUDE_: num  8806 0 0 0 0 ...
##  $ REMARKS   : Factor w/ 436781 levels "","\t","\t\t",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ REFNUM    : num  1 2 3 4 5 6 7 8 9 10 ...

Now, Subset data by fatality, then aggregate by event type then sort for top 10 event types which have highest number of fatalties.

fatality<-subset(storm,storm$FATALITIES==1)
fatality<-aggregate(FATALITIES~EVTYPE,data=fatality,FUN = sum)
fatal10<- fatality[order(-fatality$FATALITIES), ][1:10, ]

Subset data by property damage, then aggregate by event type then sort for top 10 event types which have highest property damage value.

propdmg<-subset(storm,storm$PROPDMG!=0)
propdmg<-aggregate(PROPDMG~EVTYPE,data = propdmg,FUN = sum)
propdmg10<- propdmg[order(-propdmg$PROPDMG), ][1:10, ]

Subset data by crop damage, then aggregate by event type then sort for top 10 event types which have highest crop damage value.

cropdmg<-subset(storm,storm$CROPDMG!=0)
cropdmg<-aggregate(CROPDMG~EVTYPE,data = cropdmg,FUN = sum)
cropdmg10<- cropdmg[order(-cropdmg$CROPDMG), ][1:10, ]

Results:

Let’s look at which types of events are most harmful with respect to population health. Here, we basically see top 10 events that cause the highest fatalities.

barplot(fatal10$FATALITIES, las = 3, names.arg = fatal10$EVTYPE, main = "Events 
With The Top 10 Highest Fatalities", ylab = "Number of Fatalities", col = "red")

Top 10 events that cause the highest number of fatalities

fatal10$EVTYPE
##  [1] TORNADO        LIGHTNING      FLASH FLOOD    EXCESSIVE HEAT
##  [5] TSTM WIND      RIP CURRENT    FLOOD          RIP CURRENTS  
##  [9] HIGH WIND      AVALANCHE     
## 985 Levels:    HIGH SURF ADVISORY  COASTAL FLOOD ... WND

Let’s look at which types of events have greatest economic consequences

par(mfrow = c(1, 2), mar = c(12, 4, 3, 2), mgp = c(3, 1, 0), cex = 0.8)
barplot(propdmg10$PROPDMG/(10^6), las = 3, names.arg = propdmg10$EVTYPE, main = "Events 
With The Top 10 Property Damage", ylab = "Propery Damage (million $)", col = "red")
barplot(cropdmg10$CROPDMG/(10^6), las = 3, names.arg = cropdmg10$EVTYPE, main = "Events 
With The Top 10 Crop Damage", ylab = "Crop Damage (million $)", col = "red")

CONCLUSION #1:

As seen in above graph, it can be consluded that the TORNADO, LIGHTNING, FLASH FLOOD, EXCESSIVE HEAT, THUNDERSTORM WIND, RIP CURRENT, FLOOD, HIGH WIND and AVALANCHE are among the most harmful events with respect to human health.

CONCLUSION #2:

As seen in above graph, it can be concluded that the Tornado, Flash Flood, TSTM Wind, Flood, Thunderstorm, Hail, Lightining, High Wind, Winter Storm are among the events which cause the highest economic consequences in the form of property damage and crop damage.