Storm and Wather Events Analysis

The following information is accross to storm and other weather events, between 1950 and 2005.

The information is disponible in this url https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2

## Registered S3 methods overwritten by 'tibble':
##   method     from  
##   format.tbl pillar
##   print.tbl  pillar
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
## 
## Attaching package: 'gridExtra'
## The following object is masked from 'package:dplyr':
## 
##     combine

Data

  • First step is getting information and uncompress data.
  • Second step is put this data into dataframe.
#getting values
temp <- tempfile()
download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2", temp)
storm <- read.csv(bzfile(temp))
summary(storm)
##     STATE__                  BGN_DATE             BGN_TIME     
##  Min.   : 1.0   5/25/2011 0:00:00:  1202   12:00:00 AM: 10163  
##  1st Qu.:19.0   4/27/2011 0:00:00:  1193   06:00:00 PM:  7350  
##  Median :30.0   6/9/2011 0:00:00 :  1030   04:00:00 PM:  7261  
##  Mean   :31.2   5/30/2004 0:00:00:  1016   05:00:00 PM:  6891  
##  3rd Qu.:45.0   4/4/2011 0:00:00 :  1009   12:00:00 PM:  6703  
##  Max.   :95.0   4/2/2006 0:00:00 :   981   03:00:00 PM:  6700  
##                 (Other)          :895866   (Other)    :857229  
##    TIME_ZONE          COUNTY           COUNTYNAME         STATE       
##  CST    :547493   Min.   :  0.0   JEFFERSON :  7840   TX     : 83728  
##  EST    :245558   1st Qu.: 31.0   WASHINGTON:  7603   KS     : 53440  
##  MST    : 68390   Median : 75.0   JACKSON   :  6660   OK     : 46802  
##  PST    : 28302   Mean   :100.6   FRANKLIN  :  6256   MO     : 35648  
##  AST    :  6360   3rd Qu.:131.0   LINCOLN   :  5937   IA     : 31069  
##  HST    :  2563   Max.   :873.0   MADISON   :  5632   NE     : 30271  
##  (Other):  3631                   (Other)   :862369   (Other):621339  
##                EVTYPE         BGN_RANGE           BGN_AZI      
##  HAIL             :288661   Min.   :   0.000          :547332  
##  TSTM WIND        :219940   1st Qu.:   0.000   N      : 86752  
##  THUNDERSTORM WIND: 82563   Median :   0.000   W      : 38446  
##  TORNADO          : 60652   Mean   :   1.484   S      : 37558  
##  FLASH FLOOD      : 54277   3rd Qu.:   1.000   E      : 33178  
##  FLOOD            : 25326   Max.   :3749.000   NW     : 24041  
##  (Other)          :170878                      (Other):134990  
##          BGN_LOCATI                  END_DATE             END_TIME     
##               :287743                    :243411              :238978  
##  COUNTYWIDE   : 19680   4/27/2011 0:00:00:  1214   06:00:00 PM:  9802  
##  Countywide   :   993   5/25/2011 0:00:00:  1196   05:00:00 PM:  8314  
##  SPRINGFIELD  :   843   6/9/2011 0:00:00 :  1021   04:00:00 PM:  8104  
##  SOUTH PORTION:   810   4/4/2011 0:00:00 :  1007   12:00:00 PM:  7483  
##  NORTH PORTION:   784   5/30/2004 0:00:00:   998   11:59:00 PM:  7184  
##  (Other)      :591444   (Other)          :653450   (Other)    :622432  
##    COUNTY_END COUNTYENDN       END_RANGE           END_AZI      
##  Min.   :0    Mode:logical   Min.   :  0.0000          :724837  
##  1st Qu.:0    NA's:902297    1st Qu.:  0.0000   N      : 28082  
##  Median :0                   Median :  0.0000   S      : 22510  
##  Mean   :0                   Mean   :  0.9862   W      : 20119  
##  3rd Qu.:0                   3rd Qu.:  0.0000   E      : 20047  
##  Max.   :0                   Max.   :925.0000   NE     : 14606  
##                                                 (Other): 72096  
##            END_LOCATI         LENGTH              WIDTH         
##                 :499225   Min.   :   0.0000   Min.   :   0.000  
##  COUNTYWIDE     : 19731   1st Qu.:   0.0000   1st Qu.:   0.000  
##  SOUTH PORTION  :   833   Median :   0.0000   Median :   0.000  
##  NORTH PORTION  :   780   Mean   :   0.2301   Mean   :   7.503  
##  CENTRAL PORTION:   617   3rd Qu.:   0.0000   3rd Qu.:   0.000  
##  SPRINGFIELD    :   575   Max.   :2315.0000   Max.   :4400.000  
##  (Other)        :380536                                         
##        F               MAG            FATALITIES          INJURIES        
##  Min.   :0.0      Min.   :    0.0   Min.   :  0.0000   Min.   :   0.0000  
##  1st Qu.:0.0      1st Qu.:    0.0   1st Qu.:  0.0000   1st Qu.:   0.0000  
##  Median :1.0      Median :   50.0   Median :  0.0000   Median :   0.0000  
##  Mean   :0.9      Mean   :   46.9   Mean   :  0.0168   Mean   :   0.1557  
##  3rd Qu.:1.0      3rd Qu.:   75.0   3rd Qu.:  0.0000   3rd Qu.:   0.0000  
##  Max.   :5.0      Max.   :22000.0   Max.   :583.0000   Max.   :1700.0000  
##  NA's   :843563                                                           
##     PROPDMG          PROPDMGEXP        CROPDMG          CROPDMGEXP    
##  Min.   :   0.00          :465934   Min.   :  0.000          :618413  
##  1st Qu.:   0.00   K      :424665   1st Qu.:  0.000   K      :281832  
##  Median :   0.00   M      : 11330   Median :  0.000   M      :  1994  
##  Mean   :  12.06   0      :   216   Mean   :  1.527   k      :    21  
##  3rd Qu.:   0.50   B      :    40   3rd Qu.:  0.000   0      :    19  
##  Max.   :5000.00   5      :    28   Max.   :990.000   B      :     9  
##                    (Other):    84                     (Other):     9  
##       WFO                                       STATEOFFIC    
##         :142069                                      :248769  
##  OUN    : 17393   TEXAS, North                       : 12193  
##  JAN    : 13889   ARKANSAS, Central and North Central: 11738  
##  LWX    : 13174   IOWA, Central                      : 11345  
##  PHI    : 12551   KANSAS, Southwest                  : 11212  
##  TSA    : 12483   GEORGIA, North and Central         : 11120  
##  (Other):690738   (Other)                            :595920  
##                                                                                                                                                                                                     ZONENAMES     
##                                                                                                                                                                                                          :594029  
##                                                                                                                                                                                                          :205988  
##  GREATER RENO / CARSON CITY / M - GREATER RENO / CARSON CITY / M                                                                                                                                         :   639  
##  GREATER LAKE TAHOE AREA - GREATER LAKE TAHOE AREA                                                                                                                                                       :   592  
##  JEFFERSON - JEFFERSON                                                                                                                                                                                   :   303  
##  MADISON - MADISON                                                                                                                                                                                       :   302  
##  (Other)                                                                                                                                                                                                 :100444  
##     LATITUDE      LONGITUDE        LATITUDE_E     LONGITUDE_    
##  Min.   :   0   Min.   :-14451   Min.   :   0   Min.   :-14455  
##  1st Qu.:2802   1st Qu.:  7247   1st Qu.:   0   1st Qu.:     0  
##  Median :3540   Median :  8707   Median :   0   Median :     0  
##  Mean   :2875   Mean   :  6940   Mean   :1452   Mean   :  3509  
##  3rd Qu.:4019   3rd Qu.:  9605   3rd Qu.:3549   3rd Qu.:  8735  
##  Max.   :9706   Max.   : 17124   Max.   :9706   Max.   :106220  
##  NA's   :47                      NA's   :40                     
##                                            REMARKS           REFNUM      
##                                                :287433   Min.   :     1  
##                                                : 24013   1st Qu.:225575  
##  Trees down.\n                                 :  1110   Median :451149  
##  Several trees were blown down.\n              :   568   Mean   :451149  
##  Trees were downed.\n                          :   446   3rd Qu.:676723  
##  Large trees and power lines were blown down.\n:   432   Max.   :902297  
##  (Other)                                       :588295

Calculate property damage and crop damage

  • Third step is calculate crop and property economic damage, this operation consist in multiply (prop or crop [Exp]) with (prop or crop [Value])
  • Four step is add it all up rows in colums prop_value, crop_value, injuries, fatalities by event type.
#calculate crop and prop value
storm$PROP_VALUE <- ifelse(toupper(storm$PROPDMGEXP) == "K", 1000, ifelse(toupper(storm$PROPDMGEXP) == "M", 1000000, ifelse(toupper(storm$PROPDMGEXP) == "B", 1000000000,0))) * storm$PROPDMG
storm$CROP_VALUE <- ifelse(toupper(storm$CROPDMGEXP) == "K", 1000, ifelse(toupper(storm$CROPDMGEXP) == "M", 1000000, ifelse(toupper(storm$CROPDMGEXP) == "B", 1000000000,0))) * storm$CROPDMG

storm$PROP_VALUE <- storm$PROP_VALUE/1000000
storm$CROP_VALUE <- storm$CROP_VALUE/1000000

#calculate total prop value and getting top 10 by EVTYPE
prop_dmg <- aggregate(PROP_VALUE ~ EVTYPE, data= storm, FUN = sum)
prop_dmg <- arrange(prop_dmg, desc(prop_dmg[,2]))
top10_prop_dmg <- prop_dmg[1:10,]

#calculate total crop value and getting top 10 by EVTYPE
crop_dmg <- aggregate(CROP_VALUE ~ EVTYPE, data= storm, FUN = sum)
crop_dmg <- arrange(crop_dmg, desc(crop_dmg[,2]))
top10_crop_dmg <- crop_dmg[1:10,]

#calculate total fatalities and getting top 10 by EVTYPE
fatalities <- aggregate(FATALITIES ~ EVTYPE, data = storm, FUN = sum)
fatalities <- arrange(fatalities, desc(fatalities[,2]))
top10_fatalities <- fatalities[1:10,]

#calculate total injuries and getting top 10 by EVTYPE
injuries <- aggregate(INJURIES ~ EVTYPE, data = storm, FUN = sum)
injuries <- arrange(injuries, desc(injuries[,2]))
top10_injuries <- injuries[1:10,]

Analysis top 10 health damage

This reprentetation is top 10 health damage in injuries and fatalities.

#plot injuries and fatalities
options(scipen=10)
injPlot <- ggplot(top10_injuries, aes(x = reorder(EVTYPE, -INJURIES), y = INJURIES)) +
  geom_bar(stat = "identity") +
  xlab("Weather Event Type") +
  ylab("Injuries") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  ggtitle('Top 10 Injuries')
fatPlot <- ggplot(top10_fatalities, aes(x = reorder(EVTYPE, -FATALITIES), y = FATALITIES)) +
  geom_bar(stat = "identity") +
  xlab("Weather Event Type") +
  ylab("Fatalities ") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  ggtitle('Top 10 Fatalities')
grid.arrange(injPlot, fatPlot, nrow = 1)

Analysis top 10 economic damage

This reprentetation is top 10 economic value damage in property and crop.

#plot injuries and fatalities
options(scipen=10)
propPlot <- ggplot(top10_prop_dmg, aes(x = reorder(EVTYPE, -PROP_VALUE), y = PROP_VALUE)) +
  geom_bar(stat = "identity") +
  xlab("Weather Event Type") +
  ylab("Property Damage in Million") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  ggtitle('Top 10 Property Damage')
cropPlot <- ggplot(top10_crop_dmg, aes(x = reorder(EVTYPE, -CROP_VALUE), y = CROP_VALUE)) +
  geom_bar(stat = "identity") +
  xlab("Weather Event Type") +
  ylab("Crop Damage in Million ") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  ggtitle('Top 10 Crop Damage')
grid.arrange(propPlot, cropPlot, nrow = 1)