1.Synopsis:

Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.

This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.

In this report,effect of weather events on personal as well as property damages was studied. Barplots were plotted separately for the top 10 weather events that causes highest fatalities and highest injuries. Results indicate that most Fatalities and injuries were caused by Tornado.Also, bar plots were plotted for the top 10 weather events that causes the highest property damage and crop damage.

2. Data Processing:

  1. The data provided for this project comes in a CSV file which has been compressed into bz2 format to reduce the size.
  2. The data has been availed from National Weather Service Storm Data.
  3. At first we read in the data into R to analyze it.

2.1 Reading the storm Data into R

stormdata <- read.csv("data.csv.bz2",sep = ",",header = TRUE)
head(stormdata)
##   STATE__           BGN_DATE BGN_TIME TIME_ZONE COUNTY COUNTYNAME STATE
## 1       1  4/18/1950 0:00:00     0130       CST     97     MOBILE    AL
## 2       1  4/18/1950 0:00:00     0145       CST      3    BALDWIN    AL
## 3       1  2/20/1951 0:00:00     1600       CST     57    FAYETTE    AL
## 4       1   6/8/1951 0:00:00     0900       CST     89    MADISON    AL
## 5       1 11/15/1951 0:00:00     1500       CST     43    CULLMAN    AL
## 6       1 11/15/1951 0:00:00     2000       CST     77 LAUDERDALE    AL
##    EVTYPE BGN_RANGE BGN_AZI BGN_LOCATI END_DATE END_TIME COUNTY_END
## 1 TORNADO         0                                               0
## 2 TORNADO         0                                               0
## 3 TORNADO         0                                               0
## 4 TORNADO         0                                               0
## 5 TORNADO         0                                               0
## 6 TORNADO         0                                               0
##   COUNTYENDN END_RANGE END_AZI END_LOCATI LENGTH WIDTH F MAG FATALITIES
## 1         NA         0                      14.0   100 3   0          0
## 2         NA         0                       2.0   150 2   0          0
## 3         NA         0                       0.1   123 2   0          0
## 4         NA         0                       0.0   100 2   0          0
## 5         NA         0                       0.0   150 2   0          0
## 6         NA         0                       1.5   177 2   0          0
##   INJURIES PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP WFO STATEOFFIC ZONENAMES
## 1       15    25.0          K       0                                    
## 2        0     2.5          K       0                                    
## 3        2    25.0          K       0                                    
## 4        2     2.5          K       0                                    
## 5        2     2.5          K       0                                    
## 6        6     2.5          K       0                                    
##   LATITUDE LONGITUDE LATITUDE_E LONGITUDE_ REMARKS REFNUM
## 1     3040      8812       3051       8806              1
## 2     3042      8755          0          0              2
## 3     3340      8742          0          0              3
## 4     3458      8626          0          0              4
## 5     3412      8642          0          0              5
## 6     3450      8748          0          0              6

2.2 Processing the data:

  1. To find out which storm event causes the highest fatalitites and injuries we segregate the fatalities and injuries based on event type into the variable names “fatal” and “injured”
fatal <- stormdata %>% group_by(EVTYPE) %>% summarise(FATALITIES = sum(FATALITIES))

injured <- stormdata %>% group_by(EVTYPE) %>% summarise(INJURIES = sum(INJURIES)) 
  1. Next we order the segregated data based on the highest number of fatalities and injuries caused i.e the top 10 Natural causes
fatal <- fatal[order(fatal$FATALITIES,decreasing = TRUE),]

injured <- injured[order(injured$INJURIES,decreasing = TRUE ),]
  1. To take into consideration, the economic damages caused by these natural events (EVTYPE as in dataset) we extract the crop damage and property damage based on the event type from the data set and pre process it for analysis.

  2. Processing for property damage costs

data <- stormdata[c( "PROPDMG", "PROPDMGEXP", "CROPDMG", "CROPDMGEXP","EVTYPE")]
# Assigning values for the property exponent data 
data$PROPEXP[data$PROPDMGEXP == "K"] <- 1000
data$PROPEXP[data$PROPDMGEXP == "M"] <- 1e+06
data$PROPEXP[data$PROPDMGEXP == ""] <- 1
data$PROPEXP[data$PROPDMGEXP == "B"] <- 1e+09
data$PROPEXP[data$PROPDMGEXP == "m"] <- 1e+06
data$PROPEXP[data$PROPDMGEXP == "0"] <- 1
data$PROPEXP[data$PROPDMGEXP == "5"] <- 1e+05
data$PROPEXP[data$PROPDMGEXP == "6"] <- 1e+06
data$PROPEXP[data$PROPDMGEXP == "4"] <- 10000
data$PROPEXP[data$PROPDMGEXP == "2"] <- 100
data$PROPEXP[data$PROPDMGEXP == "3"] <- 1000
data$PROPEXP[data$PROPDMGEXP == "h"] <- 100
data$PROPEXP[data$PROPDMGEXP == "7"] <- 1e+07
data$PROPEXP[data$PROPDMGEXP == "H"] <- 100
data$PROPEXP[data$PROPDMGEXP == "1"] <- 10
data$PROPEXP[data$PROPDMGEXP == "8"] <- 1e+08
# Assigning '0' to invalid exponent data
data$PROPEXP[data$PROPDMGEXP == "+"] <- 0
data$PROPEXP[data$PROPDMGEXP == "-"] <- 0
data$PROPEXP[data$PROPDMGEXP == "?"] <- 0
# Calculating the property damage value
data$PROPDMGVAL <- data$PROPDMG * data$PROPEXP
  1. Processing for costs of crop damage.
data$CROPEXP[data$CROPDMGEXP == "M"] <- 1e+06
data$CROPEXP[data$CROPDMGEXP == "K"] <- 1000
data$CROPEXP[data$CROPDMGEXP == "m"] <- 1e+06
data$CROPEXP[data$CROPDMGEXP == "B"] <- 1e+09
data$CROPEXP[data$CROPDMGEXP == "0"] <- 1
data$CROPEXP[data$CROPDMGEXP == "k"] <- 1000
data$CROPEXP[data$CROPDMGEXP == "2"] <- 100
data$CROPEXP[data$CROPDMGEXP == ""] <- 1
# Assigning '0' to invalid exponent data
data$CROPEXP[data$CROPDMGEXP == "?"] <- 0
# calculating the crop damage value
data$CROPDMGVAL <- data$CROPDMG * data$CROPEXP
  1. After processing the data we group the total property damage for each type of natural event.
  2. The data is then ordered with the highest economic damage being at first and least damage causing event being at last.
propdmg <- data %>% group_by(EVTYPE) %>% summarise(PROPDMGVAL = sum(PROPDMGVAL))

cropdmg <- data %>% group_by(EVTYPE) %>% summarise(CROPDMGVAL = sum(CROPDMGVAL))

propdmg <- propdmg[order(propdmg$PROPDMGVAL,decreasing = TRUE),]
cropdmg <- cropdmg[order(cropdmg$CROPDMGVAL,decreasing = TRUE),]

3 Plotting the data.

3.1 Plotting the data for the top ten events causing extensive property and crop damage over the time span of the data collected.

par(mfrow = c(1,2),mar = c(12,4,3,2),mgp = c(3,1,0),cex= 0.8)

barplot(propdmg$PROPDMGVAL[1:10],names.arg =  propdmg$EVTYPE[1:10],col = "steel blue",las = 3,ylab = "Property damage value" , main = "Highest Property damange by top 10 Events")

barplot(cropdmg$CROPDMGVAL[1:10],names.arg =  cropdmg$EVTYPE[1:10],col = "steel blue",las = 3,ylab = "crop damage value" , main = " Highest crop damage by top 10 Events")

3.2 Plotting the data for the top 10 events causing the most injuries and fatalities over the time span of the data collected.

par(mfrow = c(1,2),mar = c(12,4,3,2),mgp = c(3,1,0),cex= 0.8)

barplot(fatal$FATALITIES[1:10],names.arg = fatal$EVTYPE[1:10],col = "steel blue",las = 3,ylab = "No. of fatalities" , main = " Highest Fatalities by top 10 Events")

barplot(injured$INJURIES[1:10],names.arg = injured$EVTYPE[1:10],las = 3 , main = "Highest Injuries by top 10 Events" ,col = "Steel Blue",ylab = "No. of Injuries")

4. Results

Based on the bar-plots that have been plotted it can be clearly indicated the most number of injuries and fatalities are caused by Tornado followed by Excessive Heat for fatalities and Thunderstorm Wind for injuries. The maximum property damage was caused by floods where as crop damages were caused by draughts, followed by floods for crop damages and Hurricane / Typhoon for Property Damages.