The effect of storm in the economy of US and its People

Peer assignment 2

Introduction

Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.

This project involves exploring the U.S. National Oceanic and Atmospheric Administration's (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.

Reading Raw Data

This document assumes the user has the NOAA storm database that is saved as a raw CSV file. It should be noted that this is a very large datafile, so it can take a couple of minutes to read the data into memory. The following code chunk sets the echo option toTRUE to allow for the analysis to be easily replicated by the user.

library(ggplot2)
library(lattice)
library(knitr)
library(plyr)
data2 <- read.csv("repdata_data_StormData.csv", nrows = 1e+05, header = T, sep = ",")
data <- data.frame(data2)
year <- (data$BGN_DATE)
newdata <- data$EVTYPE

Health Consequences

In this analysis, health consequences refers to fatalities due to weather-related events. This study focuses on the 10 events that are associated with the most number of fatalities. This involves sorting the data frame by the number of fatalities, and then subsetting the sorted dataframe.

health.events <- data[order(-data$FATALITIES), ]

winds

in this analysis, types of winds was measured as well as the consequences they cause as injuries and as fatalities

Economic Consequences

In this analysis, economic consequences was measured using the PROPDMGEXP variable. This is a factor variable corresponding to the monetary unit of the actual property damage that has a corresponding numeric estimate containing the scale for the numeric estimate. More specifically, the factor levels “K”, “M”, and “B” are used to denote estimates in thousands of dollars, millions of dollars, and billions of dollars (respectively). Other factor levels are included in PROPDMGEXP but these represent only a small percentage of the overall data points and their meaning is not contained in the documentation. Therefore, these other levels are excluded from the current analysis.

In this study, events that are associated with damages estimated in billions of dollars are considered to have the greatest economic consequences. The following code chunk subsets the dataframe to include only those variables associated with weather-events that are measured in billions of dollars. After identifying those estimated in billions of dollars, the ten most costliest events are identified by ordering the dataframe based on the numeric estimate of property damage (PROPDMG) and then subsetting the ordered data frame.

Results

The following figures show the top 10 weather events that were associated with the greatest health and economic impacts. As indicated in the following figure, tornados were the most frequent weather-related event, representing over half of the top-10 weather-related events associated with fatalities. Heat and excessive heat represented the weather-related events that accounted for the most fatalities. However, it is unclear whether the single event of heat actually accounted for nearly 600 deaths. This issue cannot be resolved with the currently available documentation for the data source.

winds


qplot(as.numeric(year), newdata, data = data, group = data$FATALITIES, color = data$FATALITIES, 
    geom = c("point", "line"), ylab = "type of wind", xlab = "frequency", main = "FATALITIES BY THE TYPE OF WIND")

plot of chunk unnamed-chunk-2




data1 = data[order(data$FATALITIES, data$INJURIES, decreasing = TRUE), ]
data3 = data[order(data$PROPDMG, data$CROPDMG, decreasing = TRUE), ]
head(data3)
##     STATE__          BGN_DATE BGN_TIME TIME_ZONE COUNTY COUNTYNAME STATE
## 13        1 2/13/1952 0:00:00     2130       CST     73  JEFFERSON    AL
## 36        1  5/1/1953 0:00:00     1930       CST     27       CLAY    AL
## 48        1 12/5/1954 0:00:00     1200       CST     81        LEE    AL
## 49        1 12/5/1954 0:00:00     1330       CST     15    CALHOUN    AL
## 121       1  4/8/1957 0:00:00      946       CST     93     MARION    AL
## 124       1  4/8/1957 0:00:00     1030       CST     43    CULLMAN    AL
##      EVTYPE BGN_RANGE BGN_AZI BGN_LOCATI END_DATE END_TIME COUNTY_END
## 13  TORNADO         0      NA         NA       NA       NA          0
## 36  TORNADO         0      NA         NA       NA       NA          0
## 48  TORNADO         0      NA         NA       NA       NA          0
## 49  TORNADO         0      NA         NA       NA       NA          0
## 121 TORNADO         0      NA         NA       NA       NA          0
## 124 TORNADO         0      NA         NA       NA       NA          0
##     COUNTYENDN END_RANGE END_AZI END_LOCATI LENGTH WIDTH F MAG FATALITIES
## 13          NA         0      NA         NA    0.0   200 3   0          1
## 36          NA         0      NA         NA   12.1   440 4   0          7
## 48          NA         0      NA         NA   19.4   100 3   0          0
## 49          NA         0      NA         NA   24.7   100 3   0          0
## 121         NA         0      NA         NA   51.4   100 3   0          0
## 124         NA         0      NA         NA   16.3    33 3   0          0
##     INJURIES PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP WFO STATEOFFIC
## 13        26     250          K       0         NA             NA
## 36        12     250          K       0         NA             NA
## 48         4     250          K       0         NA             NA
## 49        26     250          K       0         NA             NA
## 121        0     250          K       0         NA             NA
## 124        0     250          K       0         NA             NA
##     ZONENAMES LATITUDE LONGITUDE LATITUDE_E LONGITUDE_ REMARKS REFNUM
## 13         NA     3336      8656          0          0      NA     13
## 36         NA     3313      8556       3318       8545      NA     36
## 48         NA     3241      8525       3240       8505      NA     48
## 49         NA     3347      8600       3355       8536      NA     49
## 121        NA     3407      8759       3419       8707      NA    121
## 124        NA     3418      8636       3423       8620      NA    124

Health Consequences

economic.events <- subset(data, subset = PROPDMGEXP == "B")
economic.events <- economic.events[order(-economic.events$PROPDMG), ]
economic.events <- economic.events

PLOTTING


dotplot(newdata ~ data$FATALITIES, data = health.events, xlab = "Weather-related  Fatalities", 
    main = "Weather-related Fatalities", ylab = "Event type")

plot of chunk unnamed-chunk-4

Events

health.events$event.type <- as.character(health.events$newdata)
## Error: replacement has 0 rows, data has 100000

economic.events$event.type <- as.character(economic.events$newdata)

PLOTTING

The following figure shows the amount of property damage in billions of dollars. It should be noted that the dollar values were log transformed due to high skewness. These data suggest that warm-weather storms with both wind and rain are associated with the greatest amount of property damage. Only one of the events was a winter storm.

qplot(as.numeric(year), data$INJURIES, data = data, group = data$FATALITIES, 
    color = data$FATALITIES, geom = c("point", "line"), ylab = "type of wind", 
    xlab = "frequency", main = "FATALITIES BY THE TYPE OF WIND")

plot of chunk unnamed-chunk-6