Severe weather Impacts on Society

Synopsis

Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.

This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.

Data Processing

The events in the database start in the year 1950 and end in November 2011. In the earlier years of the database there are generally fewer events recorded, most likely due to a lack of good records. More recent years should be considered more complete.

*National Weather Service Storm Data Documentation

*National Climatic Data Center Storm Events FAQ

p2 <- read.csv("~/repdata-data-StormData.csv")
p2.1<-p2[,c(8,22:27)]
p2.1$EVTYPE<-as.factor(p2.1$EVTYPE)
library(dplyr)

## Warning: package 'dplyr' was built under R version 3.2.3

## 
## Attaching package: 'dplyr'
## 
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## 
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

p2.1grp<-group_by(p2.1,as.factor(p2.1$EVTYPE))
p2.2grp<-p2.1grp[,c(2:8)]

The preceding code reads in the storm data to R and stores for variable p2. Next the data is reduced down to 7 columns: EVTYPE, MAG, FATALITIES, INJURIES, PROPDMG, PROPDMEXP, and CROPDMG. This reduced version of the data set is stored under p2.1, so that p2 remains original. Furthermore, the EVTYPE variable is changed into a factor form in order to group.

Next the “dplyr” packaged is called into the working director, and the group_by() function is called. The new dataset, grouped by EVTYPE, is stored under p2.1grp. The final line simply takes out the duplicate of Event type and stores for p2.2grp.

summary(p2.2grp)

##       MAG            FATALITIES          INJURIES        
##  Min.   :    0.0   Min.   :  0.0000   Min.   :   0.0000  
##  1st Qu.:    0.0   1st Qu.:  0.0000   1st Qu.:   0.0000  
##  Median :   50.0   Median :  0.0000   Median :   0.0000  
##  Mean   :   46.9   Mean   :  0.0168   Mean   :   0.1557  
##  3rd Qu.:   75.0   3rd Qu.:  0.0000   3rd Qu.:   0.0000  
##  Max.   :22000.0   Max.   :583.0000   Max.   :1700.0000  
##                                                          
##     PROPDMG          PROPDMGEXP        CROPDMG       
##  Min.   :   0.00          :465934   Min.   :  0.000  
##  1st Qu.:   0.00   K      :424665   1st Qu.:  0.000  
##  Median :   0.00   M      : 11330   Median :  0.000  
##  Mean   :  12.06   0      :   216   Mean   :  1.527  
##  3rd Qu.:   0.50   B      :    40   3rd Qu.:  0.000  
##  Max.   :5000.00   5      :    28   Max.   :990.000  
##                    (Other):    84                    
##        as.factor(p2.1$EVTYPE)
##  HAIL             :288661    
##  TSTM WIND        :219940    
##  THUNDERSTORM WIND: 82563    
##  TORNADO          : 60652    
##  FLASH FLOOD      : 54277    
##  FLOOD            : 25326    
##  (Other)          :170878

From the summary above, we can see that there seem to be outstanding outliersbecause many of the first three quartiles have a reading of 0, or very low number, however the max seems to tremendously skew the data. This may imply that there is some specific event that has potential to do more damage than another type.

Therefore further investigation is required.

tapply(X = p2.2grp$FATALITIES,INDEX = p2.2grp[,7],FUN = mean)->p2.2grp_fatal_means
tapply(X = p2.2grp$INJURIES,INDEX = p2.2grp[,7],FUN = mean)->p2.2grp_injury_means
tapply(X = p2.2grp$PROPDMG,INDEX = p2.2grp[,7],FUN = mean)->p2.2grp_propdmg_means
tapply(X = p2.2grp$CROPDMG,INDEX = p2.2grp[,7],FUN = mean)->p2.2grp_cropdmg_means
cbind(unique(p2.2grp$`as.factor(p2.1$EVTYPE)`),p2.2grp_fatal_means,p2.2grp_injury_means,p2.2grp_propdmg_means,p2.2grp_cropdmg_means)->bindedmeans
evnt_means_df<-as.data.frame(bindedmeans)
evnt_means_df<-evnt_means_df[,-c(1)]
evnt_avg_sum<-as.data.frame(cbind(rowSums(x = evnt_means_df[,c(1,2)]),rowSums(x = evnt_means_df[,c(3,4)]),rowSums(x = evnt_means_df)))
names(evnt_avg_sum)<-c("Average_Population_Casualties","Average_Economic_Loss","Average_Total")
evnt_avg_sum$EventType<-row.names(x = evnt_avg_sum)
arrange(evnt_avg_sum,desc(evnt_avg_sum$Average_Economic_Loss))->Economy
arrange(evnt_avg_sum,desc(evnt_avg_sum$Average_Population_Casualties))->Casualties

The above code will do the following:

*calculate means indexed by event type

+fatality count means

+injury count means

+property damage means ($)

+crop damage means ($)

the next step binds the mean vectors and unifies them in a single data frame names evnt_means _df.

Results

Now we oberserve the following:

library(lattice)

Casualties$EventType[1:10]->x1
Casualties$Average_Population_Casualties[1:10]->y1
names(y1)<-x1
barchart(y1,main = "Combined Fatality and Injury Count (per person)",xlab = "")

Economy$EventType[1:10]->x2
Economy$Average_Economic_Loss[1:10]->y2
names(y2)<-x2

barchart(y2,main = "Combined Property and Crop Damage (per $1000)", xlab = "")