Synopsis

Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.

This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.

This report uses the above data to answer the following questions: “Across the United States, which types of events are most harmful with respect to population health?” and “Across the United States, which types of events have the greatest economic consequences?”. This was done first by creating a bar plot showing number of injuries and fatalities for each weather event in order to address the first question. To address the second question another bar plot was creating showing amount of property and crop damage for each weather event.

Data Processing

library(ggplot2)
library(reshape2)
## Warning: package 'reshape2' was built under R version 4.5.3
library(knitr)
data <- read.csv("repdata_data_StormData.csv")
head(data)
##   STATE__           BGN_DATE BGN_TIME TIME_ZONE COUNTY COUNTYNAME STATE  EVTYPE
## 1       1  4/18/1950 0:00:00     0130       CST     97     MOBILE    AL TORNADO
## 2       1  4/18/1950 0:00:00     0145       CST      3    BALDWIN    AL TORNADO
## 3       1  2/20/1951 0:00:00     1600       CST     57    FAYETTE    AL TORNADO
## 4       1   6/8/1951 0:00:00     0900       CST     89    MADISON    AL TORNADO
## 5       1 11/15/1951 0:00:00     1500       CST     43    CULLMAN    AL TORNADO
## 6       1 11/15/1951 0:00:00     2000       CST     77 LAUDERDALE    AL TORNADO
##   BGN_RANGE BGN_AZI BGN_LOCATI END_DATE END_TIME COUNTY_END COUNTYENDN
## 1         0                                               0         NA
## 2         0                                               0         NA
## 3         0                                               0         NA
## 4         0                                               0         NA
## 5         0                                               0         NA
## 6         0                                               0         NA
##   END_RANGE END_AZI END_LOCATI LENGTH WIDTH F MAG FATALITIES INJURIES PROPDMG
## 1         0                      14.0   100 3   0          0       15    25.0
## 2         0                       2.0   150 2   0          0        0     2.5
## 3         0                       0.1   123 2   0          0        2    25.0
## 4         0                       0.0   100 2   0          0        2     2.5
## 5         0                       0.0   150 2   0          0        2     2.5
## 6         0                       1.5   177 2   0          0        6     2.5
##   PROPDMGEXP CROPDMG CROPDMGEXP WFO STATEOFFIC ZONENAMES LATITUDE LONGITUDE
## 1          K       0                                         3040      8812
## 2          K       0                                         3042      8755
## 3          K       0                                         3340      8742
## 4          K       0                                         3458      8626
## 5          K       0                                         3412      8642
## 6          K       0                                         3450      8748
##   LATITUDE_E LONGITUDE_ REMARKS REFNUM
## 1       3051       8806              1
## 2          0          0              2
## 3          0          0              3
## 4          0          0              4
## 5          0          0              5
## 6          0          0              6
summary(data)
##     STATE__       BGN_DATE           BGN_TIME          TIME_ZONE        
##  Min.   : 1.0   Length:902297      Length:902297      Length:902297     
##  1st Qu.:19.0   Class :character   Class :character   Class :character  
##  Median :30.0   Mode  :character   Mode  :character   Mode  :character  
##  Mean   :31.2                                                           
##  3rd Qu.:45.0                                                           
##  Max.   :95.0                                                           
##                                                                         
##      COUNTY       COUNTYNAME           STATE              EVTYPE         
##  Min.   :  0.0   Length:902297      Length:902297      Length:902297     
##  1st Qu.: 31.0   Class :character   Class :character   Class :character  
##  Median : 75.0   Mode  :character   Mode  :character   Mode  :character  
##  Mean   :100.6                                                           
##  3rd Qu.:131.0                                                           
##  Max.   :873.0                                                           
##                                                                          
##    BGN_RANGE          BGN_AZI           BGN_LOCATI          END_DATE        
##  Min.   :   0.000   Length:902297      Length:902297      Length:902297     
##  1st Qu.:   0.000   Class :character   Class :character   Class :character  
##  Median :   0.000   Mode  :character   Mode  :character   Mode  :character  
##  Mean   :   1.484                                                           
##  3rd Qu.:   1.000                                                           
##  Max.   :3749.000                                                           
##                                                                             
##    END_TIME           COUNTY_END COUNTYENDN       END_RANGE       
##  Length:902297      Min.   :0    Mode:logical   Min.   :  0.0000  
##  Class :character   1st Qu.:0    NA's:902297    1st Qu.:  0.0000  
##  Mode  :character   Median :0                   Median :  0.0000  
##                     Mean   :0                   Mean   :  0.9862  
##                     3rd Qu.:0                   3rd Qu.:  0.0000  
##                     Max.   :0                   Max.   :925.0000  
##                                                                   
##    END_AZI           END_LOCATI            LENGTH              WIDTH         
##  Length:902297      Length:902297      Min.   :   0.0000   Min.   :   0.000  
##  Class :character   Class :character   1st Qu.:   0.0000   1st Qu.:   0.000  
##  Mode  :character   Mode  :character   Median :   0.0000   Median :   0.000  
##                                        Mean   :   0.2301   Mean   :   7.503  
##                                        3rd Qu.:   0.0000   3rd Qu.:   0.000  
##                                        Max.   :2315.0000   Max.   :4400.000  
##                                                                              
##        F               MAG            FATALITIES           INJURIES        
##  Min.   :0.00     Min.   :    0.0   Min.   :  0.00000   Min.   :   0.0000  
##  1st Qu.:0.00     1st Qu.:    0.0   1st Qu.:  0.00000   1st Qu.:   0.0000  
##  Median :1.00     Median :   50.0   Median :  0.00000   Median :   0.0000  
##  Mean   :0.91     Mean   :   46.9   Mean   :  0.01678   Mean   :   0.1557  
##  3rd Qu.:1.00     3rd Qu.:   75.0   3rd Qu.:  0.00000   3rd Qu.:   0.0000  
##  Max.   :5.00     Max.   :22000.0   Max.   :583.00000   Max.   :1700.0000  
##  NA's   :843563                                                            
##     PROPDMG         PROPDMGEXP           CROPDMG         CROPDMGEXP       
##  Min.   :   0.00   Length:902297      Min.   :  0.000   Length:902297     
##  1st Qu.:   0.00   Class :character   1st Qu.:  0.000   Class :character  
##  Median :   0.00   Mode  :character   Median :  0.000   Mode  :character  
##  Mean   :  12.06                      Mean   :  1.527                     
##  3rd Qu.:   0.50                      3rd Qu.:  0.000                     
##  Max.   :5000.00                      Max.   :990.000                     
##                                                                           
##      WFO             STATEOFFIC         ZONENAMES            LATITUDE   
##  Length:902297      Length:902297      Length:902297      Min.   :   0  
##  Class :character   Class :character   Class :character   1st Qu.:2802  
##  Mode  :character   Mode  :character   Mode  :character   Median :3540  
##                                                           Mean   :2875  
##                                                           3rd Qu.:4019  
##                                                           Max.   :9706  
##                                                           NA's   :47    
##    LONGITUDE        LATITUDE_E     LONGITUDE_       REMARKS         
##  Min.   :-14451   Min.   :   0   Min.   :-14455   Length:902297     
##  1st Qu.:  7247   1st Qu.:   0   1st Qu.:     0   Class :character  
##  Median :  8707   Median :   0   Median :     0   Mode  :character  
##  Mean   :  6940   Mean   :1452   Mean   :  3509                     
##  3rd Qu.:  9605   3rd Qu.:3549   3rd Qu.:  8735                     
##  Max.   : 17124   Max.   :9706   Max.   :106220                     
##                   NA's   :40                                        
##      REFNUM      
##  Min.   :     1  
##  1st Qu.:225575  
##  Median :451149  
##  Mean   :451149  
##  3rd Qu.:676723  
##  Max.   :902297  
## 
data_sub = subset(data, select = c("EVTYPE","FATALITIES","INJURIES","PROPDMG", "PROPDMGEXP", "CROPDMG", "CROPDMGEXP"))

Question 1: Across the United States, which types of events are most harmful with respect to population health?

#Data Prepping
events <- data_sub$EVTYPE
events_factors <- factor(events)
fatalities <- data_sub$FATALITIES
injuries <- data_sub$INJURIES
fatalities_sum <- aggregate(fatalities, list(events_factors), sum)
injuries_sum <- aggregate(injuries, list(events_factors), sum)
names(fatalities_sum) <- c("Event", "Count"); names(injuries_sum) <- c("Event", "Count")
pop_health <- data.frame(fatalities_sum$Event, injuries_sum$Count, fatalities_sum$Count)
names(pop_health) <- c("Event", "Injuries", "Fatalities")
pop_health <- pop_health[with(pop_health, order(-Injuries, -Fatalities)), ][1:10,]
pop_health<- melt(pop_health, id.vars = "Event")

Results

ggplot(data=pop_health, aes(x=Event, y=value, fill=variable)) + geom_bar(stat="identity") +  labs(title = "Harmful Weather Measured by Fatalities & Injuries 1950 - 2011", y = "Number of People", x = "Weather Event", fill = "Harm") + theme(axis.text.x = element_text(angle = 45, hjust = 1))

We can see in the plot that tornadoes are significantly more harmful form of weather compared to all other hearmful weather events.

Question 2: Across the United States, which types of events have the greatest economic consequences?

#Data Prepping
convertUnits <- function(coeff, expon){
  if (is.na(expon)){
    as.numeric(coeff)
  }
  else if (toupper(expon)== "K"){
    as.numeric(coeff)*10^3
  }
  else if (toupper(expon) == "M"){
    as.numeric(coeff)*10^6
  }
  else if (toupper(expon)== "B"){
    as.numeric(coeff)*10^9
  }
  else{
    as.numeric(coeff)
  }
}

property_dam <- apply(data_sub[, c('PROPDMG', 'PROPDMGEXP')], 1, function(y) convertUnits(y['PROPDMG'], y['PROPDMGEXP']))
crop_dam <- apply(data_sub[, c('CROPDMG', 'CROPDMGEXP')], 1, function(y) convertUnits(y['CROPDMG'], y['CROPDMGEXP']))
property_dam_sum <- aggregate(property_dam, list(events_factors), sum)
crop_dam_sum <- aggregate(crop_dam, list(events_factors), sum)
names(property_dam_sum) <- c("Event", "Count"); names(crop_dam_sum) <- c("Event", "Count")
economics <- data.frame(property_dam_sum$Event, crop_dam_sum$Count, property_dam_sum$Count)
names(economics) <- c("Event", "Crop_Damage", "Property_Damage")
economics <- economics[with(economics, order(-Crop_Damage, -Property_Damage)), ][1:10,]
economics <- melt(economics, id.vars = "Event")

Results

ggplot(data=economics, aes(x=Event, y=value/10^9, fill=variable)) + geom_bar(stat="identity") +  labs(title = "Harmful Weather Measured by Property & Crop Damage", y = "Number of People", x = "Weather Event", fill = "Damage") + theme(axis.text.x = element_text(angle = 45, hjust = 1))

Looking at the plot, floods and hurricanes/typhoons have the largest impact on property damage and droughts have the largest impact on crop damage.