Synopsis

  1. The source data for this research is the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.

  2. The main goal of research is to address following questions:
    2.1. Across the United States, which types of events are most harmful with respect to population health?
    2.2. Across the United States, which types of events have the greatest economic consequences?

  3. Most part of the records in the dataset have zero casualties and economic damage, so we could omit them while processing the data.

  4. Another part of data processing was to align event records with the official event names from NATIONAL WEATHER SERVICE INSTRUCTION 10-1605

  5. After that, we build two resulting datasets to address the main questions of the research.

  6. Thus we have found the most severe weather events. In addition, we have built two maps showing different levels of damage of tornadoes for population health and of floods for economics by U.S. Counties

Data Processing

First we need to read the source data. Due to the bzip2 format, we can read it directly from the compressed file by read.csv() without unzipping.

temp <- tempfile()
download.file("http://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2",temp)
data <- read.csv(temp)
unlink(temp)
rm(temp)

Let’s see what is inside the data.

str(data)
## 'data.frame':    902297 obs. of  37 variables:
##  $ STATE__   : num  1 1 1 1 1 1 1 1 1 1 ...
##  $ BGN_DATE  : Factor w/ 16335 levels "1/1/1966 0:00:00",..: 6523 6523 4242 11116 2224 2224 2260 383 3980 3980 ...
##  $ BGN_TIME  : Factor w/ 3608 levels "00:00:00 AM",..: 272 287 2705 1683 2584 3186 242 1683 3186 3186 ...
##  $ TIME_ZONE : Factor w/ 22 levels "ADT","AKS","AST",..: 7 7 7 7 7 7 7 7 7 7 ...
##  $ COUNTY    : num  97 3 57 89 43 77 9 123 125 57 ...
##  $ COUNTYNAME: Factor w/ 29601 levels "","5NM E OF MACKINAC BRIDGE TO PRESQUE ISLE LT MI",..: 13513 1873 4598 10592 4372 10094 1973 23873 24418 4598 ...
##  $ STATE     : Factor w/ 72 levels "AK","AL","AM",..: 2 2 2 2 2 2 2 2 2 2 ...
##  $ EVTYPE    : Factor w/ 985 levels "   HIGH SURF ADVISORY",..: 834 834 834 834 834 834 834 834 834 834 ...
##  $ BGN_RANGE : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ BGN_AZI   : Factor w/ 35 levels "","  N"," NW",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ BGN_LOCATI: Factor w/ 54429 levels "","- 1 N Albion",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ END_DATE  : Factor w/ 6663 levels "","1/1/1993 0:00:00",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ END_TIME  : Factor w/ 3647 levels ""," 0900CST",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ COUNTY_END: num  0 0 0 0 0 0 0 0 0 0 ...
##  $ COUNTYENDN: logi  NA NA NA NA NA NA ...
##  $ END_RANGE : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ END_AZI   : Factor w/ 24 levels "","E","ENE","ESE",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ END_LOCATI: Factor w/ 34506 levels "","- .5 NNW",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ LENGTH    : num  14 2 0.1 0 0 1.5 1.5 0 3.3 2.3 ...
##  $ WIDTH     : num  100 150 123 100 150 177 33 33 100 100 ...
##  $ F         : int  3 2 2 2 2 2 2 1 3 3 ...
##  $ MAG       : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ FATALITIES: num  0 0 0 0 0 0 0 0 1 0 ...
##  $ INJURIES  : num  15 0 2 2 2 6 1 0 14 0 ...
##  $ PROPDMG   : num  25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
##  $ PROPDMGEXP: Factor w/ 19 levels "","-","?","+",..: 17 17 17 17 17 17 17 17 17 17 ...
##  $ CROPDMG   : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ CROPDMGEXP: Factor w/ 9 levels "","?","0","2",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ WFO       : Factor w/ 542 levels ""," CI","$AC",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ STATEOFFIC: Factor w/ 250 levels "","ALABAMA, Central",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ ZONENAMES : Factor w/ 25112 levels "","                                                                                                                               "| __truncated__,..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ LATITUDE  : num  3040 3042 3340 3458 3412 ...
##  $ LONGITUDE : num  8812 8755 8742 8626 8642 ...
##  $ LATITUDE_E: num  3051 0 0 0 0 ...
##  $ LONGITUDE_: num  8806 0 0 0 0 ...
##  $ REMARKS   : Factor w/ 436781 levels "","-2 at Deer Park\n",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ REFNUM    : num  1 2 3 4 5 6 7 8 9 10 ...
summary(data)
##     STATE__                  BGN_DATE             BGN_TIME        TIME_ZONE          COUNTY     
##  Min.   : 1.0   5/25/2011 0:00:00:  1202   12:00:00 AM: 10163   CST    :547493   Min.   :  0.0  
##  1st Qu.:19.0   4/27/2011 0:00:00:  1193   06:00:00 PM:  7350   EST    :245558   1st Qu.: 31.0  
##  Median :30.0   6/9/2011 0:00:00 :  1030   04:00:00 PM:  7261   MST    : 68390   Median : 75.0  
##  Mean   :31.2   5/30/2004 0:00:00:  1016   05:00:00 PM:  6891   PST    : 28302   Mean   :100.6  
##  3rd Qu.:45.0   4/4/2011 0:00:00 :  1009   12:00:00 PM:  6703   AST    :  6360   3rd Qu.:131.0  
##  Max.   :95.0   4/2/2006 0:00:00 :   981   03:00:00 PM:  6700   HST    :  2563   Max.   :873.0  
##                 (Other)          :895866   (Other)    :857229   (Other):  3631                  
##       COUNTYNAME         STATE                      EVTYPE         BGN_RANGE       
##  JEFFERSON :  7840   TX     : 83728   HAIL             :288661   Min.   :   0.000  
##  WASHINGTON:  7603   KS     : 53440   TSTM WIND        :219940   1st Qu.:   0.000  
##  JACKSON   :  6660   OK     : 46802   THUNDERSTORM WIND: 82563   Median :   0.000  
##  FRANKLIN  :  6256   MO     : 35648   TORNADO          : 60652   Mean   :   1.484  
##  LINCOLN   :  5937   IA     : 31069   FLASH FLOOD      : 54277   3rd Qu.:   1.000  
##  MADISON   :  5632   NE     : 30271   FLOOD            : 25326   Max.   :3749.000  
##  (Other)   :862369   (Other):621339   (Other)          :170878                     
##     BGN_AZI               BGN_LOCATI                  END_DATE             END_TIME     
##         :547332                :287743                    :243411              :238978  
##  N      : 86752   COUNTYWIDE   : 19680   4/27/2011 0:00:00:  1214   06:00:00 PM:  9802  
##  W      : 38446   Countywide   :   993   5/25/2011 0:00:00:  1196   05:00:00 PM:  8314  
##  S      : 37558   SPRINGFIELD  :   843   6/9/2011 0:00:00 :  1021   04:00:00 PM:  8104  
##  E      : 33178   SOUTH PORTION:   810   4/4/2011 0:00:00 :  1007   12:00:00 PM:  7483  
##  NW     : 24041   NORTH PORTION:   784   5/30/2004 0:00:00:   998   11:59:00 PM:  7184  
##  (Other):134990   (Other)      :591444   (Other)          :653450   (Other)    :622432  
##    COUNTY_END COUNTYENDN       END_RANGE           END_AZI                 END_LOCATI    
##  Min.   :0    Mode:logical   Min.   :  0.0000          :724837                  :499225  
##  1st Qu.:0    NA's:902297    1st Qu.:  0.0000   N      : 28082   COUNTYWIDE     : 19731  
##  Median :0                   Median :  0.0000   S      : 22510   SOUTH PORTION  :   833  
##  Mean   :0                   Mean   :  0.9862   W      : 20119   NORTH PORTION  :   780  
##  3rd Qu.:0                   3rd Qu.:  0.0000   E      : 20047   CENTRAL PORTION:   617  
##  Max.   :0                   Max.   :925.0000   NE     : 14606   SPRINGFIELD    :   575  
##                                                 (Other): 72096   (Other)        :380536  
##      LENGTH              WIDTH                F               MAG            FATALITIES      
##  Min.   :   0.0000   Min.   :   0.000   Min.   :0.0      Min.   :    0.0   Min.   :  0.0000  
##  1st Qu.:   0.0000   1st Qu.:   0.000   1st Qu.:0.0      1st Qu.:    0.0   1st Qu.:  0.0000  
##  Median :   0.0000   Median :   0.000   Median :1.0      Median :   50.0   Median :  0.0000  
##  Mean   :   0.2301   Mean   :   7.503   Mean   :0.9      Mean   :   46.9   Mean   :  0.0168  
##  3rd Qu.:   0.0000   3rd Qu.:   0.000   3rd Qu.:1.0      3rd Qu.:   75.0   3rd Qu.:  0.0000  
##  Max.   :2315.0000   Max.   :4400.000   Max.   :5.0      Max.   :22000.0   Max.   :583.0000  
##                                         NA's   :843563                                       
##     INJURIES            PROPDMG          PROPDMGEXP        CROPDMG          CROPDMGEXP    
##  Min.   :   0.0000   Min.   :   0.00          :465934   Min.   :  0.000          :618413  
##  1st Qu.:   0.0000   1st Qu.:   0.00   K      :424665   1st Qu.:  0.000   K      :281832  
##  Median :   0.0000   Median :   0.00   M      : 11330   Median :  0.000   M      :  1994  
##  Mean   :   0.1557   Mean   :  12.06   0      :   216   Mean   :  1.527   k      :    21  
##  3rd Qu.:   0.0000   3rd Qu.:   0.50   B      :    40   3rd Qu.:  0.000   0      :    19  
##  Max.   :1700.0000   Max.   :5000.00   5      :    28   Max.   :990.000   B      :     9  
##                                        (Other):    84                     (Other):     9  
##       WFO                                       STATEOFFIC    
##         :142069                                      :248769  
##  OUN    : 17393   TEXAS, North                       : 12193  
##  JAN    : 13889   ARKANSAS, Central and North Central: 11738  
##  LWX    : 13174   IOWA, Central                      : 11345  
##  PHI    : 12551   KANSAS, Southwest                  : 11212  
##  TSA    : 12483   GEORGIA, North and Central         : 11120  
##  (Other):690738   (Other)                            :595920  
##                                                                                                                                                                                                     ZONENAMES     
##                                                                                                                                                                                                          :594029  
##                                                                                                                                                                                                          :205988  
##  GREATER RENO / CARSON CITY / M - GREATER RENO / CARSON CITY / M                                                                                                                                         :   639  
##  GREATER LAKE TAHOE AREA - GREATER LAKE TAHOE AREA                                                                                                                                                       :   592  
##  JEFFERSON - JEFFERSON                                                                                                                                                                                   :   303  
##  MADISON - MADISON                                                                                                                                                                                       :   302  
##  (Other)                                                                                                                                                                                                 :100444  
##     LATITUDE      LONGITUDE        LATITUDE_E     LONGITUDE_    
##  Min.   :   0   Min.   :-14451   Min.   :   0   Min.   :-14455  
##  1st Qu.:2802   1st Qu.:  7247   1st Qu.:   0   1st Qu.:     0  
##  Median :3540   Median :  8707   Median :   0   Median :     0  
##  Mean   :2875   Mean   :  6940   Mean   :1452   Mean   :  3509  
##  3rd Qu.:4019   3rd Qu.:  9605   3rd Qu.:3549   3rd Qu.:  8735  
##  Max.   :9706   Max.   : 17124   Max.   :9706   Max.   :106220  
##  NA's   :47                      NA's   :40                     
##                                            REMARKS           REFNUM      
##                                                :287433   Min.   :     1  
##                                                : 24013   1st Qu.:225575  
##  Trees down.\n                                 :  1110   Median :451149  
##  Several trees were blown down.\n              :   568   Mean   :451149  
##  Trees were downed.\n                          :   446   3rd Qu.:676723  
##  Large trees and power lines were blown down.\n:   432   Max.   :902297  
##  (Other)                                       :588295
head(data,10)
##    STATE__           BGN_DATE BGN_TIME TIME_ZONE COUNTY COUNTYNAME STATE  EVTYPE BGN_RANGE BGN_AZI
## 1        1  4/18/1950 0:00:00     0130       CST     97     MOBILE    AL TORNADO         0        
## 2        1  4/18/1950 0:00:00     0145       CST      3    BALDWIN    AL TORNADO         0        
## 3        1  2/20/1951 0:00:00     1600       CST     57    FAYETTE    AL TORNADO         0        
## 4        1   6/8/1951 0:00:00     0900       CST     89    MADISON    AL TORNADO         0        
## 5        1 11/15/1951 0:00:00     1500       CST     43    CULLMAN    AL TORNADO         0        
## 6        1 11/15/1951 0:00:00     2000       CST     77 LAUDERDALE    AL TORNADO         0        
## 7        1 11/16/1951 0:00:00     0100       CST      9     BLOUNT    AL TORNADO         0        
## 8        1  1/22/1952 0:00:00     0900       CST    123 TALLAPOOSA    AL TORNADO         0        
## 9        1  2/13/1952 0:00:00     2000       CST    125 TUSCALOOSA    AL TORNADO         0        
## 10       1  2/13/1952 0:00:00     2000       CST     57    FAYETTE    AL TORNADO         0        
##    BGN_LOCATI END_DATE END_TIME COUNTY_END COUNTYENDN END_RANGE END_AZI END_LOCATI LENGTH WIDTH F
## 1                                        0         NA         0                      14.0   100 3
## 2                                        0         NA         0                       2.0   150 2
## 3                                        0         NA         0                       0.1   123 2
## 4                                        0         NA         0                       0.0   100 2
## 5                                        0         NA         0                       0.0   150 2
## 6                                        0         NA         0                       1.5   177 2
## 7                                        0         NA         0                       1.5    33 2
## 8                                        0         NA         0                       0.0    33 1
## 9                                        0         NA         0                       3.3   100 3
## 10                                       0         NA         0                       2.3   100 3
##    MAG FATALITIES INJURIES PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP WFO STATEOFFIC ZONENAMES LATITUDE
## 1    0          0       15    25.0          K       0                                         3040
## 2    0          0        0     2.5          K       0                                         3042
## 3    0          0        2    25.0          K       0                                         3340
## 4    0          0        2     2.5          K       0                                         3458
## 5    0          0        2     2.5          K       0                                         3412
## 6    0          0        6     2.5          K       0                                         3450
## 7    0          0        1     2.5          K       0                                         3405
## 8    0          0        0     2.5          K       0                                         3255
## 9    0          1       14    25.0          K       0                                         3334
## 10   0          0        0    25.0          K       0                                         3336
##    LONGITUDE LATITUDE_E LONGITUDE_ REMARKS REFNUM
## 1       8812       3051       8806              1
## 2       8755          0          0              2
## 3       8742          0          0              3
## 4       8626          0          0              4
## 5       8642          0          0              5
## 6       8748          0          0              6
## 7       8631          0          0              7
## 8       8558          0          0              8
## 9       8740       3336       8738              9
## 10      8738       3337       8737             10

The first question of the task is

Across the United States, which types of events are most harmful with respect to population health?

We see that raw data is huge and hard to process. It would be good to find a way to reduce it. Our first question is about types of events that are most harmful for population health (i.e. INJURIES and FATALITIES variables). Looking at the summary we see that even the 3rd quartile of both variables equals zero. Therefore, we can get rid of a big share of observations while working on this question.

Let’s confirm these thoughts by some exploration.

zero.harm.obs<-nrow(data[data$FATALITIES==0 & data$INJURIES==0,])
zero.harm.obs
## [1] 880368

Now we see that 98% of all observations are not necessary for our first task. Therefore, we make another data.frame:

data2<-data[data$FATALITIES!=0 | data$INJURIES!=0,]
dim(data2)
## [1] 21929    37

Let’s do a little check: subtract the whole number of injuries in the raw data from the number of injuries in the new dataset data2

sum(data$INJURIES)-sum(data2$INJURIES)
## [1] 0

We see that we didn’t lose any valuable information.

Now we reduce our data2 even more, moving along with the first question. In this code chunk, we boil down data to two aggregated variables. After that we sort it by FATALITIES and then by INJURIES variables both descending.

harm_sum<-aggregate(cbind(data2$FATALITIES,data2$INJURIES), by=list(data2$EVTYPE), FUN=sum)
harm_sum<-droplevels(harm_sum)
harm_sum<-harm_sum[order(-harm_sum[,2], -harm_sum[,3]),]
names(harm_sum) <- c("EVTYPE","FATALITIES","INJURIES")
str(harm_sum)
## 'data.frame':    220 obs. of  3 variables:
##  $ EVTYPE    : Factor w/ 220 levels "AVALANCE","AVALANCHE",..: 184 32 42 69 123 191 47 147 93 2 ...
##  $ FATALITIES: num  5633 1903 978 937 816 ...
##  $ INJURIES  : num  91346 6525 1777 2100 5230 ...

Note that we dropped unused levels of EVTYPE factor variable by droplevels() function.
And there is another important note.
We assume that fatality is a much more severe consequence than injury. That’s why we sort data first by FATALITIES variable and then by INJURIES.

head(harm_sum,20)
##                      EVTYPE FATALITIES INJURIES
## 184                 TORNADO       5633    91346
## 32           EXCESSIVE HEAT       1903     6525
## 42              FLASH FLOOD        978     1777
## 69                     HEAT        937     2100
## 123               LIGHTNING        816     5230
## 191               TSTM WIND        504     6957
## 47                    FLOOD        470     6789
## 147             RIP CURRENT        368      232
## 93                HIGH WIND        248     1137
## 2                 AVALANCHE        224      170
## 214            WINTER STORM        206     1321
## 148            RIP CURRENTS        204      297
## 71                HEAT WAVE        172      309
## 37             EXTREME COLD        160      231
## 173       THUNDERSTORM WIND        133     1488
## 77               HEAVY SNOW        127     1021
## 38  EXTREME COLD/WIND CHILL        125       24
## 167             STRONG WIND        103      280
## 4                  BLIZZARD        101      805
## 89                HIGH SURF        101      152

Now we must recall that the raw data was gathered by hands and may contain typos and other uncertainties. At the beginning, we had 985 levels of EVTYPE variable. After our preparation, we have 220 levels. But there are only 48 Storm Data Event types in NATIONAL WEATHER SERVICE INSTRUCTION 10-1605

First of all we spotted that in one of the top records we have the TSTM abbreviation. Obviously, its meaning is THUNDERSTORM. Since this record stands high in the list, it’s very important and we can fix it “by-hand”

harm_sum$EVTYPE <-as.character(harm_sum$EVTYPE)
harm_sum$EVTYPE <- gsub("TSTM","THUNDERSTORM",harm_sum$EVTYPE)

Now we just copy the table with the event types from the Instruction and paste it into a standalone text file event.types.txt and then read it into R dataset. We use TAB as separator to avoid any separation, e.g. we need only one column.

event.types<-read.table("../RepData_PeerAssessment2/event.types.txt",sep="\t")
head(event.types)
##                        V1
## 1 Astronomical Low Tide Z
## 2             Avalanche Z
## 3              Blizzard Z
## 4         Coastal Flood Z
## 5       Cold/Wind Chill Z
## 6           Debris Flow C

We need to bring the event types to upper case to compare them with our data. Also we see there is a single letter (designator) at the end of each row. Let’s get rid of it.

event.types$V1 <- toupper(as.character(event.types$V1))
event.types$V1 <- substr(event.types$V1,1,nchar(event.types$V1)-2)

Our strategy of handling the typos will be as follows. We will use two functions. The first is LenLongStr() which counts the length of Longest Common Substring.

LenLongStr <- function(String1, String2) {
 
  s1 <- unlist(strsplit(String1,split=""))
  s2 <- unlist(strsplit(String2,split=""))
 
  num <- matrix(0,nchar(String1),nchar(String2) )                
  maxlen <- 0
 
  for (i in 1:nchar(String1)) {
 
    for (j in 1:nchar(String2)) {
 
        if (s1[i] == s2[j]) {
            if ((i==1) || (j==1)) { 
                num[i,j] <- 1
            } 
            else {
                num[i,j] <- 1+num[i-1,j-1]
            }
            if (num[i,j] > maxlen) {
                maxlen <- num[i,j]
            }
        }
    }
  }
 
  maxlen            
}
## this code is taken from http://snipplr.com/view/52222/longest-common-substring/

The second is adist() which counts the distance (generalized Levenshtein (edit) distance) between two strings.

As a measure of similarity of two strings we take the ratio of the length of their longest substring to the distance between these strings:

Similarity (String1, String2) = LenLongStr (String1,String2) / adist (String1,String2)

The logic is straight. The longer the common substring, the more similar are the strings. The more distance between the strings, the less similar are the strings.

Now we find the closest Event type from the Instruction for every record in the events column of our data (harm_sum dataset)

for(i in 1:nrow(harm_sum))
{
        long.com.str<-matrix(ncol=2)
        
        long.com.str[1,1] <- LenLongStr(event.types[1,1], harm_sum[i,1])
        long.com.str[1,2] <- as.numeric(adist(event.types[1,1], harm_sum[i,1]))
        
        for(j in 2:nrow(event.types))
        {
             long.com.str <- rbind(long.com.str, c(LenLongStr(event.types[j,1], harm_sum[i,1]),
                             as.numeric(adist(event.types[j,1], harm_sum[i,1]))))               
        }
        harm_sum[i,4] <- event.types[which.max(long.com.str[,1]/long.com.str[,2]),1]

}

Here is the outcome. In the first column is what we had before. In the fourth column are the results of our matching. As we can see, the outcome is pretty good. Almost all of the top records have found their appropriate options from the instruction list.

head(harm_sum,50)
##                                 EVTYPE FATALITIES INJURIES                       V4
## 184                            TORNADO       5633    91346                  TORNADO
## 32                      EXCESSIVE HEAT       1903     6525           EXCESSIVE HEAT
## 42                         FLASH FLOOD        978     1777              FLASH FLOOD
## 69                                HEAT        937     2100                     HEAT
## 123                          LIGHTNING        816     5230                LIGHTNING
## 191                  THUNDERSTORM WIND        504     6957        THUNDERSTORM WIND
## 47                               FLOOD        470     6789                    FLOOD
## 147                        RIP CURRENT        368      232              RIP CURRENT
## 93                           HIGH WIND        248     1137                HIGH WIND
## 2                            AVALANCHE        224      170                AVALANCHE
## 214                       WINTER STORM        206     1321             WINTER STORM
## 148                       RIP CURRENTS        204      297              RIP CURRENT
## 71                           HEAT WAVE        172      309                     HEAT
## 37                        EXTREME COLD        160      231  EXTREME COLD/WIND CHILL
## 173                  THUNDERSTORM WIND        133     1488        THUNDERSTORM WIND
## 77                          HEAVY SNOW        127     1021               HEAVY SNOW
## 38             EXTREME COLD/WIND CHILL        125       24  EXTREME COLD/WIND CHILL
## 167                        STRONG WIND        103      280              STRONG WIND
## 4                             BLIZZARD        101      805                 BLIZZARD
## 89                           HIGH SURF        101      152                HIGH SURF
## 74                          HEAVY RAIN         98      251               HEAVY RAIN
## 39                        EXTREME HEAT         96      155           EXCESSIVE HEAT
## 21                     COLD/WIND CHILL         95       12          COLD/WIND CHILL
## 117                          ICE STORM         89     1975                ICE STORM
## 210                           WILDFIRE         75      911                 WILDFIRE
## 109                  HURRICANE/TYPHOON         64     1275      HURRICANE (TYPHOON)
## 176                 THUNDERSTORM WINDS         64      908        THUNDERSTORM WIND
## 52                                 FOG         62      734                DENSE FOG
## 101                          HURRICANE         61       46      HURRICANE (TYPHOON)
## 189                     TROPICAL STORM         58      340           TROPICAL STORM
## 85                HEAVY SURF/HIGH SURF         42       48                HIGH SURF
## 120                          LANDSLIDE         38       52                AVALANCHE
## 98                          HIGH WINDS         35      302                HIGH WIND
## 16                                COLD         35       48          COLD/WIND CHILL
## 217                     WINTER WEATHER         33      398           WINTER WEATHER
## 196                            TSUNAMI         33      129                  TSUNAMI
## 200          UNSEASONABLY WARM AND DRY         29        0        THUNDERSTORM WIND
## 202               URBAN/SML STREAM FLD         28       79       MARINE STRONG WIND
## 219                 WINTER WEATHER/MIX         28       72           WINTER WEATHER
## 187 TORNADOES, THUNDERSTORM WIND, HAIL         25        0 MARINE THUNDERSTORM WIND
## 211                               WIND         23       86                HIGH WIND
## 31                          DUST STORM         22      440               DUST STORM
## 44                      FLASH FLOODING         19        8              FLASH FLOOD
## 23                           DENSE FOG         18      342                DENSE FOG
## 49                   FLOOD/FLASH FLOOD         17       15              FLASH FLOOD
## 40                   EXTREME WINDCHILL         17        5  EXTREME COLD/WIND CHILL
## 146              RECORD/EXCESSIVE HEAT         17        0           EXCESSIVE HEAT
## 67                                HAIL         15     1361                     HAIL
## 131                 MARINE STRONG WIND         14       22       MARINE STRONG WIND
## 17                       COLD AND SNOW         14        0               HEAVY SNOW
str(harm_sum)
## 'data.frame':    220 obs. of  4 variables:
##  $ EVTYPE    : chr  "TORNADO" "EXCESSIVE HEAT" "FLASH FLOOD" "HEAT" ...
##  $ FATALITIES: num  5633 1903 978 937 816 ...
##  $ INJURIES  : num  91346 6525 1777 2100 5230 ...
##  $ V4        : chr  "TORNADO" "EXCESSIVE HEAT" "FLASH FLOOD" "HEAT" ...

Let’s finish the first question and make the resulting dataset.

harm_result<-aggregate(cbind(harm_sum$FATALITIES,harm_sum$INJURIES), by=list(harm_sum$V4), FUN=sum)
harm_result<-harm_result[order(-harm_result[,2], -harm_result[,3]),]
names(harm_result)<-c("EVTYPE","FATALITIES","INJURIES")

head(harm_result,20)
##                     EVTYPE FATALITIES INJURIES
## 38                 TORNADO       5633    91364
## 12          EXCESSIVE HEAT       2022     6753
## 20                    HEAT       1118     2415
## 14             FLASH FLOOD       1041     1804
## 27               LIGHTNING        819     5233
## 37       THUNDERSTORM WIND        741     9510
## 32             RIP CURRENT        577      529
## 15                   FLOOD        476     6791
## 24               HIGH WIND        327     1675
## 13 EXTREME COLD/WIND CHILL        303      261
## 2                AVALANCHE        271      439
## 44            WINTER STORM        219     1430
## 22              HEAVY SNOW        169     1154
## 23               HIGH SURF        156      215
## 5          COLD/WIND CHILL        137       67
## 25     HURRICANE (TYPHOON)        133     1331
## 36             STRONG WIND        111      302
## 21              HEAVY RAIN        107      292
## 3                 BLIZZARD        101      805
## 26               ICE STORM         93     2015

Looking at this table, it’s clear that tornado is the most harmful event by a wide margin. Let’s see how we could better get ready for it. It would be good to know where to focus medical personnel and lifeguard forces to save people’s lives. For that purpose, we will draw the map of the U.S.Counties with different levels of tornado casualties.

# load ggplot2, maps and color packages
library(ggplot2) 
library(maps)
library(mapproj)
library(RColorBrewer)

# Load base of county fips
data(county.fips)

Now we will perform some data explorations to get necessary data. First, we subset data by the event type ‘tornado’ and aggregate it by counties.

tornado.county<-data2[data2$EVTYPE=="TORNADO",]
tornado.county<-aggregate(cbind(tornado.county$FATALITIES,tornado.county$INJURIES), by=list(tornado.county$COUNTY,tornado.county$COUNTYNAME), FUN=sum)

names(tornado.county)<-c("COUNTY","COUNTYNAME","FATALITIES","INJURIES")

tornado.county$COUNTY<-tornado.county$COUNTY+1000

Now we build quantiles by fatalities variable.

quant<-quantile(tornado.county[tornado.county$FATALITIES!=0, 3], probs=seq(0,1,1/8),names=F)
quant<-c(-1,quant)
quant<-unique(quant)

It turns out that we have 6 quantiles. Now we distribute counties by “buckets” and assign colors.

tornado.county$colorBuckets <- as.numeric(cut(tornado.county$FATALITIES, quant))
table(tornado.county$colorBuckets)
## 
##    1    2    3    4    5    6 
## 1517  201   89  154   96  112
cnty.fips <- county.fips$fips[match(map("county", plot=FALSE)$names, county.fips$polyname)]

colorsmatched <- tornado.county$colorBuckets[match(cnty.fips, tornado.county$COUNTY)]
colorsmatched[is.na(colorsmatched)]<-0

colors <- brewer.pal(length(quant)-1, "OrRd")

And here is the MAP

map("county", col = colors[colorsmatched], fill = TRUE, resolution = 0, lty = 0, projection = "polyconic")

leg.txt <- array()
quant[1]<-0
quant[length(quant)]<-quant[length(quant)]+1
for(i in 1:(length(quant)-1))
        {
                if(quant[i+1]==quant[i]+1)
                        leg.txt[i]<-as.character(quant[i])
                else
                        leg.txt[i]<-paste0(quant[i],"-",quant[i+1]-1)
        }

title("Tornado fatalities by U.S. Counties, 1950-2011")
legend("bottomright", leg.txt, horiz = F, fill = colors, cex = 0.8)

Thus, we see the places where they should focus forces and services that save people’s lives during a tornado. Here is the top list of the places to pay attention.

head(tornado.county[order(-tornado.county$FATALITIES, -tornado.county$INJURIES), -c(1,5)], 10)
##      COUNTYNAME FATALITIES INJURIES
## 941      JASPER        163     1258
## 682     GENESEE        120      911
## 1264   MCLENNAN        114      610
## 953   JEFFERSON        107     1529
## 2142  WORCESTER         92     1253
## 423      COWLEY         77      290
## 576     ELKHART         62      510
## 1944 TUSCALOOSA         59     1057
## 1185    MADISON         59     1011
## 2092      WHITE         59      429

Now let’s turn to the second question of the task.

Across the United States, which types of events have the greatest economic consequences?

There are two variables in data with estimate of economic consequences, PROPDMG and CROPDMG. Similarly to the first part, we’ll try to simplify the data.

zero.cons.obs<-nrow(data[data$PROPDMG==0 & data$CROPDMG==0,])
zero.cons.obs
## [1] 657266

Again, more than a half of the observations can be omitted while working on the second question.

data3<-data[data$PROPDMG!=0 | data$CROPDMG!=0,]
data3$EVTYPE<-droplevels(data3$EVTYPE)
dim(data3)
## [1] 245031     37

Additionally, we have two variables PROPDMGEXP and CROPDMGEXP. Due to NATIONAL WEATHER SERVICE INSTRUCTION 10-1605 we know that “Estimates should be rounded to three significant digits, followed by an alphabetical character signifying the magnitude of the number, i.e., 1.55B for $1,550,000,000. Alphabetical characters used to signify magnitude include “K” for thousands, “M” for millions, and “B” for billions.”

Let’s look closer at this two variables.

table(data3$PROPDMGEXP)
## 
##             -      ?      +      0      1      2      3      4      5      6      7      8      B 
##   4357      1      0      5    209      0      1      1      4     18      3      2      0     40 
##      h      H      K      m      M 
##      1      6 229057      7  11319
table(data3$CROPDMGEXP)
## 
##             ?      0      2      B      k      K      m      M 
## 145037      6     17      0      7     21  97960      1   1982

As we can see, there are some entries that are different from the documented options (“K” for thousands, “M” for millions, and “B” for billions). We can make a guess about them. For example, it can be a degree, e.g. 5 means “..multiply by 10^5”. But since there are only a few such unknown values, we prefer to omit them to avoid spoiling the data.

Now let’s get actual numbers in PROPDMG and CROPDMG variables.

data3$PROPDMGEXP<-toupper(data3$PROPDMGEXP)
data3$CROPDMGEXP<-toupper(data3$CROPDMGEXP)

data3[data3$PROPDMGEXP=="K",]$PROPDMG <- data3[data3$PROPDMGEXP=="K",]$PROPDMG*1000
data3[data3$PROPDMGEXP=="M",]$PROPDMG <- data3[data3$PROPDMGEXP=="M",]$PROPDMG*1000000
data3[data3$PROPDMGEXP=="B",]$PROPDMG <- data3[data3$PROPDMGEXP=="B",]$PROPDMG*1000000000

data3[data3$CROPDMGEXP=="K",]$CROPDMG <- data3[data3$CROPDMGEXP=="K",]$CROPDMG*1000
data3[data3$CROPDMGEXP=="M",]$CROPDMG <- data3[data3$CROPDMGEXP=="M",]$CROPDMG*1000000
data3[data3$CROPDMGEXP=="B",]$CROPDMG <- data3[data3$CROPDMGEXP=="B",]$CROPDMG*1000000000

Further analysis will be very similar to the previous one with one exception. This time, we will estimate the damage by summarizing two variables, PROPDMG and CROPDMG, not only one as we did before.

dmg_sum<-aggregate(cbind(data3$PROPDMG,data3$CROPDMG), by=list(data3$EVTYPE), FUN=sum)
dmg_sum<-dmg_sum[order(-(dmg_sum[,2]+dmg_sum[,3])),]
names(dmg_sum)<-c("EVTYPE","PROPDMG","CROPDMG")
str(dmg_sum)
## 'data.frame':    431 obs. of  3 variables:
##  $ EVTYPE : Factor w/ 431 levels "   HIGH SURF ADVISORY",..: 72 197 354 299 116 59 39 189 262 206 ...
##  $ PROPDMG: num  1.45e+11 6.93e+10 5.69e+10 4.33e+10 1.57e+10 ...
##  $ CROPDMG: num  5.66e+09 2.61e+09 4.15e+08 5.00e+03 3.03e+09 ...
head(dmg_sum,10)
##                EVTYPE      PROPDMG     CROPDMG
## 72              FLOOD 144657709807  5661968450
## 197 HURRICANE/TYPHOON  69305840000  2607872800
## 354           TORNADO  56937160779   414953270
## 299       STORM SURGE  43323536000        5000
## 116              HAIL  15732267048  3025954473
## 59        FLASH FLOOD  16140812067  1421317100
## 39            DROUGHT   1046106000 13972566000
## 189         HURRICANE  11868319010  2741910000
## 262       RIVER FLOOD   5118945500  5029459000
## 206         ICE STORM   3944927860  5022113500

Thus, we have the desired dataset, but again, with non-official event types. We already have a solution for that. So we will fix it in the same way as earlier.

Again, we start from correcting the abbreviation TSTM to THUNDERSTORM. Also let’s trim the spaces on the edges of the strings.

dmg_sum$EVTYPE <-as.character(dmg_sum$EVTYPE)
dmg_sum$EVTYPE <- gsub("TSTM","THUNDERSTORM",dmg_sum$EVTYPE)

trim <- function (x) gsub("^\\s+|\\s+$", "", x)

dmg_sum$EVTYPE <- trim(dmg_sum$EVTYPE)

We find the closest Event type from the Instruction for every record in the events column of our data (this time dmg_sum dataset)

for(i in 1:nrow(dmg_sum))
{
        long.com.str<-matrix(ncol=2)
        
        long.com.str[1,1] <- LenLongStr(event.types[1,1], dmg_sum[i,1])
        long.com.str[1,2] <- as.numeric(adist(event.types[1,1], dmg_sum[i,1]))
        
        for(j in 2:nrow(event.types))
        {
             long.com.str <- rbind(long.com.str, c(LenLongStr(event.types[j,1], dmg_sum[i,1]),
                             as.numeric(adist(event.types[j,1], dmg_sum[i,1]))))               
        }
        dmg_sum[i,4] <- event.types[which.max(long.com.str[,1]/long.com.str[,2]),1]

}

Here is the outcome. In the first column is what we had before. In the fourth column are the results of our matching. And again, the outcome is pretty good. Almost all of the top records have found their appropriate options from the instruction list.

head(dmg_sum,50)
##                                 EVTYPE      PROPDMG     CROPDMG                       V4
## 72                               FLOOD 144657709807  5661968450                    FLOOD
## 197                  HURRICANE/TYPHOON  69305840000  2607872800      HURRICANE (TYPHOON)
## 354                            TORNADO  56937160779   414953270                  TORNADO
## 299                        STORM SURGE  43323536000        5000         STORM SURGE/TIDE
## 116                               HAIL  15732267048  3025954473                     HAIL
## 59                         FLASH FLOOD  16140812067  1421317100              FLASH FLOOD
## 39                             DROUGHT   1046106000 13972566000                  DROUGHT
## 189                          HURRICANE  11868319010  2741910000      HURRICANE (TYPHOON)
## 262                        RIVER FLOOD   5118945500  5029459000              FLASH FLOOD
## 206                          ICE STORM   3944927860  5022113500                ICE STORM
## 363                     TROPICAL STORM   7703890550   678346000           TROPICAL STORM
## 424                       WINTER STORM   6688497251    26944000             WINTER STORM
## 174                          HIGH WIND   5270046295   638571300                HIGH WIND
## 414                           WILDFIRE   4765114000   295472800                 WILDFIRE
## 369                  THUNDERSTORM WIND   4484928495   554007350        THUNDERSTORM WIND
## 300                   STORM SURGE/TIDE   4641188000      850000         STORM SURGE/TIDE
## 313                  THUNDERSTORM WIND   3483121284   414843050        THUNDERSTORM WIND
## 195                     HURRICANE OPAL   3172846000    19000000      HURRICANE (TYPHOON)
## 412                   WILD/FOREST FIRE   3001829500   106796830                 WILDFIRE
## 142          HEAVY RAIN/SEVERE WEATHER   2500000000           0               HEAVY RAIN
## 328                 THUNDERSTORM WINDS   1735959023   190654788        THUNDERSTORM WIND
## 360 TORNADOES, THUNDERSTORM WIND, HAIL   1600000000     2500000 MARINE THUNDERSTORM WIND
## 138                         HEAVY RAIN    694248090   733399800               HEAVY RAIN
## 54                        EXTREME COLD     67737400  1292973000  EXTREME COLD/WIND CHILL
## 269                SEVERE THUNDERSTORM   1205360000      200000 MARINE THUNDERSTORM WIND
## 98                        FROST/FREEZE      9480000  1094086000             FROST/FREEZE
## 148                         HEAVY SNOW    932589142   134653100               HEAVY SNOW
## 225                          LIGHTNING    928659447    12092090                LIGHTNING
## 12                            BLIZZARD    659213950   112060000                 BLIZZARD
## 182                         HIGH WINDS    608323748    40720600                HIGH WIND
## 411                         WILD FIRES    624100000           0                 WILDFIRE
## 388                            TYPHOON    600230000      825000      HURRICANE (TYPHOON)
## 49                      EXCESSIVE HEAT      7753700   492402000           EXCESSIVE HEAT
## 87                              FREEZE       205000   446225000             FROST/FREEZE
## 132                               HEAT      1797000   401461500                     HEAT
## 192                     HURRICANE ERIN    258100000   136010000      HURRICANE (TYPHOON)
## 214                          LANDSLIDE    324596000    20017000                AVALANCHE
## 68                      FLASH FLOODING    307763604    15116050              FLASH FLOOD
## 66                   FLASH FLOOD/FLOOD    272450006      555000              FLASH FLOOD
## 35                     DAMAGING FREEZE      8000000   262100000             FROST/FREEZE
## 76                   FLOOD/FLASH FLOOD    174039009    95034000              FLASH FLOOD
## 130                          HAILSTORM    241000000           0                ICE STORM
## 302                        STRONG WIND    175241450    64953500              STRONG WIND
## 21                       COASTAL FLOOD    237665560           0            COASTAL FLOOD
## 386                            TSUNAMI    144062000       20000                  TSUNAMI
## 51                   EXCESSIVE WETNESS            0   142000000           EXCESSIVE HEAT
## 263                     River Flooding    106155000    28020000                DENSE FOG
## 23                    COASTAL FLOODING    126640500       56000            COASTAL FLOOD
## 186                    HIGH WINDS/COLD    110500000     7000000                HIGH WIND
## 81                            FLOODING    108255006     8855500                    FLOOD

Let’s finish the second question and make the resulting dataset.

dmg_result<-aggregate(cbind(dmg_sum$PROPDMG,dmg_sum$CROPDMG), by=list(dmg_sum$V4), FUN=sum)
dmg_result<-dmg_result[order(-(dmg_result[,2]+dmg_result[,3])),]
names(dmg_result)<-c("EVTYPE","PROPDMG","CROPDMG")

dmg_result$SUM.DMG <- dmg_result$PROPDMG + dmg_result$CROPDMG

# For neat comma-separated numbers..
library(scales)
comma(head(dmg_result,20))
##                      EVTYPE         PROPDMG        CROPDMG         SUM.DMG
## 15                    FLOOD 144,773,014,813  5,794,173,950 150,567,188,763
## 25      HURRICANE (TYPHOON)  85,256,411,010  5,506,117,800  90,762,528,810
## 40                  TORNADO  56,942,036,579    414,963,070  57,356,999,649
## 37         STORM SURGE/TIDE  47,965,074,000        855,000  47,965,929,000
## 14              FLASH FLOOD  22,164,957,807  6,562,951,150  28,727,908,957
## 19                     HAIL  15,738,269,598  3,026,794,473  18,765,064,071
## 9                   DROUGHT   1,053,188,600 13,972,631,000  15,025,819,600
## 39        THUNDERSTORM WIND   9,762,981,078  1,225,463,988  10,988,445,066
## 26                ICE STORM   4,187,799,360  5,022,114,300   9,209,913,660
## 46                 WILDFIRE   8,496,643,500    403,281,630   8,899,925,130
## 42           TROPICAL STORM   7,714,395,550    694,896,000   8,409,291,550
## 47             WINTER STORM   6,750,057,251     32,444,000   6,782,501,251
## 24                HIGH WIND   6,002,941,193    687,357,050   6,690,298,243
## 21               HEAVY RAIN   3,237,229,842    804,265,800   4,041,495,642
## 33 MARINE THUNDERSTORM WIND   2,813,902,400     48,750,000   2,862,652,400
## 17             FROST/FREEZE      19,200,000  1,910,431,000   1,929,631,000
## 13  EXTREME COLD/WIND CHILL      77,190,400  1,310,023,000   1,387,213,400
## 22               HEAVY SNOW     977,125,252    134,683,100   1,111,808,352
## 29                LIGHTNING     936,187,447     12,092,090     948,279,537
## 3                  BLIZZARD     659,413,950    112,060,000     771,473,950

We see that in the economic sense, the most severe event is FLOOD. Let’s draw the map again to understand which places are in the risk zone.

Now we will perform some data explorations to get necessary data. First, we subset data by the event type flood and aggregate it by counties.

flood.county<-data3[grepl(".*FLOOD.*",data3$EVTYPE),]
flood.county<-aggregate(cbind(flood.county$PROPDMG,flood.county$CROPDMG), by=list(flood.county$COUNTY,flood.county$COUNTYNAME), FUN=sum)

names(flood.county)<-c("COUNTY","COUNTYNAME","PROPDMG","CROPDMG")
flood.county$COUNTYNAME <- droplevels(flood.county$COUNTYNAME)

flood.county$SUM.DMG <- flood.county$PROPDMG + flood.county$CROPDMG
flood.county$COUNTY<-flood.county$COUNTY+1000

Now we build quantiles by SUM.DMG variable.

quant<-quantile(flood.county[, 5], probs=seq(0,1,1/7),names=F)
quant[1]<-0

# Already not necessary, but..
quant<-unique(quant)

We have 7 quantiles. Now we distribute counties by “buckets” and assign colors.

flood.county$colorBuckets <- as.numeric(cut(flood.county$SUM.DMG, quant))
table(flood.county$colorBuckets)
## 
##   1   2   3   4   5   6   7 
## 657 631 693 595 644 644 644
cnty.fips <- county.fips$fips[match(map("county", plot=FALSE)$names, county.fips$polyname)]

colorsmatched <- flood.county$colorBuckets[match(cnty.fips, flood.county$COUNTY)]
colorsmatched[is.na(colorsmatched)]<-0

colors <- brewer.pal(length(quant)-1, "GnBu")

And here is the MAP

map("county", col = colors[colorsmatched], fill = TRUE, resolution = 0, lty = 0, projection = "polyconic")

leg.txt <- array()
quant[1]<-0
quant[length(quant)]<-quant[length(quant)]+1
for(i in 1:(length(quant)-1))
        {
                if(quant[i+1]==quant[i]+1)
                        leg.txt[i]<-as.character(comma(quant[i]))
                else
                        leg.txt[i]<-paste(comma(quant[i]),"-",comma(quant[i+1]-1),"USD")
        }

title("Economic consequences of flood by U.S. Counties, 1996-2011")
legend("bottomleft", leg.txt, horiz = F, fill = colors, cex = 0.7, bg="transparent", bty="n")

Thus, we see the places suffering from flood at most. Here is the top list of the places to pay attention in saving the property and crop from flood.

comma(head(flood.county[order(-flood.county$SUM.DMG), -c(1,6)], 10))
##                     COUNTYNAME         PROPDMG       CROPDMG         SUM.DMG
## 2648                      NAPA 115,116,210,000    66,900,000 115,183,110,000
## 9    ADAMS, CALHOUN AND JERSEY   5,000,000,000 5,000,000,000  10,000,000,000
## 2713                    NDZ027   3,004,200,000             0   3,004,200,000
## 3685                    SHELBY   2,081,050,520             0   2,081,050,520
## 773                   DAVIDSON   1,525,390,500         5,000   1,525,395,500
## 1737                 JEFFERSON   1,011,505,000         5,000   1,011,510,000
## 3937                    TUNICA   1,000,331,000             0   1,000,331,000
## 1028              FLZ072 - 074     450,000,000   500,000,000     950,000,000
## 3721                  SOMERSET     853,000,000             0     853,000,000
## 316                     BROOME     813,296,500             0     813,296,500

Results

In this research we studied the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. Our goal was to retrieve helpful information for resistance and better reaction on most severe weather events.

  1. The first result is the top list of the most harmful events with respect to population health.
head(harm_result)
##               EVTYPE FATALITIES INJURIES
## 38           TORNADO       5633    91364
## 12    EXCESSIVE HEAT       2022     6753
## 20              HEAT       1118     2415
## 14       FLASH FLOOD       1041     1804
## 27         LIGHTNING        819     5233
## 37 THUNDERSTORM WIND        741     9510
  1. Thus, we found that tornado is the most dangerous event for people. Here is the list of places with highest level of tornado casualties.
head(tornado.county[order(-tornado.county$FATALITIES, -tornado.county$INJURIES), -c(1,5)])
##      COUNTYNAME FATALITIES INJURIES
## 941      JASPER        163     1258
## 682     GENESEE        120      911
## 1264   MCLENNAN        114      610
## 953   JEFFERSON        107     1529
## 2142  WORCESTER         92     1253
## 423      COWLEY         77      290
  1. In the second part of research, we studied the economic consequences of severe events. This is the top list of the events with respect to total damage (property + crop).
comma(head(dmg_result))
##                 EVTYPE         PROPDMG       CROPDMG         SUM.DMG
## 15               FLOOD 144,773,014,813 5,794,173,950 150,567,188,763
## 25 HURRICANE (TYPHOON)  85,256,411,010 5,506,117,800  90,762,528,810
## 40             TORNADO  56,942,036,579   414,963,070  57,356,999,649
## 37    STORM SURGE/TIDE  47,965,074,000       855,000  47,965,929,000
## 14         FLASH FLOOD  22,164,957,807 6,562,951,150  28,727,908,957
## 19                HAIL  15,738,269,598 3,026,794,473  18,765,064,071
  1. It turned out that flood is the most harmful event for the property and crop. At last, we built the list of counties with the highest level of economic loss from flood.
comma(head(flood.county[order(-flood.county$SUM.DMG), -c(1,6)]))
##                     COUNTYNAME         PROPDMG       CROPDMG         SUM.DMG
## 2648                      NAPA 115,116,210,000    66,900,000 115,183,110,000
## 9    ADAMS, CALHOUN AND JERSEY   5,000,000,000 5,000,000,000  10,000,000,000
## 2713                    NDZ027   3,004,200,000             0   3,004,200,000
## 3685                    SHELBY   2,081,050,520             0   2,081,050,520
## 773                   DAVIDSON   1,525,390,500         5,000   1,525,395,500
## 1737                 JEFFERSON   1,011,505,000         5,000   1,011,510,000

Since there is no need of our recommendations in this analysis, we just wanna wish strong health and great weather for everybody!