In this article, we explore the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database, which represents characteristics of major weather events in the US. The charateristics include when and where the events occur and estimates of any health damage (i.e., fatalities and injuries) and economic damages (i.e., property and crop damages).
The research questions we have addressed in this analysis are as follows:
The structure of this report is as follows: In Section Data Processing we load and pre-process the data. Section Results presents our analysis results.
In this section, we load and process the data. In the first subsection (Data), we load the data. The Variables subsection selects the variables from the data set that are essential for our analysis. The subsequent subsection calculates the real amount of property and crop damages. Finally, in the last subsection, we categorize the damages into two groups (health and economic).
The data for this project comes in the form of a CSV file compressed via the bzip2 algorithm to reduce its size. The following script loads it into a variable named data.
dataURL <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
temp <- tempfile()
download.file(dataURL,temp)
data <- read.csv(temp, stringsAsFactors = FALSE)
unlink(temp)
dim(data)
## [1] 902297 37
As we see, the data has 902297 observations and 37 variables.
Now, let us take a look at the variable names. (Note that we first convert them to lower case.)
names(data) <- tolower(names(data))
names(data)
## [1] "state__" "bgn_date" "bgn_time" "time_zone" "county"
## [6] "countyname" "state" "evtype" "bgn_range" "bgn_azi"
## [11] "bgn_locati" "end_date" "end_time" "county_end" "countyendn"
## [16] "end_range" "end_azi" "end_locati" "length" "width"
## [21] "f" "mag" "fatalities" "injuries" "propdmg"
## [26] "propdmgexp" "cropdmg" "cropdmgexp" "wfo" "stateoffic"
## [31] "zonenames" "latitude" "longitude" "latitude_e" "longitude_"
## [36] "remarks" "refnum"
The variables that we need in this analysis are as follows:
To make our life easier, we modify the names of the above varaibles as follows:
names(data)[8] <- "event_type"
names(data)[25] <- "property_damage"
names(data)[26] <- "property_dmg_exp"
names(data)[27] <- "crop_damage"
names(data)[28] <- "crop_dmg_exp"
In the following script, we keep only those variables of the data set that we need for our analysis.
library(dplyr)
data <- data %>%
select(event_type, fatalities, injuries, property_damage, property_dmg_exp, crop_damage, crop_dmg_exp)
str(data)
## 'data.frame': 902297 obs. of 7 variables:
## $ event_type : chr "TORNADO" "TORNADO" "TORNADO" "TORNADO" ...
## $ fatalities : num 0 0 0 0 0 0 0 0 1 0 ...
## $ injuries : num 15 0 2 2 2 6 1 0 14 0 ...
## $ property_damage : num 25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
## $ property_dmg_exp: chr "K" "K" "K" "K" ...
## $ crop_damage : num 0 0 0 0 0 0 0 0 0 0 ...
## $ crop_dmg_exp : chr "" "" "" "" ...
The next step is to calculate the real amount of economic damages, i.e., property and crop damages. The combination of property_dmg_exp (crop_dmg_exp, respectively) and property_damage (crop_damage, respectively) gives us the amount of property (crop, respectively) damage in US dollars. Indeed, property_dmg_exp and crop_dmg_exp are kind of exponents for the values in property_damage and crop_damage, respectively. Let us first take a look at the possible values of these exponents:
unique(c(unique(data$property_dmg_exp), unique(data$crop_dmg_exp)))
## [1] "K" "M" "" "B" "m" "+" "0" "5" "6" "?" "4" "2" "3" "h" "7" "H" "-"
## [18] "1" "8" "k"
According to this article, the meaning of the possible values of property_dmg_exp and crop_dmg_exp are as follows:
The following function, apply_exp, helps us to compute the real amount of damages in US dollars. It takes a value (i.e., a value of either property_damage or crop_damage) and converts it to a new value based on a given exponent:
apply_exp <- function(value, exponent){
x <- 0
if((exponent == "h") || (exponent == "H")){
x <- value * 100
}
else if((exponent == "k") || (exponent == "K")){
x <- value * 1000
}
else if((exponent == "m") || (exponent == "M")){
x <- value * 1000000
}
else if((exponent == "b") || (exponent == "B")){
x <- value * 1000000000
}
else if(exponent == "+"){
x <- value
}
else if(exponent %in% as.character(0:8)){
x <- value * 10
}
else {
x <- 0
}
x
}
The following script adds the following new variables to the data set:
prop_dmg, which represents the real amount of property damage casused by events in US dollars
crop_dmg, which represents the real amount of crop damage casused by events in US dollars
Moreover, it removes the old variables (i.e, property_damage, crop_damage, property_dmg_exp, and crop_dmg_exp), as we do not need to keep them in the data set anymore.
data <- data %>%
mutate(prop_dmg = mapply(apply_exp, property_damage, property_dmg_exp), crop_dmg = mapply(apply_exp, crop_damage, crop_dmg_exp)) %>%
select(-c(property_damage, crop_damage, property_dmg_exp, crop_dmg_exp))
sample_n(data, 6)
## event_type fatalities injuries prop_dmg crop_dmg
## 302744 TORNADO 0 2 150000 0
## 817754 HAIL 0 0 0 0
## 349854 HAIL 0 0 0 0
## 885722 HAIL 0 0 0 0
## 421664 TSTM WIND 0 0 100000 0
## 615162 HAIL 0 0 500000 0
For a given observation, we can categorize the damages into two major groups: the health and the economic damages. The former would be the number of fatalities plus the number of injuries. The latter would be the amount of crop damage plus the amount of property damage. In the following script, we add two corresponding variables (health_dmg and economic_dmg, respectivly) to the data set.
data <- data %>%
mutate(health_dmg = fatalities + injuries, economic_dmg = prop_dmg + crop_dmg)
data <- data[, c(1:3, 6, 4, 5, 7)]
sample_n(data, 5)
## event_type fatalities injuries health_dmg prop_dmg crop_dmg
## 616274 TSTM WIND 0 0 0 0 0
## 127090 HAIL 0 0 0 0 0
## 227251 LIGHTNING 0 0 0 0 0
## 898274 THUNDERSTORM WIND 0 0 0 0 0
## 576909 TSTM WIND 0 0 0 25000 0
## economic_dmg
## 616274 0
## 127090 0
## 227251 0
## 898274 0
## 576909 25000
Now, let us split our data into two data sets: data_health and data_economic:
data_health <- data[, c(1:4)]
str(data_health)
## 'data.frame': 902297 obs. of 4 variables:
## $ event_type: chr "TORNADO" "TORNADO" "TORNADO" "TORNADO" ...
## $ fatalities: num 0 0 0 0 0 0 0 0 1 0 ...
## $ injuries : num 15 0 2 2 2 6 1 0 14 0 ...
## $ health_dmg: num 15 0 2 2 2 6 1 0 15 0 ...
data_economic <- data[, c(1, 5:7)]
str(data_economic)
## 'data.frame': 902297 obs. of 4 variables:
## $ event_type : chr "TORNADO" "TORNADO" "TORNADO" "TORNADO" ...
## $ prop_dmg : num 25000 2500 25000 2500 2500 2500 2500 2500 25000 25000 ...
## $ crop_dmg : num 0 0 0 0 0 0 0 0 0 0 ...
## $ economic_dmg: num 25000 2500 25000 2500 2500 2500 2500 2500 25000 25000 ...
Now, we are ready to address our analysis questions in the next section.
In this section, we address our main questions:
QUESTION 1: ``Across the United States, which types of events are most harmful with respect to population health?’’
QUESTION 2: ``Across the United States, which types of events have the greatest economic consequences?’’
To address the above questions, we will respectively play with the data sets data_health and data_economic. We address the above questions in the following subsections, respectively.
In the following script, we get the sum of health damages for each event type. The result is saved into a new data set named data_health_evn.
data_health_evn <- data_health %>%
group_by(event_type) %>%
summarize(fatalities = sum(fatalities, na.rm = TRUE), injuries = sum(injuries, na.rm = TRUE), health_damage = sum(health_dmg, na.rm = TRUE))
data_health_evn <- as.data.frame(data_health_evn)
tail(data_health_evn)
## event_type fatalities injuries health_damage
## 980 WINTER WEATHER/MIX 28 72 100
## 981 WINTERY MIX 0 0 0
## 982 Wintry mix 0 0 0
## 983 Wintry Mix 0 0 0
## 984 WINTRY MIX 1 77 78
## 985 WND 0 0 0
Let us take a look at the range of health damages:
range(data_health_evn$health_damage)
## [1] 0 96979
The following script shows us what percentage of health damage values are 0 in this vector.
mean(data_health_evn$health_damage == 0) * 100
## [1] 77.66497
Since those events with 0 health damage are not of interest, we filter out them from the data set:
data_health_evn <- data_health_evn %>% filter(health_damage != 0)
dim(data_health_evn)
## [1] 220 4
Let us now take a look at the quantile of the health damage vector:
bord_health <- quantile(data_health_evn$health_damage, probs = c(0.1, 0.5, 0.9))
bord_health
## 10% 50% 90%
## 1.0 5.0 463.9
In our point of view, those events that are in top %10 of decreaseing health damages should be considered as the most harmful damages with respect to population health. As we see above, any event type whose health damage is greater than or equal to 4642 should be included in this list. In the rest of this subsection, we analyze more these event types.
We filter our data set, according to the above criteria, to get a new data frame, data_health_evn_worst. Let us refer to this data set as the worst cases w.r.t health damages.
data_health_evn_worst <- data_health_evn %>% filter(health_damage >= round(bord_health[3]))
dim(data_health_evn_worst)
## [1] 22 4
Before going further (say visual analysis), we need to add IDs for events. As we see in this code chunk, the names of some given events are so long. This could make our analysis plots look akward. Therefore, we add an index column to our data set to distiguish the event types by their indices.
data_health_evn_worst$event_ID <- seq.int(nrow(data_health_evn_worst))
data_health_evn_worst <- data_health_evn_worst[, c(5, 1:4)]
str(data_health_evn_worst)
## 'data.frame': 22 obs. of 5 variables:
## $ event_ID : int 1 2 3 4 5 6 7 8 9 10 ...
## $ event_type : chr "BLIZZARD" "EXCESSIVE HEAT" "FLASH FLOOD" "FLOOD" ...
## $ fatalities : num 101 1903 978 470 62 ...
## $ injuries : num 805 6525 1777 6789 734 ...
## $ health_damage: num 906 8428 2755 7259 796 ...
In the following, we do some visualiztion analysis on our data set, i.e., the worst cases w.r.t health damages. The blue and red dashed lines indicate the corresponding median and mean lines. The x-axis in any of the following plots denotes the event IDs.
layout(matrix(c(1, 2, 3, 3), nrow=2, byrow=TRUE))
with(data_health_evn_worst, plot(event_ID, fatalities, main = "Fatalities in Worst Cases", xlab = "Event ID", ylab = "Fatalities"))
abline(h = median(data_health_evn_worst$fatalities), col = "blue", lwd = 2, lty = 3)
abline(h = mean(data_health_evn_worst$fatalities), col = "red", lwd = 2, lty = 3)
with(data_health_evn_worst, plot(event_ID, injuries, main = "Injuries in Worst Cases", xlab = "Event ID", ylab = "Injuries"))
abline(h = median(data_health_evn_worst$injuries), col = "blue", lwd = 2, lty = 3)
abline(h = mean(data_health_evn_worst$injuries), col = "red", lwd = 2, lty = 3)
with(data_health_evn_worst, plot(event_ID, health_damage, main = "Health Damages in Worst Cases", xlab = "Event ID", ylab = "fatalities + injuries"))
abline(h = median(data_health_evn_worst$health_damage), col = "blue", lwd = 2, lty = 3)
abline(h = mean(data_health_evn_worst$health_damage), col = "red", lwd = 2, lty = 3)
Here is just to recall which Event IDs in our plots refer to what Event Types:
data_health_evn_worst %>% select(event_ID, event_type)
## event_ID event_type
## 1 1 BLIZZARD
## 2 2 EXCESSIVE HEAT
## 3 3 FLASH FLOOD
## 4 4 FLOOD
## 5 5 FOG
## 6 6 HAIL
## 7 7 HEAT
## 8 8 HEAT WAVE
## 9 9 HEAVY SNOW
## 10 10 HIGH WIND
## 11 11 HURRICANE/TYPHOON
## 12 12 ICE STORM
## 13 13 LIGHTNING
## 14 14 RIP CURRENT
## 15 15 RIP CURRENTS
## 16 16 THUNDERSTORM WIND
## 17 17 THUNDERSTORM WINDS
## 18 18 TORNADO
## 19 19 TSTM WIND
## 20 20 WILD/FOREST FIRE
## 21 21 WILDFIRE
## 22 22 WINTER STORM
The median (blue line), mean (red line), and max of the fatalities in the worst cases are 188, ~599, and 5633, respectively.
The following script extracts those events which have fatalities above the median value of the selected events:
data_fatalities_median <- data_health_evn_worst %>% filter(data_health_evn_worst$fatalities > median(data_health_evn_worst$fatalities))
data_fatalities_median$event_type
## [1] "EXCESSIVE HEAT" "FLASH FLOOD" "FLOOD" "HEAT"
## [5] "HIGH WIND" "LIGHTNING" "RIP CURRENT" "RIP CURRENTS"
## [9] "TORNADO" "TSTM WIND" "WINTER STORM"
The following script, extracts those selected events which are above mean value:
data_fatalities_mean <- data_health_evn_worst %>% filter(data_health_evn_worst$fatalities > mean(data_health_evn_worst$fatalities))
data_fatalities_mean$event_type
## [1] "EXCESSIVE HEAT" "FLASH FLOOD" "HEAT" "LIGHTNING"
## [5] "TORNADO"
The following script extracts the most harmful event type with respect to the number of fatalities.
data_health_evn_worst[data_health_evn_worst$fatalities == max(data_health_evn_worst$fatalities), ]$event_type
## [1] "TORNADO"
The median, mean, and max of the injuries in the selected event types are ~1298, ~6138, 91346, respectively.
The following script extracts those events which have injuries above the median value of the selected events:
data_injuries_median <- data_health_evn_worst %>% filter(data_health_evn_worst$injuries > median(data_health_evn_worst$injuries))
data_injuries_median$event_type
## [1] "EXCESSIVE HEAT" "FLASH FLOOD" "FLOOD"
## [4] "HAIL" "HEAT" "ICE STORM"
## [7] "LIGHTNING" "THUNDERSTORM WIND" "TORNADO"
## [10] "TSTM WIND" "WINTER STORM"
The following script extracts those events which have injuries above the mean value of the selected events:
data_injuries_mean <- data_health_evn_worst %>% filter(data_health_evn_worst$injuries > mean(data_health_evn_worst$injuries))
data_injuries_mean$event_type
## [1] "EXCESSIVE HEAT" "FLOOD" "TORNADO" "TSTM WIND"
The following script extracts the most harmful event type with respect to the number of injuries.
data_health_evn_worst[data_health_evn_worst$injuries == max(data_health_evn_worst$injuries), ]$event_type
## [1] "TORNADO"
The median, mean, and max of the health damages (fatalities + injuries) in the worst case event types are 1380, 6737.4545, 96979, respectively.
The following script, extracts those events which have health damages above the median value of the selected events:
data_health_median <- data_health_evn_worst %>% filter(data_health_evn_worst$health_damage > median(data_health_evn_worst$health_damage))
data_health_median$event_type
## [1] "EXCESSIVE HEAT" "FLASH FLOOD" "FLOOD"
## [4] "HEAT" "HIGH WIND" "ICE STORM"
## [7] "LIGHTNING" "THUNDERSTORM WIND" "TORNADO"
## [10] "TSTM WIND" "WINTER STORM"
The following script, extracts those events which have health damages above the mean value of the selected events:
data_health_mean <- data_health_evn_worst %>% filter(data_health_evn_worst$health_damage > mean(data_health_evn_worst$health_damage))
data_health_mean$event_type
## [1] "EXCESSIVE HEAT" "FLOOD" "TORNADO" "TSTM WIND"
The following script extracts the most harmful event type with respect to the number of injuries plus fatalities.
data_health_evn_worst[data_health_evn_worst$health_damage == max(data_health_evn_worst$health_damage), ]$event_type
## [1] "TORNADO"
In the following script, we get the sum of economic damages for each event type. The result is saved into a new data set named data_economic_evn.
data_economic_evn <- data_economic %>%
group_by(event_type) %>%
summarize(property_damage = sum(prop_dmg, na.rm = TRUE), crop_damage = sum(crop_dmg, na.rm = TRUE), economic_damage = sum(economic_dmg, na.rm = TRUE))
data_economic_evn <- as.data.frame(data_economic_evn)
tail(data_economic_evn)
## event_type property_damage crop_damage economic_damage
## 980 WINTER WEATHER/MIX 6372000 0 6372000
## 981 WINTERY MIX 0 0 0
## 982 Wintry mix 0 0 0
## 983 Wintry Mix 2500 0 2500
## 984 WINTRY MIX 10000 0 10000
## 985 WND 0 0 0
The following script represents the range of economic damages in US dollars.
range(data_economic_evn$economic_damage)
## [1] 0 150319678250
The following script shows us what percentage of economic damage values are 0 in this vector.
mean(data_economic_evn$economic_damage == 0) * 100
## [1] 56.35
Obviously, we are not interested in event types whose economic damages are 0. Therefore, we filter them out from our data set:
data_economic_evn <- data_economic_evn %>% filter(economic_damage != 0)
dim(data_economic_evn)
## [1] 430 4
Let us now take a look at the quantile of the economic damage vector:
bord_economic <- quantile(data_economic_evn$economic_damage, probs = c(0.1, 0.5, 0.9) )
bord_economic
## 10% 50% 90%
## 4000 223250 237918499
Again, we consider those event types whose economic damages are in top %10 of decreaseing economic damages. As we see above, any event type whose economic damage is greater than or equal to 237918499 should be included in this list.
We filter our data set, according to the above criteria, to get a new data frames, called data_economic_evn_worst. Let us refer to this data set as the worst cases w.r.t economic damages.
data_economic_evn_worst <- data_economic_evn %>% filter(economic_damage >= round(bord_economic[3]))
dim(data_economic_evn_worst)
## [1] 43 4
As we did in the previous subsection, we add an index column to data_economic_evn to distiguish the event types by their indices.
data_economic_evn_worst$event_ID <- seq.int(nrow(data_economic_evn_worst))
data_economic_evn_worst <- data_economic_evn_worst[, c(5, 1:4)]
str(data_economic_evn_worst)
## 'data.frame': 43 obs. of 5 variables:
## $ event_ID : int 1 2 3 4 5 6 7 8 9 10 ...
## $ event_type : chr "BLIZZARD" "DAMAGING FREEZE" "DROUGHT" "EXCESSIVE HEAT" ...
## $ property_damage: num 659213950 8000000 1046106000 7753700 67737400 ...
## $ crop_damage : num 112060000 262100000 13972566000 492402000 1292973000 ...
## $ economic_damage: num 771273950 270100000 15018672000 500155700 1360710400 ...
In the following figures, we see the plots regarding the worst cases w.r.t. economic damages. Again, the blue and red dashed lines indicate the corresponding median and mean lines, respectively.
layout(matrix(c(1, 2, 3, 3), nrow=2, byrow=TRUE))
with(data_economic_evn_worst, plot(event_ID, property_damage, main = "Property Damages in Worst Cases", xlab = "Event ID", ylab = "Property Damages ($)"))
abline(h = median(data_economic_evn_worst$property_damage), col = "blue", lwd = 2, lty = 3)
abline(h = mean(data_economic_evn_worst$property_damage), col = "red", lwd = 2, lty = 3)
with(data_economic_evn_worst, plot(event_ID, crop_damage, main = "Crop Damages in Worst Cases", xlab = "Event ID", ylab = "Crop Damages ($)"))
abline(h = median(data_economic_evn_worst$crop_damage), col = "blue", lwd = 2, lty = 3)
abline(h = mean(data_economic_evn_worst$crop_damage), col = "red", lwd = 2, lty = 3)
with(data_economic_evn_worst, plot(event_ID, economic_damage, main = "Economic (Property + Crop) Damages in Worst Cases", xlab = "Event ID", ylab = "Economic Damages ($)"))
abline(h = median(data_economic_evn_worst$economic_damage), col = "blue", lwd = 2, lty = 3)
abline(h = mean(data_economic_evn_worst$economic_damage), col = "red", lwd = 2, lty = 3)
Here is to recall what Event IDs in our plots refer to what Event Types:
data_economic_evn_worst %>% select(event_ID, event_type)
## event_ID event_type
## 1 1 BLIZZARD
## 2 2 DAMAGING FREEZE
## 3 3 DROUGHT
## 4 4 EXCESSIVE HEAT
## 5 5 EXTREME COLD
## 6 6 FLASH FLOOD
## 7 7 FLASH FLOOD/FLOOD
## 8 8 FLASH FLOODING
## 9 9 FLOOD
## 10 10 FLOOD/FLASH FLOOD
## 11 11 FREEZE
## 12 12 FROST/FREEZE
## 13 13 HAIL
## 14 14 HAILSTORM
## 15 15 HEAT
## 16 16 HEAVY RAIN
## 17 17 HEAVY RAIN/SEVERE WEATHER
## 18 18 HEAVY SNOW
## 19 19 HIGH WIND
## 20 20 HIGH WINDS
## 21 21 HURRICANE
## 22 22 HURRICANE ERIN
## 23 23 HURRICANE OPAL
## 24 24 HURRICANE/TYPHOON
## 25 25 ICE STORM
## 26 26 LANDSLIDE
## 27 27 LIGHTNING
## 28 28 RIVER FLOOD
## 29 29 SEVERE THUNDERSTORM
## 30 30 STORM SURGE
## 31 31 STORM SURGE/TIDE
## 32 32 STRONG WIND
## 33 33 THUNDERSTORM WIND
## 34 34 THUNDERSTORM WINDS
## 35 35 TORNADO
## 36 36 TORNADOES, TSTM WIND, HAIL
## 37 37 TROPICAL STORM
## 38 38 TSTM WIND
## 39 39 TYPHOON
## 40 40 WILD FIRES
## 41 41 WILD/FOREST FIRE
## 42 42 WILDFIRE
## 43 43 WINTER STORM
The median (blue line), mean (red line), and max of the property damges in the selected event types are 1205360000$, ~9888930226$, and 144657709800$ respectively.
The following script extracts those events whose property damage are above the median value of the selected events:
data_property_median <- data_economic_evn_worst %>% filter(data_economic_evn_worst$property_damage > median(data_economic_evn_worst$property_damage))
data_property_median$event_type
## [1] "FLASH FLOOD" "FLOOD"
## [3] "HAIL" "HEAVY RAIN/SEVERE WEATHER"
## [5] "HIGH WIND" "HURRICANE"
## [7] "HURRICANE OPAL" "HURRICANE/TYPHOON"
## [9] "ICE STORM" "RIVER FLOOD"
## [11] "STORM SURGE" "STORM SURGE/TIDE"
## [13] "THUNDERSTORM WIND" "THUNDERSTORM WINDS"
## [15] "TORNADO" "TORNADOES, TSTM WIND, HAIL"
## [17] "TROPICAL STORM" "TSTM WIND"
## [19] "WILD/FOREST FIRE" "WILDFIRE"
## [21] "WINTER STORM"
The following script extracts those events whose property damage are above the mean value of the selected events:
data_property_mean <- data_economic_evn_worst %>% filter(data_economic_evn_worst$property_damage > mean(data_economic_evn_worst$property_damage))
data_property_mean$event_type
## [1] "FLASH FLOOD" "FLOOD" "HAIL"
## [4] "HURRICANE" "HURRICANE/TYPHOON" "STORM SURGE"
## [7] "TORNADO"
The following script extracts the most constly event type with respect to property damage.
data_economic_evn_worst[data_economic_evn_worst$property_damage == max(data_economic_evn_worst$property_damage), ]$event_type
## [1] "FLOOD"
The median and mean, and max of the crop damage in the selected event types are ~190655500$, ~1120488179$, 13972566000$, respectively.
The following script extracts those events whose crop damage are above the median value of the selected events:
data_crop_median <- data_economic_evn_worst %>% filter(data_economic_evn_worst$crop_damage > median(data_economic_evn_worst$crop_damage))
data_crop_median$event_type
## [1] "DAMAGING FREEZE" "DROUGHT" "EXCESSIVE HEAT"
## [4] "EXTREME COLD" "FLASH FLOOD" "FLOOD"
## [7] "FREEZE" "FROST/FREEZE" "HAIL"
## [10] "HEAT" "HEAVY RAIN" "HIGH WIND"
## [13] "HURRICANE" "HURRICANE/TYPHOON" "ICE STORM"
## [16] "RIVER FLOOD" "THUNDERSTORM WIND" "TORNADO"
## [19] "TROPICAL STORM" "TSTM WIND" "WILDFIRE"
The following script extracts those events whose crop damage are above the mean value of the selected events:
data_crop_mean <- data_economic_evn_worst %>% filter(data_economic_evn_worst$crop_damage > mean(data_economic_evn_worst$crop_damage))
data_crop_mean$event_type
## [1] "DROUGHT" "EXTREME COLD" "FLASH FLOOD"
## [4] "FLOOD" "HAIL" "HURRICANE"
## [7] "HURRICANE/TYPHOON" "ICE STORM" "RIVER FLOOD"
The following script extracts the most costly event type with respect to the crop damage.
data_economic_evn_worst[data_economic_evn_worst$crop_damage == max(data_economic_evn_worst$crop_damage), ]$event_type
## [1] "DROUGHT"
The median and mean, and max of the economic damages (property + crop damages) in the selected event types are 1602500000$, 11009418405.0698$, 150319678250$, respectively.
The following script extracts those events whose economic cost are above the median value of the selected events:
data_eco_median <- data_economic_evn_worst %>% filter(data_economic_evn_worst$economic_damage > median(data_economic_evn_worst$economic_damage))
data_eco_median$event_type
## [1] "DROUGHT" "FLASH FLOOD"
## [3] "FLOOD" "HAIL"
## [5] "HEAVY RAIN/SEVERE WEATHER" "HIGH WIND"
## [7] "HURRICANE" "HURRICANE OPAL"
## [9] "HURRICANE/TYPHOON" "ICE STORM"
## [11] "RIVER FLOOD" "STORM SURGE"
## [13] "STORM SURGE/TIDE" "THUNDERSTORM WIND"
## [15] "THUNDERSTORM WINDS" "TORNADO"
## [17] "TROPICAL STORM" "TSTM WIND"
## [19] "WILD/FOREST FIRE" "WILDFIRE"
## [21] "WINTER STORM"
The following script extracts those events whose economic cost are above the mean value of the selected events:
data_eco_mean <- data_economic_evn_worst %>% filter(data_economic_evn_worst$economic_damage > mean(data_economic_evn_worst$economic_damage))
data_eco_mean$event_type
## [1] "DROUGHT" "FLASH FLOOD" "FLOOD"
## [4] "HAIL" "HURRICANE" "HURRICANE/TYPHOON"
## [7] "STORM SURGE" "TORNADO"
The following script extracts the most costly event type.
data_economic_evn_worst[data_economic_evn_worst$economic_damage == max(data_economic_evn_worst$economic_damage), ]$event_type
## [1] "FLOOD"