The basic goal of this report is to explore the NOAA Storm Database and answer some basic questions about severe weather events:
Due to the data set elaboration, a lot of processing in terms of reclassification was necessary. We also applied deflation to values in order to obtain comparable values of economic damages along the period. We end up with tidy data grouped by event types to evaluate both questions. We observe that the consequences of natural event types on population health and economic consequences could be understood as a compound effect on frequency and intensity, each one with its own trend. We review the trends of tornadoes during the period. We analysed the data set in terms of the most frequent events and also explore grouping of event types with the intensity in fatalities, injuries, property and crop damages with the mean by event. Finally, we presented the 10 most harmful natural type of events for people and the economy, and half of the events were in both lists.
The raw data analysed in this report is a part of the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage. The period of the data is from 1950 to 2011. There is more information available at: Storm Events Database.
Due to the amount of information in the data set, we will only work with the variables of our interest
#including packages
paq<-c("dplyr","ggplot2","readr","knitr" )
lapply(paq, function(paq) {if (!require(paq, character.only=T))
{install.packages(paq);require(paq)}})
## [[1]]
## NULL
##
## [[2]]
## NULL
##
## [[3]]
## NULL
##
## [[4]]
## NULL
#setting environment
Sys.setenv(TZ="Europe/Madrid")
Sys.setlocale("LC_TIME","en_US.UTF-8")
## [1] "en_US.UTF-8"
#folders
cDir<-c("data","figures")
lapply(cDir, function(cDir) {if(!file.exists(cDir)) {
dir.create(cDir)}})
## [[1]]
## NULL
##
## [[2]]
## NULL
# loading storm data
zipF<- paste0("./",cDir[1],"/StormData.csv.bz2")
#fileUrl <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
#download.file(fileUrl, destfile=zipF)
#dateDownloaded <- date()
dataF<-zipF
stormDatanames<-c("STATE__","BGN_DATE","BGN_TIME","TIME_ZONE","COUNTY",
"COUNTYNAME","STATE","EVTYPE","BGN_RANGE","BGN_AZI",
"BGN_LOCATI","END_DATE","END_TIME","COUNTY_END",
"COUNTYENDN","END_RANGE","END_AZI","END_LOCATI",
"LENGTH","WIDTH","F","MAG","FATALITIES","INJURIES",
"PROPDMG","PROPDMGEXP","CROPDMG","CROPDMGEXP","WFO",
"STATEOFFIC","ZONENAMES","LATITUDE","LONGITUDE",
"LATITUDE_E","LONGITUDE_","REMARKS","REFNUM")
stormData <-
read_csv(dataF,col_names = stormDatanames,
cols_only(
"BGN_DATE" = "c",
"EVTYPE" = "c",
"FATALITIES" = "n",
"INJURIES" = "n",
"PROPDMG" = "n",
"PROPDMGEXP" = "c",
"CROPDMG" = "n",
"CROPDMGEXP" = "c"),
skip = 1,
progress = FALSE)
rstormData<-nrow(stormData)
cstormData<-ncol(stormData)
#loading inflation data
fileUrl <-"https://github.com/angieFelipe/datasciencesoursera/blob/master/ipcUSA.csv?raw=true"
dateDownloaded2 <- date()
inflation<-read_csv2(fileUrl,col_names=TRUE,
cols(
year = col_double(),
pct_inflation = col_double(),
inflatiion = col_double(),
inf_rate = col_double(),
act_rate = col_double()
))
The storm data set contains 902297 observations of 8 variables. The inflation data set contains the year and an approximation to the annual inflation rate in the EEUU.
First thing is to reduce the dataset to the variables and rows of our interest. As we are focused in the types of events with personal and economic damages we will discharge event with no such effects and variables which are not directly related to this information. We will keep year of beginning of the event just in case a time series analysis could be of our interest and also for deflating dollars among the period.
#only variables of our interest
stormData$year<-as.POSIXlt( as.Date.character(stormData$BGN_DATE, "%m/%d/%Y %X"))$year+1900
storm_data<-stormData
storm_data[is.na(storm_data)]<-0 #converting na to 0
#only rows of interes
storm_data<-storm_data %>%filter(FATALITIES>0 | INJURIES>0 | PROPDMG>0 | CROPDMG>0)
rstorm_data<-nrow(storm_data)
cstorm_data<-ncol(storm_data)
Now we have a dataset of 254633 observations and the above 9 mentioned variables.
From a quick review in the dataset it is obvious than a reclassification of event types and “DMGEXP” variables is required in order to obtain a tidy data set.
Also, we have current $ values from 1950 to 2011 and we are in year 2018, therefore we will need to convert of this valuation of damages to an actual value in 2018 in order to made figures comparable. (this is to say that the grain value in 1950 is no directly comparable with the actual grain value or that of year 2000).
In the next sections we will proceed to tidy the data.
In NOAA guidelines Storm Data Event Table there are 48 Event types while in our data set there are more than 400. Therefore we will reclassify variable Event type as much as possible and leave other event with a generic name: “rest”.
#NOAA event type classification
stype<-tolower(c("Astronomical Low Tide","Avalanche","Blizzard","Coastal Flood", "Cold/Wind Chill","Debris Flow","Dense Fog","Dense Smoke",
"Drought","Dust Devil","Dust Storm","Excessive Heat",
"Extreme Cold/Wind Chill","Flash Flood","Flood","Frost/Freeze",
"Funnel Cloud","Freezing Fog","Hail","Heat","Heavy Rain",
"Heavy Snow","High Surf","High Wind","Hurricane (Typhoon)",
"Ice Storm","Lake-Effect Snow","Lakeshore Flood","Lightning",
"Marine Hail","Marine High Wind","Marine Strong Wind",
"Marine Thunderstorm Wind","Rip Current","Seiche","Sleet",
"Storm Surge/Tide","Strong Wind","Thunderstorm Wind","Tornado",
"Tropical Depression","Tropical Storm","Tsunami","Volcanic Ash",
"Waterspout","Wildfire","Winter Storm","Winter Weather" ))
# reclassification
storm_data$CEVTYPE<-tolower(storm_data$EVTYPE)
storm_data$CEVTYPE<-gsub("astronomical high tide|astronomical low tide",
"astronomical low tide",storm_data$CEVTYPE)
storm_data$CEVTYPE<-gsub("avalance|heavy snow/blizzard/avalanche|avalanche",
"avalanche",storm_data$CEVTYPE)
storm_data$CEVTYPE<-gsub("blizzard|blizzard/winter storm|ground blizzard|heavy snow/blizzard|high wind/blizzard",
"blizzard",storm_data$CEVTYPE)
storm_data$CEVTYPE <-gsub("coastal flood|coastal flooding/erosion|coastal erosion|coastal flooding|coastal flooding/erosion|coastal surge|heavy surf coastal flooding|high winds/coastal flood|erosion/cstl flood|tidal flooding",
"coastal flood",storm_data$CEVTYPE)
storm_data$CEVTYPE<-gsub("extreme cold/wind chill|extreme wind chill|extreme windchill|extended cold|extreme cold",
"extreme cold/wind chill",storm_data$CEVTYPE)
storm_data$CEVTYPE<-gsub("dense fog|fog|dense freezing fog",
"dense fog",storm_data$CEVTYPE)
storm_data$CEVTYPE<-gsub("drought|drought/excessive heat|heat wave drought",
"drought",storm_data$CEVTYPE)
storm_data$CEVTYPE<-gsub("dust devil|dust devil waterspout",
"dust devil",storm_data$CEVTYPE)
storm_data$CEVTYPE<-gsub("dust storm|blowing dust|dust storm/high winds",
"dust storm",storm_data$CEVTYPE)
storm_data$CEVTYPE<-gsub("excessive heat|extreme heat|heat wave|heat waves|record heat|record/excessive heat",
"excessive heat",storm_data$CEVTYPE)
storm_data$CEVTYPE<-gsub("flash flood|flash flood - heavy rain|flash flood from ice jams|flash flood landslides|flash flood winds|flash flood/|flash flood/ street|flash flood/flood|flash flood/landslide|flash flooding|flash flooding/flood|flash flooding/thunderstorm wi|flash floods|flood flash|flood/flash|flood/flash flood|flood/flash/flood|flood/flashflood|ice storm/flash flood",
"flash flood",storm_data$CEVTYPE)
storm_data$CEVTYPE<-gsub("lakeshore flood|lake flood",
"lakeshore flood",storm_data$CEVTYPE)
storm_data$CEVTYPE<-gsub("flood|breakup flooding|flood & heavy rain|flood/rain/winds|flood/river flood|flooding|flooding/heavy rain|floods|heavy rain and flood|heavy rains/flooding|heavy snow/high winds & flood|ice jam flood \\(minor|ice jam flooding|major flood|minor flooding|mud slides urban flooding|river and stream flood|river flood|river flooding|rural flood|small stream flood|snowmelt flooding|thunderstorm winds/ flood|thunderstorm winds/flooding|urban and small stream floodin|urban flood|urban flooding|urban floods|urban/small stream flood",
"flood",storm_data$CEVTYPE)
storm_data$CEVTYPE<-gsub("frost/freeze|agricultural freeze|damaging freeze|early frost|freeze|freezing drizzle|freezing rain|freezing rain/sleet|freezing rain/snow|freezing spray|frost|frost\\freeze|hard freeze|heavy snow/freezing rain|light freezing rain|snow freezing rain|snow/freezing rain|snow/sleet/freezing rain",
"frost/freeze",storm_data$CEVTYPE)
storm_data$CEVTYPE<-gsub("funnel cloud|thunderstorm winds/funnel clou",
"funnel cloud",storm_data$CEVTYPE)
storm_data$CEVTYPE<-gsub("freezing fog|fog and cold temperatures|freezing dense fog",
"freezing fog",storm_data$CEVTYPE)
storm_data$CEVTYPE<-gsub("hail|gusty wind/hail|hail|hail 0\\.75|hail 075|hail 100|hail 125|hail 150|hail 175|hail 200|hail 275|hail 450|hail 75|hail damage|hail/wind|hail/winds|hailstorm|small hail|thunderstorm hail|thunderstorm wind/hail|thunderstorm winds hail|thunderstorm winds/hail|thunderstorm windshail|tstm wind/hail|wind/hail",
"hail",storm_data$CEVTYPE)
storm_data$CEVTYPE<-gsub("heavy rain|excessive rainfall|gusty wind/hvy rain|heavy rain/high surf|heavy rain/lightning|heavy rain/severe weather|heavy rain/small stream urban|heavy rain/snow|heavy rains|high winds heavy rains|high winds/heavy rain|hvy rain|lightning and heavy rain|lightning/heavy rain|rain|rain/snow|rain/wind|rainstorm|record rainfall|torrential rainfall|unseasonal rain|heavy precipitation|heavy shower",
"heavy rain",storm_data$CEVTYPE)
storm_data$CEVTYPE<-gsub("heavy snow|blowing snow|cold and snow|excessive snow|falling snow/ice|heavy snow|heavy snow and high winds|heavy snow and strong winds|heavy snow shower|heavy snow squalls|heavy snow-squalls|heavy snow/ice|heavy snow/squalls|heavy snow/wind|heavy snow/winter storm|heavy snowpack|high wind/heavy snow|high winds/snow|ice and snow|late season snow|light snow|light snowfall|record snow|snow|snow accumulation|snow and heavy snow|snow and ice|snow and ice storm|snow squall|snow squalls|snow/ bitter cold|snow/ ice|snow/blowing snow|snow/cold|snow/heavy snow|snow/high winds|snow/ice|snow/ice storm|snow/sleet|thundersnow",
"heavy snow",storm_data$CEVTYPE)
storm_data$CEVTYPE<-gsub("lake-effect snow|lake effect snow|heavy lake snow",
"lake-effect snow",storm_data$CEVTYPE)
storm_data$CEVTYPE<-gsub("high surf|hazardous surf|heavy surf|heavy surf and wind|heavy surf/high surf|high surf advisory|rip currents/heavy surf|rough surf",
"high surf",storm_data$CEVTYPE)
storm_data$CEVTYPE<-gsub("high wind|high winds|high wind (g40)|high wind 48|high wind damage|high winds|high winds/|high winds/cold|hurricane opal/high winds|winter storm high winds",
"high wind",storm_data$CEVTYPE)
storm_data$CEVTYPE<-gsub("marine high wind|high wind and seas|high wind/seas",
"marine high wind",storm_data$CEVTYPE)
storm_data$CEVTYPE<-gsub("strong wind|strong winds|ice/strong winds|non tstm wind|non-tstm wind",
"strong wind",storm_data$CEVTYPE)
storm_data$CEVTYPE<-gsub("lightning thunderstorm winds|severe thunderstorm winds|thunderestorm winds|thunderstorm winds|thunderstorm wind|thunderstorm wind (g40)|thunderstorm wind 60 mph|thunderstorm wind 65 mph|thunderstorm wind 65mph|thunderstorm wind 98 mph|thunderstorm wind g50|thunderstorm wind g52|thunderstorm wind g55|thunderstorm wind g60|thunderstorm wind trees|thunderstorm wind.|thunderstorm wind/ tree|thunderstorm wind/ trees|thunderstorm wind/awning|thunderstorm wind/lightning|thunderstorm winds|thunderstorm winds 13|thunderstorm winds 63 mph|thunderstorm winds and|thunderstorm winds g60|thunderstorm winds lightning|thunderstorm winds.|thunderstorm winds53|thunderstorm windss|thunderstorms wind|thunderstorms winds|thunderstormwinds|thunderstrom wind|thundertorm winds",
"thunderstorm wind",storm_data$CEVTYPE)
storm_data$CEVTYPE<-gsub("hurricane|hurricane edouard|hurricane emily|hurricane erin|hurricane felix|hurricane gordon|hurricane opal|hurricane-generated swells|hurricane/typhoon|typhoon",
"hurricane (typhoon)",storm_data$CEVTYPE)
storm_data$CEVTYPE<-gsub("glaze/ice storm|ice storm|sleet/ice storm",
"ice storm",storm_data$CEVTYPE)
storm_data$CEVTYPE<-gsub("storm surge|storm surge/tide",
"storm surge/tide",storm_data$CEVTYPE)
storm_data$CEVTYPE<-gsub("winter storm|winter storms",
"winter storm",storm_data$CEVTYPE)
storm_data$CEVTYPE<-gsub("coastal storm|coastalstorm|marine tstm wind",
"marine thunderstorm wind",storm_data$CEVTYPE)
storm_data$CEVTYPE<-gsub("tropical storm|tropical storm alberto|tropical storm dean|tropical storm gordon|tropical storm jerry",
"tropical storm",storm_data$CEVTYPE)
storm_data$CEVTYPE<-gsub("lightning and thunderstorm win|severe thunderstorm|severe thunderstorms|storm force winds|thuderstorm winds|thundeerstorm winds|thunderstorm|thunderstorm damage to|thunderstorm wins|thunderstorms|thunderstormw|thunerstorm winds|tunderstorm wind|wind storm",
"thunderstorm wind",storm_data$CEVTYPE)
storm_data$CEVTYPE<-gsub("^thunderstorm high wind|^tstm wind|^thunderstorm wind",
"thunderstorm wind",storm_data$CEVTYPE)
storm_data$CEVTYPE<-gsub("tstm wind|tstm wind (g45)|tstm wind (41)|tstm wind (g35)|tstm wind (g40)|tstm wind (g45)|tstm wind 40|tstm wind 45|tstm wind 55|tstm wind 65)|tstm wind and lightning|tstm wind damage|tstm wind g45|tstm wind g58|tstm winds|tstmw",
"thunderstorm wind",storm_data$CEVTYPE)
storm_data$CEVTYPE<-gsub("lightning|lightning wauseon|lightning fire|lightning injury|lightning.|lighting|ligntning",
"lightning",storm_data$CEVTYPE)
storm_data$CEVTYPE<-gsub("rip current|rip currents",
"rip current",storm_data$CEVTYPE)
storm_data$CEVTYPE<-gsub("cold/winds",
"cold/wind chill",storm_data$CEVTYPE)
storm_data$CEVTYPE<-gsub("wind and wave|marine high high wind",
"marine high wind",storm_data$CEVTYPE)
storm_data$CEVTYPE<-gsub("dry mircoburst winds|gradient wind|gusty wind|gusty wind/rain|gusty winds|microburst winds|non-severe wind damage|whirlwind|wind|wind damage|winds",
"high wind",storm_data$CEVTYPE)
storm_data$CEVTYPE<-gsub("cold air tornado|tornado|tornado f0|tornado f1|tornado f2|tornado f3|tornadoes|tornadoes, tstm wind, hail|waterspout tornado|waterspout-tornado|waterspout/ tornado|waterspout/tornado|torndao",
"tornado",storm_data$CEVTYPE)
storm_data$CEVTYPE<-gsub("waterspout|waterspout-",
"waterspout",storm_data$CEVTYPE)
storm_data$CEVTYPE<-gsub("brush fire|forest fires|grass fires|wild fires|wild/forest fire|wild/forest fires|wildfire|wildfires",
"wildfire",storm_data$CEVTYPE)
storm_data$CEVTYPE<-gsub("winter weather|winter weather mix|winter weather/mix|wintry mix ",
"winter weather",storm_data$CEVTYPE)
storm_data$CEVTYPE<-gsub("cold|cold and wet conditions|cold temperature|cold wave|cold weather|cool and wet ",
"cold/wind chill",storm_data$CEVTYPE)
storm_data[!(storm_data$CEVTYPE %in% stype),"CEVTYPE"]<-"rest"
reclass<-NROW(unique(storm_data$CEVTYPE))
Now we will be able to handle 41 numbers of different event types.
The data set has only number of fatalities and number of injuries, therefore is difficult to handle: what ages? which injuries?. Population Health measure would have been in terms of life expectancy losses, depending on age and severity of injuries.
Although we will analyse fatalities and injuries by their own, it will also be interesting to have the complete view in terms of health damages. In this report, we will consider that the averages loss of life expectancy in fatalities is 4 times those of injuries in order to aggregate both. This aggregate variable should be considered as an average score in terms of loss of life expectancy as a measure of the population health loss due to storms & weather events. Therefore, we create a new variable to sum up population health damages.
storm_data$PHEALTH<-storm_data$FATALITIES*4+storm_data$INJURIES
Other possibility would have been to introduce an economic value to fatalities and injuries. For instance, injuries as an average of the equivalent cost to heal, and fatalities as an average of equivalent sum insurances claims. Afterwards, inflation would have been taken into account or direct 2018 values applied for all years.
To reconstruct property and crop damages two variables are required in each: the “DMG” and the “DMGEXP”. The later correspond to the exponential figures to multiply “DMG”.
“DMGEXP” variables require reclassification before any mathematical operation to obtain values can be done. Therefore, we will arrange “DMGEXP” and then obtain values for the damages and the sum up of both.
# reclassifying DMGEXP
storm_data$PROPDMGEXP<-toupper(storm_data$PROPDMGEXP)
storm_data$CROPDMGEXP<-toupper(storm_data$CROPDMGEXP)
storm_data$PROPDMGEXP<-gsub("-|\\+|0","1",storm_data$PROPDMGEXP)
storm_data$PROPDMGEXP<-gsub("2|H","100",storm_data$PROPDMGEXP)
storm_data$PROPDMGEXP<-gsub("3|K","1000",storm_data$PROPDMGEXP)
storm_data$PROPDMGEXP<-gsub("4","10000",storm_data$PROPDMGEXP)
storm_data$PROPDMGEXP<-gsub("5","100000",storm_data$PROPDMGEXP)
storm_data$PROPDMGEXP<-gsub("6|M","1000000",storm_data$PROPDMGEXP)
storm_data$PROPDMGEXP<-gsub("7","10000000",storm_data$PROPDMGEXP)
storm_data$PROPDMGEXP<-gsub("B","1000000000",storm_data$PROPDMGEXP)
storm_data$CROPDMGEXP<-gsub("-|\\?|0","1",storm_data$CROPDMGEXP)
storm_data$CROPDMGEXP<-gsub("3|K","1000",storm_data$CROPDMGEXP)
storm_data$CROPDMGEXP<-gsub("6|M","1000000",storm_data$CROPDMGEXP)
storm_data$CROPDMGEXP<-gsub("B","1000000000",storm_data$CROPDMGEXP)
storm_data$PROPDMGEXP<-as.numeric(storm_data$PROPDMGEXP)
storm_data$CROPDMGEXP<-as.numeric(storm_data$CROPDMGEXP)
storm_data$YPROPDMG<-storm_data$PROPDMGEXP*storm_data$PROPDMG
storm_data$YCROPDMG<-storm_data$CROPDMGEXP*storm_data$CROPDMG
Afterwards, in order to compare different years damages we will need to deflate all values to 2018 applying an actualization rate.
#dmg value actualization
storm_data<-inner_join(storm_data,select(inflation, year, act_rate), by = "year")
storm_data$APROPDMG<-storm_data$YPROPDMG*storm_data$act_rate
storm_data$ACROPDMG<-storm_data$YCROPDMG*storm_data$act_rate
storm_data$AECODMG<-storm_data$ACROPDMG+storm_data$APROPDMG
#frequency
storm_data$freq<-1
As we are focused on the event type consequences, we will create two data sets grouping these effects:
1.First, taken into account the years, as weather effect have seasonality variations and a time series analysis would be convenient 2.The second one is the most interesting grouping the damages by the event type exclusively
# grouping ----------------------------------------------------------------
fs<-c("sum", "median", "mean","min", "max")
gstorm<-storm_data %>% select(year,CEVTYPE, freq, FATALITIES, INJURIES,
PHEALTH, APROPDMG,ACROPDMG, AECODMG) %>%
group_by(year, CEVTYPE) %>% summarise_all(funs_(fs))
ggstorm<-storm_data %>% select(CEVTYPE, freq, FATALITIES, INJURIES,
PHEALTH, APROPDMG,ACROPDMG, AECODMG) %>%
group_by( CEVTYPE) %>% summarise_all(funs_(fs))
First we would like to know the importance of the “rest” event type.
totals<-sapply(select(ggstorm, -CEVTYPE),sum)
totals<-sapply(select(ggstorm, ends_with("_sum")),sum)
rest<-filter(ggstorm, CEVTYPE=="rest")%>%select( ends_with("_sum"))
wrest<-rest/totals*100
colnames(wrest) <-
tolower(gsub("_sum$", "", names(wrest)))
kable(wrest,
caption = "Weight of 'rest' event type in the data set",
align = "r",
digits = 2,
col.names = c("Frequency","Fatalities","Injuries","Total Health Damage","Actual Property Damage","Actual Crop Damage", "Actual Total Economic Damage" ),
format.args = list(big.mark=",")
)
| Frequency | Fatalities | Injuries | Total Health Damage | Actual Property Damage | Actual Crop Damage | Actual Total Economic Damage |
|---|---|---|---|---|---|---|
| 51.39 | 11.65 | 8.71 | 9.6 | 4.31 | 7.11 | 4.59 |
As we can see, although the frequency of the events included in “rest” are important (slightly more than half), the effects in damages are under 10% except in fatalities. Therefore, we can follow our analysis considering “rest”" as another event.
We can consider the object of our analysis as a compound effect of frequency and intensity. Then we will take a look at the frequency of the events, the intensity of the damages for each event and the compound result.
We already know than 51,4% of the frequency is the “rest” event. Apart from it, in the next graphic it is possible to visualize different groups.
frggstorm<-ggstorm[ggstorm$CEVTYPE!="rest",]
frggstorm<-frggstorm[order(-frggstorm$freq_sum),]
with(frggstorm, {
barplot(freq_sum , names=frggstorm$CEVTYPE, cex.names = 0.7, las=2)
title("Natural Catastrophic Events by frequency")
abline(h=1800, col="red",lwd=3)
})
Tornado, hail, flash flood, lightning and flood are the most frequent events: all over 10000. They suppose 44% of the events, so the 35 other type of events are only 5% of them, with less than 1800 times each in the period 1950:2011. We will expect this group of events (tornado, etc.) to be the most harmful as they are the most frequent.
Tornado is the most frequent event in the analysed data set. Let’s take a look about the tornado’s events along the period.
tornado<-filter(gstorm, CEVTYPE=="tornado")
ggplot(tornado, aes(x = year)) +
geom_line(aes(y = freq_sum, color = "Frequency")) +
geom_line(aes(y = PHEALTH_sum, color = "F & I")) +
scale_x_continuous(breaks = c(seq(1950, 2012, by = 2)),
labels = c(seq(1950, 2012, by = 2))) +
theme(axis.text.x = element_text(
angle = 90,
hjust = 1,
vjust = 0.5
)) +
geom_smooth(
aes(x = year, y = freq_sum),
method = "lm",
se = TRUE,
color = "orange",
lwd = 0.5
) +
geom_smooth(
aes(x = year, y = PHEALTH_sum),
method = "lm",
se = FALSE,
color = "green",
lwd = 0.5
) +
labs(title = "Tornadoes time series analysis")
The tornadoes present cycles of years where the frequency goes up and down, but the average number of tornado by year has increased constantly during the period, as the confidence interval at the end of the period does not intersect with the beginning. On the other hand, the number of total injuries and fatalities have a decreasing trend with really intense picks in the upper cycles. Both cycles are not always the same: for instance in 2004 the number of tornadoes increase but the PHealth decrease to one of the lower figures. 2011 was in number of PHEALTH the worst year of the period, the other one was 1974, which wasn’t an important year in terms of frequency.
With the above graph it is possible to understand the different effect of frequency and intensity
Now we will take a look to the intensity of the damage. Because we have variables like fatalities and injuries we will use the median by event to observe the intensity. The values from health and economic consequences differ so much that cannot be compared: first is in terms of number of persons second is in dollars. Nevertheless, we will use the heatmap to cluster event types through variables in respect of intensity.
hastorm<-as.matrix(ggstorm[,c(grep("_median$",names(ggstorm)))])
rownames(hastorm)<-ggstorm$CEVTYPE
colnames(hastorm) <-
tolower(gsub("_median$", "", colnames(hastorm)))
heatmap(hastorm[,c(2:3,5:6)], cexCol = 0.8, main = "Heatmap comparing even types and damages")
Generally, the variable with more intensity, so is to say, with high median damage values for most of the event types is property damages actual value. Apart from this main group of event types, there are 2 more:
In respect to population health the most harmful type of natural events are the following:
kable(head(ggstorm[order(-ggstorm$PHEALTH_sum),1:5],10),
caption = "Most Harmful Natural Events for Population Health",
align = c("l","r","r","r","r"),
col.names = c("Event Type"," Frequency","Fatalities","Injuries","Total"),
format.args = list(big.mark=",")
)
| Event Type | Frequency | Fatalities | Injuries | Total |
|---|---|---|---|---|
| tornado | 39,970 | 5,636 | 91,407 | 113,951 |
| rest | 130,852 | 1,764 | 12,246 | 19,302 |
| excessive heat | 754 | 2,195 | 7,109 | 15,889 |
| flood | 10,634 | 484 | 6,794 | 8,730 |
| lightning | 13,299 | 817 | 5,231 | 8,499 |
| flash flood | 21,604 | 1,035 | 1,802 | 5,942 |
| heat | 215 | 937 | 2,100 | 5,848 |
| rip current | 641 | 572 | 529 | 2,817 |
| ice storm | 710 | 89 | 1,990 | 2,346 |
| winter storm | 1,509 | 216 | 1,338 | 2,202 |
.
.
Tornado is clearly the worst natural event in terms of population health with more than 56,66% of the total fatalities & injuries. The next is the event we have called “rest”, therefore a more specific look to this miscellaneous should have to be taken in future. These 10 events account for the 92,25% of total fatalities & injuries.
In respect of economic damages the natural types of events with greatest economic consequences are as follows:
prhead<-head(ggstorm[order(-ggstorm$AECODMG_sum ),c(1:2,6:8)],10)
prhead[,3:5]<-round(prhead[,3:5]/1000000,0)
kable(prhead,
caption = "Most Harmful Natural Events for Economic Consequences in million dollars of 2018",
align = c("l","r","r","r","r"),
col.names = c("Event Type"," Frequency","Actual Property Damage","Actual Crop Damage","Total Economic Damage"),
format.args = list(big.mark=",")
)
| Event Type | Frequency | Actual Property Damage | Actual Crop Damage | Total Economic Damage |
|---|---|---|---|---|
| flood | 10,634 | 192,767 | 16,550 | 209,317 |
| tornado | 39,970 | 159,948 | 592 | 160,541 |
| hurricane (typhoon) | 232 | 114,519 | 7,746 | 122,265 |
| storm surge/tide | 224 | 61,639 | 1 | 61,640 |
| rest | 130,852 | 28,185 | 5,135 | 33,320 |
| flash flood | 21,604 | 24,184 | 2,079 | 26,263 |
| hail | 26,670 | 21,485 | 4,406 | 25,891 |
| drought | 277 | 1,451 | 19,860 | 21,311 |
| ice storm | 710 | 5,593 | 8,488 | 14,080 |
| wildfire | 1,258 | 11,755 | 536 | 12,291 |
.
.
Main losses are due to flood with 28,85% of the total damages, while tornado imputes for 22,13% and Hurricanes for 16,85%. The list of ten events in the table are the 94,68% of the total economic consequences in the period. The comparison of both tables allows to say that there are a group of 5 event types that appear in the two lists: tornado, rest, flash flood, ice storm & flood.
.
.