In this paper we explore the economic and healthcare outputs caused by severe weather conditions. We use the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database match the main characteristics of major weather events in the U.S. with the injuries, fatalities and property damages results of these events. Although the database is up to date, we will use only the data from 1950 until November 2011 for this analysis.
Analysis of the effects of severe weather conditions is important to policymakers because this could serve to build prevention mechanisms in danger zones and seasons. It is well-known that tornados and floods affect specially the caribbean coast, causing high economic and human losses such as propery damages, injured and death people, and public infrastructure.
For these reasons, this project aim to find which are the worst events that we can account for using this database. The remaining of this article is the following. Section 2 shows the Data Processing, it includes loading of the data set and some pre-processing procedures. The Third Section show some results and graphics. Finally, Section 4 concludes.
In order to conduct our analysis we first load some of the packages that we will need to process our sets of information as well as some graphic packages.
suppressMessages(library(plyr))
suppressMessages(library(dplyr))
suppressMessages(library(data.table))
suppressMessages(library(ggplot2))
We read the .csv file using the read.csv function by default, and select the variables that we will use during this study.
# Reading data
storms2 <- read.csv("repdata_data_StormData.csv",header = TRUE)
# Select main variables for the analysis
storms <- as.data.table(select(storms2,"EVTYPE","COUNTY","COUNTYNAME","STATE__","STATE",
"FATALITIES","INJURIES","PROPDMG","PROPDMGEXP",
"CROPDMG","CROPDMGEXP"))
Now we process the data in order to change some format issues like transform to lowercase the column names and the character variables. It helps us to better visualize variable names. Character variables such as the event type (EVTYPE) could have the same record in lower and upper case, but is the same kind of event. For this reason we also transform to lower case character variables. Another important case is for the exponents of the property and crop damages. It will help us to better assign the correct exponent to each value.
# Change names to lowercase
colnames(storms) <- tolower(names(storms))
# Change character variables to lowercase
storms$evtype <- tolower(storms$evtype)
storms$cropdmgexp <- tolower(storms$cropdmgexp)
storms$propdmgexp <- tolower(storms$propdmgexp)
Now, in order to compute a total amount of damages, we need to express every amout to dollar units to sum them up. We also compute the total_damage variable which is the sum of the crop and propery variables. This anaylsis is done using the following code:
# Redefine exponents in order to compute total values
exponents <- c("?","+","","-","0","1","2","3","4","5","6","7","8","h","k","m","b")
meaning.exp <-c(1,1,1,1,10^0,10^1,10^2,10^3,10^4,10^5,10^6,10^7,10^8,10^2,10^3,10^6,10^9)
suppressMessages(storms$propdmgexp <-as.numeric(mapvalues(storms$propdmgexp,exponents,meaning.exp)))
suppressMessages(storms$cropdmgexp <- as.numeric(mapvalues(storms$cropdmgexp,exponents,meaning.exp)))
# Compute total values for property and crop damage
storms <- storms %>% mutate(propdmg=as.numeric(propdmg)*propdmgexp) %>%
mutate(cropdmg=as.numeric(cropdmg)*cropdmgexp)
storms <- as.data.table(storms)
# Create the total damage variable and rescaling to billion dollars
storms[,total_damage:=(cropdmg+propdmg)]
Finally, we need group by event type and compute the total value for the health and economic variables as shown below.
# Create the final database that summarizes damage and injuries by events
storms_by_event <- storms %>% group_by(evtype) %>%
summarise(total_econ_damage = sum(total_damage, na.rm = T)/1e9,
propdmg = sum(propdmg, na.rm = T),
cropdmg = sum(cropdmg, na.rm = T),
injuries=sum(injuries,na.rm=T),
fatalities=sum(fatalities,na.rm=T),
total_health_damage=sum(injuries+fatalities,na.rm=T)
)
Finally in order to separate both kind of damages, we create an unique table for economic effects and other for health effects.
# Choosing the 10 most harmful events for each type of effect
economic_effect <- arrange(storms_by_event,desc(total_econ_damage))[1:10,]
health_effect <- arrange(storms_by_event,desc(total_health_damage))[1:10,]
This section obtained two datasets (economic_effect and health_effect) which consists on the 10 most harmful type of events in each kind of damage. The next section shows the resulting analysis that can be deduced from this data.
The first analysis framework is centered in the human aspect. Typhoons, storms and floods usually destroy a lot of properties and infrastructure, but it may will recognized that the greatest concern of the policymakers should the the public health and human lives. It can be seen in the following plot:
ggplot(health_effect,aes(x = reorder(evtype,-total_health_damage), total_health_damage))+
geom_bar(stat = "identity", aes(fill = total_health_damage))+
labs(x = "Most Harmful Events",
y = "Total Injuries and Fatalities",
title = "Health Most Harmful Events in USA (1950-2011)",
caption="Source: U.S. National Oceanic and Atmospheric Administration's (NOAA)")+
scale_fill_gradient2(mid = "darkgreen",high = "darkred",midpoint = median(health_effect$total_health_damage))+
theme(axis.text.x = element_text(angle = 90, hjust = 1),legend.title = element_blank(),
plot.caption = element_text(hjust=0),plot.title = element_text(hjust = 0.5))
It can be seen that tornado is far the most harmful event in terms of public health. It is related with although 100 thousand injuries or fatalities. It is followed by excessive heat that is less than a 10% of the previous event. Afterwards the total health damage start to decrease at a lower rate.
head(health_effect[,c("evtype","total_health_damage")])
## # A tibble: 6 x 2
## evtype total_health_damage
## <chr> <dbl>
## 1 tornado 96979.
## 2 excessive heat 8428.
## 3 tstm wind 7461.
## 4 flood 7259.
## 5 lightning 6046.
## 6 heat 3037.
Besides of human health damages the economic conditions of the regions affected by extreme weather events also deteriorate. The following figure shows that floods are the most important event that damage property and crops.
ggplot(economic_effect,aes(x = reorder(evtype,-total_econ_damage), total_econ_damage))+
geom_bar(stat = "identity", aes(fill = total_econ_damage))+
labs(x = "Most Harmful Events",
y = "Total Damages in Billion USD",
title = "Most Economic Affecting Events in USA (1950-2011)",
caption="Source: U.S. National Oceanic and Atmospheric Administration's (NOAA)")+
scale_fill_gradient2(low = "lightblue",high = "darkblue")+
theme(axis.text.x = element_text(angle = 90, hjust = 1),legend.title = element_blank(),
plot.caption = element_text(hjust=0),plot.title = element_text(hjust = 0.5),panel.background = element_blank() )
The second most dangerous extreme weather event are the hurricanes or typhoons. Interestingly, we can notice that water-based events are the most dangerous for properties and intuitive for crops. Most of the other events have less but significative values also.
We have used the U.S. National Oceanic and Atmospheric Administration (NOAA) database in order to characterize which are the most harmful events in terms of public health and economic damages. On the human framework, the tornados are the most important events because of their negative effect in terms of injuries and fatalities. On the other hand, property and crop damage losses are most influenced by floods and other water-based extreme events. In terms of public policy, prevention in disaster moments is always the best way to decreases the losses.