INTRODUCTION
Hurricanes are no strangers to people living near the Atlantic coast. For years, storms and hurricanes ranging from category 1 to 5 have rushed ashore and made everlasting negative impacts to the cities and those who live in them. This article discusses and analyzes the hurricanes dating back from 1965 all the way through 2017. The impact of each hurricane was compared to each other and analyzed through the damages they created in billions of dollars, the number of fatalities they caused, their wind speed, and their pressure.
ANALYSIS
Looking at the graph below, one can notice that Hurricanes Maria and Katrina caused the highest number of fatalities within the US over the past 50 years, killing approximately 3,000 and 1,800 people in the US respectively, whereas the other hurricanes did not breach more than 200.
library(ggplot2)
library(ggrepel)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(knitr)
Hurricanes=read.csv("C:/Users/Brian/Downloads/Hurricanes.csv")
data.na.check = function(data){
na.totals = NULL
for(i in 1:length(data)){
na.totals = append(na.totals, sum(is.na(data[i])))
}
na.totals = rbind(c(nrow(data), na.totals))
colnames(na.totals) = c("n.data", names(data))
rownames(na.totals) = "na.values"
return(na.totals)
}
# Apply to new function Census data
data.na.check(Hurricanes)
## n.data Name Damage_billions Year Classification Loc1 Loc2 Loc3
## na.values 55 0 0 0 0 0 0 0
## Loc4 Loc5 Loc6 Fatalities US_Fatalities Highest_winds_mph
## na.values 0 0 0 0 0 0
## Lowest_Pressure_mbar
## na.values 0
Hurricanes = na.omit(Hurricanes)
ggplot(Hurricanes, aes(x=Year, y=US_Fatalities,label=Name))+
geom_point(color='red')+geom_text_repel(size=5)+theme_bw()+
theme(text = element_text(size=20))+
ggtitle("US Fatalities")
To further emphasize the large difference in US fatalities, the graph below is a replica of the data above; however, it is scaled to space out all of the hurricanes with low number of fatalities.
ggplot(Hurricanes, aes(x=Year, y=US_Fatalities, label=Name)) +
geom_point(color = 'red') + scale_y_log10(breaks = c(0,1,10,100,1000,2000,3000)) +
geom_text_repel(size=5) + theme_bw() + theme(text = element_text(size=20)) +
ggtitle("US Fatalities log10")
## Warning: Transformation introduced infinite values in continuous y-axis
## Warning: Transformation introduced infinite values in continuous y-axis
With such high numbers of fatalities, one would think those hurricanes caused higher amounts of damage. The graph below, however, shows that is not the case. Despite Hurricane Maria having the highest number of fatalities, they had the third most amount of damages in billions of USD whereas Hurricanes Katrina and Harvey had the top two. This relationship show that there are many unaccounted for factors that could affect these type of data. Just because a hurricane caused more deaths, does not mean that it was physically more destructive than the rest.
ggplot(Hurricanes, aes(x=Year, y=Damage_billions, label=Name)) +
geom_point(color = 'red') +
geom_text_repel(size=5) + theme_bw() + theme(text = element_text(size=20)) +
ggtitle("Damages in Billions of USD")
So how is it that a storm can cause so many fatalities, yet so little amount in damages? Looking at the bar graph below, you can see that the fatalities from both Maria and Katrina most definitely surpass the amount of fatalities caused by the other 40 hurricanes, denoted by NA, combined. (Side: I tried to getting rid of NA, but my codes di not eliminate the nulls.) Hurricanes\(Hurricane = Hurricanes\)Name Total = Hurricanes[rank(-Hurricanes$US_Fatalities)>5,]$Hurricane ggplot(Hurricanes, aes(Hurricane, US_Fatalities)+ theme_bw() + theme(text = element_text(size=15)) + geom_col() + labs(title=“US Hurricane Deaths”)
Hurricanes %>% filter(!(Name %in% c("Maria", "Katrina"))) %>% summarize(sum(US_Fatalities))
## sum(US_Fatalities)
## 1 1773
Hurricanes$Hurricane = Hurricanes$Name
Hurricanes[rank(-Hurricanes$US_Fatalities)>5,]$Hurricane = "All other hurricanes (40)"
## Warning in `[<-.factor`(`*tmp*`, iseq, value = c("All other hurricanes
## (40)", : invalid factor level, NA generated
Hurricanes %>%
ggplot(aes(Hurricane, US_Fatalities)) +
theme_bw() + theme(text = element_text(size=15)) +
geom_col() + ggtitle("US Hurricane Deaths")
Maybe there is some sort of reasoning within the high wind speeds or low pressures. Using the two graphs below, you can see how Hurricanes Maria and Katrina are both on the high ends of wind speed, but on the low ends of pressure. We can say there could be some association with higher wind speeds and lower pressure causing a higher number of fatalities. However, the location of each hurricane is important as well. Looking at the graph involving pressure and fatalities, Hurricanes Mitch and Fifi had high fatality counts outside of the US.
ggplot(Hurricanes, aes(x=Highest_winds_mph, y=US_Fatalities, label=Name)) +
geom_point(color = 'red') +
geom_text_repel(size=5) + theme_bw() + theme(text = element_text(size=20)) +
ggtitle("Highest Winds MPH against US Fatalities")
ggplot(Hurricanes, aes(x=Lowest_Pressure_mbar, y=Fatalities, label=Name)) +
geom_point(color = 'red') +
geom_text_repel(size=5) + theme_bw() + theme(text = element_text(size=20)) +
ggtitle("Lowest Pressure (mbar) against Fatalities")
However, with the lack of full understanding and numerical data of all factors, we could not tell what truly causes the number of fatalities and amount of damages to go up or down. There are simply too many factors that are not accounted for.
FOLLOW-UP
In general, we made the assumption that if a hurricane causes a high amount of damages, then it will also cause a higher number of fatalities. From the graph below, one can clearly see there is no correlation between the fatalities and amount of damages.
For example, look at the two hurricanes that made the biggest headlines in 2017, Hurricane Harvey and Hurricane Maria. Hurricane Harvey made landfall in Texas, and did most of its damage in the Houston area. With its damages being around 125 billion dollars, you would expect there to be a very high fatality rate with it. Fortunately, there was only 107 deaths that came from the storm which is astonishing compared to Hurricane Maria which has over 3,000 fatalities. Even though Harvey did more damage, Hurricane Maria still had a stiff 91.6 billion dollars worth of damage. Imagine if Hurricane Maria got the same amount of aid as Hurricane Harvey got. Seeing the amount of aid that was given to these areas would give a much better explanation of the fatalities from these two hurricanes.
ggplot(Hurricanes, aes(x=Damage_billions,y=Fatalities, label=Name))+geom_point()+geom_point(color = 'red') +
geom_text_repel(size=5) + theme_bw() + theme(text = element_text(size=20))+ggtitle("Damages in Billions Against Fatalities")+xlab("Damange (billions of dollars)")
Another assumption one would think is that the stronger the winds are, the more damage it will cause since more items will be flying around and more trees will fall. However, looking at the graph below, there seems to be absolutely no correlation with high wind speeds causing higher amount of damages. This further emphasizes how there is just too much noise in the data and that it is very difficult to fully understand what factors cause a hurricane to make a heavier impact.
ggplot(Hurricanes, aes(x=Highest_winds_mph, y=Damage_billions,label=Name))+geom_point()+geom_point(color = 'red') +
geom_text_repel(size=5) + theme_bw() + theme(text = element_text(size=20))+ggtitle("Highest Winds Against Damages")+xlab("Wind Speeds (mph)")+ylab("Damages (billions of dollars)")
DISCUSSION
When looking at the data of the hurricanes, we realized that sometimes, there are parts of data that is missing, or there are gaps in the conclusion. We originally wanted to add a graph on the correlation between lowest pressure and the wind speed, but this data would serve no purpose to the original analysis of the article because it discusses the impact of the hurricanes on the areas that they hit. It did not talk about the science of hurricanes and whether a lower pressure would mean a higher wind speed.
As we said earlier, we would like to see more data on all of these hurricanes. Adding more variables like amount of aid sent to the area and how many people assisted in the aftermath would be great for looking at the impact of these hurricanes. It would also be interesting to see the percentage of damages compared to the country’s wealth. Hurricane Maria would have a much higher percentage than any hurricanes that made landfall in the United States just because of the differences of wealth. Another variable that would be interesting to see if we were to do this study again is the media impact. How many people listened to the news? How many people evacuated? As said many times before, there is simply too much data that are not accounted for.
Every data set can always use more variables; it will only paint a better picture for people that want to analyze it.
LINK OF ORIGINAL ANALYSIS
https://www.r-bloggers.com/plotting-the-impact-of-atlantic-hurricanes-on-the-us/