Data description

The “storms” dataset from the “dplyr” package is a subset of the NOAA (National Oceanic and Atmospheric Administration) Atlantic hurricane database best track data. This dataset includes the positions and attributes of 198 tropical storms, measured every six hours during the lifetime of a storm. The data contains 10,010 observations and 13 variables, including information on the date, time, status and category of the storm.

Summary table template:

Variable Description
name Name of the storm
year Year of the storm
month Month of the storm
day Day of the storm
hour Hour of the storm
lat Latitude of the storm
long Longitude of the storm
status Storm classification (Tropical Depression, Tropical Storm, or Hurricane)
category Saffir-Simpson storm category
wind Storm’s maximum sustained wind speed (in knots)
pressure Air pressure at the storm’s center (in millibars)
ts_diameter Diameter of the area experiencing tropical storm strength winds (34 knots or above)
hu_diameter Diameter of the area experiencing hurricane strength winds (64 knots or above)

Data visualizations

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)
storms%>%
  group_by(category)%>%
  summarise(avgWind = mean(wind)) %>%
  ggplot(aes(x = category , y = avgWind)) +
  geom_point() +
  labs(x= "Category",
       y= "Average Wind Speed (in knots)",
       title = "Average Wind by Category of Storm") +
  theme_minimal()

This first plot is to exemplify the relationship between the Category and the average wind speed of that category, to understand how the “Saffir-Simpson storm category” system works. In this graphic we can see that there is a clear direct proportionality between both variables, and therefore we can understand that the higher the category of the storm, the greater the wind speed it carries.

storms%>%
  ggplot(aes(x=as.factor(month))) +
  geom_bar(aes(fill= category)) +
  labs(x= "Month",
       y= "Number of Storms",
       title = "Monthly Number of Storms by Category",
       fill = "Category" ) +
  theme_minimal()

This plot shows hoe storm occurances happen throughout the year, showing which months are historically more likely to have a storm happening. This graph is meant to show the clear “season” of storms, where not only most storms happen, but also when the most intense ones happen. As a result, we can clearly see that September has the most storms by far, and there is a somewhat normal curve around it, meaning that storm occurances increase exponentially as we get closer to September and decrease exponentially as you get further away from said month. This graph has value since it could allow for better planning of authorities around months with higher indices of storms.