Data description

The “storms” dataset from the “dplyr” package is a subset of the NOAA (National Oceanic and Atmospheric Administration) Atlantic hurricane database best track data. This dataset includes the positions and attributes of 198 tropical storms, measured every six hours during the lifetime of a storm. The data contains 10,010 observations and 13 variables, including information on the date, time, status and category of the storm.

Summary table template:

Variable	Description
name	Name of the storm
year	Year of the storm
month	Month of the storm
day	Day of the storm
hour	Hour of the storm
lat	Latitude of the storm
long	Longitude of the storm
status	Storm classification (Tropical Depression, Tropical Storm, or Hurricane)
category	Saffir-Simpson storm category
wind	Storm’s maximum sustained wind speed (in knots)
pressure	Air pressure at the storm’s center (in millibars)
ts_diameter	Diameter of the area experiencing tropical storm strength winds (34 knots or above)
hu_diameter	Diameter of the area experiencing hurricane strength winds (64 knots or above)

Data visualizations

library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

library(ggplot2)

storms%>%
  group_by(category)%>%
  summarise(avgWind = mean(wind)) %>%
  ggplot(aes(x = category , y = avgWind)) +
  geom_point() +
  labs(x= "Category",
       y= "Average Wind Speed (in knots)",
       title = "Average Wind by Category of Storm") +
  theme_minimal()

This first plot is to exemplify the relationship between the Category and the average wind speed of that category, to understand how the “Saffir-Simpson storm category” system works. In this graphic we can see that there is a clear direct proportionality between both variables, and therefore we can understand that the higher the category of the storm, the greater the wind speed it carries.

storms%>%
  ggplot(aes(x=as.factor(month))) +
  geom_bar(aes(fill= category)) +
  labs(x= "Month",
       y= "Number of Storms",
       title = "Monthly Number of Storms by Category",
       fill = "Category" ) +
  theme_minimal()

This plot shows hoe storm occurances happen throughout the year, showing which months are historically more likely to have a storm happening. This graph is meant to show the clear “season” of storms, where not only most storms happen, but also when the most intense ones happen. As a result, we can clearly see that September has the most storms by far, and there is a somewhat normal curve around it, meaning that storm occurances increase exponentially as we get closer to September and decrease exponentially as you get further away from said month. This graph has value since it could allow for better planning of authorities around months with higher indices of storms.

HW6: Summary of the storms dataset from the dplyr package

Gabriel Bahia de Sousa

12/1/2019

Data description

Data visualizations