As the planet temperature has been increasing due to human-made global warming it has created conditions that increase the chances of extreme weather. This report is will investigate the claim of increasing storm rates.
Description This data is a subset of the NOAA Atlantic hurricane database best track data, https://www.nhc.noaa.gov/data/#hurdat. The data includes the positions and attributes of 198 tropical storms, measured every six hours during the lifetime of a storm.
Read in the dataset and remove unnessary columns Begin by reading the data and performing simple data cleaning operations
df <- read.csv('https://vincentarelbundock.github.io/Rdatasets/csv/dplyr/storms.csv')
df$date <- paste(df$year, df$month, df$day, sep='-')
df <- df[ , !(names(df) %in% c('month', 'day', 'hour', 'X'))]
df <- df %>% replace(is.na(.), 0)
df$total_diameter <- df$ts_diameter + df$hu_diameter
Global statistics
summary(df)
## name year lat long
## Emily : 207 Min. :1975 Min. : 7.20 Min. :-109.30
## Bonnie : 185 1st Qu.:1990 1st Qu.:17.50 1st Qu.: -80.70
## Claudette: 180 Median :1999 Median :24.40 Median : -64.50
## Felix : 178 Mean :1998 Mean :24.76 Mean : -64.23
## Alberto : 170 3rd Qu.:2006 3rd Qu.:31.30 3rd Qu.: -48.60
## Danielle : 157 Max. :2015 Max. :51.90 Max. : -6.00
## (Other) :8933
## status category wind pressure
## hurricane :3091 Min. :-1.0000 Min. : 10.00 Min. : 882.0
## tropical depression:2545 1st Qu.:-1.0000 1st Qu.: 30.00 1st Qu.: 985.0
## tropical storm :4374 Median : 0.0000 Median : 45.00 Median : 999.0
## Mean : 0.3214 Mean : 53.49 Mean : 992.1
## 3rd Qu.: 1.0000 3rd Qu.: 65.00 3rd Qu.:1006.0
## Max. : 5.0000 Max. :160.00 Max. :1022.0
##
## ts_diameter hu_diameter date total_diameter
## Min. : 0.00 Min. : 0.000 Length:10010 Min. : 0.00
## 1st Qu.: 0.00 1st Qu.: 0.000 Class :character 1st Qu.: 0.00
## Median : 0.00 Median : 0.000 Mode :character Median : 0.00
## Mean : 58.01 Mean : 7.449 Mean : 65.46
## 3rd Qu.: 80.55 3rd Qu.: 0.000 3rd Qu.: 80.55
## Max. :1001.18 Max. :345.234 Max. :1311.89
##
From 1975 to 2015 how many of storms have occurred each year and is the rate increasing?
ydf <- as.data.frame(table(df$year))
colnames(ydf) <- c('Year', 'Frequency')
ggplot(ydf, aes(x = Year, y = Frequency)) + geom_col(aes(fill = Frequency)) + geom_text(aes(label = Frequency), angle = 90, nudge_y=30) + scale_x_discrete(guide = guide_axis(angle = 90))
ydf$Year <- as.integer(ydf$Year)
ggplot(ydf, aes(x=Year, y=Frequency)) + geom_point(shape=18, color="blue") + geom_smooth(method=lm, linetype="dashed", color="darkred", fill="blue")
Both the bar graph and regression line show an increase in the number of stroms for the last 40 years
How do the storms vary by category for each year?
tb1 <- table(df$status, df$year)
barplot(tb1, xlab='Year', ylab='Status Counts', col=c('green','blue','red'))
legend(x=1, y=575, legend=unique(df$status), fill=c('green','blue','red'))
The rise of more storms has also lead to more hurricanes
What makes a storm change status?
qplot(df$wind, df$pressure, main='Pressure vs Wind', xlab='Wind (knots)', ylab='Pressure (millibars)', col=df$status) + theme(plot.title = element_text(hjust = 0.5)) + labs(colour = 'Status')
From the plot above it looks like there is a wide range in wind speed for a storm to be classified as a hurricane, that can be verified with a box plot
ggplot(df, aes(x=status, y=wind)) + geom_boxplot(color="black", fill="blue", alpha=0.2)
As expected hurricanes have the most values outside the box
Which storm has covered more land area ?
It looks like Nadine has covered more land than any other storm
Which storm name was most popular?
cdf <- count(df, name)
cdf <- filter(cdf, n > 70)
wordcloud(words = cdf$name, freq = cdf$n, color = 'blue', size = 1, shape = "rectangle", backgroundColor = "white")
The name Emily is used the most, let’s take a look at her path and status change.
edf <- filter(df, name == 'Emily')
pal <- colorFactor(c('green','blue','red'), domain = c("tropical depression", "tropical storm", "hurricane"))
edf %>%
leaflet(width = '100%') %>%
addTiles() %>%
setView(lng=-60, lat=32, zoom=3.3) %>%
addCircleMarkers(lat = ~lat, lng = ~long, popup = edf$name, color=~pal(status), weight=2, stroke=FALSE, fillOpacity = 0.2, label=edf$year)
From the first plot we saw 1995 had the most storms, let’s take a look at their pathing.
From 1975 to 2015 storms have affected many countries along the east cost including the US and Mexico. Since the current rate of storms are increasing we need to do more work in reducing human-made global warming.