Data 110 - Final Project

Author

Andrew George

Global Warming in the Oceans

Source: Climate Reanalyzer - University of Maine

Introduction

2023 was the hottest year on record. Since then, scientists have been baffled why temperature anomalies particularly in the oceans have increased so dramatically since records started breaking around last March. Scientists have been grappling whether these increases can be attributed to global warming primarily or if these changes are due to natural variability namely in ENSO (El Nino Southern Oscillation) or even other factors such as air pollution(1) As a result multiple early predictions have already suggested that the 2024 Atlantic Hurricane Season could be among the most active seasons on record(2) due to the warm waters favorable of development. In this project, I will be examining some of the interconnections between temperature anomalies, ENSO and hurricanes using multiple data sets. I selected this topic because last semester I took a meteorology class and I found it really interesting. I believe it is important to communicate how the issue of global warming is worsening.

Firstly, I will be using a data set from NOAA that contains observations of ENSO indices. Next I will compile into one data set, three temperature anomaly observations, one global, land, and ocean from NOAA. Lastly, I will be making a data frame that contains data from the past 50 Atlantic hurricane seasons sourced from Colorado State University. Temperature anomalies whether global, land, ocean or ENSO are compiled through satellites and surface observations, including bouyies in the ocean. Hurricane data was calculated through CSU’s tropical meteorology branch using data from the National Hurricane Center. In the ENSO data set, I will be using the variable year, the variable for the anomaly ‘anom’ which refers to the ENSO region ocean anomalies in the tropical pacific, and the categorical variable ‘seas’ which refers to a three month period (as in AMJ = april, may, june). Due to the format this data set came in, I will have to do a lot of renaming. Additionally, this data set from the CPC did not include an ENSO phase identifier so I will add one. In the my hurricane data frame I will be using the variable year, the variable tropical storms which refers to the number of named tropical or subtropical storms (that being tropical cyclones with winds of at least 40mph), the variable hurricanes (tropical cyclones with winds of at least 75mph), the variable major hurricanes (tropical cyclones with winds of at least 111mph) and the variable ace (Accumulated Cyclone Energy). Not only is ace a measure of a storm’s strength, but it also factors in a storms longevity. Thus, the ace of each storm in a season is added up and that is the number that is ascribed to the variable. That is why ace, in most cases, is a much better indicator of a season’s activity level rather than just the number so storms. Hence, the CPC categorizes a season’s activity level based on its ace index, a variable that I will add to my data frame(3)

Loading everything in

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.4.4     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(highcharter)
Warning: package 'highcharter' was built under R version 4.3.3
Registered S3 method overwritten by 'quantmod':
  method            from
  as.zoo.data.frame zoo 
Highcharts (www.highcharts.com) is a Highsoft software product which is
not free for commercial and Governmental use
library(ggfortify)
library(ggthemes)
setwd("C:/Users/andre/Downloads/Data 110")
temp_anomoly <- read_csv("1850-2023.csv")
Warning: One or more parsing issues, call `problems()` on your data frame for details,
e.g.:
  dat <- vroom(...)
  problems(dat)
Rows: 178 Columns: 1
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (1): Global Land and Ocean January - December Average Temperature Anomalies

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
oc_temp_anomoly <- read_csv("1850-2023ocean.csv")
Warning: One or more parsing issues, call `problems()` on your data frame for details,
e.g.:
  dat <- vroom(...)
  problems(dat)
Rows: 178 Columns: 1
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (1): Global Ocean January - December Average Temperature Anomalies

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
ld_temp_anomoly <- read_csv("1850-2023land.csv")
Warning: One or more parsing issues, call `problems()` on your data frame for details,
e.g.:
  dat <- vroom(...)
  problems(dat)
Rows: 178 Columns: 1
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (1): Global Land January - December Average Temperature Anomalies

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
##Source: https://www.ncei.noaa.gov/access/monitoring/global-temperature-anomalies/anomalies
url <- "https://www.cpc.ncep.noaa.gov/data/indices/oni.ascii.txt"
enso_data <- read_table(url)

── Column specification ────────────────────────────────────────────────────────
cols(
  SEAS = col_character(),
  YR = col_double(),
  TOTAL = col_double(),
  ANOM = col_double()
)
head(temp_anomoly)
# A tibble: 6 × 1
  `Global Land and Ocean January - December Average Temperature Anomalies`
  <chr>                                                                   
1 Units: Degrees Celsius                                                  
2 Base Period: 1901-2000                                                  
3 Missing: -999                                                           
4 Year,Anomaly                                                            
5 1850,-0.20                                                              
6 1851,-0.10                                                              
head(enso_data)
# A tibble: 6 × 4
  SEAS     YR TOTAL  ANOM
  <chr> <dbl> <dbl> <dbl>
1 DJF    1950  24.7 -1.53
2 JFM    1950  25.2 -1.34
3 FMA    1950  25.8 -1.16
4 MAM    1950  26.1 -1.18
5 AMJ    1950  26.3 -1.07
6 MJJ    1950  26.3 -0.85

Cleaning anomaly data

temp_anomaly <- temp_anomoly[temp_anomoly$`Global Land and Ocean January - December Average Temperature Anomalies` != "Year,Anomaly",]
temp_anomaly <- temp_anomaly[temp_anomaly$`Global Land and Ocean January - December Average Temperature Anomalies` != "Units: Degrees Celsius",]
temp_anomaly <- temp_anomaly[temp_anomaly$`Global Land and Ocean January - December Average Temperature Anomalies` != "Missing: -999",]
temp_anomaly <- temp_anomaly[temp_anomaly$`Global Land and Ocean January - December Average Temperature Anomalies` != "Base Period: 1901-2000",]
temp_anomaly <- temp_anomaly |>
  separate_wider_delim(`Global Land and Ocean January - December Average Temperature Anomalies`, delim = ",",
                       names = c("year", "temperature_anomaly"))
ld_temp_anomaly <- ld_temp_anomoly[ld_temp_anomoly$`Global Land January - December Average Temperature Anomalies` != "Year,Anomaly",]
ld_temp_anomaly <- ld_temp_anomaly[ld_temp_anomaly$`Global Land January - December Average Temperature Anomalies` != "Units: Degrees Celsius",]
ld_temp_anomaly <- ld_temp_anomaly[ld_temp_anomaly$`Global Land January - December Average Temperature Anomalies` != "Missing: -999",]
ld_temp_anomaly <- ld_temp_anomaly[ld_temp_anomaly$`Global Land January - December Average Temperature Anomalies` != "Base Period: 1901-2000",]
ld_temp_anomaly <- ld_temp_anomaly |>
  separate_wider_delim(`Global Land January - December Average Temperature Anomalies`, delim = ",",
                       names = c("year", "land_temperature_anomaly"))
oc_temp_anomaly <- oc_temp_anomoly[oc_temp_anomoly$`Global Ocean January - December Average Temperature Anomalies` != "Year,Anomaly",]
oc_temp_anomaly <- oc_temp_anomaly[oc_temp_anomaly$`Global Ocean January - December Average Temperature Anomalies` != "Units: Degrees Celsius",]
oc_temp_anomaly <- oc_temp_anomaly[oc_temp_anomaly$`Global Ocean January - December Average Temperature Anomalies` != "Missing: -999",]
oc_temp_anomaly <- oc_temp_anomaly[oc_temp_anomaly$`Global Ocean January - December Average Temperature Anomalies` != "Base Period: 1901-2000",]
oc_temp_anomaly <- oc_temp_anomaly |>
  separate_wider_delim(`Global Ocean January - December Average Temperature Anomalies`, delim = ",",
                       names = c("year", "ocean_temperature_anomaly"))

Combining anomalies into one data set

complete_anomaly <- inner_join(ld_temp_anomaly, oc_temp_anomaly) 
Joining with `by = join_by(year)`
complete_anomaly2 <- inner_join(temp_anomaly, complete_anomaly)
Joining with `by = join_by(year)`
head(complete_anomaly2)
# A tibble: 6 × 4
  year  temperature_anomaly land_temperature_anomaly ocean_temperature_anomaly
  <chr> <chr>               <chr>                    <chr>                    
1 1850  -0.20               -0.52                    -0.06                    
2 1851  -0.10               -0.33                    0.01                     
3 1852  -0.06               -0.28                    0.04                     
4 1853  -0.11               -0.40                    0.02                     
5 1854  -0.07               -0.21                    -0.01                    
6 1855  -0.09               -0.31                    0.01                     

Converting anomaly data to numeric

The data came in as a factors so I changed them to numerical variables

complete_anomaly2$year <- as.numeric(complete_anomaly2$year) 
complete_anomaly2$temperature_anomaly <- as.numeric(complete_anomaly2$temperature_anomaly) 
complete_anomaly2$land_temperature_anomaly <-           
as.numeric(complete_anomaly2$land_temperature_anomaly) 
complete_anomaly2$ocean_temperature_anomaly <-  
as.numeric(complete_anomaly2$ocean_temperature_anomaly)
head(complete_anomaly2)
# A tibble: 6 × 4
   year temperature_anomaly land_temperature_anomaly ocean_temperature_anomaly
  <dbl>               <dbl>                    <dbl>                     <dbl>
1  1850               -0.2                     -0.52                     -0.06
2  1851               -0.1                     -0.33                      0.01
3  1852               -0.06                    -0.28                      0.04
4  1853               -0.11                    -0.4                       0.02
5  1854               -0.07                    -0.21                     -0.01
6  1855               -0.09                    -0.31                      0.01

Cleaning enso data and adding an phase identifier

As stated by the Climate Prediction Center, El Nino is active when when temperature anomalies in the eastern equatorial pacific on average are greater than 0.5c while La Nina is defined when temperature anomalies in the eastern equatorial pacific are on average less than -0.5c. ENSO neutral is defined by when neither El Nino or La Nina are active(4) The data set from the cpc did not include an identifier so I will add one.

names(enso_data) <- tolower(names(enso_data))
enso_data <- enso_data |>
  select(-total) |>
  rename("three_month_period" = `seas`,
         "year" = `yr`,
         "c_anomaly_avg" = `anom`) |>
  ## adding an identifier
  mutate(enso_phase = case_when(
    c_anomaly_avg > 0.5 ~ "El Nino",
    -0.5 <= c_anomaly_avg & c_anomaly_avg <= 0.5 ~ "Neutral",
    c_anomaly_avg < -0.5 ~ "La Nina")
  )
tail(enso_data)
# A tibble: 6 × 4
  three_month_period  year c_anomaly_avg enso_phase
  <chr>              <dbl>         <dbl> <chr>     
1 SON                 2023          1.78 El Nino   
2 OND                 2023          1.92 El Nino   
3 NDJ                 2023          1.95 El Nino   
4 DJF                 2024          1.79 El Nino   
5 JFM                 2024          1.49 El Nino   
6 FMA                 2024          1.15 El Nino   

Building Atlantic Hurricane (Tropical Cyclone) Data frame

atl_hurricanes <- data.frame(year = c(1974, 1975, 1976, 1977, 1978, 1979, 1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019 , 2020, 2021, 2022, 2023), tropical_storms = c(11, 9, 10, 6, 12, 9, 11, 12, 6, 4, 13, 11, 6, 7, 12, 11, 14, 8, 7, 8, 7, 19, 13, 8, 14, 12, 15, 15, 12, 16, 15, 28, 10, 15, 16, 9, 19, 19, 19, 14, 8, 11, 15, 17, 15, 18, 30, 21, 14, 20), hurricanes = c(4, 6, 6, 5, 5, 6, 9, 7, 2, 3, 5, 7, 4, 3, 5, 7, 8, 4, 4, 4, 3, 11, 9, 3, 10, 8, 8, 9, 4, 7, 9, 15, 5, 6, 8, 3, 12, 7, 10, 2, 6, 4, 7, 10, 8, 6, 14, 7, 8, 7), major_hurricanes = c(2, 3, 2, 1, 2, 2, 2, 3, 1, 1, 1, 3, 0, 1, 3, 2, 1, 2, 1, 1, 0, 5, 6, 1, 3, 5, 3, 4, 2, 3, 6, 7, 2, 2, 5, 2, 5, 4, 2, 0, 2, 2, 4, 6, 2, 3, 7, 4, 2, 3), ace = c(68.4, 76.1, 84.2, 25.3, 63.2, 92.9, 148.9, 100.3, 31.5, 17.4, 84.3, 88, 35.8, 34.4, 103, 135.1, 96.8, 35.5, 76.2, 38.7, 32, 227.1, 166.2, 40.9, 181.8, 176.5, 119.1, 110.1, 67.4, 176.3, 226.9, 245.3, 83.3, 73.9, 145.7, 52.6, 165.5, 126.3, 132.6, 36.1, 66.7, 62.7, 141.3, 224.9, 132.6, 132.2, 179.8, 145.7, 95.1, 145.6))
## Source: 
##https://www.nhc.noaa.gov/data/tcr/index.php?season=1995&basin=atl
##https://tropical.atmos.colostate.edu/Realtime/index.php?arch&loc=northatlantic
tail(atl_hurricanes)
   year tropical_storms hurricanes major_hurricanes   ace
45 2018              15          8                2 132.6
46 2019              18          6                3 132.2
47 2020              30         14                7 179.8
48 2021              21          7                4 145.7
49 2022              14          8                2  95.1
50 2023              20          7                3 145.6

Adding a hurricane season identifier

As stated by the CPC, a below normal season has an ace index less than 73, a near normal season has an ace index between 73 and 126.1, an above normal season has an ace index between 126.1 and 159.6, and lastly an extremely active season has an ace index above 159.6(3)

atl_hurricanes <- atl_hurricanes |>
 mutate(activity_level = case_when(
    ace < 73 ~ "Below Normal",
    73 <= ace & ace <= 126.1 ~ "Near Normal",
    126.1 < ace & ace <= 159.6 ~ "Above Normal",
    ace > 159.6 ~ "Extremely Active")
  )
tail(atl_hurricanes)
   year tropical_storms hurricanes major_hurricanes   ace   activity_level
45 2018              15          8                2 132.6     Above Normal
46 2019              18          6                3 132.2     Above Normal
47 2020              30         14                7 179.8 Extremely Active
48 2021              21          7                4 145.7     Above Normal
49 2022              14          8                2  95.1      Near Normal
50 2023              20          7                3 145.6     Above Normal

Now that I am done prepping and cleaning I am ready for the exploratory phase

Exploring the distribution of ACE

ggplot(atl_hurricanes, aes(x = ace)) +
  geom_density()

This graph shows a slight left skew in ace distribution the mode of which is about 75 ace and a mean to the right of perhaps 125 ace.

What about the distribution of activity levels?

ggplot(atl_hurricanes, aes(x = activity_level)) +
  geom_bar()

Since 1974, below normal hurricane seasons seem to be the most common followed by near normal. Interestingly enough in the past 50 years there have been 10 above normal and extremely active seasons; when you would think extremely active would be less than above normal.

When is ENSO strongest

Lets find out what time of the year enso peaks at, whether El Nino or La Nina

enso_data |>
  mutate(abs_value = abs(c_anomaly_avg)) |>
  group_by(three_month_period) |>
  summarize(avg_temp = mean(abs_value))
# A tibble: 12 × 2
   three_month_period avg_temp
   <chr>                 <dbl>
 1 AMJ                   0.477
 2 ASO                   0.702
 3 DJF                   0.865
 4 FMA                   0.570
 5 JAS                   0.596
 6 JFM                   0.726
 7 JJA                   0.53 
 8 MAM                   0.496
 9 MJJ                   0.478
10 NDJ                   0.912
11 OND                   0.891
12 SON                   0.806

This tells us that El Nino or La Nina typically peak near the end/beginning of a year but more specifically sometime in December. It also tell us that ENSO is likely to be weak/neutral phase and transitioning near the middle of the year.

Looking at the evolution of ENSO

compare_enso <- enso_data |>
  ## to classify a year's overarching enso phase (especially during hurricane season) I will use the three month period NDJ because that is where enso is likely to peak after steadily strengthening from mid year
  filter(three_month_period == "NDJ")
ggplot(compare_enso, aes(x = year, y = c_anomaly_avg)) +
  geom_line()

This graph shows that how the oscillation of ENSO has remained steady since 1950. It also shows that during strong ENSO events, anomalies are likely to swing much faster in the other direction such as in the late 1990’s. While during weaker ENSO events anomalies do not vary as much.

How many years of each phase of ENSO?

ggplot(compare_enso, aes(x = enso_phase)) +
  geom_bar()

El Nino seems to be the most common enso phase year, followed by La Nina and neutral by a small margin.

Combining ENSO with global anomalies to prep for plot 1

compare_anomaly <- complete_anomaly2 |>
    ## filtering these years because enso data has only been available seen 1950
    filter(year > 1949 & year < 2024) |>
    mutate("enso" = compare_enso$enso_phase) |>
    mutate("enso_anomaly" = abs(compare_enso$c_anomaly_avg))
tail(compare_anomaly)
# A tibble: 6 × 6
   year temperature_anomaly land_temperature_anom…¹ ocean_temperature_an…² enso 
  <dbl>               <dbl>                   <dbl>                  <dbl> <chr>
1  2018                0.87                    1.34                   0.66 El N…
2  2019                0.98                    1.52                   0.74 El N…
3  2020                1.02                    1.66                   0.73 La N…
4  2021                0.87                    1.39                   0.63 La N…
5  2022                0.9                     1.4                    0.67 La N…
6  2023                1.19                    1.81                   0.91 El N…
# ℹ abbreviated names: ¹​land_temperature_anomaly, ²​ocean_temperature_anomaly
# ℹ 1 more variable: enso_anomaly <dbl>

Plot 1

cols <- c("red", "blue", "lightgrey")
highchart() |>
  hc_add_series(data = compare_anomaly,
                   type = "bubble",
                   hcaes(x = year,
                   y = ocean_temperature_anomaly, 
                    group = enso,
                   size = enso_anomaly)) |>
  hc_add_theme(hc_theme_darkunica()) |>
  hc_colors(cols) |>
  hc_title(text="Ocean Temperature Anomalies since 1950") |>
  hc_subtitle(text="Size based on the strength of the ENSO event") |>
  hc_xAxis(title = list(text="Year")) |>
  hc_yAxis(title = list(text="Temp in C")) |>
  hc_caption(text = "Source: NOAA") |>
  hc_tooltip(borderColor = "black",
             pointFormat = "Year: {point.year}<br>Ocean Temperature Anomaly: {point.ocean_temperature_anomaly}<br>Land Temperature Anomaly: {point.land_temperature_anomaly}<br>Average Temperature Anomaly: {point.temperature_anomaly}"
  )

Firstly, this plot shows that temperature anomalies have generally been increasing since 1950 and also at a faster pace near the present. This plot shows that El Nino years are somewhat associated with higher ocean and global temperature anomalies, and even land temperature anomalies. While conversely, La Nina years are somewhat associated with the lower temperate anomalies. If we remember, El Nino is the positive anomaly phase of the ENSO region. So if a large portion of the pacific has above average temperatures it would not be suprising if the rest of the would also be hotter. This makes sense considering the oceans store about 91% of the planet’s excess heat(5) Thus, the opposite is probably true of La Nina in terms of having a cooler effects on global temperatures. The fact that the ocean store much of the planet’s heat and the fact that ENSO region anomalies can have big consequences, meaning that even small changes in temperature anomalies can significant impacts on global climates. Two more interesting things to note, from 2020 to 2022 we had a very rare triple dip La Nina which means El Nino was absent for over 3 years. From the graph it looks like La Nina was able to level off some global warming with even a bit of cooling from 2020 to 2021, which supports the aforementioned theory. However, after the triple dip finally ended, last year, we ended up having a rare super El Nino(6) So perhaps the warming effects of the super El Nino is at least partly responsible for the massive jump in temperature anomalies during 2023 as shown in the graph.

Combining for plot 2 prep

For the next plot I will explore how ENSO affects Atlantic hurricanes

compare_enso2 <- compare_enso |>
  ## filtering to match the hurricane data frame
  filter(year > 1973 & year < 2024) 
compare_hurricanes <- atl_hurricanes |>
    mutate("enso" = compare_enso2$enso_phase)
tail(compare_hurricanes)
   year tropical_storms hurricanes major_hurricanes   ace   activity_level
45 2018              15          8                2 132.6     Above Normal
46 2019              18          6                3 132.2     Above Normal
47 2020              30         14                7 179.8 Extremely Active
48 2021              21          7                4 145.7     Above Normal
49 2022              14          8                2  95.1      Near Normal
50 2023              20          7                3 145.6     Above Normal
      enso
45 El Nino
46 El Nino
47 La Nina
48 La Nina
49 La Nina
50 El Nino

Effects of ENSO on Atlantic hurricanes

Lets take a look at the averages based on ENSO

compare_hurricanes |>
  group_by(enso) |>
  summarize(avg_h = mean(hurricanes),
            avg_ace = mean(ace))
# A tibble: 3 × 3
  enso    avg_h avg_ace
  <chr>   <dbl>   <dbl>
1 El Nino  4.89    76.8
2 La Nina  8.11   137. 
3 Neutral  6.77   105. 

ENSO seems to have considerable affects on hurricane actvity. Now for plot two I will visualize this information.

Creating a complete data frame

To continue for plot 2, 3 and 4 I will need to combine all the data that overlaps

compare_anomaly2 <- complete_anomaly2 |>
  filter(year > 1973 & year < 2024)    
complete_df <- atl_hurricanes |>
  mutate("ocean_anomaly" = compare_anomaly2$ocean_temperature_anomaly) |>
  mutate("enso" = compare_enso2$enso_phase)
tail(complete_df)
   year tropical_storms hurricanes major_hurricanes   ace   activity_level
45 2018              15          8                2 132.6     Above Normal
46 2019              18          6                3 132.2     Above Normal
47 2020              30         14                7 179.8 Extremely Active
48 2021              21          7                4 145.7     Above Normal
49 2022              14          8                2  95.1      Near Normal
50 2023              20          7                3 145.6     Above Normal
   ocean_anomaly    enso
45          0.66 El Nino
46          0.74 El Nino
47          0.73 La Nina
48          0.63 La Nina
49          0.67 La Nina
50          0.91 El Nino

Lengthening for plot 2

For plot 2 I will need to lengthen, but for the following plots I do not, that is why I split completing the data frame into 2 steps

graph_p2 <- complete_df |>
  pivot_longer(
    cols = ends_with("s"),
    names_to = "type",
    values_to = "number")
tail(graph_p2)
# A tibble: 6 × 7
   year   ace activity_level ocean_anomaly enso    type             number
  <dbl> <dbl> <chr>                  <dbl> <chr>   <chr>             <dbl>
1  2022  95.1 Near Normal             0.67 La Nina tropical_storms      14
2  2022  95.1 Near Normal             0.67 La Nina hurricanes            8
3  2022  95.1 Near Normal             0.67 La Nina major_hurricanes      2
4  2023 146.  Above Normal            0.91 El Nino tropical_storms      20
5  2023 146.  Above Normal            0.91 El Nino hurricanes            7
6  2023 146.  Above Normal            0.91 El Nino major_hurricanes      3

Converting to factors and then reordering the factors to make plot 2 look more organized

graph_p2$type <- as.factor(graph_p2$type)
graph_p2$enso <- as.factor(graph_p2$enso)
## reordering
graph_p2$type <- factor(graph_p2$type, levels=c('tropical_storms',
                                                'hurricanes',
                                                'major_hurricanes'))
graph_p2$enso <- factor(graph_p2$enso, levels=c('La Nina',
                                                'Neutral',
                                                'El Nino'))
levels(graph_p2$type)
[1] "tropical_storms"  "hurricanes"       "major_hurricanes"
levels(graph_p2$enso)
[1] "La Nina" "Neutral" "El Nino"

Plot 2

ggplot(graph_p2, aes(type, number, fill = type)) +
  geom_boxplot() + 
  scale_fill_manual(values = c("gold", "red", "magenta"), 
                    labels = c("Tropical Storms", "Hurricanes", "Major Hurricanes")) +
  theme_economist() +
  theme(axis.text.x = element_blank(),
        axis.ticks.x = element_blank()) +
  facet_wrap(~enso) +
  labs(x = "", y = "Number", 
       title = "Effects of ENSO on Atlantic Tropical Cyclones",
       fill = "Type of Tropical Cyclone",
       caption = "Source: NOAA and NHC")

This graph shows that during La Nina the median number of tropical storms, hurricanes, and major hurricanes is higher than that of Neutral and El Nino years. Out of each phase of ENSO, the plot shows that La Nina mostly has the greatest amount of variation. The spread in major hurricanes is the smallest compared to the other types, which is not surprising considering it takes a great deal of strengthening to reach major hurricane status. I find it interesting that the interquartile range of hurricanes is bigger than that of tropical storms for neutral years; especailly considering that trend is flipped for the other ENSO phases. Overall, the information that this plot shows matches well with the research conducted by scientists:

The CPC states that El Nino is associated with reduce hurricane activity, while La Nina is associated with increased hurricane activity(4) These differences can be attributed to a meteorological phenomenon known as wind shear.

Now before I do plot 3, now that I have a complete data frame I can do my linear regression

Scatter plot to precede linear regression

Can we predict the number of tropical storms in a hurricane season based on the ocean temperature anomaly?

ggplot(complete_df, aes(x = ocean_anomaly, y = tropical_storms)) +
  geom_point()

This scatter plot shows somewhat of a relationship between ocean anomaly and the number of tropical storms, although on the right side of the graph the points fan out. Next I’ll find the r value.

Linear Regression

cor(complete_df$tropical_storms, complete_df$ocean_anomaly)
[1] 0.5266809
fit1 <- lm(tropical_storms ~ ocean_anomaly, data = complete_df)
summary(fit1)

Call:
lm(formula = tropical_storms ~ ocean_anomaly, data = complete_df)

Residuals:
    Min      1Q  Median      3Q     Max 
-7.8157 -3.6419  0.0786  2.2399 13.8578 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)      7.436      1.452   5.120 5.34e-06 ***
ocean_anomaly   13.686      3.188   4.293 8.53e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 4.566 on 48 degrees of freedom
Multiple R-squared:  0.2774,    Adjusted R-squared:  0.2623 
F-statistic: 18.43 on 1 and 48 DF,  p-value: 8.528e-05

Despite the moderate correlation of about 0.5266809 the model yielded very low p values which suggest that a linear model would be appropriate for the relationship between ocean anomaly and the number of tropical storms. Although the r squared value suggests that only about a quarter of the variation in the scatter plot can be explained by the regression model.

Regression Model

tropical storms = 13.686(ocean anomaly) + 7.436

This is the equation of the model. Next I will look into the diagnostic plots.

Diagnostic Plots

autoplot(fit1, 1:4, nrow=2, ncol=2)

The fitted value plot shows a line that does not stray too far from linearity, while the points are fairly random and balanced, if we ignore 32 and 47. These suggest that a linear model could fit the relationship between tropical storms and ocean anomaly. The QQ plot is actually looks pretty good, until 32 and 47 once again. Overall, there are more signs pointing towards a linear model being appropriate for the relationship between tropical storms and ocean anomalies. However, it is important to realize that because temperatures are increasingly warming faster, the previous statement may not hold much longer.

Filtering for last 2 plots

First of all, the main ingredient required for tropical development is warm water, so lets find out if global warming worsening hurricanes.

In order to visualize I am going to compare ACE and ocean temperature anomalies. However, due to the effects of ENSO, as explored above, I am going to have to look at La Nina and El Nino years separately to really tell.

complete_df2 <- complete_df |>
  mutate("above_line" = 126.1) |>
  ## I decide to remove two years from the next two plots because 1975 was the last year with a negative anomaly. That one year with the negative anomaly changed how my 3rd graph looked making it much harder to compare it with graph 4
  filter(year > 1975)
complete_nina <- complete_df2 |>
  filter(enso == "La Nina") 
complete_nino <- complete_df2 |>
  filter(enso == "El Nino")

Plot 3

cols2 <- c("#dbd0d2", "blue", "black")
highchart() |>
  hc_yAxis_multiples(
    list(title = list(text = "Acumlated Cyclone Energy")),
    list(title = list(text = "Ocean Anomalies"),
         opposite = TRUE)
  ) |>
  hc_add_series(data = complete_nina$above_line,
                name = "Non-Above Average Seasons",
                type = "area",
                yAxis = 0) |>
  hc_add_series(data = complete_nina$ace,
                name = "Acumlated Cyclone Energy",
                type = "column",
                yAxis = 0) |>
  hc_add_series(data = complete_nina$ocean_anomaly,
                name = "Ocean Anomaly",
                type = "line",
                yAxis = 1) |>
  hc_xAxis(categories = complete_nina$year,
           tickInterval = 1) |>
  hc_colors(cols2) |>
  hc_chart(style = list(fontFamily = "AvantGarde",
                        fontWeight = "bold")) |>
  hc_title(text = "Ocean Temperature Anomalies and North Atlantic Accumlated Cyclone Energy<br>For La Nina years", align = "center") |>
  hc_subtitle(text = "Between 1976 and last year", align = "center") |>
  hc_caption(text = "Source: NOAA and NHC") |>
  hc_legend(verticalAlign = "top",
            layout = "horizontal",
            align = "center") |>
  hc_tooltip(shared = TRUE) |>
  hc_add_theme(hc_theme_ffx())

This plot shows that there perhaps is not a clear cut direct relationship between ocean anomalies and ace as one would think. Even though ocean anomalies have generally been increasing from 1983 to 2022, ace has still varied quite a bit during those years. The graph also shows that many La Nina years end up having hurricane seasons that are above average. A couple of interesting things to note next: The 2017 hurricane season, which was infamous for hurricanes Harvey, Maria and Irma, were not only powerful but also long-lived. These types of storms tend to inflate ace values for the season. That is why 2017 and 1995 are have the second and third most ace correspondingly.

complete_df |>
  slice_max(tropical_storms, n = 5)
  year tropical_storms hurricanes major_hurricanes   ace   activity_level
1 2020              30         14                7 179.8 Extremely Active
2 2005              28         15                7 245.3 Extremely Active
3 2021              21          7                4 145.7     Above Normal
4 2023              20          7                3 145.6     Above Normal
5 1995              19         11                5 227.1 Extremely Active
6 2010              19         12                5 165.5 Extremely Active
7 2011              19          7                4 126.3     Above Normal
8 2012              19         10                2 132.6     Above Normal
  ocean_anomaly    enso
1          0.73 La Nina
2          0.49 La Nina
3          0.63 La Nina
4          0.91 El Nino
5          0.34 La Nina
6          0.53 La Nina
7          0.41 La Nina
8          0.48 Neutral

The 2020 hurricane season, although the most active hurricane season on record, in terms of named storms, is only 5th on the list of most ace. That shows that 2020 had quantity over quantity for tropical cyclones. The 2005 hurricane season, also known for numerous deadly and destructive hurricanes, including Hurricane Katrina, still holds the record for most ACE in a season and is the second most active season on record in terms of tropical storms.

Plot 4

cols3 <- c("#f2dade", "red", "black")
hc_red_theme <- hc_theme(
  chart = list(
    backgroundColor = NULL,
    divBackgroundImage = "https://swiftwithmajid.com/public/l1.png"
  ))
highchart() |>
  hc_yAxis_multiples(
    list(title = list(text = "Acumlated Cyclone Energy")),
    list(title = list(text = "Ocean Anomalies"),
         opposite = TRUE)
  ) |>
  hc_add_series(data = complete_nino$above_line,
                name = "Non-Above Average Hurricane Seasons",
                type = "area",
                yAxis = 0) |>
  hc_add_series(data = complete_nino$ace,
                name = "Acumlated Cyclone Energy",
                type = "column",
                yAxis = 0) |>
  hc_add_series(data = complete_nino$ocean_anomaly,
                name = "Ocean Anomaly",
                type = "line",
                yAxis = 1) |>
  hc_xAxis(categories = complete_nino$year,
           tickInterval = 1) |>
  hc_colors(cols3) |>
  hc_chart(style = list(fontFamily = "AvantGarde",
                        fontWeight = "bold")) |>
  hc_title(text = "Ocean Temperature Anomalies and North Atlantic Accumlated Cyclone Energy<br>For El Nino years") |>
  hc_subtitle(text = "Between 1976 and last year") |>
  hc_caption(text = "Source: NOAA and NHC") |>
  hc_legend(verticalAlign = "top",
            layout = "horizontal") |>
  hc_tooltip(shared = TRUE) |>
  hc_add_theme(hc_red_theme)

This plot shows that over time ace has been increasing alongside temperature anomalies during El Nino years. The only year that does not fit this trend is 2004 and perhaps 1976 and 1979. However, checking 2004 on plot 1, shows that 2004 was a weaker El Nino which might explain why its ace was showing anomalously high on this graph. Furthermore, the plot shows that there have been only 4 El Nino years that had above average hurricane activity, particularly in the last 3 El Nino years when ocean anomalies have been among their highest. One last interesting thing to note: The 2023 hurricane season ended up being the 4th most active season on record in terms of named storms (as shown above), even in spite of the stronger El Nino (as shown by plot 1). Scientist believe that the record breaking ocean anomalies were responsible for neutralizing the unfavorable conditions typically brought by El Nino(7) Although in the end, the strong El Nino still restricted the 2023 season from having an ace that was too high.

Final Essay

Overall, my plots show that ENSO plays a big role in temperature anomalies and in the Atlantic hurricane season. Plots 2, 3, and 4 all show that La Nina tends to enhance hurricane activity while El Nino diminishes hurricane activity. Thus, there are many more above average hurricane seasons during La Nina years than El Nino years. In addition, temperature anomalies can also affect the Atlantic hurricane season. However, simultaneously, plots 3 and 4 show that nothing is a simple as it seems. Hence, there are a number of other factors that affect global climates in addition to global warming. Different meteorological factors such as the North Atlantic Oscillation, the Madden-Julian Oscillation, Oceanic heat content, Saharan dust and air pollution also play significant roles in moderating temperature anomalies as well as the Atlantic hurricane season.

Obviously, it would have been nice I was able to figure out how facet highchart for plot 3\4. Nevertheless, I like what I have done.

References:

https://www.nytimes.com/2024/04/10/climate/ocean-heat-records.html1

https://www.artemis.bm/2024-atlantic-hurricane-season/2

https://www.cpc.ncep.noaa.gov/products/outlooks/Background.html3

https://www.cpc.ncep.noaa.gov/products/analysis_monitoring/ensostuff/ensofaq.shtml#GLOBALimpacts4

https://www.climate.gov/news-features/understanding-climate/climate-change-ocean-heat-content#:~:text=Highlights,the%20surface%20of%20the%20Earth.5

https://www.usatoday.com/story/news/weather/2024/02/08/super-el-nino-declared-but-la-nina-is-on-the-way/72513805007/6

https://yaleclimateconnections.org/2023/11/the-unusual-2023-atlantic-hurricane-season-ends/7