Load packages

library(tidyverse)
library(sqldf)

Overview / Introduction

For this project, I chose two weather related datasets and resubmitted my Homework 5 assignment. The two weather related datasets were from NOAA and NCAA. To collect this data, I created a web scraper in Python which pulled the data from the websites and formatted it into a dataframe. From there, I am able to pull the data into RStudio and begin the process of data cleaning.

The NOAA dataset contained data of historic storms. It is incredibly messy, with inconsistent format within many columns. Multiple columns have more than one piece of data, and there are also many missing values. The data includes date, the name of the storm (in most cases), the level of surge, various impact variables, and more. The NCAA dataset contains data on monthly snowfall in Buffalo, with data spanning back decades. The scraped data has repeated headers and some “missing” data, indicated by a T which represents trace amounts of snowfall that was not measurable.

Historic Storms Dataset: Create a CSV file that includes all of the information

For my first dataset, I used the historical storms data from NOAA. This dataset includes historic named storms from the 2010’s to current day. To create the CSV file, I used BeautifulSoup to scrape the data from the NOAA website directly.

import re
from bs4 import BeautifulSoup
import requests
import pandas as pd
import numpy as np

full_url = "https://vlab.noaa.gov/web/mdl/historic-storms"
print(full_url)
page = requests.get(full_url, headers={"User-Agent": "Mozilla/5.0 (X11; CrOS x86_64 12871.102.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.141 Safari/537.36"}).text
soup = BeautifulSoup(page, "html.parser")

pre_tags = []
index = 0

allData = soup.find_all('pre')

for i in allData:
    pre_tags.insert(len(pre_tags),i.get_text().strip())
    index += 1

num_rows = int(index/7)
num_columns = 7

df = pd.DataFrame(np.array(pre_tags).reshape(num_rows, num_columns))

df.columns = ['Storm','Date','Storm-Tide','Obs','Guidance','Cat,Pres,Dead,$bn','Area']

df.to_csv("stormdata.csv",sep =';',index=False)

Approach

Looking at the data, I see there are many rows with newline characters. There are also various columns that can be split into multiple. My approach for cleaning the data will be to remove newlines, split the columns appropriately (Storm, Storm Tide, Cat Pres Dead $bn), transform date into start date, and then convert each Area to its own row.

Load Data

stormdata <- read.csv("C:\\Users\\Kim\\Documents\\GitHub\\Data607\\Project 2\\stormdata.csv", header = TRUE, sep = ";")
stormdata
##                         Storm             Date
## 1                 2024-Helene        Sep 23-29
## 2               2024-Francine           Sep 12
## 3               2024-Gilma-cp           Aug 30
## 4                2024-Hone-cp           Aug 25
## 5   2024-Post-\nTyphoon Ampil           Aug 22
## 6                  2024-Debby          Aug 5,6
## 7                  2024-Beryl            Jul 8
## 8                2024-Alberto        Jun 19,20
## 9            2023-Otis-ep (R)        Oct 22-25
## 10            2023-Bolaven-wp         Oct 6-14
## 11               2023-Ophelia        Sep 22-24
## 12                   2023-Lee        Sep 13,14
## 13                2023-Idalia        Aug 27-31
## 14            2023-Hillary-ep        Aug 16-22
## 15           2023-Dora-cp (R) Jul 31 -\nAug 21
## 16             2023-Calvin-cp        Jul 11-19
## 17              2023-Mawar-wp  May 19 -\nJun 3
## 18                 2023-Jan28        Jan 28,29
## 19                 2023-Jan14        Jan 14-17
## 20                2022-Nicole         Nov 7-11
## 21               2022-Ian (R)        Sep 26-30
## 22             2022-Fiona (R)        Sep 14-27
## 23 2022-Post-\nTyphoon Merbok        Sep 14-19
## 24                   2021-Oct           Oct 28
## 25              2021-Nicholas        Sep 12-20
## 26               2021-Ida (R)        Aug 27-30
## 27                 2021-Henri        Aug 20-22
## 28                  2021-Fred        Aug 15,16
## 29                  2021-Elsa Jun 30 -\nJul 10
## 30              2020-Iota (R)        Nov 13-18
## 31               2020-Eta (R)          Nov 8,9
## 32                  2020-Zeta        Oct 26-29
## 33                 2020-Delta         Oct 7-10
## 34                  2020-Beta        Sep 17-25
## 35                 2020-Sally        Sep 12-16
## 36             2020-Laura (R)        Aug 24-27
## 37                 2020-Marco        Aug 22-24
## 38                2020-Isaias  Jul 31 -\nAug 4
## 39             '20-Douglas-cp        Jul 20-30
## 40                 2020-Hanna        Jul 24-26
## 41             2020-Cristobal         Jun 1-12
## 42                2019-Imelda        Sep 17-19
## 43            2019-Dorian (R)          Sep 1-6
## 44                 2019-Barry        Jul 10-14
## 45              2019-Wutip-wp  Feb 18 -\nMar 2
## 46           2018-Michael (R)         Oct 7-11
## 47            '18-Mangkhut-wp         Sep 6-17
## 48                 2018-Isaac         Sep 7-15
## 49                2018-Gordon          Sep 3-8
## 50           '18-Florence (R)  Aug 31 -\nSep 8
## 51               2018-Lane-cp        Aug 15-29
## 52              2018-Maria-wp         Jul 3-12
## 53               2018-Alberto  May 25 -\nJun 1
## 54                  2018-Mar1  Feb 28 -\nMar 1
## 55               2018-Gita-sp         Feb 3-22
## 56                 2018-Jan17         Jan 3-17
## 57              2017-Nate (R)          Oct 4-9
## 58             2017-Maria (R)        Sep 16-30
## 59                  2017-Jose         Sep 5-25
## 60              2017-Irma (R) Aug 30 -\nSep 12
## 61            2017-Harvey (R)  Aug 17 -\nSep 1
## 62                   2017-Mar  Mar 13-\nApr 13
## 63              2016-Otto (R)        Nov 20-26
## 64           2016-Matthew (R)  Sep 28 -\nOct 9
## 65               2016-Hermine  Aug 28 -\nSep 8
## 66                 2016-Colin          Jun 5-7
## 67                   2016-Jan        Jan 23-24
## 68     2015-\nPatricia-ep (R)        Oct 20-24
## 69           2015-Joaquin (R)  Sep 28 -\nOct 8
## 70             2015-Erika (R)        Aug 24-28
## 71           2015-Chan-hom-wp Jun 29 -\nJul 13
## 72            2015-Dolphin-wp         May 6-24
## 73                   2015-Jan        Jan 26-28
## 74          2014-Odile-ep (R)        Sep 10-19
## 75                2014-Arthur          Jul 1-9
## 76          2013-Francisco-wp        Oct 15-26
## 77         2013-Manuel-ep (R)        Sep 13-20
## 78            2013-Ingrid (R)        Sep 12-17
## 79             2012-Sandy (R)        Oct 22-29
## 80                 2012-Isaac  Aug 21 -\nSep 3
## 81                   2011-Nov         Nov 4-11
## 82                   2011-Lee          Sep 2-7
## 83             2011-Irene (R)        Aug 21-30
##                                Storm.Tide                  Obs
## 1  Obs w4: >9.3 mhhw\nFcst w6: 15-20 mhhw                (FEV)
## 2                              w2: 5 mhhw                     
## 3                              w1: 3 mhhw                     
## 4                              w1: 3 mhhw                     
## 5                            w3: 6.3 mhhw                     
## 6                            w2: 5.1 mhhw                (FEV)
## 7                              w3: 8 mhhw                     
## 8                              w1: 3 mhhw                     
## 9                                                             
## 10                                                            
## 11                            w2: 3-5 agl            (TCR,FEV)
## 12                            w1: 1-2 agl            (TCR,FEV)
## 13                           w4: 8-12 agl            (TCR,FEV)
## 14                                                            
## 15                                                            
## 16                                                            
## 17                                                            
## 18                                                            
## 19                                                            
## 20                              w2: 5 agl                (TCR)
## 21                          w5: 10-15 agl            (TCR,FEV)
## 22                            w1: 1-3 agl                (TCR)
## 23                           w5: 12.5 agl                (FEV)
## 24                                                       (FEV)
## 25                            w2: 3-6 agl                (TCR)
## 26                           w5: 9-14 agl (TCR,FEV)\n(Service)
## 27                            w2: 2-4 agl            (TCR,FEV)
## 28                            w2: 2-4 agl                (TCR)
## 29                            w1: 1-3 agl                (TCR)
## 30                                                            
## 31                              w1: 3 agl                (TCR)
## 32                           w4: 6-10 agl                (TCR)
## 33                            w3: 6-9 agl                (TCR)
## 34                            w2: 4.2 agl                (TCR)
## 35                            w3: 5-7 agl                (TCR)
## 36                          w6: 12-18 agl                (TCR)
## 37                            w1: 1-2 agl                (TCR)
## 38                            w2: 3-6 agl                (TCR)
## 39                                                            
## 40                            w2: 3-5 agl                (TCR)
## 41                            w3: 6.2 agl                (TCR)
## 42                            w1: 1-2 agl                (TCR)
## 43                            w3: 5-7 agl                (TCR)
## 44                            w2: 4-6 agl                (TCR)
## 45                                                            
## 46                           w5: 9-14 agl (TCR,FEV)\n(Service)
## 47                                                            
## 48                                                       (FEV)
## 49                            w1: 1-3 agl                (TCR)
## 50                           w4: 8-11 agl (TCR,FEV)\n(Service)
## 51                                                       (FEV)
## 52                                                            
## 53                            w1: 1-3 agl                (TCR)
## 54                                                       (FEV)
## 55                                                            
## 56                                                       (FEV)
## 57                            w3: 6-9 agl            (TCR,FEV)
## 58                            w3: 6-9 agl            (TCR,FEV)
## 59                            w1: < 3 agl            (TCR,FEV)
## 60                            w3: 5-8 agl            (TCR,FEV)
## 61                           w4: 6-10 agl   (TCR,FEV)\nService
## 62                                                       (FEV)
## 63                                                            
## 64                            w3: 5-7 agl   (TCR,FEV)\nService
## 65                            w3: 4-7 agl            (TCR,FEV)
## 66                            w1: < 3 agl                (TCR)
## 67                           w2: 3.5 mhhw              Service
## 68                                                            
## 69                            w2: 3-4 agl            (TCR,FEV)
## 70                                                            
## 71                                                            
## 72                                                            
## 73                           w2: 3.4 mhhw                (FEV)
## 74                                                            
## 75                            w2: 3-5 agl                (TCR)
## 76                                                            
## 77                                                            
## 78                                                            
## 79                            w3: 4-9 agl   (TCR,FEV)\nService
## 80                          w6: 10-17 agl                (TCR)
## 81                                                            
## 82                               w2: 4.76            (TCR,FEV)
## 83                               w4: 8-11   (TCR,FEV)\nService
##                   Guidance          Cat.Pres.Dead..bn
## 1            Warn-A6 (1-8)  (Cat4, 938, 221+, $27.5+)
## 2           Warn-A5 (1-15)       (Cat2, 972, 0, $1.5)
## 3             PS-Adv 43,46          (Cat4, 949, -, -)
## 4  PS-Adv 2-7,\n9-11,13,14          (Cat1, 988, -, -)
## 5                   P-ETSS          (Cat4, 947, -, -)
## 6           Warn-A5 (6-20)       (Cat1, 979, 10, >$1)
## 7         Warn-A33 (29-41)    (Cat5, 934, 70, >$6.86)
## 8                                (TS, 993, 5, $0.179)
## 9                               (Cat5, 922, 52+, $16)
## 10                               (Typhoon, 903, 0, -)
## 11                 Warn-A4       (TS, 981, 0, $0.450)
## 12      Watch-A34\n(34-37)     (Cat5, 926, 3, $0.050)
## 13         Warn-A15 (5-20)       (Cat4, 940, 7, $2.5)
## 14                             (Cat4, 939, 2, $0.690)
## 15                                  (Cat4, 939, -, -)
## 16                                  (Cat3, 955, -, -)
## 17                             (Cat5, 897, 6, $0.250)
## 18                  P-ETSS                           
## 19                  P-ETSS                           
## 20                Warn-A12        (Cat1, 980, 11, $1)
## 21        Warn-A18 (12-33)     (Cat5, 937, 161, $113)
## 22                              (Cat4, 931, 29, $3.4)
## 23                  P-ETSS          (Cat1, 965, -, -)
## 24   P-ETSS\nRiver Forcing                           
## 25             Warn-A5 (2)       (Cat1, 988, 2, $1.1)
## 26          Warn-A7 (3-18)    (Cat4, 929, 107, $75.3)
## 27        Warn-A22 (18-27)       (Cat1, 986, 2, $0.7)
## 28        Warn-A28 (24-29)         (TS, 991, 7, $1.3)
## 29                Warn-A28      (Cat1, 991, 13, $1.2)
## 30                              (Cat4, 917, 84, $1.4)
## 31        Warn-A33 (31-35)     (Cat4, 922, 175, $8.3)
## 32         Warn-A11 (9-18)       (Cat3, 970, 9, $4.4)
## 33        Warn-A12 (12-22)       (Cat4, 953, 6, $3.0)
## 34             Warn-A9 (-)       (TS, 993, 1, $0.225)
## 35          Warn-A8 (5-21)       (Cat2, 965, 4, $7.3)
## 36        Warn-A23 (19-30)     (Cat4, 937, 81, $23.3)
## 37        Warn-A13 (10-16)     (Cat1, 991, 0, $0.035)
## 38        Warn-A23 (15-30)     (Cat1, 986, 17, $5.03)
## 39                                  (Cat4, 954, -, -)
## 40         Warn-A11 (8-13)       (Cat1, 973, 9, $1.2)
## 41            Warn-A17 (-)      (TS, 988, 46, $0.865)
## 42                                (TS, 1003, 7, $4.8)
## 43        WarnA-35 (33-53)      (Cat5, 910, 84, $5.1)
## 44          WarnA-5 (1-16)       (Cat1, 993, 3, $0.9)
## 45                               (Typhoon, 895, -, -)
## 46         Warn-A10 (7-17)     (Cat5, 919, 74, $25.5)
## 47                         (Typhoon, 868, 134, $3.77)
## 48                                     (Cat1, 995 mb)
## 49          Warn-A5 (3-11)         (TS, 996, 3, $0.2)
## 50        Warn-A50 (48-65)     (Cat4, 937, 54, $24.2)
## 51                              (Cat5, 926, 1, $0.25)
## 52                          (Typhoon, 910, 2, $0.637)
## 53            Watch-A2 (1)      (TS, 990, 18, $0.125)
## 54                  P-ETSS                           
## 55                             (Cat4, 927, 1, $0.253)
## 56                  P-ETSS                           
## 57          Warn-A9 (7-16)    (Cat1, 981, 48, $0.787)
## 58        Warn-A42 (35-46)  (Cat5, 908, 3,059, $91.6)
## 59                  P-ETSS   (Cat4, 938, 1, $0.00284)
## 60        Warn-A36 (34-51)     (Cat5, 914, 52, $77.2)
## 61        Warn-A25 (23-42)     (Cat4, 937, 107, $125)
## 62                  P-ETSS                           
## 63                            (Cat3, 975, 23, $0.192)
## 64         P-Surge (26-44)    (Cat5, 934, 731, $16.5)
## 65         P-Surge (11-24)      (Cat1, 981, 4, $0.55)
## 66         P-Surge (26-40)      (TS, 1001, 6, $0.001)
## 67                ETSS-2.0        (ET, 983, 55, $0.5)
## 68                             (Cat5, 872, 8, $0.463)
## 69                ETSS-2.0      (Cat4, 931, 34, $0.2)
## 70                             (TS, 1001, 35, $0.511)
## 71                          (Typhoon, 933, 18, $1.59)
## 72                          (Typhoon, 918, 1, $0.014)
## 73                ETSS-2.0                           
## 74                             (Cat4, 918, 18, $1.25)
## 75          P-Surge (6-14)     (Cat2, 973, 0, $0.040)
## 76                         (Typhoon, 920, 0, $0.0002)
## 77                             (Cat1, 983, 169, $4.2)
## 78                              (Cat1, 983, 32, $1.5)
## 79         P-Surge (21-31)    (Cat3, 940, 254, $68.7)
## 80         P-Surge (16-38)      (Cat1, 965, 41, $3.1)
## 81                    ETSS       (ET, 943, 1, $0.024)
## 82                                (TS, 986, 18, $2.8)
## 83         P-Surge (20-34)     (Cat3, 942, 58, $14.2)
##                                Area
## 1  N.W. FL, GA, AL,\nTN, KY, VA, WV
## 2                            LA, MS
## 3                                HI
## 4                                HI
## 5                                AK
## 6               N.W. FL, GA, SC, NC
## 7                            TX, LA
## 8                            TX, LA
## 9                               MEX
## 10                             Guam
## 11               NC, VA, MD, NY, DE
## 12                      New England
## 13                          N.W. FL
## 14                               CA
## 15                               HI
## 16                               HI
## 17                             Guam
## 18                                 
## 19                                 
## 20                           PR, FL
## 21                      S.W. FL, SC
## 22                               PR
## 23                               AK
## 24                           MD, DE
## 25                            N. TX
## 26                      New Orleans
## 27                      New England
## 28                          N.W. FL
## 29                       East Coast
## 30    NIC, HND, COL,\nSLV, GTM, PAN
## 31                   FL (Tampa Bay)
## 32                      New Orleans
## 33                               LA
## 34                       TX, LA, MS
## 35                      New Orleans
## 36                               LA
## 37                      New Orleans
## 38                       East Coast
## 39                               HI
## 40                            S. TX
## 41                               LA
## 42                           TX, LA
## 43                           FL, NC
## 44                               LA
## 45                             Guam
## 46                    US Gulf Coast
## 47                             Guam
## 48                      PR - (Rain)
## 49                  MS, AR, FL Keys
## 50                       Eastern US
## 51                               HI
## 52                             Guam
## 53                               MS
## 54      DE, NJ, NY, CT,\nRI, VT, ME
## 55                         Am Samoa
## 56                               MA
## 57                       AL, LA, MS
## 58                               PR
## 59                       CT, RI, MA
## 60                           FL, PR
## 61                           TX, LA
## 62                               NY
## 63                    PAN, CRI, NIC
## 64                            SE US
## 65                          N.W. FL
## 66                          N.W. FL
## 67                DE, NJ (Cape May)
## 68                          MEX, TX
## 69           VA, BHS, BMU, CUB, HTI
## 70                         DMA, HTI
## 71                             Guam
## 72                             Guam
## 73                           CT, MA
## 74                              MEX
## 75                               NC
## 76                             Guam
## 77                              MEX
## 78                              MEX
## 79             Mid-Atlantic,\nNE US
## 80                               LA
## 81                               AK
## 82                           LA, MS
## 83             Mid-Atlantic,\nNE US

Remove newlines

stormdata <- stormdata %>% mutate(across(everything(), ~ str_replace_all(.,"[\n]"," ")))

Clean up Storm column

I cleaned up the Storm column by first extracting the year and removing it from the column. Next, I extract whether the name of the storm is required by matching (R) in the Storm column, and remove this as well. Lastly, I extract the Oceanic Basin with a regex that identifies a dash followed by any two alphabetical characters, and then remove this from the Storm column. I bumped up year to the first column of the dataset. If year only has two numbers, I added a “20” preceeding those numbers, as this data is only for 2000s data.

stormdata <- stormdata %>% 
  mutate(year = as.numeric(str_extract(Storm, "\\d+(?=-)"))) %>% 
  mutate(Storm = str_replace_all(Storm,"^.*?-","")) %>%
  mutate(retiredName = str_match(Storm, "\\([R]\\)")) %>%
  mutate(Storm = str_replace_all(Storm,"\\s\\([R]\\)","")) %>%
  mutate(OceanicBasin = str_extract(Storm, "(?<=-)[a-z]{2}$")) %>%
  mutate(Storm = str_replace_all(Storm,"-[a-z]{2}$","")) %>%
  select(year,everything()) %>%
  mutate(year = str_replace(year,"^\\d{2}$",paste("20",year,sep = "")))

stormdata
##    year                Storm            Date
## 1  2024               Helene       Sep 23-29
## 2  2024             Francine          Sep 12
## 3  2024                Gilma          Aug 30
## 4  2024                 Hone          Aug 25
## 5  2024  Post- Typhoon Ampil          Aug 22
## 6  2024                Debby         Aug 5,6
## 7  2024                Beryl           Jul 8
## 8  2024              Alberto       Jun 19,20
## 9  2023                 Otis       Oct 22-25
## 10 2023              Bolaven        Oct 6-14
## 11 2023              Ophelia       Sep 22-24
## 12 2023                  Lee       Sep 13,14
## 13 2023               Idalia       Aug 27-31
## 14 2023              Hillary       Aug 16-22
## 15 2023                 Dora Jul 31 - Aug 21
## 16 2023               Calvin       Jul 11-19
## 17 2023                Mawar  May 19 - Jun 3
## 18 2023                Jan28       Jan 28,29
## 19 2023                Jan14       Jan 14-17
## 20 2022               Nicole        Nov 7-11
## 21 2022                  Ian       Sep 26-30
## 22 2022                Fiona       Sep 14-27
## 23 2022 Post- Typhoon Merbok       Sep 14-19
## 24 2021                  Oct          Oct 28
## 25 2021             Nicholas       Sep 12-20
## 26 2021                  Ida       Aug 27-30
## 27 2021                Henri       Aug 20-22
## 28 2021                 Fred       Aug 15,16
## 29 2021                 Elsa Jun 30 - Jul 10
## 30 2020                 Iota       Nov 13-18
## 31 2020                  Eta         Nov 8,9
## 32 2020                 Zeta       Oct 26-29
## 33 2020                Delta        Oct 7-10
## 34 2020                 Beta       Sep 17-25
## 35 2020                Sally       Sep 12-16
## 36 2020                Laura       Aug 24-27
## 37 2020                Marco       Aug 22-24
## 38 2020               Isaias  Jul 31 - Aug 4
## 39 2020              Douglas       Jul 20-30
## 40 2020                Hanna       Jul 24-26
## 41 2020            Cristobal        Jun 1-12
## 42 2019               Imelda       Sep 17-19
## 43 2019               Dorian         Sep 1-6
## 44 2019                Barry       Jul 10-14
## 45 2019                Wutip  Feb 18 - Mar 2
## 46 2018              Michael        Oct 7-11
## 47 2018             Mangkhut        Sep 6-17
## 48 2018                Isaac        Sep 7-15
## 49 2018               Gordon         Sep 3-8
## 50 2018             Florence  Aug 31 - Sep 8
## 51 2018                 Lane       Aug 15-29
## 52 2018                Maria        Jul 3-12
## 53 2018              Alberto  May 25 - Jun 1
## 54 2018                 Mar1  Feb 28 - Mar 1
## 55 2018                 Gita        Feb 3-22
## 56 2018                Jan17        Jan 3-17
## 57 2017                 Nate         Oct 4-9
## 58 2017                Maria       Sep 16-30
## 59 2017                 Jose        Sep 5-25
## 60 2017                 Irma Aug 30 - Sep 12
## 61 2017               Harvey  Aug 17 - Sep 1
## 62 2017                  Mar  Mar 13- Apr 13
## 63 2016                 Otto       Nov 20-26
## 64 2016              Matthew  Sep 28 - Oct 9
## 65 2016              Hermine  Aug 28 - Sep 8
## 66 2016                Colin         Jun 5-7
## 67 2016                  Jan       Jan 23-24
## 68 2015             Patricia       Oct 20-24
## 69 2015              Joaquin  Sep 28 - Oct 8
## 70 2015                Erika       Aug 24-28
## 71 2015             Chan-hom Jun 29 - Jul 13
## 72 2015              Dolphin        May 6-24
## 73 2015                  Jan       Jan 26-28
## 74 2014                Odile       Sep 10-19
## 75 2014               Arthur         Jul 1-9
## 76 2013            Francisco       Oct 15-26
## 77 2013               Manuel       Sep 13-20
## 78 2013               Ingrid       Sep 12-17
## 79 2012                Sandy       Oct 22-29
## 80 2012                Isaac  Aug 21 - Sep 3
## 81 2011                  Nov        Nov 4-11
## 82 2011                  Lee         Sep 2-7
## 83 2011                Irene       Aug 21-30
##                               Storm.Tide                 Obs
## 1  Obs w4: >9.3 mhhw Fcst w6: 15-20 mhhw               (FEV)
## 2                             w2: 5 mhhw                    
## 3                             w1: 3 mhhw                    
## 4                             w1: 3 mhhw                    
## 5                           w3: 6.3 mhhw                    
## 6                           w2: 5.1 mhhw               (FEV)
## 7                             w3: 8 mhhw                    
## 8                             w1: 3 mhhw                    
## 9                                                           
## 10                                                          
## 11                           w2: 3-5 agl           (TCR,FEV)
## 12                           w1: 1-2 agl           (TCR,FEV)
## 13                          w4: 8-12 agl           (TCR,FEV)
## 14                                                          
## 15                                                          
## 16                                                          
## 17                                                          
## 18                                                          
## 19                                                          
## 20                             w2: 5 agl               (TCR)
## 21                         w5: 10-15 agl           (TCR,FEV)
## 22                           w1: 1-3 agl               (TCR)
## 23                          w5: 12.5 agl               (FEV)
## 24                                                     (FEV)
## 25                           w2: 3-6 agl               (TCR)
## 26                          w5: 9-14 agl (TCR,FEV) (Service)
## 27                           w2: 2-4 agl           (TCR,FEV)
## 28                           w2: 2-4 agl               (TCR)
## 29                           w1: 1-3 agl               (TCR)
## 30                                                          
## 31                             w1: 3 agl               (TCR)
## 32                          w4: 6-10 agl               (TCR)
## 33                           w3: 6-9 agl               (TCR)
## 34                           w2: 4.2 agl               (TCR)
## 35                           w3: 5-7 agl               (TCR)
## 36                         w6: 12-18 agl               (TCR)
## 37                           w1: 1-2 agl               (TCR)
## 38                           w2: 3-6 agl               (TCR)
## 39                                                          
## 40                           w2: 3-5 agl               (TCR)
## 41                           w3: 6.2 agl               (TCR)
## 42                           w1: 1-2 agl               (TCR)
## 43                           w3: 5-7 agl               (TCR)
## 44                           w2: 4-6 agl               (TCR)
## 45                                                          
## 46                          w5: 9-14 agl (TCR,FEV) (Service)
## 47                                                          
## 48                                                     (FEV)
## 49                           w1: 1-3 agl               (TCR)
## 50                          w4: 8-11 agl (TCR,FEV) (Service)
## 51                                                     (FEV)
## 52                                                          
## 53                           w1: 1-3 agl               (TCR)
## 54                                                     (FEV)
## 55                                                          
## 56                                                     (FEV)
## 57                           w3: 6-9 agl           (TCR,FEV)
## 58                           w3: 6-9 agl           (TCR,FEV)
## 59                           w1: < 3 agl           (TCR,FEV)
## 60                           w3: 5-8 agl           (TCR,FEV)
## 61                          w4: 6-10 agl   (TCR,FEV) Service
## 62                                                     (FEV)
## 63                                                          
## 64                           w3: 5-7 agl   (TCR,FEV) Service
## 65                           w3: 4-7 agl           (TCR,FEV)
## 66                           w1: < 3 agl               (TCR)
## 67                          w2: 3.5 mhhw             Service
## 68                                                          
## 69                           w2: 3-4 agl           (TCR,FEV)
## 70                                                          
## 71                                                          
## 72                                                          
## 73                          w2: 3.4 mhhw               (FEV)
## 74                                                          
## 75                           w2: 3-5 agl               (TCR)
## 76                                                          
## 77                                                          
## 78                                                          
## 79                           w3: 4-9 agl   (TCR,FEV) Service
## 80                         w6: 10-17 agl               (TCR)
## 81                                                          
## 82                              w2: 4.76           (TCR,FEV)
## 83                              w4: 8-11   (TCR,FEV) Service
##                  Guidance          Cat.Pres.Dead..bn
## 1           Warn-A6 (1-8)  (Cat4, 938, 221+, $27.5+)
## 2          Warn-A5 (1-15)       (Cat2, 972, 0, $1.5)
## 3            PS-Adv 43,46          (Cat4, 949, -, -)
## 4  PS-Adv 2-7, 9-11,13,14          (Cat1, 988, -, -)
## 5                  P-ETSS          (Cat4, 947, -, -)
## 6          Warn-A5 (6-20)       (Cat1, 979, 10, >$1)
## 7        Warn-A33 (29-41)    (Cat5, 934, 70, >$6.86)
## 8                               (TS, 993, 5, $0.179)
## 9                              (Cat5, 922, 52+, $16)
## 10                              (Typhoon, 903, 0, -)
## 11                Warn-A4       (TS, 981, 0, $0.450)
## 12      Watch-A34 (34-37)     (Cat5, 926, 3, $0.050)
## 13        Warn-A15 (5-20)       (Cat4, 940, 7, $2.5)
## 14                            (Cat4, 939, 2, $0.690)
## 15                                 (Cat4, 939, -, -)
## 16                                 (Cat3, 955, -, -)
## 17                            (Cat5, 897, 6, $0.250)
## 18                 P-ETSS                           
## 19                 P-ETSS                           
## 20               Warn-A12        (Cat1, 980, 11, $1)
## 21       Warn-A18 (12-33)     (Cat5, 937, 161, $113)
## 22                             (Cat4, 931, 29, $3.4)
## 23                 P-ETSS          (Cat1, 965, -, -)
## 24   P-ETSS River Forcing                           
## 25            Warn-A5 (2)       (Cat1, 988, 2, $1.1)
## 26         Warn-A7 (3-18)    (Cat4, 929, 107, $75.3)
## 27       Warn-A22 (18-27)       (Cat1, 986, 2, $0.7)
## 28       Warn-A28 (24-29)         (TS, 991, 7, $1.3)
## 29               Warn-A28      (Cat1, 991, 13, $1.2)
## 30                             (Cat4, 917, 84, $1.4)
## 31       Warn-A33 (31-35)     (Cat4, 922, 175, $8.3)
## 32        Warn-A11 (9-18)       (Cat3, 970, 9, $4.4)
## 33       Warn-A12 (12-22)       (Cat4, 953, 6, $3.0)
## 34            Warn-A9 (-)       (TS, 993, 1, $0.225)
## 35         Warn-A8 (5-21)       (Cat2, 965, 4, $7.3)
## 36       Warn-A23 (19-30)     (Cat4, 937, 81, $23.3)
## 37       Warn-A13 (10-16)     (Cat1, 991, 0, $0.035)
## 38       Warn-A23 (15-30)     (Cat1, 986, 17, $5.03)
## 39                                 (Cat4, 954, -, -)
## 40        Warn-A11 (8-13)       (Cat1, 973, 9, $1.2)
## 41           Warn-A17 (-)      (TS, 988, 46, $0.865)
## 42                               (TS, 1003, 7, $4.8)
## 43       WarnA-35 (33-53)      (Cat5, 910, 84, $5.1)
## 44         WarnA-5 (1-16)       (Cat1, 993, 3, $0.9)
## 45                              (Typhoon, 895, -, -)
## 46        Warn-A10 (7-17)     (Cat5, 919, 74, $25.5)
## 47                        (Typhoon, 868, 134, $3.77)
## 48                                    (Cat1, 995 mb)
## 49         Warn-A5 (3-11)         (TS, 996, 3, $0.2)
## 50       Warn-A50 (48-65)     (Cat4, 937, 54, $24.2)
## 51                             (Cat5, 926, 1, $0.25)
## 52                         (Typhoon, 910, 2, $0.637)
## 53           Watch-A2 (1)      (TS, 990, 18, $0.125)
## 54                 P-ETSS                           
## 55                            (Cat4, 927, 1, $0.253)
## 56                 P-ETSS                           
## 57         Warn-A9 (7-16)    (Cat1, 981, 48, $0.787)
## 58       Warn-A42 (35-46)  (Cat5, 908, 3,059, $91.6)
## 59                 P-ETSS   (Cat4, 938, 1, $0.00284)
## 60       Warn-A36 (34-51)     (Cat5, 914, 52, $77.2)
## 61       Warn-A25 (23-42)     (Cat4, 937, 107, $125)
## 62                 P-ETSS                           
## 63                           (Cat3, 975, 23, $0.192)
## 64        P-Surge (26-44)    (Cat5, 934, 731, $16.5)
## 65        P-Surge (11-24)      (Cat1, 981, 4, $0.55)
## 66        P-Surge (26-40)      (TS, 1001, 6, $0.001)
## 67               ETSS-2.0        (ET, 983, 55, $0.5)
## 68                            (Cat5, 872, 8, $0.463)
## 69               ETSS-2.0      (Cat4, 931, 34, $0.2)
## 70                            (TS, 1001, 35, $0.511)
## 71                         (Typhoon, 933, 18, $1.59)
## 72                         (Typhoon, 918, 1, $0.014)
## 73               ETSS-2.0                           
## 74                            (Cat4, 918, 18, $1.25)
## 75         P-Surge (6-14)     (Cat2, 973, 0, $0.040)
## 76                        (Typhoon, 920, 0, $0.0002)
## 77                            (Cat1, 983, 169, $4.2)
## 78                             (Cat1, 983, 32, $1.5)
## 79        P-Surge (21-31)    (Cat3, 940, 254, $68.7)
## 80        P-Surge (16-38)      (Cat1, 965, 41, $3.1)
## 81                   ETSS       (ET, 943, 1, $0.024)
## 82                               (TS, 986, 18, $2.8)
## 83        P-Surge (20-34)     (Cat3, 942, 58, $14.2)
##                               Area retiredName OceanicBasin
## 1  N.W. FL, GA, AL, TN, KY, VA, WV        <NA>         <NA>
## 2                           LA, MS        <NA>         <NA>
## 3                               HI        <NA>           cp
## 4                               HI        <NA>           cp
## 5                               AK        <NA>         <NA>
## 6              N.W. FL, GA, SC, NC        <NA>         <NA>
## 7                           TX, LA        <NA>         <NA>
## 8                           TX, LA        <NA>         <NA>
## 9                              MEX         (R)           ep
## 10                            Guam        <NA>           wp
## 11              NC, VA, MD, NY, DE        <NA>         <NA>
## 12                     New England        <NA>         <NA>
## 13                         N.W. FL        <NA>         <NA>
## 14                              CA        <NA>           ep
## 15                              HI         (R)           cp
## 16                              HI        <NA>           cp
## 17                            Guam        <NA>           wp
## 18                                        <NA>         <NA>
## 19                                        <NA>         <NA>
## 20                          PR, FL        <NA>         <NA>
## 21                     S.W. FL, SC         (R)         <NA>
## 22                              PR         (R)         <NA>
## 23                              AK        <NA>         <NA>
## 24                          MD, DE        <NA>         <NA>
## 25                           N. TX        <NA>         <NA>
## 26                     New Orleans         (R)         <NA>
## 27                     New England        <NA>         <NA>
## 28                         N.W. FL        <NA>         <NA>
## 29                      East Coast        <NA>         <NA>
## 30    NIC, HND, COL, SLV, GTM, PAN         (R)         <NA>
## 31                  FL (Tampa Bay)         (R)         <NA>
## 32                     New Orleans        <NA>         <NA>
## 33                              LA        <NA>         <NA>
## 34                      TX, LA, MS        <NA>         <NA>
## 35                     New Orleans        <NA>         <NA>
## 36                              LA         (R)         <NA>
## 37                     New Orleans        <NA>         <NA>
## 38                      East Coast        <NA>         <NA>
## 39                              HI        <NA>           cp
## 40                           S. TX        <NA>         <NA>
## 41                              LA        <NA>         <NA>
## 42                          TX, LA        <NA>         <NA>
## 43                          FL, NC         (R)         <NA>
## 44                              LA        <NA>         <NA>
## 45                            Guam        <NA>           wp
## 46                   US Gulf Coast         (R)         <NA>
## 47                            Guam        <NA>           wp
## 48                     PR - (Rain)        <NA>         <NA>
## 49                 MS, AR, FL Keys        <NA>         <NA>
## 50                      Eastern US         (R)         <NA>
## 51                              HI        <NA>           cp
## 52                            Guam        <NA>           wp
## 53                              MS        <NA>         <NA>
## 54      DE, NJ, NY, CT, RI, VT, ME        <NA>         <NA>
## 55                        Am Samoa        <NA>           sp
## 56                              MA        <NA>         <NA>
## 57                      AL, LA, MS         (R)         <NA>
## 58                              PR         (R)         <NA>
## 59                      CT, RI, MA        <NA>         <NA>
## 60                          FL, PR         (R)         <NA>
## 61                          TX, LA         (R)         <NA>
## 62                              NY        <NA>         <NA>
## 63                   PAN, CRI, NIC         (R)         <NA>
## 64                           SE US         (R)         <NA>
## 65                         N.W. FL        <NA>         <NA>
## 66                         N.W. FL        <NA>         <NA>
## 67               DE, NJ (Cape May)        <NA>         <NA>
## 68                         MEX, TX         (R)           ep
## 69          VA, BHS, BMU, CUB, HTI         (R)         <NA>
## 70                        DMA, HTI         (R)         <NA>
## 71                            Guam        <NA>           wp
## 72                            Guam        <NA>           wp
## 73                          CT, MA        <NA>         <NA>
## 74                             MEX         (R)           ep
## 75                              NC        <NA>         <NA>
## 76                            Guam        <NA>           wp
## 77                             MEX         (R)           ep
## 78                             MEX         (R)         <NA>
## 79             Mid-Atlantic, NE US         (R)         <NA>
## 80                              LA        <NA>         <NA>
## 81                              AK        <NA>         <NA>
## 82                          LA, MS        <NA>         <NA>
## 83             Mid-Atlantic, NE US         (R)         <NA>

Clean up Date

For the sake of analysis, I will be transforming Date to Start Date. To do this, I will take only the text before a dash or comma, concatenate the results with year, and change the character representation of month to numeric.

stormdata <- stormdata %>% 
  mutate(startDate = str_extract(Date, "^(.*?)(?=-|,)|^(.*)$")) %>% 
  mutate(startDate = str_replace(startDate,".*\\s\\d{1}$",paste(substr(startDate,1,nchar(startDate)-1),0,substr(startDate,nchar(startDate),nchar(startDate)),sep=''))) %>%
  mutate(startDate = str_replace(startDate,"\\s$","")) %>%
  mutate(startDate = paste(year,startDate)) %>%
  mutate(startDate = str_replace_all(startDate,"Jan","01")) %>%
  mutate(startDate = str_replace_all(startDate,"Feb","02")) %>%
  mutate(startDate = str_replace_all(startDate,"Mar","03")) %>%
  mutate(startDate = str_replace_all(startDate,"Apr","04")) %>%
  mutate(startDate = str_replace_all(startDate,"May","05")) %>%
  mutate(startDate = str_replace_all(startDate,"Jun","06")) %>%
  mutate(startDate = str_replace_all(startDate,"Jul","07")) %>%
  mutate(startDate = str_replace_all(startDate,"Aug","08")) %>%
  mutate(startDate = str_replace_all(startDate,"Sep","09")) %>%
  mutate(startDate = str_replace_all(startDate,"Oct","10")) %>%
  mutate(startDate = str_replace_all(startDate,"Nov","11")) %>%
  mutate(startDate = str_replace_all(startDate,"Dec","12")) %>%
  mutate(startDate = as.Date(str_replace_all(startDate,"\\s","-"))) %>%
  select(Storm, startDate, year,everything())
stormdata
##                   Storm  startDate year            Date
## 1                Helene 2024-09-23 2024       Sep 23-29
## 2              Francine 2024-09-12 2024          Sep 12
## 3                 Gilma 2024-08-30 2024          Aug 30
## 4                  Hone 2024-08-25 2024          Aug 25
## 5   Post- Typhoon Ampil 2024-08-22 2024          Aug 22
## 6                 Debby 2024-08-05 2024         Aug 5,6
## 7                 Beryl 2024-07-08 2024           Jul 8
## 8               Alberto 2024-06-19 2024       Jun 19,20
## 9                  Otis 2023-10-22 2023       Oct 22-25
## 10              Bolaven 2023-10-06 2023        Oct 6-14
## 11              Ophelia 2023-09-22 2023       Sep 22-24
## 12                  Lee 2023-09-13 2023       Sep 13,14
## 13               Idalia 2023-08-27 2023       Aug 27-31
## 14              Hillary 2023-08-16 2023       Aug 16-22
## 15                 Dora 2023-07-31 2023 Jul 31 - Aug 21
## 16               Calvin 2023-07-11 2023       Jul 11-19
## 17                Mawar 2023-05-19 2023  May 19 - Jun 3
## 18                Jan28 2023-01-28 2023       Jan 28,29
## 19                Jan14 2023-01-14 2023       Jan 14-17
## 20               Nicole 2022-11-07 2022        Nov 7-11
## 21                  Ian 2022-09-26 2022       Sep 26-30
## 22                Fiona 2022-09-14 2022       Sep 14-27
## 23 Post- Typhoon Merbok 2022-09-14 2022       Sep 14-19
## 24                  Oct 2021-10-28 2021          Oct 28
## 25             Nicholas 2021-09-12 2021       Sep 12-20
## 26                  Ida 2021-08-27 2021       Aug 27-30
## 27                Henri 2021-08-20 2021       Aug 20-22
## 28                 Fred 2021-08-15 2021       Aug 15,16
## 29                 Elsa 2021-06-30 2021 Jun 30 - Jul 10
## 30                 Iota 2020-11-13 2020       Nov 13-18
## 31                  Eta 2020-11-08 2020         Nov 8,9
## 32                 Zeta 2020-10-26 2020       Oct 26-29
## 33                Delta 2020-10-07 2020        Oct 7-10
## 34                 Beta 2020-09-17 2020       Sep 17-25
## 35                Sally 2020-09-12 2020       Sep 12-16
## 36                Laura 2020-08-24 2020       Aug 24-27
## 37                Marco 2020-08-22 2020       Aug 22-24
## 38               Isaias 2020-07-31 2020  Jul 31 - Aug 4
## 39              Douglas 2020-07-20 2020       Jul 20-30
## 40                Hanna 2020-07-24 2020       Jul 24-26
## 41            Cristobal 2020-06-01 2020        Jun 1-12
## 42               Imelda 2019-09-17 2019       Sep 17-19
## 43               Dorian 2019-09-01 2019         Sep 1-6
## 44                Barry 2019-07-10 2019       Jul 10-14
## 45                Wutip 2019-02-18 2019  Feb 18 - Mar 2
## 46              Michael 2018-10-07 2018        Oct 7-11
## 47             Mangkhut 2018-09-06 2018        Sep 6-17
## 48                Isaac 2018-09-07 2018        Sep 7-15
## 49               Gordon 2018-09-03 2018         Sep 3-8
## 50             Florence 2018-08-31 2018  Aug 31 - Sep 8
## 51                 Lane 2018-08-15 2018       Aug 15-29
## 52                Maria 2018-07-03 2018        Jul 3-12
## 53              Alberto 2018-05-25 2018  May 25 - Jun 1
## 54                 Mar1 2018-02-28 2018  Feb 28 - Mar 1
## 55                 Gita 2018-02-03 2018        Feb 3-22
## 56                Jan17 2018-01-03 2018        Jan 3-17
## 57                 Nate 2017-10-04 2017         Oct 4-9
## 58                Maria 2017-09-16 2017       Sep 16-30
## 59                 Jose 2017-09-05 2017        Sep 5-25
## 60                 Irma 2017-08-30 2017 Aug 30 - Sep 12
## 61               Harvey 2017-08-17 2017  Aug 17 - Sep 1
## 62                  Mar 2017-03-13 2017  Mar 13- Apr 13
## 63                 Otto 2016-11-20 2016       Nov 20-26
## 64              Matthew 2016-09-28 2016  Sep 28 - Oct 9
## 65              Hermine 2016-08-28 2016  Aug 28 - Sep 8
## 66                Colin 2016-06-05 2016         Jun 5-7
## 67                  Jan 2016-01-23 2016       Jan 23-24
## 68             Patricia 2015-10-20 2015       Oct 20-24
## 69              Joaquin 2015-09-28 2015  Sep 28 - Oct 8
## 70                Erika 2015-08-24 2015       Aug 24-28
## 71             Chan-hom 2015-06-29 2015 Jun 29 - Jul 13
## 72              Dolphin 2015-05-06 2015        May 6-24
## 73                  Jan 2015-01-26 2015       Jan 26-28
## 74                Odile 2014-09-10 2014       Sep 10-19
## 75               Arthur 2014-07-01 2014         Jul 1-9
## 76            Francisco 2013-10-15 2013       Oct 15-26
## 77               Manuel 2013-09-13 2013       Sep 13-20
## 78               Ingrid 2013-09-12 2013       Sep 12-17
## 79                Sandy 2012-10-22 2012       Oct 22-29
## 80                Isaac 2012-08-21 2012  Aug 21 - Sep 3
## 81                  Nov 2011-11-04 2011        Nov 4-11
## 82                  Lee 2011-09-02 2011         Sep 2-7
## 83                Irene 2011-08-21 2011       Aug 21-30
##                               Storm.Tide                 Obs
## 1  Obs w4: >9.3 mhhw Fcst w6: 15-20 mhhw               (FEV)
## 2                             w2: 5 mhhw                    
## 3                             w1: 3 mhhw                    
## 4                             w1: 3 mhhw                    
## 5                           w3: 6.3 mhhw                    
## 6                           w2: 5.1 mhhw               (FEV)
## 7                             w3: 8 mhhw                    
## 8                             w1: 3 mhhw                    
## 9                                                           
## 10                                                          
## 11                           w2: 3-5 agl           (TCR,FEV)
## 12                           w1: 1-2 agl           (TCR,FEV)
## 13                          w4: 8-12 agl           (TCR,FEV)
## 14                                                          
## 15                                                          
## 16                                                          
## 17                                                          
## 18                                                          
## 19                                                          
## 20                             w2: 5 agl               (TCR)
## 21                         w5: 10-15 agl           (TCR,FEV)
## 22                           w1: 1-3 agl               (TCR)
## 23                          w5: 12.5 agl               (FEV)
## 24                                                     (FEV)
## 25                           w2: 3-6 agl               (TCR)
## 26                          w5: 9-14 agl (TCR,FEV) (Service)
## 27                           w2: 2-4 agl           (TCR,FEV)
## 28                           w2: 2-4 agl               (TCR)
## 29                           w1: 1-3 agl               (TCR)
## 30                                                          
## 31                             w1: 3 agl               (TCR)
## 32                          w4: 6-10 agl               (TCR)
## 33                           w3: 6-9 agl               (TCR)
## 34                           w2: 4.2 agl               (TCR)
## 35                           w3: 5-7 agl               (TCR)
## 36                         w6: 12-18 agl               (TCR)
## 37                           w1: 1-2 agl               (TCR)
## 38                           w2: 3-6 agl               (TCR)
## 39                                                          
## 40                           w2: 3-5 agl               (TCR)
## 41                           w3: 6.2 agl               (TCR)
## 42                           w1: 1-2 agl               (TCR)
## 43                           w3: 5-7 agl               (TCR)
## 44                           w2: 4-6 agl               (TCR)
## 45                                                          
## 46                          w5: 9-14 agl (TCR,FEV) (Service)
## 47                                                          
## 48                                                     (FEV)
## 49                           w1: 1-3 agl               (TCR)
## 50                          w4: 8-11 agl (TCR,FEV) (Service)
## 51                                                     (FEV)
## 52                                                          
## 53                           w1: 1-3 agl               (TCR)
## 54                                                     (FEV)
## 55                                                          
## 56                                                     (FEV)
## 57                           w3: 6-9 agl           (TCR,FEV)
## 58                           w3: 6-9 agl           (TCR,FEV)
## 59                           w1: < 3 agl           (TCR,FEV)
## 60                           w3: 5-8 agl           (TCR,FEV)
## 61                          w4: 6-10 agl   (TCR,FEV) Service
## 62                                                     (FEV)
## 63                                                          
## 64                           w3: 5-7 agl   (TCR,FEV) Service
## 65                           w3: 4-7 agl           (TCR,FEV)
## 66                           w1: < 3 agl               (TCR)
## 67                          w2: 3.5 mhhw             Service
## 68                                                          
## 69                           w2: 3-4 agl           (TCR,FEV)
## 70                                                          
## 71                                                          
## 72                                                          
## 73                          w2: 3.4 mhhw               (FEV)
## 74                                                          
## 75                           w2: 3-5 agl               (TCR)
## 76                                                          
## 77                                                          
## 78                                                          
## 79                           w3: 4-9 agl   (TCR,FEV) Service
## 80                         w6: 10-17 agl               (TCR)
## 81                                                          
## 82                              w2: 4.76           (TCR,FEV)
## 83                              w4: 8-11   (TCR,FEV) Service
##                  Guidance          Cat.Pres.Dead..bn
## 1           Warn-A6 (1-8)  (Cat4, 938, 221+, $27.5+)
## 2          Warn-A5 (1-15)       (Cat2, 972, 0, $1.5)
## 3            PS-Adv 43,46          (Cat4, 949, -, -)
## 4  PS-Adv 2-7, 9-11,13,14          (Cat1, 988, -, -)
## 5                  P-ETSS          (Cat4, 947, -, -)
## 6          Warn-A5 (6-20)       (Cat1, 979, 10, >$1)
## 7        Warn-A33 (29-41)    (Cat5, 934, 70, >$6.86)
## 8                               (TS, 993, 5, $0.179)
## 9                              (Cat5, 922, 52+, $16)
## 10                              (Typhoon, 903, 0, -)
## 11                Warn-A4       (TS, 981, 0, $0.450)
## 12      Watch-A34 (34-37)     (Cat5, 926, 3, $0.050)
## 13        Warn-A15 (5-20)       (Cat4, 940, 7, $2.5)
## 14                            (Cat4, 939, 2, $0.690)
## 15                                 (Cat4, 939, -, -)
## 16                                 (Cat3, 955, -, -)
## 17                            (Cat5, 897, 6, $0.250)
## 18                 P-ETSS                           
## 19                 P-ETSS                           
## 20               Warn-A12        (Cat1, 980, 11, $1)
## 21       Warn-A18 (12-33)     (Cat5, 937, 161, $113)
## 22                             (Cat4, 931, 29, $3.4)
## 23                 P-ETSS          (Cat1, 965, -, -)
## 24   P-ETSS River Forcing                           
## 25            Warn-A5 (2)       (Cat1, 988, 2, $1.1)
## 26         Warn-A7 (3-18)    (Cat4, 929, 107, $75.3)
## 27       Warn-A22 (18-27)       (Cat1, 986, 2, $0.7)
## 28       Warn-A28 (24-29)         (TS, 991, 7, $1.3)
## 29               Warn-A28      (Cat1, 991, 13, $1.2)
## 30                             (Cat4, 917, 84, $1.4)
## 31       Warn-A33 (31-35)     (Cat4, 922, 175, $8.3)
## 32        Warn-A11 (9-18)       (Cat3, 970, 9, $4.4)
## 33       Warn-A12 (12-22)       (Cat4, 953, 6, $3.0)
## 34            Warn-A9 (-)       (TS, 993, 1, $0.225)
## 35         Warn-A8 (5-21)       (Cat2, 965, 4, $7.3)
## 36       Warn-A23 (19-30)     (Cat4, 937, 81, $23.3)
## 37       Warn-A13 (10-16)     (Cat1, 991, 0, $0.035)
## 38       Warn-A23 (15-30)     (Cat1, 986, 17, $5.03)
## 39                                 (Cat4, 954, -, -)
## 40        Warn-A11 (8-13)       (Cat1, 973, 9, $1.2)
## 41           Warn-A17 (-)      (TS, 988, 46, $0.865)
## 42                               (TS, 1003, 7, $4.8)
## 43       WarnA-35 (33-53)      (Cat5, 910, 84, $5.1)
## 44         WarnA-5 (1-16)       (Cat1, 993, 3, $0.9)
## 45                              (Typhoon, 895, -, -)
## 46        Warn-A10 (7-17)     (Cat5, 919, 74, $25.5)
## 47                        (Typhoon, 868, 134, $3.77)
## 48                                    (Cat1, 995 mb)
## 49         Warn-A5 (3-11)         (TS, 996, 3, $0.2)
## 50       Warn-A50 (48-65)     (Cat4, 937, 54, $24.2)
## 51                             (Cat5, 926, 1, $0.25)
## 52                         (Typhoon, 910, 2, $0.637)
## 53           Watch-A2 (1)      (TS, 990, 18, $0.125)
## 54                 P-ETSS                           
## 55                            (Cat4, 927, 1, $0.253)
## 56                 P-ETSS                           
## 57         Warn-A9 (7-16)    (Cat1, 981, 48, $0.787)
## 58       Warn-A42 (35-46)  (Cat5, 908, 3,059, $91.6)
## 59                 P-ETSS   (Cat4, 938, 1, $0.00284)
## 60       Warn-A36 (34-51)     (Cat5, 914, 52, $77.2)
## 61       Warn-A25 (23-42)     (Cat4, 937, 107, $125)
## 62                 P-ETSS                           
## 63                           (Cat3, 975, 23, $0.192)
## 64        P-Surge (26-44)    (Cat5, 934, 731, $16.5)
## 65        P-Surge (11-24)      (Cat1, 981, 4, $0.55)
## 66        P-Surge (26-40)      (TS, 1001, 6, $0.001)
## 67               ETSS-2.0        (ET, 983, 55, $0.5)
## 68                            (Cat5, 872, 8, $0.463)
## 69               ETSS-2.0      (Cat4, 931, 34, $0.2)
## 70                            (TS, 1001, 35, $0.511)
## 71                         (Typhoon, 933, 18, $1.59)
## 72                         (Typhoon, 918, 1, $0.014)
## 73               ETSS-2.0                           
## 74                            (Cat4, 918, 18, $1.25)
## 75         P-Surge (6-14)     (Cat2, 973, 0, $0.040)
## 76                        (Typhoon, 920, 0, $0.0002)
## 77                            (Cat1, 983, 169, $4.2)
## 78                             (Cat1, 983, 32, $1.5)
## 79        P-Surge (21-31)    (Cat3, 940, 254, $68.7)
## 80        P-Surge (16-38)      (Cat1, 965, 41, $3.1)
## 81                   ETSS       (ET, 943, 1, $0.024)
## 82                               (TS, 986, 18, $2.8)
## 83        P-Surge (20-34)     (Cat3, 942, 58, $14.2)
##                               Area retiredName OceanicBasin
## 1  N.W. FL, GA, AL, TN, KY, VA, WV        <NA>         <NA>
## 2                           LA, MS        <NA>         <NA>
## 3                               HI        <NA>           cp
## 4                               HI        <NA>           cp
## 5                               AK        <NA>         <NA>
## 6              N.W. FL, GA, SC, NC        <NA>         <NA>
## 7                           TX, LA        <NA>         <NA>
## 8                           TX, LA        <NA>         <NA>
## 9                              MEX         (R)           ep
## 10                            Guam        <NA>           wp
## 11              NC, VA, MD, NY, DE        <NA>         <NA>
## 12                     New England        <NA>         <NA>
## 13                         N.W. FL        <NA>         <NA>
## 14                              CA        <NA>           ep
## 15                              HI         (R)           cp
## 16                              HI        <NA>           cp
## 17                            Guam        <NA>           wp
## 18                                        <NA>         <NA>
## 19                                        <NA>         <NA>
## 20                          PR, FL        <NA>         <NA>
## 21                     S.W. FL, SC         (R)         <NA>
## 22                              PR         (R)         <NA>
## 23                              AK        <NA>         <NA>
## 24                          MD, DE        <NA>         <NA>
## 25                           N. TX        <NA>         <NA>
## 26                     New Orleans         (R)         <NA>
## 27                     New England        <NA>         <NA>
## 28                         N.W. FL        <NA>         <NA>
## 29                      East Coast        <NA>         <NA>
## 30    NIC, HND, COL, SLV, GTM, PAN         (R)         <NA>
## 31                  FL (Tampa Bay)         (R)         <NA>
## 32                     New Orleans        <NA>         <NA>
## 33                              LA        <NA>         <NA>
## 34                      TX, LA, MS        <NA>         <NA>
## 35                     New Orleans        <NA>         <NA>
## 36                              LA         (R)         <NA>
## 37                     New Orleans        <NA>         <NA>
## 38                      East Coast        <NA>         <NA>
## 39                              HI        <NA>           cp
## 40                           S. TX        <NA>         <NA>
## 41                              LA        <NA>         <NA>
## 42                          TX, LA        <NA>         <NA>
## 43                          FL, NC         (R)         <NA>
## 44                              LA        <NA>         <NA>
## 45                            Guam        <NA>           wp
## 46                   US Gulf Coast         (R)         <NA>
## 47                            Guam        <NA>           wp
## 48                     PR - (Rain)        <NA>         <NA>
## 49                 MS, AR, FL Keys        <NA>         <NA>
## 50                      Eastern US         (R)         <NA>
## 51                              HI        <NA>           cp
## 52                            Guam        <NA>           wp
## 53                              MS        <NA>         <NA>
## 54      DE, NJ, NY, CT, RI, VT, ME        <NA>         <NA>
## 55                        Am Samoa        <NA>           sp
## 56                              MA        <NA>         <NA>
## 57                      AL, LA, MS         (R)         <NA>
## 58                              PR         (R)         <NA>
## 59                      CT, RI, MA        <NA>         <NA>
## 60                          FL, PR         (R)         <NA>
## 61                          TX, LA         (R)         <NA>
## 62                              NY        <NA>         <NA>
## 63                   PAN, CRI, NIC         (R)         <NA>
## 64                           SE US         (R)         <NA>
## 65                         N.W. FL        <NA>         <NA>
## 66                         N.W. FL        <NA>         <NA>
## 67               DE, NJ (Cape May)        <NA>         <NA>
## 68                         MEX, TX         (R)           ep
## 69          VA, BHS, BMU, CUB, HTI         (R)         <NA>
## 70                        DMA, HTI         (R)         <NA>
## 71                            Guam        <NA>           wp
## 72                            Guam        <NA>           wp
## 73                          CT, MA        <NA>         <NA>
## 74                             MEX         (R)           ep
## 75                              NC        <NA>         <NA>
## 76                            Guam        <NA>           wp
## 77                             MEX         (R)           ep
## 78                             MEX         (R)         <NA>
## 79             Mid-Atlantic, NE US         (R)         <NA>
## 80                              LA        <NA>         <NA>
## 81                              AK        <NA>         <NA>
## 82                          LA, MS        <NA>         <NA>
## 83             Mid-Atlantic, NE US         (R)         <NA>

Clean up Storm Tide

Next, I will clean up the Storm Tide column to extract either the MHHW or AGL. The conversion of MHHW to AGL and vice versa is complicated as it involves vertical datums where the benchmark value may vary based on location. For the sake of this exercise, although MHHW was extracted, I will be using only AGL for analysis. For N/A and MHHW datapoints, they will be excluded from the analysis. This decision was made based on AGL being present in most of the data (i.e. the dataset has more AGL values), and because it does not detract from the exercise to remove MHHW (as the exercise is not to delve deeply into the world of vertical tidal datums). Additionally, it appears only the previous year has been recorded in mostly MHHW, so it is essentially just an omission of the latest year data. The lack of MHHW values do not seem significant in any way (e.g. the missing values do not appear to be correlated to storm severity).

stormdata <- stormdata %>% 
  mutate(MHHW = str_extract(Storm.Tide, "(?<=:\\s).*?(?=\\smhhw)")) %>%
  mutate(AGL = str_extract(Storm.Tide, "(?<=:\\s).*?(?=\\sagl)")) %>%
  mutate(maxAGL = str_extract(AGL, "((?<=-)|(?<=\\<\\s)).*"))
stormdata
##                   Storm  startDate year            Date
## 1                Helene 2024-09-23 2024       Sep 23-29
## 2              Francine 2024-09-12 2024          Sep 12
## 3                 Gilma 2024-08-30 2024          Aug 30
## 4                  Hone 2024-08-25 2024          Aug 25
## 5   Post- Typhoon Ampil 2024-08-22 2024          Aug 22
## 6                 Debby 2024-08-05 2024         Aug 5,6
## 7                 Beryl 2024-07-08 2024           Jul 8
## 8               Alberto 2024-06-19 2024       Jun 19,20
## 9                  Otis 2023-10-22 2023       Oct 22-25
## 10              Bolaven 2023-10-06 2023        Oct 6-14
## 11              Ophelia 2023-09-22 2023       Sep 22-24
## 12                  Lee 2023-09-13 2023       Sep 13,14
## 13               Idalia 2023-08-27 2023       Aug 27-31
## 14              Hillary 2023-08-16 2023       Aug 16-22
## 15                 Dora 2023-07-31 2023 Jul 31 - Aug 21
## 16               Calvin 2023-07-11 2023       Jul 11-19
## 17                Mawar 2023-05-19 2023  May 19 - Jun 3
## 18                Jan28 2023-01-28 2023       Jan 28,29
## 19                Jan14 2023-01-14 2023       Jan 14-17
## 20               Nicole 2022-11-07 2022        Nov 7-11
## 21                  Ian 2022-09-26 2022       Sep 26-30
## 22                Fiona 2022-09-14 2022       Sep 14-27
## 23 Post- Typhoon Merbok 2022-09-14 2022       Sep 14-19
## 24                  Oct 2021-10-28 2021          Oct 28
## 25             Nicholas 2021-09-12 2021       Sep 12-20
## 26                  Ida 2021-08-27 2021       Aug 27-30
## 27                Henri 2021-08-20 2021       Aug 20-22
## 28                 Fred 2021-08-15 2021       Aug 15,16
## 29                 Elsa 2021-06-30 2021 Jun 30 - Jul 10
## 30                 Iota 2020-11-13 2020       Nov 13-18
## 31                  Eta 2020-11-08 2020         Nov 8,9
## 32                 Zeta 2020-10-26 2020       Oct 26-29
## 33                Delta 2020-10-07 2020        Oct 7-10
## 34                 Beta 2020-09-17 2020       Sep 17-25
## 35                Sally 2020-09-12 2020       Sep 12-16
## 36                Laura 2020-08-24 2020       Aug 24-27
## 37                Marco 2020-08-22 2020       Aug 22-24
## 38               Isaias 2020-07-31 2020  Jul 31 - Aug 4
## 39              Douglas 2020-07-20 2020       Jul 20-30
## 40                Hanna 2020-07-24 2020       Jul 24-26
## 41            Cristobal 2020-06-01 2020        Jun 1-12
## 42               Imelda 2019-09-17 2019       Sep 17-19
## 43               Dorian 2019-09-01 2019         Sep 1-6
## 44                Barry 2019-07-10 2019       Jul 10-14
## 45                Wutip 2019-02-18 2019  Feb 18 - Mar 2
## 46              Michael 2018-10-07 2018        Oct 7-11
## 47             Mangkhut 2018-09-06 2018        Sep 6-17
## 48                Isaac 2018-09-07 2018        Sep 7-15
## 49               Gordon 2018-09-03 2018         Sep 3-8
## 50             Florence 2018-08-31 2018  Aug 31 - Sep 8
## 51                 Lane 2018-08-15 2018       Aug 15-29
## 52                Maria 2018-07-03 2018        Jul 3-12
## 53              Alberto 2018-05-25 2018  May 25 - Jun 1
## 54                 Mar1 2018-02-28 2018  Feb 28 - Mar 1
## 55                 Gita 2018-02-03 2018        Feb 3-22
## 56                Jan17 2018-01-03 2018        Jan 3-17
## 57                 Nate 2017-10-04 2017         Oct 4-9
## 58                Maria 2017-09-16 2017       Sep 16-30
## 59                 Jose 2017-09-05 2017        Sep 5-25
## 60                 Irma 2017-08-30 2017 Aug 30 - Sep 12
## 61               Harvey 2017-08-17 2017  Aug 17 - Sep 1
## 62                  Mar 2017-03-13 2017  Mar 13- Apr 13
## 63                 Otto 2016-11-20 2016       Nov 20-26
## 64              Matthew 2016-09-28 2016  Sep 28 - Oct 9
## 65              Hermine 2016-08-28 2016  Aug 28 - Sep 8
## 66                Colin 2016-06-05 2016         Jun 5-7
## 67                  Jan 2016-01-23 2016       Jan 23-24
## 68             Patricia 2015-10-20 2015       Oct 20-24
## 69              Joaquin 2015-09-28 2015  Sep 28 - Oct 8
## 70                Erika 2015-08-24 2015       Aug 24-28
## 71             Chan-hom 2015-06-29 2015 Jun 29 - Jul 13
## 72              Dolphin 2015-05-06 2015        May 6-24
## 73                  Jan 2015-01-26 2015       Jan 26-28
## 74                Odile 2014-09-10 2014       Sep 10-19
## 75               Arthur 2014-07-01 2014         Jul 1-9
## 76            Francisco 2013-10-15 2013       Oct 15-26
## 77               Manuel 2013-09-13 2013       Sep 13-20
## 78               Ingrid 2013-09-12 2013       Sep 12-17
## 79                Sandy 2012-10-22 2012       Oct 22-29
## 80                Isaac 2012-08-21 2012  Aug 21 - Sep 3
## 81                  Nov 2011-11-04 2011        Nov 4-11
## 82                  Lee 2011-09-02 2011         Sep 2-7
## 83                Irene 2011-08-21 2011       Aug 21-30
##                               Storm.Tide                 Obs
## 1  Obs w4: >9.3 mhhw Fcst w6: 15-20 mhhw               (FEV)
## 2                             w2: 5 mhhw                    
## 3                             w1: 3 mhhw                    
## 4                             w1: 3 mhhw                    
## 5                           w3: 6.3 mhhw                    
## 6                           w2: 5.1 mhhw               (FEV)
## 7                             w3: 8 mhhw                    
## 8                             w1: 3 mhhw                    
## 9                                                           
## 10                                                          
## 11                           w2: 3-5 agl           (TCR,FEV)
## 12                           w1: 1-2 agl           (TCR,FEV)
## 13                          w4: 8-12 agl           (TCR,FEV)
## 14                                                          
## 15                                                          
## 16                                                          
## 17                                                          
## 18                                                          
## 19                                                          
## 20                             w2: 5 agl               (TCR)
## 21                         w5: 10-15 agl           (TCR,FEV)
## 22                           w1: 1-3 agl               (TCR)
## 23                          w5: 12.5 agl               (FEV)
## 24                                                     (FEV)
## 25                           w2: 3-6 agl               (TCR)
## 26                          w5: 9-14 agl (TCR,FEV) (Service)
## 27                           w2: 2-4 agl           (TCR,FEV)
## 28                           w2: 2-4 agl               (TCR)
## 29                           w1: 1-3 agl               (TCR)
## 30                                                          
## 31                             w1: 3 agl               (TCR)
## 32                          w4: 6-10 agl               (TCR)
## 33                           w3: 6-9 agl               (TCR)
## 34                           w2: 4.2 agl               (TCR)
## 35                           w3: 5-7 agl               (TCR)
## 36                         w6: 12-18 agl               (TCR)
## 37                           w1: 1-2 agl               (TCR)
## 38                           w2: 3-6 agl               (TCR)
## 39                                                          
## 40                           w2: 3-5 agl               (TCR)
## 41                           w3: 6.2 agl               (TCR)
## 42                           w1: 1-2 agl               (TCR)
## 43                           w3: 5-7 agl               (TCR)
## 44                           w2: 4-6 agl               (TCR)
## 45                                                          
## 46                          w5: 9-14 agl (TCR,FEV) (Service)
## 47                                                          
## 48                                                     (FEV)
## 49                           w1: 1-3 agl               (TCR)
## 50                          w4: 8-11 agl (TCR,FEV) (Service)
## 51                                                     (FEV)
## 52                                                          
## 53                           w1: 1-3 agl               (TCR)
## 54                                                     (FEV)
## 55                                                          
## 56                                                     (FEV)
## 57                           w3: 6-9 agl           (TCR,FEV)
## 58                           w3: 6-9 agl           (TCR,FEV)
## 59                           w1: < 3 agl           (TCR,FEV)
## 60                           w3: 5-8 agl           (TCR,FEV)
## 61                          w4: 6-10 agl   (TCR,FEV) Service
## 62                                                     (FEV)
## 63                                                          
## 64                           w3: 5-7 agl   (TCR,FEV) Service
## 65                           w3: 4-7 agl           (TCR,FEV)
## 66                           w1: < 3 agl               (TCR)
## 67                          w2: 3.5 mhhw             Service
## 68                                                          
## 69                           w2: 3-4 agl           (TCR,FEV)
## 70                                                          
## 71                                                          
## 72                                                          
## 73                          w2: 3.4 mhhw               (FEV)
## 74                                                          
## 75                           w2: 3-5 agl               (TCR)
## 76                                                          
## 77                                                          
## 78                                                          
## 79                           w3: 4-9 agl   (TCR,FEV) Service
## 80                         w6: 10-17 agl               (TCR)
## 81                                                          
## 82                              w2: 4.76           (TCR,FEV)
## 83                              w4: 8-11   (TCR,FEV) Service
##                  Guidance          Cat.Pres.Dead..bn
## 1           Warn-A6 (1-8)  (Cat4, 938, 221+, $27.5+)
## 2          Warn-A5 (1-15)       (Cat2, 972, 0, $1.5)
## 3            PS-Adv 43,46          (Cat4, 949, -, -)
## 4  PS-Adv 2-7, 9-11,13,14          (Cat1, 988, -, -)
## 5                  P-ETSS          (Cat4, 947, -, -)
## 6          Warn-A5 (6-20)       (Cat1, 979, 10, >$1)
## 7        Warn-A33 (29-41)    (Cat5, 934, 70, >$6.86)
## 8                               (TS, 993, 5, $0.179)
## 9                              (Cat5, 922, 52+, $16)
## 10                              (Typhoon, 903, 0, -)
## 11                Warn-A4       (TS, 981, 0, $0.450)
## 12      Watch-A34 (34-37)     (Cat5, 926, 3, $0.050)
## 13        Warn-A15 (5-20)       (Cat4, 940, 7, $2.5)
## 14                            (Cat4, 939, 2, $0.690)
## 15                                 (Cat4, 939, -, -)
## 16                                 (Cat3, 955, -, -)
## 17                            (Cat5, 897, 6, $0.250)
## 18                 P-ETSS                           
## 19                 P-ETSS                           
## 20               Warn-A12        (Cat1, 980, 11, $1)
## 21       Warn-A18 (12-33)     (Cat5, 937, 161, $113)
## 22                             (Cat4, 931, 29, $3.4)
## 23                 P-ETSS          (Cat1, 965, -, -)
## 24   P-ETSS River Forcing                           
## 25            Warn-A5 (2)       (Cat1, 988, 2, $1.1)
## 26         Warn-A7 (3-18)    (Cat4, 929, 107, $75.3)
## 27       Warn-A22 (18-27)       (Cat1, 986, 2, $0.7)
## 28       Warn-A28 (24-29)         (TS, 991, 7, $1.3)
## 29               Warn-A28      (Cat1, 991, 13, $1.2)
## 30                             (Cat4, 917, 84, $1.4)
## 31       Warn-A33 (31-35)     (Cat4, 922, 175, $8.3)
## 32        Warn-A11 (9-18)       (Cat3, 970, 9, $4.4)
## 33       Warn-A12 (12-22)       (Cat4, 953, 6, $3.0)
## 34            Warn-A9 (-)       (TS, 993, 1, $0.225)
## 35         Warn-A8 (5-21)       (Cat2, 965, 4, $7.3)
## 36       Warn-A23 (19-30)     (Cat4, 937, 81, $23.3)
## 37       Warn-A13 (10-16)     (Cat1, 991, 0, $0.035)
## 38       Warn-A23 (15-30)     (Cat1, 986, 17, $5.03)
## 39                                 (Cat4, 954, -, -)
## 40        Warn-A11 (8-13)       (Cat1, 973, 9, $1.2)
## 41           Warn-A17 (-)      (TS, 988, 46, $0.865)
## 42                               (TS, 1003, 7, $4.8)
## 43       WarnA-35 (33-53)      (Cat5, 910, 84, $5.1)
## 44         WarnA-5 (1-16)       (Cat1, 993, 3, $0.9)
## 45                              (Typhoon, 895, -, -)
## 46        Warn-A10 (7-17)     (Cat5, 919, 74, $25.5)
## 47                        (Typhoon, 868, 134, $3.77)
## 48                                    (Cat1, 995 mb)
## 49         Warn-A5 (3-11)         (TS, 996, 3, $0.2)
## 50       Warn-A50 (48-65)     (Cat4, 937, 54, $24.2)
## 51                             (Cat5, 926, 1, $0.25)
## 52                         (Typhoon, 910, 2, $0.637)
## 53           Watch-A2 (1)      (TS, 990, 18, $0.125)
## 54                 P-ETSS                           
## 55                            (Cat4, 927, 1, $0.253)
## 56                 P-ETSS                           
## 57         Warn-A9 (7-16)    (Cat1, 981, 48, $0.787)
## 58       Warn-A42 (35-46)  (Cat5, 908, 3,059, $91.6)
## 59                 P-ETSS   (Cat4, 938, 1, $0.00284)
## 60       Warn-A36 (34-51)     (Cat5, 914, 52, $77.2)
## 61       Warn-A25 (23-42)     (Cat4, 937, 107, $125)
## 62                 P-ETSS                           
## 63                           (Cat3, 975, 23, $0.192)
## 64        P-Surge (26-44)    (Cat5, 934, 731, $16.5)
## 65        P-Surge (11-24)      (Cat1, 981, 4, $0.55)
## 66        P-Surge (26-40)      (TS, 1001, 6, $0.001)
## 67               ETSS-2.0        (ET, 983, 55, $0.5)
## 68                            (Cat5, 872, 8, $0.463)
## 69               ETSS-2.0      (Cat4, 931, 34, $0.2)
## 70                            (TS, 1001, 35, $0.511)
## 71                         (Typhoon, 933, 18, $1.59)
## 72                         (Typhoon, 918, 1, $0.014)
## 73               ETSS-2.0                           
## 74                            (Cat4, 918, 18, $1.25)
## 75         P-Surge (6-14)     (Cat2, 973, 0, $0.040)
## 76                        (Typhoon, 920, 0, $0.0002)
## 77                            (Cat1, 983, 169, $4.2)
## 78                             (Cat1, 983, 32, $1.5)
## 79        P-Surge (21-31)    (Cat3, 940, 254, $68.7)
## 80        P-Surge (16-38)      (Cat1, 965, 41, $3.1)
## 81                   ETSS       (ET, 943, 1, $0.024)
## 82                               (TS, 986, 18, $2.8)
## 83        P-Surge (20-34)     (Cat3, 942, 58, $14.2)
##                               Area retiredName OceanicBasin MHHW   AGL maxAGL
## 1  N.W. FL, GA, AL, TN, KY, VA, WV        <NA>         <NA> >9.3  <NA>   <NA>
## 2                           LA, MS        <NA>         <NA>    5  <NA>   <NA>
## 3                               HI        <NA>           cp    3  <NA>   <NA>
## 4                               HI        <NA>           cp    3  <NA>   <NA>
## 5                               AK        <NA>         <NA>  6.3  <NA>   <NA>
## 6              N.W. FL, GA, SC, NC        <NA>         <NA>  5.1  <NA>   <NA>
## 7                           TX, LA        <NA>         <NA>    8  <NA>   <NA>
## 8                           TX, LA        <NA>         <NA>    3  <NA>   <NA>
## 9                              MEX         (R)           ep <NA>  <NA>   <NA>
## 10                            Guam        <NA>           wp <NA>  <NA>   <NA>
## 11              NC, VA, MD, NY, DE        <NA>         <NA> <NA>   3-5      5
## 12                     New England        <NA>         <NA> <NA>   1-2      2
## 13                         N.W. FL        <NA>         <NA> <NA>  8-12     12
## 14                              CA        <NA>           ep <NA>  <NA>   <NA>
## 15                              HI         (R)           cp <NA>  <NA>   <NA>
## 16                              HI        <NA>           cp <NA>  <NA>   <NA>
## 17                            Guam        <NA>           wp <NA>  <NA>   <NA>
## 18                                        <NA>         <NA> <NA>  <NA>   <NA>
## 19                                        <NA>         <NA> <NA>  <NA>   <NA>
## 20                          PR, FL        <NA>         <NA> <NA>     5   <NA>
## 21                     S.W. FL, SC         (R)         <NA> <NA> 10-15     15
## 22                              PR         (R)         <NA> <NA>   1-3      3
## 23                              AK        <NA>         <NA> <NA>  12.5   <NA>
## 24                          MD, DE        <NA>         <NA> <NA>  <NA>   <NA>
## 25                           N. TX        <NA>         <NA> <NA>   3-6      6
## 26                     New Orleans         (R)         <NA> <NA>  9-14     14
## 27                     New England        <NA>         <NA> <NA>   2-4      4
## 28                         N.W. FL        <NA>         <NA> <NA>   2-4      4
## 29                      East Coast        <NA>         <NA> <NA>   1-3      3
## 30    NIC, HND, COL, SLV, GTM, PAN         (R)         <NA> <NA>  <NA>   <NA>
## 31                  FL (Tampa Bay)         (R)         <NA> <NA>     3   <NA>
## 32                     New Orleans        <NA>         <NA> <NA>  6-10     10
## 33                              LA        <NA>         <NA> <NA>   6-9      9
## 34                      TX, LA, MS        <NA>         <NA> <NA>   4.2   <NA>
## 35                     New Orleans        <NA>         <NA> <NA>   5-7      7
## 36                              LA         (R)         <NA> <NA> 12-18     18
## 37                     New Orleans        <NA>         <NA> <NA>   1-2      2
## 38                      East Coast        <NA>         <NA> <NA>   3-6      6
## 39                              HI        <NA>           cp <NA>  <NA>   <NA>
## 40                           S. TX        <NA>         <NA> <NA>   3-5      5
## 41                              LA        <NA>         <NA> <NA>   6.2   <NA>
## 42                          TX, LA        <NA>         <NA> <NA>   1-2      2
## 43                          FL, NC         (R)         <NA> <NA>   5-7      7
## 44                              LA        <NA>         <NA> <NA>   4-6      6
## 45                            Guam        <NA>           wp <NA>  <NA>   <NA>
## 46                   US Gulf Coast         (R)         <NA> <NA>  9-14     14
## 47                            Guam        <NA>           wp <NA>  <NA>   <NA>
## 48                     PR - (Rain)        <NA>         <NA> <NA>  <NA>   <NA>
## 49                 MS, AR, FL Keys        <NA>         <NA> <NA>   1-3      3
## 50                      Eastern US         (R)         <NA> <NA>  8-11     11
## 51                              HI        <NA>           cp <NA>  <NA>   <NA>
## 52                            Guam        <NA>           wp <NA>  <NA>   <NA>
## 53                              MS        <NA>         <NA> <NA>   1-3      3
## 54      DE, NJ, NY, CT, RI, VT, ME        <NA>         <NA> <NA>  <NA>   <NA>
## 55                        Am Samoa        <NA>           sp <NA>  <NA>   <NA>
## 56                              MA        <NA>         <NA> <NA>  <NA>   <NA>
## 57                      AL, LA, MS         (R)         <NA> <NA>   6-9      9
## 58                              PR         (R)         <NA> <NA>   6-9      9
## 59                      CT, RI, MA        <NA>         <NA> <NA>   < 3      3
## 60                          FL, PR         (R)         <NA> <NA>   5-8      8
## 61                          TX, LA         (R)         <NA> <NA>  6-10     10
## 62                              NY        <NA>         <NA> <NA>  <NA>   <NA>
## 63                   PAN, CRI, NIC         (R)         <NA> <NA>  <NA>   <NA>
## 64                           SE US         (R)         <NA> <NA>   5-7      7
## 65                         N.W. FL        <NA>         <NA> <NA>   4-7      7
## 66                         N.W. FL        <NA>         <NA> <NA>   < 3      3
## 67               DE, NJ (Cape May)        <NA>         <NA>  3.5  <NA>   <NA>
## 68                         MEX, TX         (R)           ep <NA>  <NA>   <NA>
## 69          VA, BHS, BMU, CUB, HTI         (R)         <NA> <NA>   3-4      4
## 70                        DMA, HTI         (R)         <NA> <NA>  <NA>   <NA>
## 71                            Guam        <NA>           wp <NA>  <NA>   <NA>
## 72                            Guam        <NA>           wp <NA>  <NA>   <NA>
## 73                          CT, MA        <NA>         <NA>  3.4  <NA>   <NA>
## 74                             MEX         (R)           ep <NA>  <NA>   <NA>
## 75                              NC        <NA>         <NA> <NA>   3-5      5
## 76                            Guam        <NA>           wp <NA>  <NA>   <NA>
## 77                             MEX         (R)           ep <NA>  <NA>   <NA>
## 78                             MEX         (R)         <NA> <NA>  <NA>   <NA>
## 79             Mid-Atlantic, NE US         (R)         <NA> <NA>   4-9      9
## 80                              LA        <NA>         <NA> <NA> 10-17     17
## 81                              AK        <NA>         <NA> <NA>  <NA>   <NA>
## 82                          LA, MS        <NA>         <NA> <NA>  <NA>   <NA>
## 83             Mid-Atlantic, NE US         (R)         <NA> <NA>  <NA>   <NA>

Clean up Cat Pres Dead $bn

stormdata <- stormdata %>% 
  mutate(Cat.Pres.Dead..bn = str_extract(Cat.Pres.Dead..bn, "(?<=\\().*(?=\\))")) %>%
  separate(Cat.Pres.Dead..bn, into = c("Category","Pressure","Dead","Cost"), sep = ",") %>%
  mutate(MinCost = str_extract(Cost,"(?<=\\$).*")) %>%
  mutate(MinCost = str_replace(MinCost,"\\+$","")) %>%
  mutate(MinCost = as.numeric(MinCost) * 1000000000) %>%
  mutate(Pressure = str_replace(Pressure,"mb","")) %>%
  mutate(Pressure = as.numeric(Pressure)) %>%
  mutate(Dead = as.integer(Dead))

stormdata
##                   Storm  startDate year            Date
## 1                Helene 2024-09-23 2024       Sep 23-29
## 2              Francine 2024-09-12 2024          Sep 12
## 3                 Gilma 2024-08-30 2024          Aug 30
## 4                  Hone 2024-08-25 2024          Aug 25
## 5   Post- Typhoon Ampil 2024-08-22 2024          Aug 22
## 6                 Debby 2024-08-05 2024         Aug 5,6
## 7                 Beryl 2024-07-08 2024           Jul 8
## 8               Alberto 2024-06-19 2024       Jun 19,20
## 9                  Otis 2023-10-22 2023       Oct 22-25
## 10              Bolaven 2023-10-06 2023        Oct 6-14
## 11              Ophelia 2023-09-22 2023       Sep 22-24
## 12                  Lee 2023-09-13 2023       Sep 13,14
## 13               Idalia 2023-08-27 2023       Aug 27-31
## 14              Hillary 2023-08-16 2023       Aug 16-22
## 15                 Dora 2023-07-31 2023 Jul 31 - Aug 21
## 16               Calvin 2023-07-11 2023       Jul 11-19
## 17                Mawar 2023-05-19 2023  May 19 - Jun 3
## 18                Jan28 2023-01-28 2023       Jan 28,29
## 19                Jan14 2023-01-14 2023       Jan 14-17
## 20               Nicole 2022-11-07 2022        Nov 7-11
## 21                  Ian 2022-09-26 2022       Sep 26-30
## 22                Fiona 2022-09-14 2022       Sep 14-27
## 23 Post- Typhoon Merbok 2022-09-14 2022       Sep 14-19
## 24                  Oct 2021-10-28 2021          Oct 28
## 25             Nicholas 2021-09-12 2021       Sep 12-20
## 26                  Ida 2021-08-27 2021       Aug 27-30
## 27                Henri 2021-08-20 2021       Aug 20-22
## 28                 Fred 2021-08-15 2021       Aug 15,16
## 29                 Elsa 2021-06-30 2021 Jun 30 - Jul 10
## 30                 Iota 2020-11-13 2020       Nov 13-18
## 31                  Eta 2020-11-08 2020         Nov 8,9
## 32                 Zeta 2020-10-26 2020       Oct 26-29
## 33                Delta 2020-10-07 2020        Oct 7-10
## 34                 Beta 2020-09-17 2020       Sep 17-25
## 35                Sally 2020-09-12 2020       Sep 12-16
## 36                Laura 2020-08-24 2020       Aug 24-27
## 37                Marco 2020-08-22 2020       Aug 22-24
## 38               Isaias 2020-07-31 2020  Jul 31 - Aug 4
## 39              Douglas 2020-07-20 2020       Jul 20-30
## 40                Hanna 2020-07-24 2020       Jul 24-26
## 41            Cristobal 2020-06-01 2020        Jun 1-12
## 42               Imelda 2019-09-17 2019       Sep 17-19
## 43               Dorian 2019-09-01 2019         Sep 1-6
## 44                Barry 2019-07-10 2019       Jul 10-14
## 45                Wutip 2019-02-18 2019  Feb 18 - Mar 2
## 46              Michael 2018-10-07 2018        Oct 7-11
## 47             Mangkhut 2018-09-06 2018        Sep 6-17
## 48                Isaac 2018-09-07 2018        Sep 7-15
## 49               Gordon 2018-09-03 2018         Sep 3-8
## 50             Florence 2018-08-31 2018  Aug 31 - Sep 8
## 51                 Lane 2018-08-15 2018       Aug 15-29
## 52                Maria 2018-07-03 2018        Jul 3-12
## 53              Alberto 2018-05-25 2018  May 25 - Jun 1
## 54                 Mar1 2018-02-28 2018  Feb 28 - Mar 1
## 55                 Gita 2018-02-03 2018        Feb 3-22
## 56                Jan17 2018-01-03 2018        Jan 3-17
## 57                 Nate 2017-10-04 2017         Oct 4-9
## 58                Maria 2017-09-16 2017       Sep 16-30
## 59                 Jose 2017-09-05 2017        Sep 5-25
## 60                 Irma 2017-08-30 2017 Aug 30 - Sep 12
## 61               Harvey 2017-08-17 2017  Aug 17 - Sep 1
## 62                  Mar 2017-03-13 2017  Mar 13- Apr 13
## 63                 Otto 2016-11-20 2016       Nov 20-26
## 64              Matthew 2016-09-28 2016  Sep 28 - Oct 9
## 65              Hermine 2016-08-28 2016  Aug 28 - Sep 8
## 66                Colin 2016-06-05 2016         Jun 5-7
## 67                  Jan 2016-01-23 2016       Jan 23-24
## 68             Patricia 2015-10-20 2015       Oct 20-24
## 69              Joaquin 2015-09-28 2015  Sep 28 - Oct 8
## 70                Erika 2015-08-24 2015       Aug 24-28
## 71             Chan-hom 2015-06-29 2015 Jun 29 - Jul 13
## 72              Dolphin 2015-05-06 2015        May 6-24
## 73                  Jan 2015-01-26 2015       Jan 26-28
## 74                Odile 2014-09-10 2014       Sep 10-19
## 75               Arthur 2014-07-01 2014         Jul 1-9
## 76            Francisco 2013-10-15 2013       Oct 15-26
## 77               Manuel 2013-09-13 2013       Sep 13-20
## 78               Ingrid 2013-09-12 2013       Sep 12-17
## 79                Sandy 2012-10-22 2012       Oct 22-29
## 80                Isaac 2012-08-21 2012  Aug 21 - Sep 3
## 81                  Nov 2011-11-04 2011        Nov 4-11
## 82                  Lee 2011-09-02 2011         Sep 2-7
## 83                Irene 2011-08-21 2011       Aug 21-30
##                               Storm.Tide                 Obs
## 1  Obs w4: >9.3 mhhw Fcst w6: 15-20 mhhw               (FEV)
## 2                             w2: 5 mhhw                    
## 3                             w1: 3 mhhw                    
## 4                             w1: 3 mhhw                    
## 5                           w3: 6.3 mhhw                    
## 6                           w2: 5.1 mhhw               (FEV)
## 7                             w3: 8 mhhw                    
## 8                             w1: 3 mhhw                    
## 9                                                           
## 10                                                          
## 11                           w2: 3-5 agl           (TCR,FEV)
## 12                           w1: 1-2 agl           (TCR,FEV)
## 13                          w4: 8-12 agl           (TCR,FEV)
## 14                                                          
## 15                                                          
## 16                                                          
## 17                                                          
## 18                                                          
## 19                                                          
## 20                             w2: 5 agl               (TCR)
## 21                         w5: 10-15 agl           (TCR,FEV)
## 22                           w1: 1-3 agl               (TCR)
## 23                          w5: 12.5 agl               (FEV)
## 24                                                     (FEV)
## 25                           w2: 3-6 agl               (TCR)
## 26                          w5: 9-14 agl (TCR,FEV) (Service)
## 27                           w2: 2-4 agl           (TCR,FEV)
## 28                           w2: 2-4 agl               (TCR)
## 29                           w1: 1-3 agl               (TCR)
## 30                                                          
## 31                             w1: 3 agl               (TCR)
## 32                          w4: 6-10 agl               (TCR)
## 33                           w3: 6-9 agl               (TCR)
## 34                           w2: 4.2 agl               (TCR)
## 35                           w3: 5-7 agl               (TCR)
## 36                         w6: 12-18 agl               (TCR)
## 37                           w1: 1-2 agl               (TCR)
## 38                           w2: 3-6 agl               (TCR)
## 39                                                          
## 40                           w2: 3-5 agl               (TCR)
## 41                           w3: 6.2 agl               (TCR)
## 42                           w1: 1-2 agl               (TCR)
## 43                           w3: 5-7 agl               (TCR)
## 44                           w2: 4-6 agl               (TCR)
## 45                                                          
## 46                          w5: 9-14 agl (TCR,FEV) (Service)
## 47                                                          
## 48                                                     (FEV)
## 49                           w1: 1-3 agl               (TCR)
## 50                          w4: 8-11 agl (TCR,FEV) (Service)
## 51                                                     (FEV)
## 52                                                          
## 53                           w1: 1-3 agl               (TCR)
## 54                                                     (FEV)
## 55                                                          
## 56                                                     (FEV)
## 57                           w3: 6-9 agl           (TCR,FEV)
## 58                           w3: 6-9 agl           (TCR,FEV)
## 59                           w1: < 3 agl           (TCR,FEV)
## 60                           w3: 5-8 agl           (TCR,FEV)
## 61                          w4: 6-10 agl   (TCR,FEV) Service
## 62                                                     (FEV)
## 63                                                          
## 64                           w3: 5-7 agl   (TCR,FEV) Service
## 65                           w3: 4-7 agl           (TCR,FEV)
## 66                           w1: < 3 agl               (TCR)
## 67                          w2: 3.5 mhhw             Service
## 68                                                          
## 69                           w2: 3-4 agl           (TCR,FEV)
## 70                                                          
## 71                                                          
## 72                                                          
## 73                          w2: 3.4 mhhw               (FEV)
## 74                                                          
## 75                           w2: 3-5 agl               (TCR)
## 76                                                          
## 77                                                          
## 78                                                          
## 79                           w3: 4-9 agl   (TCR,FEV) Service
## 80                         w6: 10-17 agl               (TCR)
## 81                                                          
## 82                              w2: 4.76           (TCR,FEV)
## 83                              w4: 8-11   (TCR,FEV) Service
##                  Guidance Category Pressure Dead      Cost
## 1           Warn-A6 (1-8)     Cat4      938   NA    $27.5+
## 2          Warn-A5 (1-15)     Cat2      972    0      $1.5
## 3            PS-Adv 43,46     Cat4      949   NA         -
## 4  PS-Adv 2-7, 9-11,13,14     Cat1      988   NA         -
## 5                  P-ETSS     Cat4      947   NA         -
## 6          Warn-A5 (6-20)     Cat1      979   10       >$1
## 7        Warn-A33 (29-41)     Cat5      934   70    >$6.86
## 8                               TS      993    5    $0.179
## 9                             Cat5      922   NA       $16
## 10                         Typhoon      903    0         -
## 11                Warn-A4       TS      981    0    $0.450
## 12      Watch-A34 (34-37)     Cat5      926    3    $0.050
## 13        Warn-A15 (5-20)     Cat4      940    7      $2.5
## 14                            Cat4      939    2    $0.690
## 15                            Cat4      939   NA         -
## 16                            Cat3      955   NA         -
## 17                            Cat5      897    6    $0.250
## 18                 P-ETSS     <NA>       NA   NA      <NA>
## 19                 P-ETSS     <NA>       NA   NA      <NA>
## 20               Warn-A12     Cat1      980   11        $1
## 21       Warn-A18 (12-33)     Cat5      937  161      $113
## 22                            Cat4      931   29      $3.4
## 23                 P-ETSS     Cat1      965   NA         -
## 24   P-ETSS River Forcing     <NA>       NA   NA      <NA>
## 25            Warn-A5 (2)     Cat1      988    2      $1.1
## 26         Warn-A7 (3-18)     Cat4      929  107     $75.3
## 27       Warn-A22 (18-27)     Cat1      986    2      $0.7
## 28       Warn-A28 (24-29)       TS      991    7      $1.3
## 29               Warn-A28     Cat1      991   13      $1.2
## 30                            Cat4      917   84      $1.4
## 31       Warn-A33 (31-35)     Cat4      922  175      $8.3
## 32        Warn-A11 (9-18)     Cat3      970    9      $4.4
## 33       Warn-A12 (12-22)     Cat4      953    6      $3.0
## 34            Warn-A9 (-)       TS      993    1    $0.225
## 35         Warn-A8 (5-21)     Cat2      965    4      $7.3
## 36       Warn-A23 (19-30)     Cat4      937   81     $23.3
## 37       Warn-A13 (10-16)     Cat1      991    0    $0.035
## 38       Warn-A23 (15-30)     Cat1      986   17     $5.03
## 39                            Cat4      954   NA         -
## 40        Warn-A11 (8-13)     Cat1      973    9      $1.2
## 41           Warn-A17 (-)       TS      988   46    $0.865
## 42                              TS     1003    7      $4.8
## 43       WarnA-35 (33-53)     Cat5      910   84      $5.1
## 44         WarnA-5 (1-16)     Cat1      993    3      $0.9
## 45                         Typhoon      895   NA         -
## 46        Warn-A10 (7-17)     Cat5      919   74     $25.5
## 47                         Typhoon      868  134     $3.77
## 48                            Cat1      995   NA      <NA>
## 49         Warn-A5 (3-11)       TS      996    3      $0.2
## 50       Warn-A50 (48-65)     Cat4      937   54     $24.2
## 51                            Cat5      926    1     $0.25
## 52                         Typhoon      910    2    $0.637
## 53           Watch-A2 (1)       TS      990   18    $0.125
## 54                 P-ETSS     <NA>       NA   NA      <NA>
## 55                            Cat4      927    1    $0.253
## 56                 P-ETSS     <NA>       NA   NA      <NA>
## 57         Warn-A9 (7-16)     Cat1      981   48    $0.787
## 58       Warn-A42 (35-46)     Cat5      908    3       059
## 59                 P-ETSS     Cat4      938    1  $0.00284
## 60       Warn-A36 (34-51)     Cat5      914   52     $77.2
## 61       Warn-A25 (23-42)     Cat4      937  107      $125
## 62                 P-ETSS     <NA>       NA   NA      <NA>
## 63                            Cat3      975   23    $0.192
## 64        P-Surge (26-44)     Cat5      934  731     $16.5
## 65        P-Surge (11-24)     Cat1      981    4     $0.55
## 66        P-Surge (26-40)       TS     1001    6    $0.001
## 67               ETSS-2.0       ET      983   55      $0.5
## 68                            Cat5      872    8    $0.463
## 69               ETSS-2.0     Cat4      931   34      $0.2
## 70                              TS     1001   35    $0.511
## 71                         Typhoon      933   18     $1.59
## 72                         Typhoon      918    1    $0.014
## 73               ETSS-2.0     <NA>       NA   NA      <NA>
## 74                            Cat4      918   18     $1.25
## 75         P-Surge (6-14)     Cat2      973    0    $0.040
## 76                         Typhoon      920    0   $0.0002
## 77                            Cat1      983  169      $4.2
## 78                            Cat1      983   32      $1.5
## 79        P-Surge (21-31)     Cat3      940  254     $68.7
## 80        P-Surge (16-38)     Cat1      965   41      $3.1
## 81                   ETSS       ET      943    1    $0.024
## 82                              TS      986   18      $2.8
## 83        P-Surge (20-34)     Cat3      942   58     $14.2
##                               Area retiredName OceanicBasin MHHW   AGL maxAGL
## 1  N.W. FL, GA, AL, TN, KY, VA, WV        <NA>         <NA> >9.3  <NA>   <NA>
## 2                           LA, MS        <NA>         <NA>    5  <NA>   <NA>
## 3                               HI        <NA>           cp    3  <NA>   <NA>
## 4                               HI        <NA>           cp    3  <NA>   <NA>
## 5                               AK        <NA>         <NA>  6.3  <NA>   <NA>
## 6              N.W. FL, GA, SC, NC        <NA>         <NA>  5.1  <NA>   <NA>
## 7                           TX, LA        <NA>         <NA>    8  <NA>   <NA>
## 8                           TX, LA        <NA>         <NA>    3  <NA>   <NA>
## 9                              MEX         (R)           ep <NA>  <NA>   <NA>
## 10                            Guam        <NA>           wp <NA>  <NA>   <NA>
## 11              NC, VA, MD, NY, DE        <NA>         <NA> <NA>   3-5      5
## 12                     New England        <NA>         <NA> <NA>   1-2      2
## 13                         N.W. FL        <NA>         <NA> <NA>  8-12     12
## 14                              CA        <NA>           ep <NA>  <NA>   <NA>
## 15                              HI         (R)           cp <NA>  <NA>   <NA>
## 16                              HI        <NA>           cp <NA>  <NA>   <NA>
## 17                            Guam        <NA>           wp <NA>  <NA>   <NA>
## 18                                        <NA>         <NA> <NA>  <NA>   <NA>
## 19                                        <NA>         <NA> <NA>  <NA>   <NA>
## 20                          PR, FL        <NA>         <NA> <NA>     5   <NA>
## 21                     S.W. FL, SC         (R)         <NA> <NA> 10-15     15
## 22                              PR         (R)         <NA> <NA>   1-3      3
## 23                              AK        <NA>         <NA> <NA>  12.5   <NA>
## 24                          MD, DE        <NA>         <NA> <NA>  <NA>   <NA>
## 25                           N. TX        <NA>         <NA> <NA>   3-6      6
## 26                     New Orleans         (R)         <NA> <NA>  9-14     14
## 27                     New England        <NA>         <NA> <NA>   2-4      4
## 28                         N.W. FL        <NA>         <NA> <NA>   2-4      4
## 29                      East Coast        <NA>         <NA> <NA>   1-3      3
## 30    NIC, HND, COL, SLV, GTM, PAN         (R)         <NA> <NA>  <NA>   <NA>
## 31                  FL (Tampa Bay)         (R)         <NA> <NA>     3   <NA>
## 32                     New Orleans        <NA>         <NA> <NA>  6-10     10
## 33                              LA        <NA>         <NA> <NA>   6-9      9
## 34                      TX, LA, MS        <NA>         <NA> <NA>   4.2   <NA>
## 35                     New Orleans        <NA>         <NA> <NA>   5-7      7
## 36                              LA         (R)         <NA> <NA> 12-18     18
## 37                     New Orleans        <NA>         <NA> <NA>   1-2      2
## 38                      East Coast        <NA>         <NA> <NA>   3-6      6
## 39                              HI        <NA>           cp <NA>  <NA>   <NA>
## 40                           S. TX        <NA>         <NA> <NA>   3-5      5
## 41                              LA        <NA>         <NA> <NA>   6.2   <NA>
## 42                          TX, LA        <NA>         <NA> <NA>   1-2      2
## 43                          FL, NC         (R)         <NA> <NA>   5-7      7
## 44                              LA        <NA>         <NA> <NA>   4-6      6
## 45                            Guam        <NA>           wp <NA>  <NA>   <NA>
## 46                   US Gulf Coast         (R)         <NA> <NA>  9-14     14
## 47                            Guam        <NA>           wp <NA>  <NA>   <NA>
## 48                     PR - (Rain)        <NA>         <NA> <NA>  <NA>   <NA>
## 49                 MS, AR, FL Keys        <NA>         <NA> <NA>   1-3      3
## 50                      Eastern US         (R)         <NA> <NA>  8-11     11
## 51                              HI        <NA>           cp <NA>  <NA>   <NA>
## 52                            Guam        <NA>           wp <NA>  <NA>   <NA>
## 53                              MS        <NA>         <NA> <NA>   1-3      3
## 54      DE, NJ, NY, CT, RI, VT, ME        <NA>         <NA> <NA>  <NA>   <NA>
## 55                        Am Samoa        <NA>           sp <NA>  <NA>   <NA>
## 56                              MA        <NA>         <NA> <NA>  <NA>   <NA>
## 57                      AL, LA, MS         (R)         <NA> <NA>   6-9      9
## 58                              PR         (R)         <NA> <NA>   6-9      9
## 59                      CT, RI, MA        <NA>         <NA> <NA>   < 3      3
## 60                          FL, PR         (R)         <NA> <NA>   5-8      8
## 61                          TX, LA         (R)         <NA> <NA>  6-10     10
## 62                              NY        <NA>         <NA> <NA>  <NA>   <NA>
## 63                   PAN, CRI, NIC         (R)         <NA> <NA>  <NA>   <NA>
## 64                           SE US         (R)         <NA> <NA>   5-7      7
## 65                         N.W. FL        <NA>         <NA> <NA>   4-7      7
## 66                         N.W. FL        <NA>         <NA> <NA>   < 3      3
## 67               DE, NJ (Cape May)        <NA>         <NA>  3.5  <NA>   <NA>
## 68                         MEX, TX         (R)           ep <NA>  <NA>   <NA>
## 69          VA, BHS, BMU, CUB, HTI         (R)         <NA> <NA>   3-4      4
## 70                        DMA, HTI         (R)         <NA> <NA>  <NA>   <NA>
## 71                            Guam        <NA>           wp <NA>  <NA>   <NA>
## 72                            Guam        <NA>           wp <NA>  <NA>   <NA>
## 73                          CT, MA        <NA>         <NA>  3.4  <NA>   <NA>
## 74                             MEX         (R)           ep <NA>  <NA>   <NA>
## 75                              NC        <NA>         <NA> <NA>   3-5      5
## 76                            Guam        <NA>           wp <NA>  <NA>   <NA>
## 77                             MEX         (R)           ep <NA>  <NA>   <NA>
## 78                             MEX         (R)         <NA> <NA>  <NA>   <NA>
## 79             Mid-Atlantic, NE US         (R)         <NA> <NA>   4-9      9
## 80                              LA        <NA>         <NA> <NA> 10-17     17
## 81                              AK        <NA>         <NA> <NA>  <NA>   <NA>
## 82                          LA, MS        <NA>         <NA> <NA>  <NA>   <NA>
## 83             Mid-Atlantic, NE US         (R)         <NA> <NA>  <NA>   <NA>
##     MinCost
## 1  2.75e+10
## 2  1.50e+09
## 3        NA
## 4        NA
## 5        NA
## 6  1.00e+09
## 7  6.86e+09
## 8  1.79e+08
## 9  1.60e+10
## 10       NA
## 11 4.50e+08
## 12 5.00e+07
## 13 2.50e+09
## 14 6.90e+08
## 15       NA
## 16       NA
## 17 2.50e+08
## 18       NA
## 19       NA
## 20 1.00e+09
## 21 1.13e+11
## 22 3.40e+09
## 23       NA
## 24       NA
## 25 1.10e+09
## 26 7.53e+10
## 27 7.00e+08
## 28 1.30e+09
## 29 1.20e+09
## 30 1.40e+09
## 31 8.30e+09
## 32 4.40e+09
## 33 3.00e+09
## 34 2.25e+08
## 35 7.30e+09
## 36 2.33e+10
## 37 3.50e+07
## 38 5.03e+09
## 39       NA
## 40 1.20e+09
## 41 8.65e+08
## 42 4.80e+09
## 43 5.10e+09
## 44 9.00e+08
## 45       NA
## 46 2.55e+10
## 47 3.77e+09
## 48       NA
## 49 2.00e+08
## 50 2.42e+10
## 51 2.50e+08
## 52 6.37e+08
## 53 1.25e+08
## 54       NA
## 55 2.53e+08
## 56       NA
## 57 7.87e+08
## 58       NA
## 59 2.84e+06
## 60 7.72e+10
## 61 1.25e+11
## 62       NA
## 63 1.92e+08
## 64 1.65e+10
## 65 5.50e+08
## 66 1.00e+06
## 67 5.00e+08
## 68 4.63e+08
## 69 2.00e+08
## 70 5.11e+08
## 71 1.59e+09
## 72 1.40e+07
## 73       NA
## 74 1.25e+09
## 75 4.00e+07
## 76 2.00e+05
## 77 4.20e+09
## 78 1.50e+09
## 79 6.87e+10
## 80 3.10e+09
## 81 2.40e+07
## 82 2.80e+09
## 83 1.42e+10

Separate out Areas

stormdata <- stormdata %>%
  separate_rows(Area,sep = ", ") 

Remove columns not used in analysis and rows with NA values in important categories

Analyze_storm_data <- stormdata %>%
  select(Storm, Area, startDate, year, Category, Pressure, Dead, maxAGL, MinCost) %>%
  drop_na(maxAGL)

print(Analyze_storm_data,n=100)
## # A tibble: 56 × 9
##    Storm    Area        startDate  year  Category Pressure  Dead maxAGL  MinCost
##    <chr>    <chr>       <date>     <chr> <chr>       <dbl> <int> <chr>     <dbl>
##  1 Ophelia  NC          2023-09-22 2023  TS            981     0 5       4.50e 8
##  2 Ophelia  VA          2023-09-22 2023  TS            981     0 5       4.50e 8
##  3 Ophelia  MD          2023-09-22 2023  TS            981     0 5       4.50e 8
##  4 Ophelia  NY          2023-09-22 2023  TS            981     0 5       4.50e 8
##  5 Ophelia  DE          2023-09-22 2023  TS            981     0 5       4.50e 8
##  6 Lee      New England 2023-09-13 2023  Cat5          926     3 2       5   e 7
##  7 Idalia   N.W. FL     2023-08-27 2023  Cat4          940     7 12      2.5 e 9
##  8 Ian      S.W. FL     2022-09-26 2022  Cat5          937   161 15      1.13e11
##  9 Ian      SC          2022-09-26 2022  Cat5          937   161 15      1.13e11
## 10 Fiona    PR          2022-09-14 2022  Cat4          931    29 3       3.4 e 9
## 11 Nicholas N. TX       2021-09-12 2021  Cat1          988     2 6       1.10e 9
## 12 Ida      New Orleans 2021-08-27 2021  Cat4          929   107 14      7.53e10
## 13 Henri    New England 2021-08-20 2021  Cat1          986     2 4       7   e 8
## 14 Fred     N.W. FL     2021-08-15 2021  TS            991     7 4       1.3 e 9
## 15 Elsa     East Coast  2021-06-30 2021  Cat1          991    13 3       1.20e 9
## 16 Zeta     New Orleans 2020-10-26 2020  Cat3          970     9 10      4.40e 9
## 17 Delta    LA          2020-10-07 2020  Cat4          953     6 9       3   e 9
## 18 Sally    New Orleans 2020-09-12 2020  Cat2          965     4 7       7.30e 9
## 19 Laura    LA          2020-08-24 2020  Cat4          937    81 18      2.33e10
## 20 Marco    New Orleans 2020-08-22 2020  Cat1          991     0 2       3.5 e 7
## 21 Isaias   East Coast  2020-07-31 2020  Cat1          986    17 6       5.03e 9
## 22 Hanna    S. TX       2020-07-24 2020  Cat1          973     9 5       1.20e 9
## 23 Imelda   TX          2019-09-17 2019  TS           1003     7 2       4.80e 9
## 24 Imelda   LA          2019-09-17 2019  TS           1003     7 2       4.80e 9
## 25 Dorian   FL          2019-09-01 2019  Cat5          910    84 7       5.10e 9
## 26 Dorian   NC          2019-09-01 2019  Cat5          910    84 7       5.10e 9
## 27 Barry    LA          2019-07-10 2019  Cat1          993     3 6       9   e 8
## 28 Michael  US Gulf Co… 2018-10-07 2018  Cat5          919    74 14      2.55e10
## 29 Gordon   MS          2018-09-03 2018  TS            996     3 3       2   e 8
## 30 Gordon   AR          2018-09-03 2018  TS            996     3 3       2   e 8
## 31 Gordon   FL Keys     2018-09-03 2018  TS            996     3 3       2   e 8
## 32 Florence Eastern US  2018-08-31 2018  Cat4          937    54 11      2.42e10
## 33 Alberto  MS          2018-05-25 2018  TS            990    18 3       1.25e 8
## 34 Nate     AL          2017-10-04 2017  Cat1          981    48 9       7.87e 8
## 35 Nate     LA          2017-10-04 2017  Cat1          981    48 9       7.87e 8
## 36 Nate     MS          2017-10-04 2017  Cat1          981    48 9       7.87e 8
## 37 Maria    PR          2017-09-16 2017  Cat5          908     3 9      NA      
## 38 Jose     CT          2017-09-05 2017  Cat4          938     1 3       2.84e 6
## 39 Jose     RI          2017-09-05 2017  Cat4          938     1 3       2.84e 6
## 40 Jose     MA          2017-09-05 2017  Cat4          938     1 3       2.84e 6
## 41 Irma     FL          2017-08-30 2017  Cat5          914    52 8       7.72e10
## 42 Irma     PR          2017-08-30 2017  Cat5          914    52 8       7.72e10
## 43 Harvey   TX          2017-08-17 2017  Cat4          937   107 10      1.25e11
## 44 Harvey   LA          2017-08-17 2017  Cat4          937   107 10      1.25e11
## 45 Matthew  SE US       2016-09-28 2016  Cat5          934   731 7       1.65e10
## 46 Hermine  N.W. FL     2016-08-28 2016  Cat1          981     4 7       5.50e 8
## 47 Colin    N.W. FL     2016-06-05 2016  TS           1001     6 3       1   e 6
## 48 Joaquin  VA          2015-09-28 2015  Cat4          931    34 4       2   e 8
## 49 Joaquin  BHS         2015-09-28 2015  Cat4          931    34 4       2   e 8
## 50 Joaquin  BMU         2015-09-28 2015  Cat4          931    34 4       2   e 8
## 51 Joaquin  CUB         2015-09-28 2015  Cat4          931    34 4       2   e 8
## 52 Joaquin  HTI         2015-09-28 2015  Cat4          931    34 4       2   e 8
## 53 Arthur   NC          2014-07-01 2014  Cat2          973     0 5       4   e 7
## 54 Sandy    Mid-Atlant… 2012-10-22 2012  Cat3          940   254 9       6.87e10
## 55 Sandy    NE US       2012-10-22 2012  Cat3          940   254 9       6.87e10
## 56 Isaac    LA          2012-08-21 2012  Cat1          965    41 17      3.10e 9

Perform Relational Analysis

After graphing various relationships between time, maxAGL, Pressure, Dead, and MinCost, most data appears not to exhibit any obvious trends. The exception is the number Dead graphed against Minimum Cost, where it seems to have a positively correlated relationship (costs increase as number dead increases).

Analyze_storm_data %>% ggplot(aes(x = maxAGL,y = MinCost)) + geom_point()
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).

Analyze_storm_data %>% ggplot(aes(x = maxAGL,y = Dead)) + geom_point()

Analyze_storm_data %>% ggplot(aes(x = Dead,y = MinCost)) + geom_point()
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).

Analyze_storm_data %>% ggplot(aes(x = maxAGL,y = Pressure)) + geom_point()

Analyze_storm_data %>% ggplot(aes(x = startDate,y = maxAGL)) + geom_point()

Analyze_storm_data %>% ggplot(aes(x = startDate,y = Pressure)) + geom_point()

Analyze_storm_data %>% ggplot(aes(x = startDate,y = MinCost)) + geom_point()
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).

Perform Categoric Analysis

We can look at number of named hurricanes per geographic area with the following bar chart. However, next steps for data cleaning could be performing a better geographic categorization of the areas, as the format is inconsistent (e.g. format is shown in region of a state, state, region of the US, and even other countries outside of the US). Instead, only a standardized format should be used, like State. This would mean that other non-US countries should be excluded from this dataset, and an understanding of the larger categories (e.g. New England) would have to be developed.

count_area <- Analyze_storm_data %>%
  group_by(Area) %>%
  summarise(count = n())

count_area %>% ggplot(aes(x = count, y = reorder(Area,count))) +  
  geom_col() + 
  labs(title = "Number of Named Storms with Measureable Storm Surge since 2011", 
       y = "Area")

Buffalo Snow Dataset: Create a CSV file that includes all of the information

For my second dataset, I used the Buffalo Monthly Snowfall data from NCAA. This dataset monthly snowfall from 1940 to current day. To create the CSV file, I used BeautifulSoup to scrape the data from the NCAA website directly.

import re
from bs4 import BeautifulSoup
import requests
import pandas as pd
import numpy as np

full_url = "https://www.weather.gov/buf/BuffaloSnow"
print(full_url)
page = requests.get(full_url, headers={"User-Agent": "Mozilla/5.0 (X11; CrOS x86_64 12871.102.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.141 Safari/537.36"}).text
soup = BeautifulSoup(page, "html.parser")

tags = []
index = 0

def returnTrue(tag):
    return tag.font;

allData = soup.find_all("tr")

for i in allData:
    tags.insert(len(tags),i.get_text().strip())

df = pd.DataFrame(columns = ['SEASON','JUL','AUG','SEP','OCT','NOV','DEC','JAN','FEB','MAR','APR','MAY','JUN','ANNUAL'])

for element in tags:
    row = element.strip().split('\n')
    #print(row)
    if len(row) == 14:
        df.loc[len(df)] = row
 

df.to_csv("buffalosnow.csv",sep =';',index=False)

Approach

This dataset appears to be cleaner than the storm dataset. Some potentially problematic areas would be the use of T to indicate “Trace” and the repeated headers located in the data.

Load Data

buffalosnow <- read.csv("C:\\Users\\Kim\\Documents\\GitHub\\Data607\\Project 2\\buffalosnow.csv", header = TRUE, sep = ";")
buffalosnow
##     SEASON JUL AUG SEP  OCT  NOV  DEC  JAN  FEB  MAR  APR MAY JUN ANNUAL
## 1   SEASON JUL AUG SEP  OCT  NOV  DEC  JAN  FEB  MAR  APR MAY JUN ANNUAL
## 2  1940-41   0   0   0    T 17.5 12.1 17.3 23.1  9.3    T   0   0   79.3
## 3  1941-42   0   0   0    T  5.0  7.8 31.0 28.0 13.7  4.1   0   0   89.6
## 4  1942-43   0   0   0    T  8.7 26.7 16.9 17.7 10.4  5.1   T   0   85.5
## 5  1943-44   0   0   0  1.5 13.6  1.7  3.4 24.6 10.5  2.7   0   0   58.0
## 6  1944-45   0   0   0    0  3.9 35.1 50.6 23.3  5.8    T 2.0   0  120.7
## 7  1945-46   0   0   0    T 25.2 51.1 10.7 23.5    T    T   0   0  110.5
## 8  1946-47   0   0   0    0    T 11.9 13.0 22.2 13.5  4.0 0.8   0   65.4
## 9  1947-48   0   0   0    0  9.9  4.3 16.7  7.0  4.2    T   T   0   42.1
## 10 1948-49   0   0   0    T  1.3  7.0 11.8  5.2 14.3  0.5   0   0   40.1
## 11 1949-50   0   0   0    0 28.6  9.5 14.8 19.3 13.7  2.8   0   0   88.7
## 12  SEASON JUL AUG SEP  OCT  NOV  DEC  JAN  FEB  MAR  APR MAY JUN ANNUAL
## 13 1950-51   0   0   T    0 15.2 22.3 13.7 12.3  7.9    T   0   0   71.4
## 14 1951-52   0   0   0    T 17.7 31.6 16.8  8.9  7.4  0.6   0   0   83.0
## 15 1952-53   0   0   0  0.4 11.2 16.1  8.0 11.9  5.5  2.8   0   0   55.9
## 16 1953-54   0   0   0    0 22.7 10.9 16.2  2.6 26.3  1.2   T   0   89.9
## 17 1954-55   0   0   0  1.0  0.7 31.2 28.5 12.0 11.1  0.3   0   0   84.8
## 18 1955-56   0   0   0    T 26.0 20.3 17.7 14.3 23.1  3.8   T   0  105.2
## 19 1956-57   0   0   T    0 16.2 35.6 36.6  3.1 10.2 12.0   0   0  113.7
## 20 1957-58   0   0   0    T 19.2  4.8 31.1 54.2 11.2  4.2   0   0  124.7
## 21 1958-59   0   0   0    0 13.5 19.0 38.4 14.0 29.2  0.4   T   0  114.5
## 22 1959-60   0   0   0    T 12.2 14.2 18.3 49.5 19.5  1.9   T   0  115.6
## 23  SEASON JUL AUG SEP  OCT  NOV  DEC  JAN  FEB  MAR  APR MAY JUN ANNUAL
## 24 1960-61   0   0   0  1.0 15.9 31.9 23.5 11.9  5.1 13.1   T   0  102.4
## 25 1961-62   0   0   0    T  5.6 30.2 26.2 28.2  6.7  4.5   0   0  101.4
## 26 1962-63   0   0   0  2.0  2.5 30.2 31.5 15.5  7.7  0.3 0.1   0   89.8
## 27 1963-64   0   0   0    0  3.1 24.0 13.7 14.6 12.8  3.3   0   0   71.5
## 28 1964-65   0   0   0    T  5.4 15.2 19.2  9.4 17.5  4.2   0   0   70.9
## 29 1965-66   0   0   0  1.2 12.2  7.0 48.0 15.2 11.4  3.2 0.1   0   98.3
## 30 1966-67   0   0   0    0 10.0 12.1 11.6 19.8 10.8  0.6 1.2   0   66.1
## 31 1967-68   0   0   0    T 19.7 10.4 19.1 11.7 10.6  0.1   0   0   71.6
## 32 1968-69   0   0   0    T 11.6 11.7 31.2 12.8  8.0  3.1   0   0   78.4
## 33 1969-70   0   0   0  1.0 22.1 23.4 38.0 21.9 12.6  1.5   T   0  120.5
## 34  SEASON JUL AUG SEP  OCT  NOV  DEC  JAN  FEB  MAR  APR MAY JUN ANNUAL
## 35 1970-71   0   0   0    0  2.6 32.3 17.2 19.4 22.6  2.9   0   0   97.0
## 36 1971-72   0   0   0    0 18.7 12.9 27.6 31.4 14.1  5.2   0   0  109.9
## 37 1972-73   0   0   0  3.1 18.9 19.8  9.9 16.1  8.5  2.4 0.1   0   78.8
## 38 1973-74   0   0   0    0  3.0 23.1 19.7 22.8 12.9  7.1 0.1   0   88.7
## 39 1974-75   0   0   0    T 22.1 23.6 11.0 16.3  7.6 15.0   0   0   95.6
## 40 1975-76   0   0   0    T  5.5 27.3 21.6  8.3 17.3 2,.5   T   0   82.5
## 41 1976-77   0   0   0  0.2 31.3 60.7 68.3 22.7 13.5  2.2 0.5   0  199.4
## 42 1977-78   0   0   0    T 15.0 53.4 56.5 21.7  5.8  1.8 0.1   0  154.3
## 43 1978-79   0   0   0    T  3.0 10.1 42.6 28.3  4.6  8.7   0   0   97.3
## 44 1979-80   0   0   0    T 12.6 19.7 10.2 11.7 13.9  0.3   T   0   68.4
## 45  SEASON JUL AUG SEP  OCT  NOV  DEC  JAN  FEB  MAR  APR MAY JUN ANNUAL
## 46 1980-81   0   0   0    T  6.7 21.6 14.4  5.0 13.2    T   0   0   60.9
## 47 1981-82   0   0   0    T  1.8 24.8 53.2 12.7  9.0 10.9   0   0  112.4
## 48 1982-83   0   0   0    0 15.8 12.9  9.0  5.5  6.9  2.3   T   0   52.4
## 49 1983-84   0   0   0    T 17.7 52.0 13.4 32.5 16.0  0.9   T   0  132.5
## 50 1984-85   0   0   0    0  1.4 11.2 65.9 20.9  6.3  1.5   0   0  107.2
## 51 1985-86   0   0   0    0  5.2 68.4 17.3 17.3  4.8  1.7   T   0  114.7
## 52 1986-87   0   0   0    0 13.7  4.8 28.5  7.7 10.8  2.0   0   0   67.5
## 53 1987-88   0   0   0    T  0.9  9.8  6.9 31.9  6.1  0.8   0   0   56.4
## 54 1988-89   0   0   0  0.5  0.6 10.8  5.4 29.6 10.1  2.5 7.9   0   67.4
## 55 1989-90   0   0   0    T  7.8 34.8 11.8 28.0  1.4  9.9   T   0   93.7
## 56  SEASON JUL AUG SEP  OCT  NOV  DEC  JAN  FEB  MAR  APR MAY JUN ANNUAL
## 57 1990-91   0   0   0    T  0.7 15.4 16.6 16.1  8.5  0.2   T   0   57.5
## 58 1991-92   0   0   0  0.2 18.0 21.4 18.4  7.0 22.8  5.0   0   0   92.8
## 59 1992-93   0   0   0  0.6 13.7 16.5 13.1 19.5 29.3  0.5   0   0   93.2
## 60 1993-94   0   0   0  2.9  4.8 27.9 35.4 21.6 13.2  6.9   0   0  112.7
## 61 1994-95   0   0   0    0  0.9  7.8 23.1 34.6  4.3  3.9   0   0   74.6
## 62 1995-96   0   0   0    0 15.7 61.2 25.3 11.9 24.1  3.2   T   0  141.4
## 63 1996-97   0   0   0    0 11.5 18.9 42.4  9.3 13.4  2.1   0   0   97.6
## 64 1997-98   0   0   0  0.2 16.5 18.2 13.6  1.8 25.3    T   0   0   75.6
## 65 1998-99   0   0   0    0  0.2 11.6 65.1  6.9 15.8  0.9   0   0  100.5
## 66 1999-00   0   0   0    0  0.9 12.7 19.4 16.2 10.7  3.7   0   0   63.6
## 67  SEASON JUL AUG SEP  OCT  NOV  DEC  JAN  FEB  MAR  APR MAY JUN ANNUAL
## 68 2000-01   0   0   0    T 45.6 50.3 19.6  9.8 32.8  0.6   0   0  158.7
## 69 2001-02   0   0   0  0.4    0 82.7 13.7 17.2 15.9  2.5   T   0  132.4
## 70 2002-03   0   0   0    T  8.9 35.8 37.4 19.5  6.6  3.1   0   0  111.3
## 71 2003-04   0   0   0    T  4.2 21.6 45.2  5.9 20.7  3.3   0   0  100.9
## 72 2004-05   0   0   0    T  0.2 22.8 37.0 22.3 17.5  9.3   0   0  109.1
## 73 2005-06   0   0   0    0 17.9 20.3  7.1 26.3  6.5  0.1   0   0   78.2
## 74 2006-07   0   0   0 22.6  2.1  7.5 15.5 33.5  5.4  2.3   0 0.0   88.9
## 75 2007-08 0.0 0.0 0.0  0.0  3.4 31.3 17.5 22.5 29.1    T 0.0 0.0  103.8
## 76 2008-09 0.0 0.0 0.0  0.1  6.2 49.2 30.6 11.6  0.5  2.0 0.0 0.0  100.2
## 77 2009-10 0.0 0.0 0.0    T    T 25.1 35.6 13.4    T    T   T 0.0   74.1
## 78  SEASON JUL AUG SEP  OCT  NOV  DEC  JAN  FEB  MAR  APR MAY JUN ANNUAL
## 79 2010-11 0.0 0.0 0.0    T  1.6 32.0 29.8 34.1 13.3  1.0 0.0 0.0  111.8
## 80 2011-12 0.0 0.0 0.0    T    T  3.8 20.8  9.6  1.5  1.0 0.0 0.0   36.7
## 81 2012-13 0.0 0.0 0.0  0.0  2.4 19.6  6.2 20.5  9.8  0.3   T 0.0   58.8
## 82 2013-14 0.0 0.0 0.0    T 10.0 28.5 42.4 23.3 24.5  1.2 0.0 0.0  129.9
## 83 2014-15 0.0 0.0 0.0    T 20.5 11.3 25.0 46.2  9.6  0.3 0.0 0.0  112.9
## 84 2015-16 0.0 0.0 0.0    T    T  1.0 25.6 20.5  0.9  7.1   T 0.0   55.1
## 85 2016-17 0.0 0.0 0.0    T  0.9 30.1 11.0  5.0 26.8  2.3 0.0 0.0   76.1
## 86 2017-18 0.0 0.0 0.0  0.1  0.1 33.7 31.2 18.1 22.2  6.9 0.0 0.0  112.3
## 87 2018-19 0.0 0.0 0.0  0.1 13.4 14.1 63.4 16.8 10.5  0.5 0.0 0.0  118.8
## 88 2019-20 0.0 0.0 0.0  0.0 12.6 16.1 13.2 20.6  2.1  4.3 0.3 0.0   69.2
## 89  SEASON JUL AUG SEP  OCT  NOV  DEC  JAN  FEB  MAR  APR MAY JUN ANNUAL
## 90 2020-21 0.0 0.0 0.0    T  0.8 33.0 10.2 27.5  0.6  5.1   T 0.0   77.2
## 91 2021-22 0.0 0.0 0.0  0.0  5.0  5.7 52.1 23.1  9.3  2.2 0.0 0.0   97.4
## 92 2022-23 0.0 0.0 0.0    T 36.9 64.7 14.7  7.3 10.0    T 0.0 0.0  133.6
## 93 2023-24 0.0 0.0 0.0  0.4  1.9  5.7 45.6  6.0 11.2  0.5 0.0       71.3

Remove Repeated Headers

buffalosnow <- buffalosnow %>% filter(SEASON != "SEASON")

Replacements

Since T indicates Trace, this is effectively an unmeasureable amount of snowfall. As such, T will be replaced by 0 in the data. There is also one value listed as 2,.5 due to a data entry error (April of the 75-76 season) - this was replaced by 2.5 based on the annual snowfall for that year minus the sum of snowfall for the other months. NA values were dropped as the scraped table included future months.

buffalosnow <- buffalosnow %>%
  mutate(across(everything(), ~ str_replace_all(.,"[T]","0"))) %>%
  mutate(across(everything(), ~ str_replace_all(.,"\\s","0"))) %>%
  drop_na()


buffalosnow$APR[36] <- 2.5


buffalosnow <- buffalosnow %>%
pivot_longer(c(JUL,AUG,SEP,OCT,NOV,DEC,JAN,FEB,MAR,APR,MAY,JUN,ANNUAL), names_to = "TimeCategory", values_to = "Snowfall") %>%
  mutate(Snowfall = as.numeric(Snowfall))

buffalosnow
## # A tibble: 1,092 × 3
##    SEASON  TimeCategory Snowfall
##    <chr>   <chr>           <dbl>
##  1 1940-41 JUL               0  
##  2 1940-41 AUG               0  
##  3 1940-41 SEP               0  
##  4 1940-41 OCT               0  
##  5 1940-41 NOV              17.5
##  6 1940-41 DEC              12.1
##  7 1940-41 JAN              17.3
##  8 1940-41 FEB              23.1
##  9 1940-41 MAR               9.3
## 10 1940-41 APR               0  
## # ℹ 1,082 more rows

Perform Categoric Analysis

buffalosnow %>% filter(TimeCategory != "ANNUAL") %>% 
  group_by(TimeCategory) %>% 
  summarise(averageSnowfall = mean(Snowfall)) %>% 
  ggplot(aes(x = reorder(TimeCategory, -averageSnowfall), y = averageSnowfall)) + 
  geom_bar(stat = "identity") + 
  labs(x = "Month", y = "Average Snowfall in inches", 
       title = "Months by Average Snowfall in Buffalo")

Conclusion

This project provided a lot of practice using regular expressions. Most of the data cleaning involved regex to extract data from columns that contained more than one piece of information. I also had to remove and replace many data values, such as newline characters and extraneous symbols. The NOAA historic storm dataset in particular was difficult to clean given the extent of untidiness. There were also some parts of the data that were difficult to clean due to the lack of context provided in the dataset and documentation. One example is the conversion of MHHW to AGL, which did not seem like an easy task based on the information I found on Vertical Datums. Moreover, the inconsistently formatted data had proved to be a large obstacle in data cleaning, as many regex mutations on the data had to be performed due to these inconsistencies. In comparison, the NCAA Buffalo snowfall dataset was much more consistent. For next steps in data cleaning, I would propose to better characterize the geographic data in the historic storms data, as discussed in the section of categorical analysis in the NOAA section. I would also propose an additional step in the Buffalo Snowfall data, which would be to add the “correct” year to each row, as Season appears to be listed in a way that ignores calendar year. Each month could be attributed to the correct year, as the July through December months would correspond to the earlier year listed in the season while January through June would correspond to the later year listed in the season. This could be done by a conditional. “Annual” rows could also be dropped from the dataset and recalculated as necessary, but its existence in the dataset does not provide much value as the annual snowfall can be summed up easily if required; there is no reason it needs to be its own row.