library(tidyverse)
library(sqldf)
For this project, I chose two weather related datasets and resubmitted my Homework 5 assignment. The two weather related datasets were from NOAA and NCAA. To collect this data, I created a web scraper in Python which pulled the data from the websites and formatted it into a dataframe. From there, I am able to pull the data into RStudio and begin the process of data cleaning.
The NOAA dataset contained data of historic storms. It is incredibly messy, with inconsistent format within many columns. Multiple columns have more than one piece of data, and there are also many missing values. The data includes date, the name of the storm (in most cases), the level of surge, various impact variables, and more. The NCAA dataset contains data on monthly snowfall in Buffalo, with data spanning back decades. The scraped data has repeated headers and some “missing” data, indicated by a T which represents trace amounts of snowfall that was not measurable.
For my first dataset, I used the historical storms data from NOAA. This dataset includes historic named storms from the 2010’s to current day. To create the CSV file, I used BeautifulSoup to scrape the data from the NOAA website directly.
import re
from bs4 import BeautifulSoup
import requests
import pandas as pd
import numpy as np
full_url = "https://vlab.noaa.gov/web/mdl/historic-storms"
print(full_url)
page = requests.get(full_url, headers={"User-Agent": "Mozilla/5.0 (X11; CrOS x86_64 12871.102.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.141 Safari/537.36"}).text
soup = BeautifulSoup(page, "html.parser")
pre_tags = []
index = 0
allData = soup.find_all('pre')
for i in allData:
pre_tags.insert(len(pre_tags),i.get_text().strip())
index += 1
num_rows = int(index/7)
num_columns = 7
df = pd.DataFrame(np.array(pre_tags).reshape(num_rows, num_columns))
df.columns = ['Storm','Date','Storm-Tide','Obs','Guidance','Cat,Pres,Dead,$bn','Area']
df.to_csv("stormdata.csv",sep =';',index=False)
Looking at the data, I see there are many rows with newline characters. There are also various columns that can be split into multiple. My approach for cleaning the data will be to remove newlines, split the columns appropriately (Storm, Storm Tide, Cat Pres Dead $bn), transform date into start date, and then convert each Area to its own row.
stormdata <- read.csv("C:\\Users\\Kim\\Documents\\GitHub\\Data607\\Project 2\\stormdata.csv", header = TRUE, sep = ";")
stormdata
## Storm Date
## 1 2024-Helene Sep 23-29
## 2 2024-Francine Sep 12
## 3 2024-Gilma-cp Aug 30
## 4 2024-Hone-cp Aug 25
## 5 2024-Post-\nTyphoon Ampil Aug 22
## 6 2024-Debby Aug 5,6
## 7 2024-Beryl Jul 8
## 8 2024-Alberto Jun 19,20
## 9 2023-Otis-ep (R) Oct 22-25
## 10 2023-Bolaven-wp Oct 6-14
## 11 2023-Ophelia Sep 22-24
## 12 2023-Lee Sep 13,14
## 13 2023-Idalia Aug 27-31
## 14 2023-Hillary-ep Aug 16-22
## 15 2023-Dora-cp (R) Jul 31 -\nAug 21
## 16 2023-Calvin-cp Jul 11-19
## 17 2023-Mawar-wp May 19 -\nJun 3
## 18 2023-Jan28 Jan 28,29
## 19 2023-Jan14 Jan 14-17
## 20 2022-Nicole Nov 7-11
## 21 2022-Ian (R) Sep 26-30
## 22 2022-Fiona (R) Sep 14-27
## 23 2022-Post-\nTyphoon Merbok Sep 14-19
## 24 2021-Oct Oct 28
## 25 2021-Nicholas Sep 12-20
## 26 2021-Ida (R) Aug 27-30
## 27 2021-Henri Aug 20-22
## 28 2021-Fred Aug 15,16
## 29 2021-Elsa Jun 30 -\nJul 10
## 30 2020-Iota (R) Nov 13-18
## 31 2020-Eta (R) Nov 8,9
## 32 2020-Zeta Oct 26-29
## 33 2020-Delta Oct 7-10
## 34 2020-Beta Sep 17-25
## 35 2020-Sally Sep 12-16
## 36 2020-Laura (R) Aug 24-27
## 37 2020-Marco Aug 22-24
## 38 2020-Isaias Jul 31 -\nAug 4
## 39 '20-Douglas-cp Jul 20-30
## 40 2020-Hanna Jul 24-26
## 41 2020-Cristobal Jun 1-12
## 42 2019-Imelda Sep 17-19
## 43 2019-Dorian (R) Sep 1-6
## 44 2019-Barry Jul 10-14
## 45 2019-Wutip-wp Feb 18 -\nMar 2
## 46 2018-Michael (R) Oct 7-11
## 47 '18-Mangkhut-wp Sep 6-17
## 48 2018-Isaac Sep 7-15
## 49 2018-Gordon Sep 3-8
## 50 '18-Florence (R) Aug 31 -\nSep 8
## 51 2018-Lane-cp Aug 15-29
## 52 2018-Maria-wp Jul 3-12
## 53 2018-Alberto May 25 -\nJun 1
## 54 2018-Mar1 Feb 28 -\nMar 1
## 55 2018-Gita-sp Feb 3-22
## 56 2018-Jan17 Jan 3-17
## 57 2017-Nate (R) Oct 4-9
## 58 2017-Maria (R) Sep 16-30
## 59 2017-Jose Sep 5-25
## 60 2017-Irma (R) Aug 30 -\nSep 12
## 61 2017-Harvey (R) Aug 17 -\nSep 1
## 62 2017-Mar Mar 13-\nApr 13
## 63 2016-Otto (R) Nov 20-26
## 64 2016-Matthew (R) Sep 28 -\nOct 9
## 65 2016-Hermine Aug 28 -\nSep 8
## 66 2016-Colin Jun 5-7
## 67 2016-Jan Jan 23-24
## 68 2015-\nPatricia-ep (R) Oct 20-24
## 69 2015-Joaquin (R) Sep 28 -\nOct 8
## 70 2015-Erika (R) Aug 24-28
## 71 2015-Chan-hom-wp Jun 29 -\nJul 13
## 72 2015-Dolphin-wp May 6-24
## 73 2015-Jan Jan 26-28
## 74 2014-Odile-ep (R) Sep 10-19
## 75 2014-Arthur Jul 1-9
## 76 2013-Francisco-wp Oct 15-26
## 77 2013-Manuel-ep (R) Sep 13-20
## 78 2013-Ingrid (R) Sep 12-17
## 79 2012-Sandy (R) Oct 22-29
## 80 2012-Isaac Aug 21 -\nSep 3
## 81 2011-Nov Nov 4-11
## 82 2011-Lee Sep 2-7
## 83 2011-Irene (R) Aug 21-30
## Storm.Tide Obs
## 1 Obs w4: >9.3 mhhw\nFcst w6: 15-20 mhhw (FEV)
## 2 w2: 5 mhhw
## 3 w1: 3 mhhw
## 4 w1: 3 mhhw
## 5 w3: 6.3 mhhw
## 6 w2: 5.1 mhhw (FEV)
## 7 w3: 8 mhhw
## 8 w1: 3 mhhw
## 9
## 10
## 11 w2: 3-5 agl (TCR,FEV)
## 12 w1: 1-2 agl (TCR,FEV)
## 13 w4: 8-12 agl (TCR,FEV)
## 14
## 15
## 16
## 17
## 18
## 19
## 20 w2: 5 agl (TCR)
## 21 w5: 10-15 agl (TCR,FEV)
## 22 w1: 1-3 agl (TCR)
## 23 w5: 12.5 agl (FEV)
## 24 (FEV)
## 25 w2: 3-6 agl (TCR)
## 26 w5: 9-14 agl (TCR,FEV)\n(Service)
## 27 w2: 2-4 agl (TCR,FEV)
## 28 w2: 2-4 agl (TCR)
## 29 w1: 1-3 agl (TCR)
## 30
## 31 w1: 3 agl (TCR)
## 32 w4: 6-10 agl (TCR)
## 33 w3: 6-9 agl (TCR)
## 34 w2: 4.2 agl (TCR)
## 35 w3: 5-7 agl (TCR)
## 36 w6: 12-18 agl (TCR)
## 37 w1: 1-2 agl (TCR)
## 38 w2: 3-6 agl (TCR)
## 39
## 40 w2: 3-5 agl (TCR)
## 41 w3: 6.2 agl (TCR)
## 42 w1: 1-2 agl (TCR)
## 43 w3: 5-7 agl (TCR)
## 44 w2: 4-6 agl (TCR)
## 45
## 46 w5: 9-14 agl (TCR,FEV)\n(Service)
## 47
## 48 (FEV)
## 49 w1: 1-3 agl (TCR)
## 50 w4: 8-11 agl (TCR,FEV)\n(Service)
## 51 (FEV)
## 52
## 53 w1: 1-3 agl (TCR)
## 54 (FEV)
## 55
## 56 (FEV)
## 57 w3: 6-9 agl (TCR,FEV)
## 58 w3: 6-9 agl (TCR,FEV)
## 59 w1: < 3 agl (TCR,FEV)
## 60 w3: 5-8 agl (TCR,FEV)
## 61 w4: 6-10 agl (TCR,FEV)\nService
## 62 (FEV)
## 63
## 64 w3: 5-7 agl (TCR,FEV)\nService
## 65 w3: 4-7 agl (TCR,FEV)
## 66 w1: < 3 agl (TCR)
## 67 w2: 3.5 mhhw Service
## 68
## 69 w2: 3-4 agl (TCR,FEV)
## 70
## 71
## 72
## 73 w2: 3.4 mhhw (FEV)
## 74
## 75 w2: 3-5 agl (TCR)
## 76
## 77
## 78
## 79 w3: 4-9 agl (TCR,FEV)\nService
## 80 w6: 10-17 agl (TCR)
## 81
## 82 w2: 4.76 (TCR,FEV)
## 83 w4: 8-11 (TCR,FEV)\nService
## Guidance Cat.Pres.Dead..bn
## 1 Warn-A6 (1-8) (Cat4, 938, 221+, $27.5+)
## 2 Warn-A5 (1-15) (Cat2, 972, 0, $1.5)
## 3 PS-Adv 43,46 (Cat4, 949, -, -)
## 4 PS-Adv 2-7,\n9-11,13,14 (Cat1, 988, -, -)
## 5 P-ETSS (Cat4, 947, -, -)
## 6 Warn-A5 (6-20) (Cat1, 979, 10, >$1)
## 7 Warn-A33 (29-41) (Cat5, 934, 70, >$6.86)
## 8 (TS, 993, 5, $0.179)
## 9 (Cat5, 922, 52+, $16)
## 10 (Typhoon, 903, 0, -)
## 11 Warn-A4 (TS, 981, 0, $0.450)
## 12 Watch-A34\n(34-37) (Cat5, 926, 3, $0.050)
## 13 Warn-A15 (5-20) (Cat4, 940, 7, $2.5)
## 14 (Cat4, 939, 2, $0.690)
## 15 (Cat4, 939, -, -)
## 16 (Cat3, 955, -, -)
## 17 (Cat5, 897, 6, $0.250)
## 18 P-ETSS
## 19 P-ETSS
## 20 Warn-A12 (Cat1, 980, 11, $1)
## 21 Warn-A18 (12-33) (Cat5, 937, 161, $113)
## 22 (Cat4, 931, 29, $3.4)
## 23 P-ETSS (Cat1, 965, -, -)
## 24 P-ETSS\nRiver Forcing
## 25 Warn-A5 (2) (Cat1, 988, 2, $1.1)
## 26 Warn-A7 (3-18) (Cat4, 929, 107, $75.3)
## 27 Warn-A22 (18-27) (Cat1, 986, 2, $0.7)
## 28 Warn-A28 (24-29) (TS, 991, 7, $1.3)
## 29 Warn-A28 (Cat1, 991, 13, $1.2)
## 30 (Cat4, 917, 84, $1.4)
## 31 Warn-A33 (31-35) (Cat4, 922, 175, $8.3)
## 32 Warn-A11 (9-18) (Cat3, 970, 9, $4.4)
## 33 Warn-A12 (12-22) (Cat4, 953, 6, $3.0)
## 34 Warn-A9 (-) (TS, 993, 1, $0.225)
## 35 Warn-A8 (5-21) (Cat2, 965, 4, $7.3)
## 36 Warn-A23 (19-30) (Cat4, 937, 81, $23.3)
## 37 Warn-A13 (10-16) (Cat1, 991, 0, $0.035)
## 38 Warn-A23 (15-30) (Cat1, 986, 17, $5.03)
## 39 (Cat4, 954, -, -)
## 40 Warn-A11 (8-13) (Cat1, 973, 9, $1.2)
## 41 Warn-A17 (-) (TS, 988, 46, $0.865)
## 42 (TS, 1003, 7, $4.8)
## 43 WarnA-35 (33-53) (Cat5, 910, 84, $5.1)
## 44 WarnA-5 (1-16) (Cat1, 993, 3, $0.9)
## 45 (Typhoon, 895, -, -)
## 46 Warn-A10 (7-17) (Cat5, 919, 74, $25.5)
## 47 (Typhoon, 868, 134, $3.77)
## 48 (Cat1, 995 mb)
## 49 Warn-A5 (3-11) (TS, 996, 3, $0.2)
## 50 Warn-A50 (48-65) (Cat4, 937, 54, $24.2)
## 51 (Cat5, 926, 1, $0.25)
## 52 (Typhoon, 910, 2, $0.637)
## 53 Watch-A2 (1) (TS, 990, 18, $0.125)
## 54 P-ETSS
## 55 (Cat4, 927, 1, $0.253)
## 56 P-ETSS
## 57 Warn-A9 (7-16) (Cat1, 981, 48, $0.787)
## 58 Warn-A42 (35-46) (Cat5, 908, 3,059, $91.6)
## 59 P-ETSS (Cat4, 938, 1, $0.00284)
## 60 Warn-A36 (34-51) (Cat5, 914, 52, $77.2)
## 61 Warn-A25 (23-42) (Cat4, 937, 107, $125)
## 62 P-ETSS
## 63 (Cat3, 975, 23, $0.192)
## 64 P-Surge (26-44) (Cat5, 934, 731, $16.5)
## 65 P-Surge (11-24) (Cat1, 981, 4, $0.55)
## 66 P-Surge (26-40) (TS, 1001, 6, $0.001)
## 67 ETSS-2.0 (ET, 983, 55, $0.5)
## 68 (Cat5, 872, 8, $0.463)
## 69 ETSS-2.0 (Cat4, 931, 34, $0.2)
## 70 (TS, 1001, 35, $0.511)
## 71 (Typhoon, 933, 18, $1.59)
## 72 (Typhoon, 918, 1, $0.014)
## 73 ETSS-2.0
## 74 (Cat4, 918, 18, $1.25)
## 75 P-Surge (6-14) (Cat2, 973, 0, $0.040)
## 76 (Typhoon, 920, 0, $0.0002)
## 77 (Cat1, 983, 169, $4.2)
## 78 (Cat1, 983, 32, $1.5)
## 79 P-Surge (21-31) (Cat3, 940, 254, $68.7)
## 80 P-Surge (16-38) (Cat1, 965, 41, $3.1)
## 81 ETSS (ET, 943, 1, $0.024)
## 82 (TS, 986, 18, $2.8)
## 83 P-Surge (20-34) (Cat3, 942, 58, $14.2)
## Area
## 1 N.W. FL, GA, AL,\nTN, KY, VA, WV
## 2 LA, MS
## 3 HI
## 4 HI
## 5 AK
## 6 N.W. FL, GA, SC, NC
## 7 TX, LA
## 8 TX, LA
## 9 MEX
## 10 Guam
## 11 NC, VA, MD, NY, DE
## 12 New England
## 13 N.W. FL
## 14 CA
## 15 HI
## 16 HI
## 17 Guam
## 18
## 19
## 20 PR, FL
## 21 S.W. FL, SC
## 22 PR
## 23 AK
## 24 MD, DE
## 25 N. TX
## 26 New Orleans
## 27 New England
## 28 N.W. FL
## 29 East Coast
## 30 NIC, HND, COL,\nSLV, GTM, PAN
## 31 FL (Tampa Bay)
## 32 New Orleans
## 33 LA
## 34 TX, LA, MS
## 35 New Orleans
## 36 LA
## 37 New Orleans
## 38 East Coast
## 39 HI
## 40 S. TX
## 41 LA
## 42 TX, LA
## 43 FL, NC
## 44 LA
## 45 Guam
## 46 US Gulf Coast
## 47 Guam
## 48 PR - (Rain)
## 49 MS, AR, FL Keys
## 50 Eastern US
## 51 HI
## 52 Guam
## 53 MS
## 54 DE, NJ, NY, CT,\nRI, VT, ME
## 55 Am Samoa
## 56 MA
## 57 AL, LA, MS
## 58 PR
## 59 CT, RI, MA
## 60 FL, PR
## 61 TX, LA
## 62 NY
## 63 PAN, CRI, NIC
## 64 SE US
## 65 N.W. FL
## 66 N.W. FL
## 67 DE, NJ (Cape May)
## 68 MEX, TX
## 69 VA, BHS, BMU, CUB, HTI
## 70 DMA, HTI
## 71 Guam
## 72 Guam
## 73 CT, MA
## 74 MEX
## 75 NC
## 76 Guam
## 77 MEX
## 78 MEX
## 79 Mid-Atlantic,\nNE US
## 80 LA
## 81 AK
## 82 LA, MS
## 83 Mid-Atlantic,\nNE US
stormdata <- stormdata %>% mutate(across(everything(), ~ str_replace_all(.,"[\n]"," ")))
I cleaned up the Storm column by first extracting the year and removing it from the column. Next, I extract whether the name of the storm is required by matching (R) in the Storm column, and remove this as well. Lastly, I extract the Oceanic Basin with a regex that identifies a dash followed by any two alphabetical characters, and then remove this from the Storm column. I bumped up year to the first column of the dataset. If year only has two numbers, I added a “20” preceeding those numbers, as this data is only for 2000s data.
stormdata <- stormdata %>%
mutate(year = as.numeric(str_extract(Storm, "\\d+(?=-)"))) %>%
mutate(Storm = str_replace_all(Storm,"^.*?-","")) %>%
mutate(retiredName = str_match(Storm, "\\([R]\\)")) %>%
mutate(Storm = str_replace_all(Storm,"\\s\\([R]\\)","")) %>%
mutate(OceanicBasin = str_extract(Storm, "(?<=-)[a-z]{2}$")) %>%
mutate(Storm = str_replace_all(Storm,"-[a-z]{2}$","")) %>%
select(year,everything()) %>%
mutate(year = str_replace(year,"^\\d{2}$",paste("20",year,sep = "")))
stormdata
## year Storm Date
## 1 2024 Helene Sep 23-29
## 2 2024 Francine Sep 12
## 3 2024 Gilma Aug 30
## 4 2024 Hone Aug 25
## 5 2024 Post- Typhoon Ampil Aug 22
## 6 2024 Debby Aug 5,6
## 7 2024 Beryl Jul 8
## 8 2024 Alberto Jun 19,20
## 9 2023 Otis Oct 22-25
## 10 2023 Bolaven Oct 6-14
## 11 2023 Ophelia Sep 22-24
## 12 2023 Lee Sep 13,14
## 13 2023 Idalia Aug 27-31
## 14 2023 Hillary Aug 16-22
## 15 2023 Dora Jul 31 - Aug 21
## 16 2023 Calvin Jul 11-19
## 17 2023 Mawar May 19 - Jun 3
## 18 2023 Jan28 Jan 28,29
## 19 2023 Jan14 Jan 14-17
## 20 2022 Nicole Nov 7-11
## 21 2022 Ian Sep 26-30
## 22 2022 Fiona Sep 14-27
## 23 2022 Post- Typhoon Merbok Sep 14-19
## 24 2021 Oct Oct 28
## 25 2021 Nicholas Sep 12-20
## 26 2021 Ida Aug 27-30
## 27 2021 Henri Aug 20-22
## 28 2021 Fred Aug 15,16
## 29 2021 Elsa Jun 30 - Jul 10
## 30 2020 Iota Nov 13-18
## 31 2020 Eta Nov 8,9
## 32 2020 Zeta Oct 26-29
## 33 2020 Delta Oct 7-10
## 34 2020 Beta Sep 17-25
## 35 2020 Sally Sep 12-16
## 36 2020 Laura Aug 24-27
## 37 2020 Marco Aug 22-24
## 38 2020 Isaias Jul 31 - Aug 4
## 39 2020 Douglas Jul 20-30
## 40 2020 Hanna Jul 24-26
## 41 2020 Cristobal Jun 1-12
## 42 2019 Imelda Sep 17-19
## 43 2019 Dorian Sep 1-6
## 44 2019 Barry Jul 10-14
## 45 2019 Wutip Feb 18 - Mar 2
## 46 2018 Michael Oct 7-11
## 47 2018 Mangkhut Sep 6-17
## 48 2018 Isaac Sep 7-15
## 49 2018 Gordon Sep 3-8
## 50 2018 Florence Aug 31 - Sep 8
## 51 2018 Lane Aug 15-29
## 52 2018 Maria Jul 3-12
## 53 2018 Alberto May 25 - Jun 1
## 54 2018 Mar1 Feb 28 - Mar 1
## 55 2018 Gita Feb 3-22
## 56 2018 Jan17 Jan 3-17
## 57 2017 Nate Oct 4-9
## 58 2017 Maria Sep 16-30
## 59 2017 Jose Sep 5-25
## 60 2017 Irma Aug 30 - Sep 12
## 61 2017 Harvey Aug 17 - Sep 1
## 62 2017 Mar Mar 13- Apr 13
## 63 2016 Otto Nov 20-26
## 64 2016 Matthew Sep 28 - Oct 9
## 65 2016 Hermine Aug 28 - Sep 8
## 66 2016 Colin Jun 5-7
## 67 2016 Jan Jan 23-24
## 68 2015 Patricia Oct 20-24
## 69 2015 Joaquin Sep 28 - Oct 8
## 70 2015 Erika Aug 24-28
## 71 2015 Chan-hom Jun 29 - Jul 13
## 72 2015 Dolphin May 6-24
## 73 2015 Jan Jan 26-28
## 74 2014 Odile Sep 10-19
## 75 2014 Arthur Jul 1-9
## 76 2013 Francisco Oct 15-26
## 77 2013 Manuel Sep 13-20
## 78 2013 Ingrid Sep 12-17
## 79 2012 Sandy Oct 22-29
## 80 2012 Isaac Aug 21 - Sep 3
## 81 2011 Nov Nov 4-11
## 82 2011 Lee Sep 2-7
## 83 2011 Irene Aug 21-30
## Storm.Tide Obs
## 1 Obs w4: >9.3 mhhw Fcst w6: 15-20 mhhw (FEV)
## 2 w2: 5 mhhw
## 3 w1: 3 mhhw
## 4 w1: 3 mhhw
## 5 w3: 6.3 mhhw
## 6 w2: 5.1 mhhw (FEV)
## 7 w3: 8 mhhw
## 8 w1: 3 mhhw
## 9
## 10
## 11 w2: 3-5 agl (TCR,FEV)
## 12 w1: 1-2 agl (TCR,FEV)
## 13 w4: 8-12 agl (TCR,FEV)
## 14
## 15
## 16
## 17
## 18
## 19
## 20 w2: 5 agl (TCR)
## 21 w5: 10-15 agl (TCR,FEV)
## 22 w1: 1-3 agl (TCR)
## 23 w5: 12.5 agl (FEV)
## 24 (FEV)
## 25 w2: 3-6 agl (TCR)
## 26 w5: 9-14 agl (TCR,FEV) (Service)
## 27 w2: 2-4 agl (TCR,FEV)
## 28 w2: 2-4 agl (TCR)
## 29 w1: 1-3 agl (TCR)
## 30
## 31 w1: 3 agl (TCR)
## 32 w4: 6-10 agl (TCR)
## 33 w3: 6-9 agl (TCR)
## 34 w2: 4.2 agl (TCR)
## 35 w3: 5-7 agl (TCR)
## 36 w6: 12-18 agl (TCR)
## 37 w1: 1-2 agl (TCR)
## 38 w2: 3-6 agl (TCR)
## 39
## 40 w2: 3-5 agl (TCR)
## 41 w3: 6.2 agl (TCR)
## 42 w1: 1-2 agl (TCR)
## 43 w3: 5-7 agl (TCR)
## 44 w2: 4-6 agl (TCR)
## 45
## 46 w5: 9-14 agl (TCR,FEV) (Service)
## 47
## 48 (FEV)
## 49 w1: 1-3 agl (TCR)
## 50 w4: 8-11 agl (TCR,FEV) (Service)
## 51 (FEV)
## 52
## 53 w1: 1-3 agl (TCR)
## 54 (FEV)
## 55
## 56 (FEV)
## 57 w3: 6-9 agl (TCR,FEV)
## 58 w3: 6-9 agl (TCR,FEV)
## 59 w1: < 3 agl (TCR,FEV)
## 60 w3: 5-8 agl (TCR,FEV)
## 61 w4: 6-10 agl (TCR,FEV) Service
## 62 (FEV)
## 63
## 64 w3: 5-7 agl (TCR,FEV) Service
## 65 w3: 4-7 agl (TCR,FEV)
## 66 w1: < 3 agl (TCR)
## 67 w2: 3.5 mhhw Service
## 68
## 69 w2: 3-4 agl (TCR,FEV)
## 70
## 71
## 72
## 73 w2: 3.4 mhhw (FEV)
## 74
## 75 w2: 3-5 agl (TCR)
## 76
## 77
## 78
## 79 w3: 4-9 agl (TCR,FEV) Service
## 80 w6: 10-17 agl (TCR)
## 81
## 82 w2: 4.76 (TCR,FEV)
## 83 w4: 8-11 (TCR,FEV) Service
## Guidance Cat.Pres.Dead..bn
## 1 Warn-A6 (1-8) (Cat4, 938, 221+, $27.5+)
## 2 Warn-A5 (1-15) (Cat2, 972, 0, $1.5)
## 3 PS-Adv 43,46 (Cat4, 949, -, -)
## 4 PS-Adv 2-7, 9-11,13,14 (Cat1, 988, -, -)
## 5 P-ETSS (Cat4, 947, -, -)
## 6 Warn-A5 (6-20) (Cat1, 979, 10, >$1)
## 7 Warn-A33 (29-41) (Cat5, 934, 70, >$6.86)
## 8 (TS, 993, 5, $0.179)
## 9 (Cat5, 922, 52+, $16)
## 10 (Typhoon, 903, 0, -)
## 11 Warn-A4 (TS, 981, 0, $0.450)
## 12 Watch-A34 (34-37) (Cat5, 926, 3, $0.050)
## 13 Warn-A15 (5-20) (Cat4, 940, 7, $2.5)
## 14 (Cat4, 939, 2, $0.690)
## 15 (Cat4, 939, -, -)
## 16 (Cat3, 955, -, -)
## 17 (Cat5, 897, 6, $0.250)
## 18 P-ETSS
## 19 P-ETSS
## 20 Warn-A12 (Cat1, 980, 11, $1)
## 21 Warn-A18 (12-33) (Cat5, 937, 161, $113)
## 22 (Cat4, 931, 29, $3.4)
## 23 P-ETSS (Cat1, 965, -, -)
## 24 P-ETSS River Forcing
## 25 Warn-A5 (2) (Cat1, 988, 2, $1.1)
## 26 Warn-A7 (3-18) (Cat4, 929, 107, $75.3)
## 27 Warn-A22 (18-27) (Cat1, 986, 2, $0.7)
## 28 Warn-A28 (24-29) (TS, 991, 7, $1.3)
## 29 Warn-A28 (Cat1, 991, 13, $1.2)
## 30 (Cat4, 917, 84, $1.4)
## 31 Warn-A33 (31-35) (Cat4, 922, 175, $8.3)
## 32 Warn-A11 (9-18) (Cat3, 970, 9, $4.4)
## 33 Warn-A12 (12-22) (Cat4, 953, 6, $3.0)
## 34 Warn-A9 (-) (TS, 993, 1, $0.225)
## 35 Warn-A8 (5-21) (Cat2, 965, 4, $7.3)
## 36 Warn-A23 (19-30) (Cat4, 937, 81, $23.3)
## 37 Warn-A13 (10-16) (Cat1, 991, 0, $0.035)
## 38 Warn-A23 (15-30) (Cat1, 986, 17, $5.03)
## 39 (Cat4, 954, -, -)
## 40 Warn-A11 (8-13) (Cat1, 973, 9, $1.2)
## 41 Warn-A17 (-) (TS, 988, 46, $0.865)
## 42 (TS, 1003, 7, $4.8)
## 43 WarnA-35 (33-53) (Cat5, 910, 84, $5.1)
## 44 WarnA-5 (1-16) (Cat1, 993, 3, $0.9)
## 45 (Typhoon, 895, -, -)
## 46 Warn-A10 (7-17) (Cat5, 919, 74, $25.5)
## 47 (Typhoon, 868, 134, $3.77)
## 48 (Cat1, 995 mb)
## 49 Warn-A5 (3-11) (TS, 996, 3, $0.2)
## 50 Warn-A50 (48-65) (Cat4, 937, 54, $24.2)
## 51 (Cat5, 926, 1, $0.25)
## 52 (Typhoon, 910, 2, $0.637)
## 53 Watch-A2 (1) (TS, 990, 18, $0.125)
## 54 P-ETSS
## 55 (Cat4, 927, 1, $0.253)
## 56 P-ETSS
## 57 Warn-A9 (7-16) (Cat1, 981, 48, $0.787)
## 58 Warn-A42 (35-46) (Cat5, 908, 3,059, $91.6)
## 59 P-ETSS (Cat4, 938, 1, $0.00284)
## 60 Warn-A36 (34-51) (Cat5, 914, 52, $77.2)
## 61 Warn-A25 (23-42) (Cat4, 937, 107, $125)
## 62 P-ETSS
## 63 (Cat3, 975, 23, $0.192)
## 64 P-Surge (26-44) (Cat5, 934, 731, $16.5)
## 65 P-Surge (11-24) (Cat1, 981, 4, $0.55)
## 66 P-Surge (26-40) (TS, 1001, 6, $0.001)
## 67 ETSS-2.0 (ET, 983, 55, $0.5)
## 68 (Cat5, 872, 8, $0.463)
## 69 ETSS-2.0 (Cat4, 931, 34, $0.2)
## 70 (TS, 1001, 35, $0.511)
## 71 (Typhoon, 933, 18, $1.59)
## 72 (Typhoon, 918, 1, $0.014)
## 73 ETSS-2.0
## 74 (Cat4, 918, 18, $1.25)
## 75 P-Surge (6-14) (Cat2, 973, 0, $0.040)
## 76 (Typhoon, 920, 0, $0.0002)
## 77 (Cat1, 983, 169, $4.2)
## 78 (Cat1, 983, 32, $1.5)
## 79 P-Surge (21-31) (Cat3, 940, 254, $68.7)
## 80 P-Surge (16-38) (Cat1, 965, 41, $3.1)
## 81 ETSS (ET, 943, 1, $0.024)
## 82 (TS, 986, 18, $2.8)
## 83 P-Surge (20-34) (Cat3, 942, 58, $14.2)
## Area retiredName OceanicBasin
## 1 N.W. FL, GA, AL, TN, KY, VA, WV <NA> <NA>
## 2 LA, MS <NA> <NA>
## 3 HI <NA> cp
## 4 HI <NA> cp
## 5 AK <NA> <NA>
## 6 N.W. FL, GA, SC, NC <NA> <NA>
## 7 TX, LA <NA> <NA>
## 8 TX, LA <NA> <NA>
## 9 MEX (R) ep
## 10 Guam <NA> wp
## 11 NC, VA, MD, NY, DE <NA> <NA>
## 12 New England <NA> <NA>
## 13 N.W. FL <NA> <NA>
## 14 CA <NA> ep
## 15 HI (R) cp
## 16 HI <NA> cp
## 17 Guam <NA> wp
## 18 <NA> <NA>
## 19 <NA> <NA>
## 20 PR, FL <NA> <NA>
## 21 S.W. FL, SC (R) <NA>
## 22 PR (R) <NA>
## 23 AK <NA> <NA>
## 24 MD, DE <NA> <NA>
## 25 N. TX <NA> <NA>
## 26 New Orleans (R) <NA>
## 27 New England <NA> <NA>
## 28 N.W. FL <NA> <NA>
## 29 East Coast <NA> <NA>
## 30 NIC, HND, COL, SLV, GTM, PAN (R) <NA>
## 31 FL (Tampa Bay) (R) <NA>
## 32 New Orleans <NA> <NA>
## 33 LA <NA> <NA>
## 34 TX, LA, MS <NA> <NA>
## 35 New Orleans <NA> <NA>
## 36 LA (R) <NA>
## 37 New Orleans <NA> <NA>
## 38 East Coast <NA> <NA>
## 39 HI <NA> cp
## 40 S. TX <NA> <NA>
## 41 LA <NA> <NA>
## 42 TX, LA <NA> <NA>
## 43 FL, NC (R) <NA>
## 44 LA <NA> <NA>
## 45 Guam <NA> wp
## 46 US Gulf Coast (R) <NA>
## 47 Guam <NA> wp
## 48 PR - (Rain) <NA> <NA>
## 49 MS, AR, FL Keys <NA> <NA>
## 50 Eastern US (R) <NA>
## 51 HI <NA> cp
## 52 Guam <NA> wp
## 53 MS <NA> <NA>
## 54 DE, NJ, NY, CT, RI, VT, ME <NA> <NA>
## 55 Am Samoa <NA> sp
## 56 MA <NA> <NA>
## 57 AL, LA, MS (R) <NA>
## 58 PR (R) <NA>
## 59 CT, RI, MA <NA> <NA>
## 60 FL, PR (R) <NA>
## 61 TX, LA (R) <NA>
## 62 NY <NA> <NA>
## 63 PAN, CRI, NIC (R) <NA>
## 64 SE US (R) <NA>
## 65 N.W. FL <NA> <NA>
## 66 N.W. FL <NA> <NA>
## 67 DE, NJ (Cape May) <NA> <NA>
## 68 MEX, TX (R) ep
## 69 VA, BHS, BMU, CUB, HTI (R) <NA>
## 70 DMA, HTI (R) <NA>
## 71 Guam <NA> wp
## 72 Guam <NA> wp
## 73 CT, MA <NA> <NA>
## 74 MEX (R) ep
## 75 NC <NA> <NA>
## 76 Guam <NA> wp
## 77 MEX (R) ep
## 78 MEX (R) <NA>
## 79 Mid-Atlantic, NE US (R) <NA>
## 80 LA <NA> <NA>
## 81 AK <NA> <NA>
## 82 LA, MS <NA> <NA>
## 83 Mid-Atlantic, NE US (R) <NA>
For the sake of analysis, I will be transforming Date to Start Date. To do this, I will take only the text before a dash or comma, concatenate the results with year, and change the character representation of month to numeric.
stormdata <- stormdata %>%
mutate(startDate = str_extract(Date, "^(.*?)(?=-|,)|^(.*)$")) %>%
mutate(startDate = str_replace(startDate,".*\\s\\d{1}$",paste(substr(startDate,1,nchar(startDate)-1),0,substr(startDate,nchar(startDate),nchar(startDate)),sep=''))) %>%
mutate(startDate = str_replace(startDate,"\\s$","")) %>%
mutate(startDate = paste(year,startDate)) %>%
mutate(startDate = str_replace_all(startDate,"Jan","01")) %>%
mutate(startDate = str_replace_all(startDate,"Feb","02")) %>%
mutate(startDate = str_replace_all(startDate,"Mar","03")) %>%
mutate(startDate = str_replace_all(startDate,"Apr","04")) %>%
mutate(startDate = str_replace_all(startDate,"May","05")) %>%
mutate(startDate = str_replace_all(startDate,"Jun","06")) %>%
mutate(startDate = str_replace_all(startDate,"Jul","07")) %>%
mutate(startDate = str_replace_all(startDate,"Aug","08")) %>%
mutate(startDate = str_replace_all(startDate,"Sep","09")) %>%
mutate(startDate = str_replace_all(startDate,"Oct","10")) %>%
mutate(startDate = str_replace_all(startDate,"Nov","11")) %>%
mutate(startDate = str_replace_all(startDate,"Dec","12")) %>%
mutate(startDate = as.Date(str_replace_all(startDate,"\\s","-"))) %>%
select(Storm, startDate, year,everything())
stormdata
## Storm startDate year Date
## 1 Helene 2024-09-23 2024 Sep 23-29
## 2 Francine 2024-09-12 2024 Sep 12
## 3 Gilma 2024-08-30 2024 Aug 30
## 4 Hone 2024-08-25 2024 Aug 25
## 5 Post- Typhoon Ampil 2024-08-22 2024 Aug 22
## 6 Debby 2024-08-05 2024 Aug 5,6
## 7 Beryl 2024-07-08 2024 Jul 8
## 8 Alberto 2024-06-19 2024 Jun 19,20
## 9 Otis 2023-10-22 2023 Oct 22-25
## 10 Bolaven 2023-10-06 2023 Oct 6-14
## 11 Ophelia 2023-09-22 2023 Sep 22-24
## 12 Lee 2023-09-13 2023 Sep 13,14
## 13 Idalia 2023-08-27 2023 Aug 27-31
## 14 Hillary 2023-08-16 2023 Aug 16-22
## 15 Dora 2023-07-31 2023 Jul 31 - Aug 21
## 16 Calvin 2023-07-11 2023 Jul 11-19
## 17 Mawar 2023-05-19 2023 May 19 - Jun 3
## 18 Jan28 2023-01-28 2023 Jan 28,29
## 19 Jan14 2023-01-14 2023 Jan 14-17
## 20 Nicole 2022-11-07 2022 Nov 7-11
## 21 Ian 2022-09-26 2022 Sep 26-30
## 22 Fiona 2022-09-14 2022 Sep 14-27
## 23 Post- Typhoon Merbok 2022-09-14 2022 Sep 14-19
## 24 Oct 2021-10-28 2021 Oct 28
## 25 Nicholas 2021-09-12 2021 Sep 12-20
## 26 Ida 2021-08-27 2021 Aug 27-30
## 27 Henri 2021-08-20 2021 Aug 20-22
## 28 Fred 2021-08-15 2021 Aug 15,16
## 29 Elsa 2021-06-30 2021 Jun 30 - Jul 10
## 30 Iota 2020-11-13 2020 Nov 13-18
## 31 Eta 2020-11-08 2020 Nov 8,9
## 32 Zeta 2020-10-26 2020 Oct 26-29
## 33 Delta 2020-10-07 2020 Oct 7-10
## 34 Beta 2020-09-17 2020 Sep 17-25
## 35 Sally 2020-09-12 2020 Sep 12-16
## 36 Laura 2020-08-24 2020 Aug 24-27
## 37 Marco 2020-08-22 2020 Aug 22-24
## 38 Isaias 2020-07-31 2020 Jul 31 - Aug 4
## 39 Douglas 2020-07-20 2020 Jul 20-30
## 40 Hanna 2020-07-24 2020 Jul 24-26
## 41 Cristobal 2020-06-01 2020 Jun 1-12
## 42 Imelda 2019-09-17 2019 Sep 17-19
## 43 Dorian 2019-09-01 2019 Sep 1-6
## 44 Barry 2019-07-10 2019 Jul 10-14
## 45 Wutip 2019-02-18 2019 Feb 18 - Mar 2
## 46 Michael 2018-10-07 2018 Oct 7-11
## 47 Mangkhut 2018-09-06 2018 Sep 6-17
## 48 Isaac 2018-09-07 2018 Sep 7-15
## 49 Gordon 2018-09-03 2018 Sep 3-8
## 50 Florence 2018-08-31 2018 Aug 31 - Sep 8
## 51 Lane 2018-08-15 2018 Aug 15-29
## 52 Maria 2018-07-03 2018 Jul 3-12
## 53 Alberto 2018-05-25 2018 May 25 - Jun 1
## 54 Mar1 2018-02-28 2018 Feb 28 - Mar 1
## 55 Gita 2018-02-03 2018 Feb 3-22
## 56 Jan17 2018-01-03 2018 Jan 3-17
## 57 Nate 2017-10-04 2017 Oct 4-9
## 58 Maria 2017-09-16 2017 Sep 16-30
## 59 Jose 2017-09-05 2017 Sep 5-25
## 60 Irma 2017-08-30 2017 Aug 30 - Sep 12
## 61 Harvey 2017-08-17 2017 Aug 17 - Sep 1
## 62 Mar 2017-03-13 2017 Mar 13- Apr 13
## 63 Otto 2016-11-20 2016 Nov 20-26
## 64 Matthew 2016-09-28 2016 Sep 28 - Oct 9
## 65 Hermine 2016-08-28 2016 Aug 28 - Sep 8
## 66 Colin 2016-06-05 2016 Jun 5-7
## 67 Jan 2016-01-23 2016 Jan 23-24
## 68 Patricia 2015-10-20 2015 Oct 20-24
## 69 Joaquin 2015-09-28 2015 Sep 28 - Oct 8
## 70 Erika 2015-08-24 2015 Aug 24-28
## 71 Chan-hom 2015-06-29 2015 Jun 29 - Jul 13
## 72 Dolphin 2015-05-06 2015 May 6-24
## 73 Jan 2015-01-26 2015 Jan 26-28
## 74 Odile 2014-09-10 2014 Sep 10-19
## 75 Arthur 2014-07-01 2014 Jul 1-9
## 76 Francisco 2013-10-15 2013 Oct 15-26
## 77 Manuel 2013-09-13 2013 Sep 13-20
## 78 Ingrid 2013-09-12 2013 Sep 12-17
## 79 Sandy 2012-10-22 2012 Oct 22-29
## 80 Isaac 2012-08-21 2012 Aug 21 - Sep 3
## 81 Nov 2011-11-04 2011 Nov 4-11
## 82 Lee 2011-09-02 2011 Sep 2-7
## 83 Irene 2011-08-21 2011 Aug 21-30
## Storm.Tide Obs
## 1 Obs w4: >9.3 mhhw Fcst w6: 15-20 mhhw (FEV)
## 2 w2: 5 mhhw
## 3 w1: 3 mhhw
## 4 w1: 3 mhhw
## 5 w3: 6.3 mhhw
## 6 w2: 5.1 mhhw (FEV)
## 7 w3: 8 mhhw
## 8 w1: 3 mhhw
## 9
## 10
## 11 w2: 3-5 agl (TCR,FEV)
## 12 w1: 1-2 agl (TCR,FEV)
## 13 w4: 8-12 agl (TCR,FEV)
## 14
## 15
## 16
## 17
## 18
## 19
## 20 w2: 5 agl (TCR)
## 21 w5: 10-15 agl (TCR,FEV)
## 22 w1: 1-3 agl (TCR)
## 23 w5: 12.5 agl (FEV)
## 24 (FEV)
## 25 w2: 3-6 agl (TCR)
## 26 w5: 9-14 agl (TCR,FEV) (Service)
## 27 w2: 2-4 agl (TCR,FEV)
## 28 w2: 2-4 agl (TCR)
## 29 w1: 1-3 agl (TCR)
## 30
## 31 w1: 3 agl (TCR)
## 32 w4: 6-10 agl (TCR)
## 33 w3: 6-9 agl (TCR)
## 34 w2: 4.2 agl (TCR)
## 35 w3: 5-7 agl (TCR)
## 36 w6: 12-18 agl (TCR)
## 37 w1: 1-2 agl (TCR)
## 38 w2: 3-6 agl (TCR)
## 39
## 40 w2: 3-5 agl (TCR)
## 41 w3: 6.2 agl (TCR)
## 42 w1: 1-2 agl (TCR)
## 43 w3: 5-7 agl (TCR)
## 44 w2: 4-6 agl (TCR)
## 45
## 46 w5: 9-14 agl (TCR,FEV) (Service)
## 47
## 48 (FEV)
## 49 w1: 1-3 agl (TCR)
## 50 w4: 8-11 agl (TCR,FEV) (Service)
## 51 (FEV)
## 52
## 53 w1: 1-3 agl (TCR)
## 54 (FEV)
## 55
## 56 (FEV)
## 57 w3: 6-9 agl (TCR,FEV)
## 58 w3: 6-9 agl (TCR,FEV)
## 59 w1: < 3 agl (TCR,FEV)
## 60 w3: 5-8 agl (TCR,FEV)
## 61 w4: 6-10 agl (TCR,FEV) Service
## 62 (FEV)
## 63
## 64 w3: 5-7 agl (TCR,FEV) Service
## 65 w3: 4-7 agl (TCR,FEV)
## 66 w1: < 3 agl (TCR)
## 67 w2: 3.5 mhhw Service
## 68
## 69 w2: 3-4 agl (TCR,FEV)
## 70
## 71
## 72
## 73 w2: 3.4 mhhw (FEV)
## 74
## 75 w2: 3-5 agl (TCR)
## 76
## 77
## 78
## 79 w3: 4-9 agl (TCR,FEV) Service
## 80 w6: 10-17 agl (TCR)
## 81
## 82 w2: 4.76 (TCR,FEV)
## 83 w4: 8-11 (TCR,FEV) Service
## Guidance Cat.Pres.Dead..bn
## 1 Warn-A6 (1-8) (Cat4, 938, 221+, $27.5+)
## 2 Warn-A5 (1-15) (Cat2, 972, 0, $1.5)
## 3 PS-Adv 43,46 (Cat4, 949, -, -)
## 4 PS-Adv 2-7, 9-11,13,14 (Cat1, 988, -, -)
## 5 P-ETSS (Cat4, 947, -, -)
## 6 Warn-A5 (6-20) (Cat1, 979, 10, >$1)
## 7 Warn-A33 (29-41) (Cat5, 934, 70, >$6.86)
## 8 (TS, 993, 5, $0.179)
## 9 (Cat5, 922, 52+, $16)
## 10 (Typhoon, 903, 0, -)
## 11 Warn-A4 (TS, 981, 0, $0.450)
## 12 Watch-A34 (34-37) (Cat5, 926, 3, $0.050)
## 13 Warn-A15 (5-20) (Cat4, 940, 7, $2.5)
## 14 (Cat4, 939, 2, $0.690)
## 15 (Cat4, 939, -, -)
## 16 (Cat3, 955, -, -)
## 17 (Cat5, 897, 6, $0.250)
## 18 P-ETSS
## 19 P-ETSS
## 20 Warn-A12 (Cat1, 980, 11, $1)
## 21 Warn-A18 (12-33) (Cat5, 937, 161, $113)
## 22 (Cat4, 931, 29, $3.4)
## 23 P-ETSS (Cat1, 965, -, -)
## 24 P-ETSS River Forcing
## 25 Warn-A5 (2) (Cat1, 988, 2, $1.1)
## 26 Warn-A7 (3-18) (Cat4, 929, 107, $75.3)
## 27 Warn-A22 (18-27) (Cat1, 986, 2, $0.7)
## 28 Warn-A28 (24-29) (TS, 991, 7, $1.3)
## 29 Warn-A28 (Cat1, 991, 13, $1.2)
## 30 (Cat4, 917, 84, $1.4)
## 31 Warn-A33 (31-35) (Cat4, 922, 175, $8.3)
## 32 Warn-A11 (9-18) (Cat3, 970, 9, $4.4)
## 33 Warn-A12 (12-22) (Cat4, 953, 6, $3.0)
## 34 Warn-A9 (-) (TS, 993, 1, $0.225)
## 35 Warn-A8 (5-21) (Cat2, 965, 4, $7.3)
## 36 Warn-A23 (19-30) (Cat4, 937, 81, $23.3)
## 37 Warn-A13 (10-16) (Cat1, 991, 0, $0.035)
## 38 Warn-A23 (15-30) (Cat1, 986, 17, $5.03)
## 39 (Cat4, 954, -, -)
## 40 Warn-A11 (8-13) (Cat1, 973, 9, $1.2)
## 41 Warn-A17 (-) (TS, 988, 46, $0.865)
## 42 (TS, 1003, 7, $4.8)
## 43 WarnA-35 (33-53) (Cat5, 910, 84, $5.1)
## 44 WarnA-5 (1-16) (Cat1, 993, 3, $0.9)
## 45 (Typhoon, 895, -, -)
## 46 Warn-A10 (7-17) (Cat5, 919, 74, $25.5)
## 47 (Typhoon, 868, 134, $3.77)
## 48 (Cat1, 995 mb)
## 49 Warn-A5 (3-11) (TS, 996, 3, $0.2)
## 50 Warn-A50 (48-65) (Cat4, 937, 54, $24.2)
## 51 (Cat5, 926, 1, $0.25)
## 52 (Typhoon, 910, 2, $0.637)
## 53 Watch-A2 (1) (TS, 990, 18, $0.125)
## 54 P-ETSS
## 55 (Cat4, 927, 1, $0.253)
## 56 P-ETSS
## 57 Warn-A9 (7-16) (Cat1, 981, 48, $0.787)
## 58 Warn-A42 (35-46) (Cat5, 908, 3,059, $91.6)
## 59 P-ETSS (Cat4, 938, 1, $0.00284)
## 60 Warn-A36 (34-51) (Cat5, 914, 52, $77.2)
## 61 Warn-A25 (23-42) (Cat4, 937, 107, $125)
## 62 P-ETSS
## 63 (Cat3, 975, 23, $0.192)
## 64 P-Surge (26-44) (Cat5, 934, 731, $16.5)
## 65 P-Surge (11-24) (Cat1, 981, 4, $0.55)
## 66 P-Surge (26-40) (TS, 1001, 6, $0.001)
## 67 ETSS-2.0 (ET, 983, 55, $0.5)
## 68 (Cat5, 872, 8, $0.463)
## 69 ETSS-2.0 (Cat4, 931, 34, $0.2)
## 70 (TS, 1001, 35, $0.511)
## 71 (Typhoon, 933, 18, $1.59)
## 72 (Typhoon, 918, 1, $0.014)
## 73 ETSS-2.0
## 74 (Cat4, 918, 18, $1.25)
## 75 P-Surge (6-14) (Cat2, 973, 0, $0.040)
## 76 (Typhoon, 920, 0, $0.0002)
## 77 (Cat1, 983, 169, $4.2)
## 78 (Cat1, 983, 32, $1.5)
## 79 P-Surge (21-31) (Cat3, 940, 254, $68.7)
## 80 P-Surge (16-38) (Cat1, 965, 41, $3.1)
## 81 ETSS (ET, 943, 1, $0.024)
## 82 (TS, 986, 18, $2.8)
## 83 P-Surge (20-34) (Cat3, 942, 58, $14.2)
## Area retiredName OceanicBasin
## 1 N.W. FL, GA, AL, TN, KY, VA, WV <NA> <NA>
## 2 LA, MS <NA> <NA>
## 3 HI <NA> cp
## 4 HI <NA> cp
## 5 AK <NA> <NA>
## 6 N.W. FL, GA, SC, NC <NA> <NA>
## 7 TX, LA <NA> <NA>
## 8 TX, LA <NA> <NA>
## 9 MEX (R) ep
## 10 Guam <NA> wp
## 11 NC, VA, MD, NY, DE <NA> <NA>
## 12 New England <NA> <NA>
## 13 N.W. FL <NA> <NA>
## 14 CA <NA> ep
## 15 HI (R) cp
## 16 HI <NA> cp
## 17 Guam <NA> wp
## 18 <NA> <NA>
## 19 <NA> <NA>
## 20 PR, FL <NA> <NA>
## 21 S.W. FL, SC (R) <NA>
## 22 PR (R) <NA>
## 23 AK <NA> <NA>
## 24 MD, DE <NA> <NA>
## 25 N. TX <NA> <NA>
## 26 New Orleans (R) <NA>
## 27 New England <NA> <NA>
## 28 N.W. FL <NA> <NA>
## 29 East Coast <NA> <NA>
## 30 NIC, HND, COL, SLV, GTM, PAN (R) <NA>
## 31 FL (Tampa Bay) (R) <NA>
## 32 New Orleans <NA> <NA>
## 33 LA <NA> <NA>
## 34 TX, LA, MS <NA> <NA>
## 35 New Orleans <NA> <NA>
## 36 LA (R) <NA>
## 37 New Orleans <NA> <NA>
## 38 East Coast <NA> <NA>
## 39 HI <NA> cp
## 40 S. TX <NA> <NA>
## 41 LA <NA> <NA>
## 42 TX, LA <NA> <NA>
## 43 FL, NC (R) <NA>
## 44 LA <NA> <NA>
## 45 Guam <NA> wp
## 46 US Gulf Coast (R) <NA>
## 47 Guam <NA> wp
## 48 PR - (Rain) <NA> <NA>
## 49 MS, AR, FL Keys <NA> <NA>
## 50 Eastern US (R) <NA>
## 51 HI <NA> cp
## 52 Guam <NA> wp
## 53 MS <NA> <NA>
## 54 DE, NJ, NY, CT, RI, VT, ME <NA> <NA>
## 55 Am Samoa <NA> sp
## 56 MA <NA> <NA>
## 57 AL, LA, MS (R) <NA>
## 58 PR (R) <NA>
## 59 CT, RI, MA <NA> <NA>
## 60 FL, PR (R) <NA>
## 61 TX, LA (R) <NA>
## 62 NY <NA> <NA>
## 63 PAN, CRI, NIC (R) <NA>
## 64 SE US (R) <NA>
## 65 N.W. FL <NA> <NA>
## 66 N.W. FL <NA> <NA>
## 67 DE, NJ (Cape May) <NA> <NA>
## 68 MEX, TX (R) ep
## 69 VA, BHS, BMU, CUB, HTI (R) <NA>
## 70 DMA, HTI (R) <NA>
## 71 Guam <NA> wp
## 72 Guam <NA> wp
## 73 CT, MA <NA> <NA>
## 74 MEX (R) ep
## 75 NC <NA> <NA>
## 76 Guam <NA> wp
## 77 MEX (R) ep
## 78 MEX (R) <NA>
## 79 Mid-Atlantic, NE US (R) <NA>
## 80 LA <NA> <NA>
## 81 AK <NA> <NA>
## 82 LA, MS <NA> <NA>
## 83 Mid-Atlantic, NE US (R) <NA>
Next, I will clean up the Storm Tide column to extract either the MHHW or AGL. The conversion of MHHW to AGL and vice versa is complicated as it involves vertical datums where the benchmark value may vary based on location. For the sake of this exercise, although MHHW was extracted, I will be using only AGL for analysis. For N/A and MHHW datapoints, they will be excluded from the analysis. This decision was made based on AGL being present in most of the data (i.e. the dataset has more AGL values), and because it does not detract from the exercise to remove MHHW (as the exercise is not to delve deeply into the world of vertical tidal datums). Additionally, it appears only the previous year has been recorded in mostly MHHW, so it is essentially just an omission of the latest year data. The lack of MHHW values do not seem significant in any way (e.g. the missing values do not appear to be correlated to storm severity).
stormdata <- stormdata %>%
mutate(MHHW = str_extract(Storm.Tide, "(?<=:\\s).*?(?=\\smhhw)")) %>%
mutate(AGL = str_extract(Storm.Tide, "(?<=:\\s).*?(?=\\sagl)")) %>%
mutate(maxAGL = str_extract(AGL, "((?<=-)|(?<=\\<\\s)).*"))
stormdata
## Storm startDate year Date
## 1 Helene 2024-09-23 2024 Sep 23-29
## 2 Francine 2024-09-12 2024 Sep 12
## 3 Gilma 2024-08-30 2024 Aug 30
## 4 Hone 2024-08-25 2024 Aug 25
## 5 Post- Typhoon Ampil 2024-08-22 2024 Aug 22
## 6 Debby 2024-08-05 2024 Aug 5,6
## 7 Beryl 2024-07-08 2024 Jul 8
## 8 Alberto 2024-06-19 2024 Jun 19,20
## 9 Otis 2023-10-22 2023 Oct 22-25
## 10 Bolaven 2023-10-06 2023 Oct 6-14
## 11 Ophelia 2023-09-22 2023 Sep 22-24
## 12 Lee 2023-09-13 2023 Sep 13,14
## 13 Idalia 2023-08-27 2023 Aug 27-31
## 14 Hillary 2023-08-16 2023 Aug 16-22
## 15 Dora 2023-07-31 2023 Jul 31 - Aug 21
## 16 Calvin 2023-07-11 2023 Jul 11-19
## 17 Mawar 2023-05-19 2023 May 19 - Jun 3
## 18 Jan28 2023-01-28 2023 Jan 28,29
## 19 Jan14 2023-01-14 2023 Jan 14-17
## 20 Nicole 2022-11-07 2022 Nov 7-11
## 21 Ian 2022-09-26 2022 Sep 26-30
## 22 Fiona 2022-09-14 2022 Sep 14-27
## 23 Post- Typhoon Merbok 2022-09-14 2022 Sep 14-19
## 24 Oct 2021-10-28 2021 Oct 28
## 25 Nicholas 2021-09-12 2021 Sep 12-20
## 26 Ida 2021-08-27 2021 Aug 27-30
## 27 Henri 2021-08-20 2021 Aug 20-22
## 28 Fred 2021-08-15 2021 Aug 15,16
## 29 Elsa 2021-06-30 2021 Jun 30 - Jul 10
## 30 Iota 2020-11-13 2020 Nov 13-18
## 31 Eta 2020-11-08 2020 Nov 8,9
## 32 Zeta 2020-10-26 2020 Oct 26-29
## 33 Delta 2020-10-07 2020 Oct 7-10
## 34 Beta 2020-09-17 2020 Sep 17-25
## 35 Sally 2020-09-12 2020 Sep 12-16
## 36 Laura 2020-08-24 2020 Aug 24-27
## 37 Marco 2020-08-22 2020 Aug 22-24
## 38 Isaias 2020-07-31 2020 Jul 31 - Aug 4
## 39 Douglas 2020-07-20 2020 Jul 20-30
## 40 Hanna 2020-07-24 2020 Jul 24-26
## 41 Cristobal 2020-06-01 2020 Jun 1-12
## 42 Imelda 2019-09-17 2019 Sep 17-19
## 43 Dorian 2019-09-01 2019 Sep 1-6
## 44 Barry 2019-07-10 2019 Jul 10-14
## 45 Wutip 2019-02-18 2019 Feb 18 - Mar 2
## 46 Michael 2018-10-07 2018 Oct 7-11
## 47 Mangkhut 2018-09-06 2018 Sep 6-17
## 48 Isaac 2018-09-07 2018 Sep 7-15
## 49 Gordon 2018-09-03 2018 Sep 3-8
## 50 Florence 2018-08-31 2018 Aug 31 - Sep 8
## 51 Lane 2018-08-15 2018 Aug 15-29
## 52 Maria 2018-07-03 2018 Jul 3-12
## 53 Alberto 2018-05-25 2018 May 25 - Jun 1
## 54 Mar1 2018-02-28 2018 Feb 28 - Mar 1
## 55 Gita 2018-02-03 2018 Feb 3-22
## 56 Jan17 2018-01-03 2018 Jan 3-17
## 57 Nate 2017-10-04 2017 Oct 4-9
## 58 Maria 2017-09-16 2017 Sep 16-30
## 59 Jose 2017-09-05 2017 Sep 5-25
## 60 Irma 2017-08-30 2017 Aug 30 - Sep 12
## 61 Harvey 2017-08-17 2017 Aug 17 - Sep 1
## 62 Mar 2017-03-13 2017 Mar 13- Apr 13
## 63 Otto 2016-11-20 2016 Nov 20-26
## 64 Matthew 2016-09-28 2016 Sep 28 - Oct 9
## 65 Hermine 2016-08-28 2016 Aug 28 - Sep 8
## 66 Colin 2016-06-05 2016 Jun 5-7
## 67 Jan 2016-01-23 2016 Jan 23-24
## 68 Patricia 2015-10-20 2015 Oct 20-24
## 69 Joaquin 2015-09-28 2015 Sep 28 - Oct 8
## 70 Erika 2015-08-24 2015 Aug 24-28
## 71 Chan-hom 2015-06-29 2015 Jun 29 - Jul 13
## 72 Dolphin 2015-05-06 2015 May 6-24
## 73 Jan 2015-01-26 2015 Jan 26-28
## 74 Odile 2014-09-10 2014 Sep 10-19
## 75 Arthur 2014-07-01 2014 Jul 1-9
## 76 Francisco 2013-10-15 2013 Oct 15-26
## 77 Manuel 2013-09-13 2013 Sep 13-20
## 78 Ingrid 2013-09-12 2013 Sep 12-17
## 79 Sandy 2012-10-22 2012 Oct 22-29
## 80 Isaac 2012-08-21 2012 Aug 21 - Sep 3
## 81 Nov 2011-11-04 2011 Nov 4-11
## 82 Lee 2011-09-02 2011 Sep 2-7
## 83 Irene 2011-08-21 2011 Aug 21-30
## Storm.Tide Obs
## 1 Obs w4: >9.3 mhhw Fcst w6: 15-20 mhhw (FEV)
## 2 w2: 5 mhhw
## 3 w1: 3 mhhw
## 4 w1: 3 mhhw
## 5 w3: 6.3 mhhw
## 6 w2: 5.1 mhhw (FEV)
## 7 w3: 8 mhhw
## 8 w1: 3 mhhw
## 9
## 10
## 11 w2: 3-5 agl (TCR,FEV)
## 12 w1: 1-2 agl (TCR,FEV)
## 13 w4: 8-12 agl (TCR,FEV)
## 14
## 15
## 16
## 17
## 18
## 19
## 20 w2: 5 agl (TCR)
## 21 w5: 10-15 agl (TCR,FEV)
## 22 w1: 1-3 agl (TCR)
## 23 w5: 12.5 agl (FEV)
## 24 (FEV)
## 25 w2: 3-6 agl (TCR)
## 26 w5: 9-14 agl (TCR,FEV) (Service)
## 27 w2: 2-4 agl (TCR,FEV)
## 28 w2: 2-4 agl (TCR)
## 29 w1: 1-3 agl (TCR)
## 30
## 31 w1: 3 agl (TCR)
## 32 w4: 6-10 agl (TCR)
## 33 w3: 6-9 agl (TCR)
## 34 w2: 4.2 agl (TCR)
## 35 w3: 5-7 agl (TCR)
## 36 w6: 12-18 agl (TCR)
## 37 w1: 1-2 agl (TCR)
## 38 w2: 3-6 agl (TCR)
## 39
## 40 w2: 3-5 agl (TCR)
## 41 w3: 6.2 agl (TCR)
## 42 w1: 1-2 agl (TCR)
## 43 w3: 5-7 agl (TCR)
## 44 w2: 4-6 agl (TCR)
## 45
## 46 w5: 9-14 agl (TCR,FEV) (Service)
## 47
## 48 (FEV)
## 49 w1: 1-3 agl (TCR)
## 50 w4: 8-11 agl (TCR,FEV) (Service)
## 51 (FEV)
## 52
## 53 w1: 1-3 agl (TCR)
## 54 (FEV)
## 55
## 56 (FEV)
## 57 w3: 6-9 agl (TCR,FEV)
## 58 w3: 6-9 agl (TCR,FEV)
## 59 w1: < 3 agl (TCR,FEV)
## 60 w3: 5-8 agl (TCR,FEV)
## 61 w4: 6-10 agl (TCR,FEV) Service
## 62 (FEV)
## 63
## 64 w3: 5-7 agl (TCR,FEV) Service
## 65 w3: 4-7 agl (TCR,FEV)
## 66 w1: < 3 agl (TCR)
## 67 w2: 3.5 mhhw Service
## 68
## 69 w2: 3-4 agl (TCR,FEV)
## 70
## 71
## 72
## 73 w2: 3.4 mhhw (FEV)
## 74
## 75 w2: 3-5 agl (TCR)
## 76
## 77
## 78
## 79 w3: 4-9 agl (TCR,FEV) Service
## 80 w6: 10-17 agl (TCR)
## 81
## 82 w2: 4.76 (TCR,FEV)
## 83 w4: 8-11 (TCR,FEV) Service
## Guidance Cat.Pres.Dead..bn
## 1 Warn-A6 (1-8) (Cat4, 938, 221+, $27.5+)
## 2 Warn-A5 (1-15) (Cat2, 972, 0, $1.5)
## 3 PS-Adv 43,46 (Cat4, 949, -, -)
## 4 PS-Adv 2-7, 9-11,13,14 (Cat1, 988, -, -)
## 5 P-ETSS (Cat4, 947, -, -)
## 6 Warn-A5 (6-20) (Cat1, 979, 10, >$1)
## 7 Warn-A33 (29-41) (Cat5, 934, 70, >$6.86)
## 8 (TS, 993, 5, $0.179)
## 9 (Cat5, 922, 52+, $16)
## 10 (Typhoon, 903, 0, -)
## 11 Warn-A4 (TS, 981, 0, $0.450)
## 12 Watch-A34 (34-37) (Cat5, 926, 3, $0.050)
## 13 Warn-A15 (5-20) (Cat4, 940, 7, $2.5)
## 14 (Cat4, 939, 2, $0.690)
## 15 (Cat4, 939, -, -)
## 16 (Cat3, 955, -, -)
## 17 (Cat5, 897, 6, $0.250)
## 18 P-ETSS
## 19 P-ETSS
## 20 Warn-A12 (Cat1, 980, 11, $1)
## 21 Warn-A18 (12-33) (Cat5, 937, 161, $113)
## 22 (Cat4, 931, 29, $3.4)
## 23 P-ETSS (Cat1, 965, -, -)
## 24 P-ETSS River Forcing
## 25 Warn-A5 (2) (Cat1, 988, 2, $1.1)
## 26 Warn-A7 (3-18) (Cat4, 929, 107, $75.3)
## 27 Warn-A22 (18-27) (Cat1, 986, 2, $0.7)
## 28 Warn-A28 (24-29) (TS, 991, 7, $1.3)
## 29 Warn-A28 (Cat1, 991, 13, $1.2)
## 30 (Cat4, 917, 84, $1.4)
## 31 Warn-A33 (31-35) (Cat4, 922, 175, $8.3)
## 32 Warn-A11 (9-18) (Cat3, 970, 9, $4.4)
## 33 Warn-A12 (12-22) (Cat4, 953, 6, $3.0)
## 34 Warn-A9 (-) (TS, 993, 1, $0.225)
## 35 Warn-A8 (5-21) (Cat2, 965, 4, $7.3)
## 36 Warn-A23 (19-30) (Cat4, 937, 81, $23.3)
## 37 Warn-A13 (10-16) (Cat1, 991, 0, $0.035)
## 38 Warn-A23 (15-30) (Cat1, 986, 17, $5.03)
## 39 (Cat4, 954, -, -)
## 40 Warn-A11 (8-13) (Cat1, 973, 9, $1.2)
## 41 Warn-A17 (-) (TS, 988, 46, $0.865)
## 42 (TS, 1003, 7, $4.8)
## 43 WarnA-35 (33-53) (Cat5, 910, 84, $5.1)
## 44 WarnA-5 (1-16) (Cat1, 993, 3, $0.9)
## 45 (Typhoon, 895, -, -)
## 46 Warn-A10 (7-17) (Cat5, 919, 74, $25.5)
## 47 (Typhoon, 868, 134, $3.77)
## 48 (Cat1, 995 mb)
## 49 Warn-A5 (3-11) (TS, 996, 3, $0.2)
## 50 Warn-A50 (48-65) (Cat4, 937, 54, $24.2)
## 51 (Cat5, 926, 1, $0.25)
## 52 (Typhoon, 910, 2, $0.637)
## 53 Watch-A2 (1) (TS, 990, 18, $0.125)
## 54 P-ETSS
## 55 (Cat4, 927, 1, $0.253)
## 56 P-ETSS
## 57 Warn-A9 (7-16) (Cat1, 981, 48, $0.787)
## 58 Warn-A42 (35-46) (Cat5, 908, 3,059, $91.6)
## 59 P-ETSS (Cat4, 938, 1, $0.00284)
## 60 Warn-A36 (34-51) (Cat5, 914, 52, $77.2)
## 61 Warn-A25 (23-42) (Cat4, 937, 107, $125)
## 62 P-ETSS
## 63 (Cat3, 975, 23, $0.192)
## 64 P-Surge (26-44) (Cat5, 934, 731, $16.5)
## 65 P-Surge (11-24) (Cat1, 981, 4, $0.55)
## 66 P-Surge (26-40) (TS, 1001, 6, $0.001)
## 67 ETSS-2.0 (ET, 983, 55, $0.5)
## 68 (Cat5, 872, 8, $0.463)
## 69 ETSS-2.0 (Cat4, 931, 34, $0.2)
## 70 (TS, 1001, 35, $0.511)
## 71 (Typhoon, 933, 18, $1.59)
## 72 (Typhoon, 918, 1, $0.014)
## 73 ETSS-2.0
## 74 (Cat4, 918, 18, $1.25)
## 75 P-Surge (6-14) (Cat2, 973, 0, $0.040)
## 76 (Typhoon, 920, 0, $0.0002)
## 77 (Cat1, 983, 169, $4.2)
## 78 (Cat1, 983, 32, $1.5)
## 79 P-Surge (21-31) (Cat3, 940, 254, $68.7)
## 80 P-Surge (16-38) (Cat1, 965, 41, $3.1)
## 81 ETSS (ET, 943, 1, $0.024)
## 82 (TS, 986, 18, $2.8)
## 83 P-Surge (20-34) (Cat3, 942, 58, $14.2)
## Area retiredName OceanicBasin MHHW AGL maxAGL
## 1 N.W. FL, GA, AL, TN, KY, VA, WV <NA> <NA> >9.3 <NA> <NA>
## 2 LA, MS <NA> <NA> 5 <NA> <NA>
## 3 HI <NA> cp 3 <NA> <NA>
## 4 HI <NA> cp 3 <NA> <NA>
## 5 AK <NA> <NA> 6.3 <NA> <NA>
## 6 N.W. FL, GA, SC, NC <NA> <NA> 5.1 <NA> <NA>
## 7 TX, LA <NA> <NA> 8 <NA> <NA>
## 8 TX, LA <NA> <NA> 3 <NA> <NA>
## 9 MEX (R) ep <NA> <NA> <NA>
## 10 Guam <NA> wp <NA> <NA> <NA>
## 11 NC, VA, MD, NY, DE <NA> <NA> <NA> 3-5 5
## 12 New England <NA> <NA> <NA> 1-2 2
## 13 N.W. FL <NA> <NA> <NA> 8-12 12
## 14 CA <NA> ep <NA> <NA> <NA>
## 15 HI (R) cp <NA> <NA> <NA>
## 16 HI <NA> cp <NA> <NA> <NA>
## 17 Guam <NA> wp <NA> <NA> <NA>
## 18 <NA> <NA> <NA> <NA> <NA>
## 19 <NA> <NA> <NA> <NA> <NA>
## 20 PR, FL <NA> <NA> <NA> 5 <NA>
## 21 S.W. FL, SC (R) <NA> <NA> 10-15 15
## 22 PR (R) <NA> <NA> 1-3 3
## 23 AK <NA> <NA> <NA> 12.5 <NA>
## 24 MD, DE <NA> <NA> <NA> <NA> <NA>
## 25 N. TX <NA> <NA> <NA> 3-6 6
## 26 New Orleans (R) <NA> <NA> 9-14 14
## 27 New England <NA> <NA> <NA> 2-4 4
## 28 N.W. FL <NA> <NA> <NA> 2-4 4
## 29 East Coast <NA> <NA> <NA> 1-3 3
## 30 NIC, HND, COL, SLV, GTM, PAN (R) <NA> <NA> <NA> <NA>
## 31 FL (Tampa Bay) (R) <NA> <NA> 3 <NA>
## 32 New Orleans <NA> <NA> <NA> 6-10 10
## 33 LA <NA> <NA> <NA> 6-9 9
## 34 TX, LA, MS <NA> <NA> <NA> 4.2 <NA>
## 35 New Orleans <NA> <NA> <NA> 5-7 7
## 36 LA (R) <NA> <NA> 12-18 18
## 37 New Orleans <NA> <NA> <NA> 1-2 2
## 38 East Coast <NA> <NA> <NA> 3-6 6
## 39 HI <NA> cp <NA> <NA> <NA>
## 40 S. TX <NA> <NA> <NA> 3-5 5
## 41 LA <NA> <NA> <NA> 6.2 <NA>
## 42 TX, LA <NA> <NA> <NA> 1-2 2
## 43 FL, NC (R) <NA> <NA> 5-7 7
## 44 LA <NA> <NA> <NA> 4-6 6
## 45 Guam <NA> wp <NA> <NA> <NA>
## 46 US Gulf Coast (R) <NA> <NA> 9-14 14
## 47 Guam <NA> wp <NA> <NA> <NA>
## 48 PR - (Rain) <NA> <NA> <NA> <NA> <NA>
## 49 MS, AR, FL Keys <NA> <NA> <NA> 1-3 3
## 50 Eastern US (R) <NA> <NA> 8-11 11
## 51 HI <NA> cp <NA> <NA> <NA>
## 52 Guam <NA> wp <NA> <NA> <NA>
## 53 MS <NA> <NA> <NA> 1-3 3
## 54 DE, NJ, NY, CT, RI, VT, ME <NA> <NA> <NA> <NA> <NA>
## 55 Am Samoa <NA> sp <NA> <NA> <NA>
## 56 MA <NA> <NA> <NA> <NA> <NA>
## 57 AL, LA, MS (R) <NA> <NA> 6-9 9
## 58 PR (R) <NA> <NA> 6-9 9
## 59 CT, RI, MA <NA> <NA> <NA> < 3 3
## 60 FL, PR (R) <NA> <NA> 5-8 8
## 61 TX, LA (R) <NA> <NA> 6-10 10
## 62 NY <NA> <NA> <NA> <NA> <NA>
## 63 PAN, CRI, NIC (R) <NA> <NA> <NA> <NA>
## 64 SE US (R) <NA> <NA> 5-7 7
## 65 N.W. FL <NA> <NA> <NA> 4-7 7
## 66 N.W. FL <NA> <NA> <NA> < 3 3
## 67 DE, NJ (Cape May) <NA> <NA> 3.5 <NA> <NA>
## 68 MEX, TX (R) ep <NA> <NA> <NA>
## 69 VA, BHS, BMU, CUB, HTI (R) <NA> <NA> 3-4 4
## 70 DMA, HTI (R) <NA> <NA> <NA> <NA>
## 71 Guam <NA> wp <NA> <NA> <NA>
## 72 Guam <NA> wp <NA> <NA> <NA>
## 73 CT, MA <NA> <NA> 3.4 <NA> <NA>
## 74 MEX (R) ep <NA> <NA> <NA>
## 75 NC <NA> <NA> <NA> 3-5 5
## 76 Guam <NA> wp <NA> <NA> <NA>
## 77 MEX (R) ep <NA> <NA> <NA>
## 78 MEX (R) <NA> <NA> <NA> <NA>
## 79 Mid-Atlantic, NE US (R) <NA> <NA> 4-9 9
## 80 LA <NA> <NA> <NA> 10-17 17
## 81 AK <NA> <NA> <NA> <NA> <NA>
## 82 LA, MS <NA> <NA> <NA> <NA> <NA>
## 83 Mid-Atlantic, NE US (R) <NA> <NA> <NA> <NA>
stormdata <- stormdata %>%
mutate(Cat.Pres.Dead..bn = str_extract(Cat.Pres.Dead..bn, "(?<=\\().*(?=\\))")) %>%
separate(Cat.Pres.Dead..bn, into = c("Category","Pressure","Dead","Cost"), sep = ",") %>%
mutate(MinCost = str_extract(Cost,"(?<=\\$).*")) %>%
mutate(MinCost = str_replace(MinCost,"\\+$","")) %>%
mutate(MinCost = as.numeric(MinCost) * 1000000000) %>%
mutate(Pressure = str_replace(Pressure,"mb","")) %>%
mutate(Pressure = as.numeric(Pressure)) %>%
mutate(Dead = as.integer(Dead))
stormdata
## Storm startDate year Date
## 1 Helene 2024-09-23 2024 Sep 23-29
## 2 Francine 2024-09-12 2024 Sep 12
## 3 Gilma 2024-08-30 2024 Aug 30
## 4 Hone 2024-08-25 2024 Aug 25
## 5 Post- Typhoon Ampil 2024-08-22 2024 Aug 22
## 6 Debby 2024-08-05 2024 Aug 5,6
## 7 Beryl 2024-07-08 2024 Jul 8
## 8 Alberto 2024-06-19 2024 Jun 19,20
## 9 Otis 2023-10-22 2023 Oct 22-25
## 10 Bolaven 2023-10-06 2023 Oct 6-14
## 11 Ophelia 2023-09-22 2023 Sep 22-24
## 12 Lee 2023-09-13 2023 Sep 13,14
## 13 Idalia 2023-08-27 2023 Aug 27-31
## 14 Hillary 2023-08-16 2023 Aug 16-22
## 15 Dora 2023-07-31 2023 Jul 31 - Aug 21
## 16 Calvin 2023-07-11 2023 Jul 11-19
## 17 Mawar 2023-05-19 2023 May 19 - Jun 3
## 18 Jan28 2023-01-28 2023 Jan 28,29
## 19 Jan14 2023-01-14 2023 Jan 14-17
## 20 Nicole 2022-11-07 2022 Nov 7-11
## 21 Ian 2022-09-26 2022 Sep 26-30
## 22 Fiona 2022-09-14 2022 Sep 14-27
## 23 Post- Typhoon Merbok 2022-09-14 2022 Sep 14-19
## 24 Oct 2021-10-28 2021 Oct 28
## 25 Nicholas 2021-09-12 2021 Sep 12-20
## 26 Ida 2021-08-27 2021 Aug 27-30
## 27 Henri 2021-08-20 2021 Aug 20-22
## 28 Fred 2021-08-15 2021 Aug 15,16
## 29 Elsa 2021-06-30 2021 Jun 30 - Jul 10
## 30 Iota 2020-11-13 2020 Nov 13-18
## 31 Eta 2020-11-08 2020 Nov 8,9
## 32 Zeta 2020-10-26 2020 Oct 26-29
## 33 Delta 2020-10-07 2020 Oct 7-10
## 34 Beta 2020-09-17 2020 Sep 17-25
## 35 Sally 2020-09-12 2020 Sep 12-16
## 36 Laura 2020-08-24 2020 Aug 24-27
## 37 Marco 2020-08-22 2020 Aug 22-24
## 38 Isaias 2020-07-31 2020 Jul 31 - Aug 4
## 39 Douglas 2020-07-20 2020 Jul 20-30
## 40 Hanna 2020-07-24 2020 Jul 24-26
## 41 Cristobal 2020-06-01 2020 Jun 1-12
## 42 Imelda 2019-09-17 2019 Sep 17-19
## 43 Dorian 2019-09-01 2019 Sep 1-6
## 44 Barry 2019-07-10 2019 Jul 10-14
## 45 Wutip 2019-02-18 2019 Feb 18 - Mar 2
## 46 Michael 2018-10-07 2018 Oct 7-11
## 47 Mangkhut 2018-09-06 2018 Sep 6-17
## 48 Isaac 2018-09-07 2018 Sep 7-15
## 49 Gordon 2018-09-03 2018 Sep 3-8
## 50 Florence 2018-08-31 2018 Aug 31 - Sep 8
## 51 Lane 2018-08-15 2018 Aug 15-29
## 52 Maria 2018-07-03 2018 Jul 3-12
## 53 Alberto 2018-05-25 2018 May 25 - Jun 1
## 54 Mar1 2018-02-28 2018 Feb 28 - Mar 1
## 55 Gita 2018-02-03 2018 Feb 3-22
## 56 Jan17 2018-01-03 2018 Jan 3-17
## 57 Nate 2017-10-04 2017 Oct 4-9
## 58 Maria 2017-09-16 2017 Sep 16-30
## 59 Jose 2017-09-05 2017 Sep 5-25
## 60 Irma 2017-08-30 2017 Aug 30 - Sep 12
## 61 Harvey 2017-08-17 2017 Aug 17 - Sep 1
## 62 Mar 2017-03-13 2017 Mar 13- Apr 13
## 63 Otto 2016-11-20 2016 Nov 20-26
## 64 Matthew 2016-09-28 2016 Sep 28 - Oct 9
## 65 Hermine 2016-08-28 2016 Aug 28 - Sep 8
## 66 Colin 2016-06-05 2016 Jun 5-7
## 67 Jan 2016-01-23 2016 Jan 23-24
## 68 Patricia 2015-10-20 2015 Oct 20-24
## 69 Joaquin 2015-09-28 2015 Sep 28 - Oct 8
## 70 Erika 2015-08-24 2015 Aug 24-28
## 71 Chan-hom 2015-06-29 2015 Jun 29 - Jul 13
## 72 Dolphin 2015-05-06 2015 May 6-24
## 73 Jan 2015-01-26 2015 Jan 26-28
## 74 Odile 2014-09-10 2014 Sep 10-19
## 75 Arthur 2014-07-01 2014 Jul 1-9
## 76 Francisco 2013-10-15 2013 Oct 15-26
## 77 Manuel 2013-09-13 2013 Sep 13-20
## 78 Ingrid 2013-09-12 2013 Sep 12-17
## 79 Sandy 2012-10-22 2012 Oct 22-29
## 80 Isaac 2012-08-21 2012 Aug 21 - Sep 3
## 81 Nov 2011-11-04 2011 Nov 4-11
## 82 Lee 2011-09-02 2011 Sep 2-7
## 83 Irene 2011-08-21 2011 Aug 21-30
## Storm.Tide Obs
## 1 Obs w4: >9.3 mhhw Fcst w6: 15-20 mhhw (FEV)
## 2 w2: 5 mhhw
## 3 w1: 3 mhhw
## 4 w1: 3 mhhw
## 5 w3: 6.3 mhhw
## 6 w2: 5.1 mhhw (FEV)
## 7 w3: 8 mhhw
## 8 w1: 3 mhhw
## 9
## 10
## 11 w2: 3-5 agl (TCR,FEV)
## 12 w1: 1-2 agl (TCR,FEV)
## 13 w4: 8-12 agl (TCR,FEV)
## 14
## 15
## 16
## 17
## 18
## 19
## 20 w2: 5 agl (TCR)
## 21 w5: 10-15 agl (TCR,FEV)
## 22 w1: 1-3 agl (TCR)
## 23 w5: 12.5 agl (FEV)
## 24 (FEV)
## 25 w2: 3-6 agl (TCR)
## 26 w5: 9-14 agl (TCR,FEV) (Service)
## 27 w2: 2-4 agl (TCR,FEV)
## 28 w2: 2-4 agl (TCR)
## 29 w1: 1-3 agl (TCR)
## 30
## 31 w1: 3 agl (TCR)
## 32 w4: 6-10 agl (TCR)
## 33 w3: 6-9 agl (TCR)
## 34 w2: 4.2 agl (TCR)
## 35 w3: 5-7 agl (TCR)
## 36 w6: 12-18 agl (TCR)
## 37 w1: 1-2 agl (TCR)
## 38 w2: 3-6 agl (TCR)
## 39
## 40 w2: 3-5 agl (TCR)
## 41 w3: 6.2 agl (TCR)
## 42 w1: 1-2 agl (TCR)
## 43 w3: 5-7 agl (TCR)
## 44 w2: 4-6 agl (TCR)
## 45
## 46 w5: 9-14 agl (TCR,FEV) (Service)
## 47
## 48 (FEV)
## 49 w1: 1-3 agl (TCR)
## 50 w4: 8-11 agl (TCR,FEV) (Service)
## 51 (FEV)
## 52
## 53 w1: 1-3 agl (TCR)
## 54 (FEV)
## 55
## 56 (FEV)
## 57 w3: 6-9 agl (TCR,FEV)
## 58 w3: 6-9 agl (TCR,FEV)
## 59 w1: < 3 agl (TCR,FEV)
## 60 w3: 5-8 agl (TCR,FEV)
## 61 w4: 6-10 agl (TCR,FEV) Service
## 62 (FEV)
## 63
## 64 w3: 5-7 agl (TCR,FEV) Service
## 65 w3: 4-7 agl (TCR,FEV)
## 66 w1: < 3 agl (TCR)
## 67 w2: 3.5 mhhw Service
## 68
## 69 w2: 3-4 agl (TCR,FEV)
## 70
## 71
## 72
## 73 w2: 3.4 mhhw (FEV)
## 74
## 75 w2: 3-5 agl (TCR)
## 76
## 77
## 78
## 79 w3: 4-9 agl (TCR,FEV) Service
## 80 w6: 10-17 agl (TCR)
## 81
## 82 w2: 4.76 (TCR,FEV)
## 83 w4: 8-11 (TCR,FEV) Service
## Guidance Category Pressure Dead Cost
## 1 Warn-A6 (1-8) Cat4 938 NA $27.5+
## 2 Warn-A5 (1-15) Cat2 972 0 $1.5
## 3 PS-Adv 43,46 Cat4 949 NA -
## 4 PS-Adv 2-7, 9-11,13,14 Cat1 988 NA -
## 5 P-ETSS Cat4 947 NA -
## 6 Warn-A5 (6-20) Cat1 979 10 >$1
## 7 Warn-A33 (29-41) Cat5 934 70 >$6.86
## 8 TS 993 5 $0.179
## 9 Cat5 922 NA $16
## 10 Typhoon 903 0 -
## 11 Warn-A4 TS 981 0 $0.450
## 12 Watch-A34 (34-37) Cat5 926 3 $0.050
## 13 Warn-A15 (5-20) Cat4 940 7 $2.5
## 14 Cat4 939 2 $0.690
## 15 Cat4 939 NA -
## 16 Cat3 955 NA -
## 17 Cat5 897 6 $0.250
## 18 P-ETSS <NA> NA NA <NA>
## 19 P-ETSS <NA> NA NA <NA>
## 20 Warn-A12 Cat1 980 11 $1
## 21 Warn-A18 (12-33) Cat5 937 161 $113
## 22 Cat4 931 29 $3.4
## 23 P-ETSS Cat1 965 NA -
## 24 P-ETSS River Forcing <NA> NA NA <NA>
## 25 Warn-A5 (2) Cat1 988 2 $1.1
## 26 Warn-A7 (3-18) Cat4 929 107 $75.3
## 27 Warn-A22 (18-27) Cat1 986 2 $0.7
## 28 Warn-A28 (24-29) TS 991 7 $1.3
## 29 Warn-A28 Cat1 991 13 $1.2
## 30 Cat4 917 84 $1.4
## 31 Warn-A33 (31-35) Cat4 922 175 $8.3
## 32 Warn-A11 (9-18) Cat3 970 9 $4.4
## 33 Warn-A12 (12-22) Cat4 953 6 $3.0
## 34 Warn-A9 (-) TS 993 1 $0.225
## 35 Warn-A8 (5-21) Cat2 965 4 $7.3
## 36 Warn-A23 (19-30) Cat4 937 81 $23.3
## 37 Warn-A13 (10-16) Cat1 991 0 $0.035
## 38 Warn-A23 (15-30) Cat1 986 17 $5.03
## 39 Cat4 954 NA -
## 40 Warn-A11 (8-13) Cat1 973 9 $1.2
## 41 Warn-A17 (-) TS 988 46 $0.865
## 42 TS 1003 7 $4.8
## 43 WarnA-35 (33-53) Cat5 910 84 $5.1
## 44 WarnA-5 (1-16) Cat1 993 3 $0.9
## 45 Typhoon 895 NA -
## 46 Warn-A10 (7-17) Cat5 919 74 $25.5
## 47 Typhoon 868 134 $3.77
## 48 Cat1 995 NA <NA>
## 49 Warn-A5 (3-11) TS 996 3 $0.2
## 50 Warn-A50 (48-65) Cat4 937 54 $24.2
## 51 Cat5 926 1 $0.25
## 52 Typhoon 910 2 $0.637
## 53 Watch-A2 (1) TS 990 18 $0.125
## 54 P-ETSS <NA> NA NA <NA>
## 55 Cat4 927 1 $0.253
## 56 P-ETSS <NA> NA NA <NA>
## 57 Warn-A9 (7-16) Cat1 981 48 $0.787
## 58 Warn-A42 (35-46) Cat5 908 3 059
## 59 P-ETSS Cat4 938 1 $0.00284
## 60 Warn-A36 (34-51) Cat5 914 52 $77.2
## 61 Warn-A25 (23-42) Cat4 937 107 $125
## 62 P-ETSS <NA> NA NA <NA>
## 63 Cat3 975 23 $0.192
## 64 P-Surge (26-44) Cat5 934 731 $16.5
## 65 P-Surge (11-24) Cat1 981 4 $0.55
## 66 P-Surge (26-40) TS 1001 6 $0.001
## 67 ETSS-2.0 ET 983 55 $0.5
## 68 Cat5 872 8 $0.463
## 69 ETSS-2.0 Cat4 931 34 $0.2
## 70 TS 1001 35 $0.511
## 71 Typhoon 933 18 $1.59
## 72 Typhoon 918 1 $0.014
## 73 ETSS-2.0 <NA> NA NA <NA>
## 74 Cat4 918 18 $1.25
## 75 P-Surge (6-14) Cat2 973 0 $0.040
## 76 Typhoon 920 0 $0.0002
## 77 Cat1 983 169 $4.2
## 78 Cat1 983 32 $1.5
## 79 P-Surge (21-31) Cat3 940 254 $68.7
## 80 P-Surge (16-38) Cat1 965 41 $3.1
## 81 ETSS ET 943 1 $0.024
## 82 TS 986 18 $2.8
## 83 P-Surge (20-34) Cat3 942 58 $14.2
## Area retiredName OceanicBasin MHHW AGL maxAGL
## 1 N.W. FL, GA, AL, TN, KY, VA, WV <NA> <NA> >9.3 <NA> <NA>
## 2 LA, MS <NA> <NA> 5 <NA> <NA>
## 3 HI <NA> cp 3 <NA> <NA>
## 4 HI <NA> cp 3 <NA> <NA>
## 5 AK <NA> <NA> 6.3 <NA> <NA>
## 6 N.W. FL, GA, SC, NC <NA> <NA> 5.1 <NA> <NA>
## 7 TX, LA <NA> <NA> 8 <NA> <NA>
## 8 TX, LA <NA> <NA> 3 <NA> <NA>
## 9 MEX (R) ep <NA> <NA> <NA>
## 10 Guam <NA> wp <NA> <NA> <NA>
## 11 NC, VA, MD, NY, DE <NA> <NA> <NA> 3-5 5
## 12 New England <NA> <NA> <NA> 1-2 2
## 13 N.W. FL <NA> <NA> <NA> 8-12 12
## 14 CA <NA> ep <NA> <NA> <NA>
## 15 HI (R) cp <NA> <NA> <NA>
## 16 HI <NA> cp <NA> <NA> <NA>
## 17 Guam <NA> wp <NA> <NA> <NA>
## 18 <NA> <NA> <NA> <NA> <NA>
## 19 <NA> <NA> <NA> <NA> <NA>
## 20 PR, FL <NA> <NA> <NA> 5 <NA>
## 21 S.W. FL, SC (R) <NA> <NA> 10-15 15
## 22 PR (R) <NA> <NA> 1-3 3
## 23 AK <NA> <NA> <NA> 12.5 <NA>
## 24 MD, DE <NA> <NA> <NA> <NA> <NA>
## 25 N. TX <NA> <NA> <NA> 3-6 6
## 26 New Orleans (R) <NA> <NA> 9-14 14
## 27 New England <NA> <NA> <NA> 2-4 4
## 28 N.W. FL <NA> <NA> <NA> 2-4 4
## 29 East Coast <NA> <NA> <NA> 1-3 3
## 30 NIC, HND, COL, SLV, GTM, PAN (R) <NA> <NA> <NA> <NA>
## 31 FL (Tampa Bay) (R) <NA> <NA> 3 <NA>
## 32 New Orleans <NA> <NA> <NA> 6-10 10
## 33 LA <NA> <NA> <NA> 6-9 9
## 34 TX, LA, MS <NA> <NA> <NA> 4.2 <NA>
## 35 New Orleans <NA> <NA> <NA> 5-7 7
## 36 LA (R) <NA> <NA> 12-18 18
## 37 New Orleans <NA> <NA> <NA> 1-2 2
## 38 East Coast <NA> <NA> <NA> 3-6 6
## 39 HI <NA> cp <NA> <NA> <NA>
## 40 S. TX <NA> <NA> <NA> 3-5 5
## 41 LA <NA> <NA> <NA> 6.2 <NA>
## 42 TX, LA <NA> <NA> <NA> 1-2 2
## 43 FL, NC (R) <NA> <NA> 5-7 7
## 44 LA <NA> <NA> <NA> 4-6 6
## 45 Guam <NA> wp <NA> <NA> <NA>
## 46 US Gulf Coast (R) <NA> <NA> 9-14 14
## 47 Guam <NA> wp <NA> <NA> <NA>
## 48 PR - (Rain) <NA> <NA> <NA> <NA> <NA>
## 49 MS, AR, FL Keys <NA> <NA> <NA> 1-3 3
## 50 Eastern US (R) <NA> <NA> 8-11 11
## 51 HI <NA> cp <NA> <NA> <NA>
## 52 Guam <NA> wp <NA> <NA> <NA>
## 53 MS <NA> <NA> <NA> 1-3 3
## 54 DE, NJ, NY, CT, RI, VT, ME <NA> <NA> <NA> <NA> <NA>
## 55 Am Samoa <NA> sp <NA> <NA> <NA>
## 56 MA <NA> <NA> <NA> <NA> <NA>
## 57 AL, LA, MS (R) <NA> <NA> 6-9 9
## 58 PR (R) <NA> <NA> 6-9 9
## 59 CT, RI, MA <NA> <NA> <NA> < 3 3
## 60 FL, PR (R) <NA> <NA> 5-8 8
## 61 TX, LA (R) <NA> <NA> 6-10 10
## 62 NY <NA> <NA> <NA> <NA> <NA>
## 63 PAN, CRI, NIC (R) <NA> <NA> <NA> <NA>
## 64 SE US (R) <NA> <NA> 5-7 7
## 65 N.W. FL <NA> <NA> <NA> 4-7 7
## 66 N.W. FL <NA> <NA> <NA> < 3 3
## 67 DE, NJ (Cape May) <NA> <NA> 3.5 <NA> <NA>
## 68 MEX, TX (R) ep <NA> <NA> <NA>
## 69 VA, BHS, BMU, CUB, HTI (R) <NA> <NA> 3-4 4
## 70 DMA, HTI (R) <NA> <NA> <NA> <NA>
## 71 Guam <NA> wp <NA> <NA> <NA>
## 72 Guam <NA> wp <NA> <NA> <NA>
## 73 CT, MA <NA> <NA> 3.4 <NA> <NA>
## 74 MEX (R) ep <NA> <NA> <NA>
## 75 NC <NA> <NA> <NA> 3-5 5
## 76 Guam <NA> wp <NA> <NA> <NA>
## 77 MEX (R) ep <NA> <NA> <NA>
## 78 MEX (R) <NA> <NA> <NA> <NA>
## 79 Mid-Atlantic, NE US (R) <NA> <NA> 4-9 9
## 80 LA <NA> <NA> <NA> 10-17 17
## 81 AK <NA> <NA> <NA> <NA> <NA>
## 82 LA, MS <NA> <NA> <NA> <NA> <NA>
## 83 Mid-Atlantic, NE US (R) <NA> <NA> <NA> <NA>
## MinCost
## 1 2.75e+10
## 2 1.50e+09
## 3 NA
## 4 NA
## 5 NA
## 6 1.00e+09
## 7 6.86e+09
## 8 1.79e+08
## 9 1.60e+10
## 10 NA
## 11 4.50e+08
## 12 5.00e+07
## 13 2.50e+09
## 14 6.90e+08
## 15 NA
## 16 NA
## 17 2.50e+08
## 18 NA
## 19 NA
## 20 1.00e+09
## 21 1.13e+11
## 22 3.40e+09
## 23 NA
## 24 NA
## 25 1.10e+09
## 26 7.53e+10
## 27 7.00e+08
## 28 1.30e+09
## 29 1.20e+09
## 30 1.40e+09
## 31 8.30e+09
## 32 4.40e+09
## 33 3.00e+09
## 34 2.25e+08
## 35 7.30e+09
## 36 2.33e+10
## 37 3.50e+07
## 38 5.03e+09
## 39 NA
## 40 1.20e+09
## 41 8.65e+08
## 42 4.80e+09
## 43 5.10e+09
## 44 9.00e+08
## 45 NA
## 46 2.55e+10
## 47 3.77e+09
## 48 NA
## 49 2.00e+08
## 50 2.42e+10
## 51 2.50e+08
## 52 6.37e+08
## 53 1.25e+08
## 54 NA
## 55 2.53e+08
## 56 NA
## 57 7.87e+08
## 58 NA
## 59 2.84e+06
## 60 7.72e+10
## 61 1.25e+11
## 62 NA
## 63 1.92e+08
## 64 1.65e+10
## 65 5.50e+08
## 66 1.00e+06
## 67 5.00e+08
## 68 4.63e+08
## 69 2.00e+08
## 70 5.11e+08
## 71 1.59e+09
## 72 1.40e+07
## 73 NA
## 74 1.25e+09
## 75 4.00e+07
## 76 2.00e+05
## 77 4.20e+09
## 78 1.50e+09
## 79 6.87e+10
## 80 3.10e+09
## 81 2.40e+07
## 82 2.80e+09
## 83 1.42e+10
stormdata <- stormdata %>%
separate_rows(Area,sep = ", ")
Analyze_storm_data <- stormdata %>%
select(Storm, Area, startDate, year, Category, Pressure, Dead, maxAGL, MinCost) %>%
drop_na(maxAGL)
print(Analyze_storm_data,n=100)
## # A tibble: 56 × 9
## Storm Area startDate year Category Pressure Dead maxAGL MinCost
## <chr> <chr> <date> <chr> <chr> <dbl> <int> <chr> <dbl>
## 1 Ophelia NC 2023-09-22 2023 TS 981 0 5 4.50e 8
## 2 Ophelia VA 2023-09-22 2023 TS 981 0 5 4.50e 8
## 3 Ophelia MD 2023-09-22 2023 TS 981 0 5 4.50e 8
## 4 Ophelia NY 2023-09-22 2023 TS 981 0 5 4.50e 8
## 5 Ophelia DE 2023-09-22 2023 TS 981 0 5 4.50e 8
## 6 Lee New England 2023-09-13 2023 Cat5 926 3 2 5 e 7
## 7 Idalia N.W. FL 2023-08-27 2023 Cat4 940 7 12 2.5 e 9
## 8 Ian S.W. FL 2022-09-26 2022 Cat5 937 161 15 1.13e11
## 9 Ian SC 2022-09-26 2022 Cat5 937 161 15 1.13e11
## 10 Fiona PR 2022-09-14 2022 Cat4 931 29 3 3.4 e 9
## 11 Nicholas N. TX 2021-09-12 2021 Cat1 988 2 6 1.10e 9
## 12 Ida New Orleans 2021-08-27 2021 Cat4 929 107 14 7.53e10
## 13 Henri New England 2021-08-20 2021 Cat1 986 2 4 7 e 8
## 14 Fred N.W. FL 2021-08-15 2021 TS 991 7 4 1.3 e 9
## 15 Elsa East Coast 2021-06-30 2021 Cat1 991 13 3 1.20e 9
## 16 Zeta New Orleans 2020-10-26 2020 Cat3 970 9 10 4.40e 9
## 17 Delta LA 2020-10-07 2020 Cat4 953 6 9 3 e 9
## 18 Sally New Orleans 2020-09-12 2020 Cat2 965 4 7 7.30e 9
## 19 Laura LA 2020-08-24 2020 Cat4 937 81 18 2.33e10
## 20 Marco New Orleans 2020-08-22 2020 Cat1 991 0 2 3.5 e 7
## 21 Isaias East Coast 2020-07-31 2020 Cat1 986 17 6 5.03e 9
## 22 Hanna S. TX 2020-07-24 2020 Cat1 973 9 5 1.20e 9
## 23 Imelda TX 2019-09-17 2019 TS 1003 7 2 4.80e 9
## 24 Imelda LA 2019-09-17 2019 TS 1003 7 2 4.80e 9
## 25 Dorian FL 2019-09-01 2019 Cat5 910 84 7 5.10e 9
## 26 Dorian NC 2019-09-01 2019 Cat5 910 84 7 5.10e 9
## 27 Barry LA 2019-07-10 2019 Cat1 993 3 6 9 e 8
## 28 Michael US Gulf Co… 2018-10-07 2018 Cat5 919 74 14 2.55e10
## 29 Gordon MS 2018-09-03 2018 TS 996 3 3 2 e 8
## 30 Gordon AR 2018-09-03 2018 TS 996 3 3 2 e 8
## 31 Gordon FL Keys 2018-09-03 2018 TS 996 3 3 2 e 8
## 32 Florence Eastern US 2018-08-31 2018 Cat4 937 54 11 2.42e10
## 33 Alberto MS 2018-05-25 2018 TS 990 18 3 1.25e 8
## 34 Nate AL 2017-10-04 2017 Cat1 981 48 9 7.87e 8
## 35 Nate LA 2017-10-04 2017 Cat1 981 48 9 7.87e 8
## 36 Nate MS 2017-10-04 2017 Cat1 981 48 9 7.87e 8
## 37 Maria PR 2017-09-16 2017 Cat5 908 3 9 NA
## 38 Jose CT 2017-09-05 2017 Cat4 938 1 3 2.84e 6
## 39 Jose RI 2017-09-05 2017 Cat4 938 1 3 2.84e 6
## 40 Jose MA 2017-09-05 2017 Cat4 938 1 3 2.84e 6
## 41 Irma FL 2017-08-30 2017 Cat5 914 52 8 7.72e10
## 42 Irma PR 2017-08-30 2017 Cat5 914 52 8 7.72e10
## 43 Harvey TX 2017-08-17 2017 Cat4 937 107 10 1.25e11
## 44 Harvey LA 2017-08-17 2017 Cat4 937 107 10 1.25e11
## 45 Matthew SE US 2016-09-28 2016 Cat5 934 731 7 1.65e10
## 46 Hermine N.W. FL 2016-08-28 2016 Cat1 981 4 7 5.50e 8
## 47 Colin N.W. FL 2016-06-05 2016 TS 1001 6 3 1 e 6
## 48 Joaquin VA 2015-09-28 2015 Cat4 931 34 4 2 e 8
## 49 Joaquin BHS 2015-09-28 2015 Cat4 931 34 4 2 e 8
## 50 Joaquin BMU 2015-09-28 2015 Cat4 931 34 4 2 e 8
## 51 Joaquin CUB 2015-09-28 2015 Cat4 931 34 4 2 e 8
## 52 Joaquin HTI 2015-09-28 2015 Cat4 931 34 4 2 e 8
## 53 Arthur NC 2014-07-01 2014 Cat2 973 0 5 4 e 7
## 54 Sandy Mid-Atlant… 2012-10-22 2012 Cat3 940 254 9 6.87e10
## 55 Sandy NE US 2012-10-22 2012 Cat3 940 254 9 6.87e10
## 56 Isaac LA 2012-08-21 2012 Cat1 965 41 17 3.10e 9
After graphing various relationships between time, maxAGL, Pressure, Dead, and MinCost, most data appears not to exhibit any obvious trends. The exception is the number Dead graphed against Minimum Cost, where it seems to have a positively correlated relationship (costs increase as number dead increases).
Analyze_storm_data %>% ggplot(aes(x = maxAGL,y = MinCost)) + geom_point()
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).
Analyze_storm_data %>% ggplot(aes(x = maxAGL,y = Dead)) + geom_point()
Analyze_storm_data %>% ggplot(aes(x = Dead,y = MinCost)) + geom_point()
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).
Analyze_storm_data %>% ggplot(aes(x = maxAGL,y = Pressure)) + geom_point()
Analyze_storm_data %>% ggplot(aes(x = startDate,y = maxAGL)) + geom_point()
Analyze_storm_data %>% ggplot(aes(x = startDate,y = Pressure)) + geom_point()
Analyze_storm_data %>% ggplot(aes(x = startDate,y = MinCost)) + geom_point()
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).
We can look at number of named hurricanes per geographic area with the following bar chart. However, next steps for data cleaning could be performing a better geographic categorization of the areas, as the format is inconsistent (e.g. format is shown in region of a state, state, region of the US, and even other countries outside of the US). Instead, only a standardized format should be used, like State. This would mean that other non-US countries should be excluded from this dataset, and an understanding of the larger categories (e.g. New England) would have to be developed.
count_area <- Analyze_storm_data %>%
group_by(Area) %>%
summarise(count = n())
count_area %>% ggplot(aes(x = count, y = reorder(Area,count))) +
geom_col() +
labs(title = "Number of Named Storms with Measureable Storm Surge since 2011",
y = "Area")
For my second dataset, I used the Buffalo Monthly Snowfall data from NCAA. This dataset monthly snowfall from 1940 to current day. To create the CSV file, I used BeautifulSoup to scrape the data from the NCAA website directly.
import re
from bs4 import BeautifulSoup
import requests
import pandas as pd
import numpy as np
full_url = "https://www.weather.gov/buf/BuffaloSnow"
print(full_url)
page = requests.get(full_url, headers={"User-Agent": "Mozilla/5.0 (X11; CrOS x86_64 12871.102.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.141 Safari/537.36"}).text
soup = BeautifulSoup(page, "html.parser")
tags = []
index = 0
def returnTrue(tag):
return tag.font;
allData = soup.find_all("tr")
for i in allData:
tags.insert(len(tags),i.get_text().strip())
df = pd.DataFrame(columns = ['SEASON','JUL','AUG','SEP','OCT','NOV','DEC','JAN','FEB','MAR','APR','MAY','JUN','ANNUAL'])
for element in tags:
row = element.strip().split('\n')
#print(row)
if len(row) == 14:
df.loc[len(df)] = row
df.to_csv("buffalosnow.csv",sep =';',index=False)
This dataset appears to be cleaner than the storm dataset. Some potentially problematic areas would be the use of T to indicate “Trace” and the repeated headers located in the data.
buffalosnow <- read.csv("C:\\Users\\Kim\\Documents\\GitHub\\Data607\\Project 2\\buffalosnow.csv", header = TRUE, sep = ";")
buffalosnow
## SEASON JUL AUG SEP OCT NOV DEC JAN FEB MAR APR MAY JUN ANNUAL
## 1 SEASON JUL AUG SEP OCT NOV DEC JAN FEB MAR APR MAY JUN ANNUAL
## 2 1940-41 0 0 0 T 17.5 12.1 17.3 23.1 9.3 T 0 0 79.3
## 3 1941-42 0 0 0 T 5.0 7.8 31.0 28.0 13.7 4.1 0 0 89.6
## 4 1942-43 0 0 0 T 8.7 26.7 16.9 17.7 10.4 5.1 T 0 85.5
## 5 1943-44 0 0 0 1.5 13.6 1.7 3.4 24.6 10.5 2.7 0 0 58.0
## 6 1944-45 0 0 0 0 3.9 35.1 50.6 23.3 5.8 T 2.0 0 120.7
## 7 1945-46 0 0 0 T 25.2 51.1 10.7 23.5 T T 0 0 110.5
## 8 1946-47 0 0 0 0 T 11.9 13.0 22.2 13.5 4.0 0.8 0 65.4
## 9 1947-48 0 0 0 0 9.9 4.3 16.7 7.0 4.2 T T 0 42.1
## 10 1948-49 0 0 0 T 1.3 7.0 11.8 5.2 14.3 0.5 0 0 40.1
## 11 1949-50 0 0 0 0 28.6 9.5 14.8 19.3 13.7 2.8 0 0 88.7
## 12 SEASON JUL AUG SEP OCT NOV DEC JAN FEB MAR APR MAY JUN ANNUAL
## 13 1950-51 0 0 T 0 15.2 22.3 13.7 12.3 7.9 T 0 0 71.4
## 14 1951-52 0 0 0 T 17.7 31.6 16.8 8.9 7.4 0.6 0 0 83.0
## 15 1952-53 0 0 0 0.4 11.2 16.1 8.0 11.9 5.5 2.8 0 0 55.9
## 16 1953-54 0 0 0 0 22.7 10.9 16.2 2.6 26.3 1.2 T 0 89.9
## 17 1954-55 0 0 0 1.0 0.7 31.2 28.5 12.0 11.1 0.3 0 0 84.8
## 18 1955-56 0 0 0 T 26.0 20.3 17.7 14.3 23.1 3.8 T 0 105.2
## 19 1956-57 0 0 T 0 16.2 35.6 36.6 3.1 10.2 12.0 0 0 113.7
## 20 1957-58 0 0 0 T 19.2 4.8 31.1 54.2 11.2 4.2 0 0 124.7
## 21 1958-59 0 0 0 0 13.5 19.0 38.4 14.0 29.2 0.4 T 0 114.5
## 22 1959-60 0 0 0 T 12.2 14.2 18.3 49.5 19.5 1.9 T 0 115.6
## 23 SEASON JUL AUG SEP OCT NOV DEC JAN FEB MAR APR MAY JUN ANNUAL
## 24 1960-61 0 0 0 1.0 15.9 31.9 23.5 11.9 5.1 13.1 T 0 102.4
## 25 1961-62 0 0 0 T 5.6 30.2 26.2 28.2 6.7 4.5 0 0 101.4
## 26 1962-63 0 0 0 2.0 2.5 30.2 31.5 15.5 7.7 0.3 0.1 0 89.8
## 27 1963-64 0 0 0 0 3.1 24.0 13.7 14.6 12.8 3.3 0 0 71.5
## 28 1964-65 0 0 0 T 5.4 15.2 19.2 9.4 17.5 4.2 0 0 70.9
## 29 1965-66 0 0 0 1.2 12.2 7.0 48.0 15.2 11.4 3.2 0.1 0 98.3
## 30 1966-67 0 0 0 0 10.0 12.1 11.6 19.8 10.8 0.6 1.2 0 66.1
## 31 1967-68 0 0 0 T 19.7 10.4 19.1 11.7 10.6 0.1 0 0 71.6
## 32 1968-69 0 0 0 T 11.6 11.7 31.2 12.8 8.0 3.1 0 0 78.4
## 33 1969-70 0 0 0 1.0 22.1 23.4 38.0 21.9 12.6 1.5 T 0 120.5
## 34 SEASON JUL AUG SEP OCT NOV DEC JAN FEB MAR APR MAY JUN ANNUAL
## 35 1970-71 0 0 0 0 2.6 32.3 17.2 19.4 22.6 2.9 0 0 97.0
## 36 1971-72 0 0 0 0 18.7 12.9 27.6 31.4 14.1 5.2 0 0 109.9
## 37 1972-73 0 0 0 3.1 18.9 19.8 9.9 16.1 8.5 2.4 0.1 0 78.8
## 38 1973-74 0 0 0 0 3.0 23.1 19.7 22.8 12.9 7.1 0.1 0 88.7
## 39 1974-75 0 0 0 T 22.1 23.6 11.0 16.3 7.6 15.0 0 0 95.6
## 40 1975-76 0 0 0 T 5.5 27.3 21.6 8.3 17.3 2,.5 T 0 82.5
## 41 1976-77 0 0 0 0.2 31.3 60.7 68.3 22.7 13.5 2.2 0.5 0 199.4
## 42 1977-78 0 0 0 T 15.0 53.4 56.5 21.7 5.8 1.8 0.1 0 154.3
## 43 1978-79 0 0 0 T 3.0 10.1 42.6 28.3 4.6 8.7 0 0 97.3
## 44 1979-80 0 0 0 T 12.6 19.7 10.2 11.7 13.9 0.3 T 0 68.4
## 45 SEASON JUL AUG SEP OCT NOV DEC JAN FEB MAR APR MAY JUN ANNUAL
## 46 1980-81 0 0 0 T 6.7 21.6 14.4 5.0 13.2 T 0 0 60.9
## 47 1981-82 0 0 0 T 1.8 24.8 53.2 12.7 9.0 10.9 0 0 112.4
## 48 1982-83 0 0 0 0 15.8 12.9 9.0 5.5 6.9 2.3 T 0 52.4
## 49 1983-84 0 0 0 T 17.7 52.0 13.4 32.5 16.0 0.9 T 0 132.5
## 50 1984-85 0 0 0 0 1.4 11.2 65.9 20.9 6.3 1.5 0 0 107.2
## 51 1985-86 0 0 0 0 5.2 68.4 17.3 17.3 4.8 1.7 T 0 114.7
## 52 1986-87 0 0 0 0 13.7 4.8 28.5 7.7 10.8 2.0 0 0 67.5
## 53 1987-88 0 0 0 T 0.9 9.8 6.9 31.9 6.1 0.8 0 0 56.4
## 54 1988-89 0 0 0 0.5 0.6 10.8 5.4 29.6 10.1 2.5 7.9 0 67.4
## 55 1989-90 0 0 0 T 7.8 34.8 11.8 28.0 1.4 9.9 T 0 93.7
## 56 SEASON JUL AUG SEP OCT NOV DEC JAN FEB MAR APR MAY JUN ANNUAL
## 57 1990-91 0 0 0 T 0.7 15.4 16.6 16.1 8.5 0.2 T 0 57.5
## 58 1991-92 0 0 0 0.2 18.0 21.4 18.4 7.0 22.8 5.0 0 0 92.8
## 59 1992-93 0 0 0 0.6 13.7 16.5 13.1 19.5 29.3 0.5 0 0 93.2
## 60 1993-94 0 0 0 2.9 4.8 27.9 35.4 21.6 13.2 6.9 0 0 112.7
## 61 1994-95 0 0 0 0 0.9 7.8 23.1 34.6 4.3 3.9 0 0 74.6
## 62 1995-96 0 0 0 0 15.7 61.2 25.3 11.9 24.1 3.2 T 0 141.4
## 63 1996-97 0 0 0 0 11.5 18.9 42.4 9.3 13.4 2.1 0 0 97.6
## 64 1997-98 0 0 0 0.2 16.5 18.2 13.6 1.8 25.3 T 0 0 75.6
## 65 1998-99 0 0 0 0 0.2 11.6 65.1 6.9 15.8 0.9 0 0 100.5
## 66 1999-00 0 0 0 0 0.9 12.7 19.4 16.2 10.7 3.7 0 0 63.6
## 67 SEASON JUL AUG SEP OCT NOV DEC JAN FEB MAR APR MAY JUN ANNUAL
## 68 2000-01 0 0 0 T 45.6 50.3 19.6 9.8 32.8 0.6 0 0 158.7
## 69 2001-02 0 0 0 0.4 0 82.7 13.7 17.2 15.9 2.5 T 0 132.4
## 70 2002-03 0 0 0 T 8.9 35.8 37.4 19.5 6.6 3.1 0 0 111.3
## 71 2003-04 0 0 0 T 4.2 21.6 45.2 5.9 20.7 3.3 0 0 100.9
## 72 2004-05 0 0 0 T 0.2 22.8 37.0 22.3 17.5 9.3 0 0 109.1
## 73 2005-06 0 0 0 0 17.9 20.3 7.1 26.3 6.5 0.1 0 0 78.2
## 74 2006-07 0 0 0 22.6 2.1 7.5 15.5 33.5 5.4 2.3 0 0.0 88.9
## 75 2007-08 0.0 0.0 0.0 0.0 3.4 31.3 17.5 22.5 29.1 T 0.0 0.0 103.8
## 76 2008-09 0.0 0.0 0.0 0.1 6.2 49.2 30.6 11.6 0.5 2.0 0.0 0.0 100.2
## 77 2009-10 0.0 0.0 0.0 T T 25.1 35.6 13.4 T T T 0.0 74.1
## 78 SEASON JUL AUG SEP OCT NOV DEC JAN FEB MAR APR MAY JUN ANNUAL
## 79 2010-11 0.0 0.0 0.0 T 1.6 32.0 29.8 34.1 13.3 1.0 0.0 0.0 111.8
## 80 2011-12 0.0 0.0 0.0 T T 3.8 20.8 9.6 1.5 1.0 0.0 0.0 36.7
## 81 2012-13 0.0 0.0 0.0 0.0 2.4 19.6 6.2 20.5 9.8 0.3 T 0.0 58.8
## 82 2013-14 0.0 0.0 0.0 T 10.0 28.5 42.4 23.3 24.5 1.2 0.0 0.0 129.9
## 83 2014-15 0.0 0.0 0.0 T 20.5 11.3 25.0 46.2 9.6 0.3 0.0 0.0 112.9
## 84 2015-16 0.0 0.0 0.0 T T 1.0 25.6 20.5 0.9 7.1 T 0.0 55.1
## 85 2016-17 0.0 0.0 0.0 T 0.9 30.1 11.0 5.0 26.8 2.3 0.0 0.0 76.1
## 86 2017-18 0.0 0.0 0.0 0.1 0.1 33.7 31.2 18.1 22.2 6.9 0.0 0.0 112.3
## 87 2018-19 0.0 0.0 0.0 0.1 13.4 14.1 63.4 16.8 10.5 0.5 0.0 0.0 118.8
## 88 2019-20 0.0 0.0 0.0 0.0 12.6 16.1 13.2 20.6 2.1 4.3 0.3 0.0 69.2
## 89 SEASON JUL AUG SEP OCT NOV DEC JAN FEB MAR APR MAY JUN ANNUAL
## 90 2020-21 0.0 0.0 0.0 T 0.8 33.0 10.2 27.5 0.6 5.1 T 0.0 77.2
## 91 2021-22 0.0 0.0 0.0 0.0 5.0 5.7 52.1 23.1 9.3 2.2 0.0 0.0 97.4
## 92 2022-23 0.0 0.0 0.0 T 36.9 64.7 14.7 7.3 10.0 T 0.0 0.0 133.6
## 93 2023-24 0.0 0.0 0.0 0.4 1.9 5.7 45.6 6.0 11.2 0.5 0.0 71.3
buffalosnow <- buffalosnow %>% filter(SEASON != "SEASON")
Since T indicates Trace, this is effectively an unmeasureable amount of snowfall. As such, T will be replaced by 0 in the data. There is also one value listed as 2,.5 due to a data entry error (April of the 75-76 season) - this was replaced by 2.5 based on the annual snowfall for that year minus the sum of snowfall for the other months. NA values were dropped as the scraped table included future months.
buffalosnow <- buffalosnow %>%
mutate(across(everything(), ~ str_replace_all(.,"[T]","0"))) %>%
mutate(across(everything(), ~ str_replace_all(.,"\\s","0"))) %>%
drop_na()
buffalosnow$APR[36] <- 2.5
buffalosnow <- buffalosnow %>%
pivot_longer(c(JUL,AUG,SEP,OCT,NOV,DEC,JAN,FEB,MAR,APR,MAY,JUN,ANNUAL), names_to = "TimeCategory", values_to = "Snowfall") %>%
mutate(Snowfall = as.numeric(Snowfall))
buffalosnow
## # A tibble: 1,092 × 3
## SEASON TimeCategory Snowfall
## <chr> <chr> <dbl>
## 1 1940-41 JUL 0
## 2 1940-41 AUG 0
## 3 1940-41 SEP 0
## 4 1940-41 OCT 0
## 5 1940-41 NOV 17.5
## 6 1940-41 DEC 12.1
## 7 1940-41 JAN 17.3
## 8 1940-41 FEB 23.1
## 9 1940-41 MAR 9.3
## 10 1940-41 APR 0
## # ℹ 1,082 more rows
buffalosnow %>% filter(TimeCategory != "ANNUAL") %>%
group_by(TimeCategory) %>%
summarise(averageSnowfall = mean(Snowfall)) %>%
ggplot(aes(x = reorder(TimeCategory, -averageSnowfall), y = averageSnowfall)) +
geom_bar(stat = "identity") +
labs(x = "Month", y = "Average Snowfall in inches",
title = "Months by Average Snowfall in Buffalo")
This project provided a lot of practice using regular expressions. Most of the data cleaning involved regex to extract data from columns that contained more than one piece of information. I also had to remove and replace many data values, such as newline characters and extraneous symbols. The NOAA historic storm dataset in particular was difficult to clean given the extent of untidiness. There were also some parts of the data that were difficult to clean due to the lack of context provided in the dataset and documentation. One example is the conversion of MHHW to AGL, which did not seem like an easy task based on the information I found on Vertical Datums. Moreover, the inconsistently formatted data had proved to be a large obstacle in data cleaning, as many regex mutations on the data had to be performed due to these inconsistencies. In comparison, the NCAA Buffalo snowfall dataset was much more consistent. For next steps in data cleaning, I would propose to better characterize the geographic data in the historic storms data, as discussed in the section of categorical analysis in the NOAA section. I would also propose an additional step in the Buffalo Snowfall data, which would be to add the “correct” year to each row, as Season appears to be listed in a way that ignores calendar year. Each month could be attributed to the correct year, as the July through December months would correspond to the earlier year listed in the season while January through June would correspond to the later year listed in the season. This could be done by a conditional. “Annual” rows could also be dropped from the dataset and recalculated as necessary, but its existence in the dataset does not provide much value as the annual snowfall can be summed up easily if required; there is no reason it needs to be its own row.