Introduction

There was some discussion on the Poll Bludger forum as to whether the Lismore floods were indicative of a trend to increased flooding. The chart used as a source didn’t include flood data past 2017, so it was diffult to tell how the recent flooding related to historic data.

Obtaining data

Data for historical flooding events can be found on the Australian Severe Weather website. The format of the data is quite untidy, so I’ve had to scrape the table and clean the content.

First step is to the webpage and then extract the table into a dataframe using the rvest package.

wilsons_at_lismore <- read_html(url) %>%
  html_element(xpath = "/html/body/table[4]") %>%
  html_table(header = TRUE)

Tidying the data

The table has flood height spread across three columns - Minor, Moderate, Major. These are merged into two columns - type containing the classification of the flooding, and height containing flood level. The height information is extracted using a regular expression, and converted from text to a numeric. Observations without flood height information are dropped.

The day and month fields contain a start and end date for the flood. To simplify handling I’ve extracted the date preceding the month. The day-month is then combined with the year and converted to a Date format.

wilsons_clean <- wilsons_at_lismore %>%
  rename(Year = 2) %>%
  pivot_longer(
    cols = Minor:Major,
    names_to = "type",
    values_to = "height"
  ) %>%
  mutate(
    height = as.numeric(str_extract(height, "[0-9]{1,2}\\.[0-9]{2}")),
    type = as.factor(type)
  ) %>%
  drop_na(height) %>%
  mutate("date" = dmy(paste(str_extract(`Early Years`, "[0-9]+?[dhnrst]{2} [a-zA-Z]*$"), Year)))

Plotting the flood data

wilsons_clean %>%
  ggplot(aes(date, height)) +
  geom_point(aes(color = type)) +
  scale_x_date(
    date_breaks = "20 years",
    date_minor_breaks = "5 years",
    date_labels = "%Y"
  ) +
  ylim(0, 15) +
  labs(
    title = "Lismore Flood Heights",
    subtitle = "Wilsons River at Lismore",
    x = "Year",
    y = "Flood Height (m)",
    caption = "Data: http://australiasevereweather.com"
  )