Australian Frogs

1 A Tidy Tuesday exploration of Australian Frogs

The primary purpose for this exercise is to showcase my R and Quarto skills and some of the tasks demonstrated here are:

  • cleaning data
  • ggplot
  • leaflet (R and Javascript)
  • Use of javascript and basic html

1.1 The sixth annual release of FrogID data was in 2023.

FrogID is a citizen scientist project to help scientists better understand how different frog species are surviving in a changing environment.

Questions put forward by the Tidy Tuesday team are:

  • Are there species that are endemic to certain regions?

  • Do different frog species have distinct calling seasons?

  • Which species has the widest geographic range? Which is the rarest?

Primary citation for FrogID data: Rowley JJL, & Callaghan CT (2020) The FrogID dataset: expert-validated occurrence records of Australia’s frogs collected by citizen scientists. ZooKeys 912: 139-151

Official frog name data: Australian Society of Herpetologists Official List of Australian Species. 2025. http://www.australiansocietyofherpetologists.org/ash-official-list-of-australian-species.

Firstly, we load some libraries.

Code
list.of.packages <- c("tidyverse", "ggplot2", "leaflet", "knitr", "kableExtra", "quarto", "htmlwidgets", "htmltools", "mapview", "webshot", "kimisc", "timeplyr")
new.packages <- list.of.packages[!(list.of.packages %in% installed.packages()[,"Package"])]
if(length(new.packages)) install.packages(new.packages)
lapply(list.of.packages, library, character.only = TRUE)

Now, let’s load and have a look at the raw data

Code
if (!exists("frogID_data")) {
  frogID_data <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2025/2025-09-02/frogID_data.csv')
}
if (!exists("frog_names")) {
  frog_names <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2025/2025-09-02/frog_names.csv')
}
cat("Frog names")
glimpse(frog_names)
cat("\n\nfrogID raw data")
glimpse(frogID_data)
Frog namesRows: 294
Columns: 5
$ subfamily             <chr> "Hylid", "Hylid", "Hylid", "Hylid", "Hylid", "Hy…
$ tribe                 <chr> "Pelodryadidae", "Pelodryadidae", "Pelodryadidae…
$ scientificName        <chr> "Cyclorana", "Cyclorana alboguttata", "Cyclorana…
$ commonName            <chr> "—", "Striped Burrowing Frog", "Northern Snappin…
$ secondary_commonNames <chr> "—", "Green-striped Frog", "Giant Frog", "Short-…


frogID raw dataRows: 136,621
Columns: 11
$ occurrenceID                  <dbl> 12832, 12833, 12834, 12835, 12836, 12837…
$ eventID                       <dbl> 525618, 526341, 526673, 526673, 526673, …
$ decimalLatitude               <dbl> -28.5, -33.7, -28.7, -28.7, -28.7, -30.4…
$ decimalLongitude              <dbl> 153.1, 151.2, 152.7, 152.7, 152.7, 152.8…
$ scientificName                <chr> "Philoria loveridgei", "Heleioporus aust…
$ eventDate                     <date> 2023-01-01, 2023-01-02, 2023-01-02, 202…
$ eventTime                     <time> 11:18:32, 20:39:30, 21:30:07, 21:30:07,…
$ timezone                      <chr> "GMT+1100", "GMT+1100", "GMT+1100", "GMT…
$ coordinateUncertaintyInMeters <dbl> 10000, 10000, 10000, 10000, 10000, 10000…
$ recordedBy                    <dbl> 41480, 834983, 804177, 804177, 804177, 1…
$ stateProvince                 <chr> "New South Wales", "New South Wales", "N…

We can see scientificName joins the tables. Data contains lat/long, date/time (and time zones), state, tribe and subfamily. The field classes appear correctly auto-detected, so the data is probably needs little or no cleaning, but there are omissions in commonName and secondary_commonNames.

In this case local time is more important than universal time, time conversion is unnecessary.

Other relevant questions the data might answer are:

  • Can a link be found to photos of each species?

  • Can a database listing species discoveries be found? Who identified the most? When have the most discoveries happened?

  • Where is the most biodiversity in frogs?

  • Are there any Biome maps that can be compared to species distribution? Are any species generalists?

  • Do the species distributions more closely match biomes or watersheds?


Some data cleaning is needed.

129 records showing “Other Territories” for StateProvince were found to be around Jervis Bay and relabeled New South Wales.

Code
list_of_species_with_no_id_records <- setdiff(frog_names$scientificName, frogID_data$scientificName)
list_of_species_ided_without_a_name_match <- setdiff(frogID_data$scientificName, frog_names$scientificName)
cat("Some notes about cleaning the data")
cat(length(list_of_species_with_no_id_records),"/",length(unique(frog_names$scientificName)), " species from the list of frog names do not appear in the observation data set.\n")
cat("A look over these species show some are extinct (Rheobatrachus silus), some are newly discovered (Philoria knowlesi), some are considered rare and may not have been found during the survey time.\n")

observations <- frogID_data %>% filter(scientificName %in% list_of_species_ided_without_a_name_match)
cat(length(list_of_species_ided_without_a_name_match), " species in the obersavtion data do not appear in the frogName data set, accounting for ", length(observations$scientificName),"/", length(frogID_data$scientificName), " observations.\n")

quicksum <- observations %>% group_by(scientificName) %>% summarise(count = n()) %>% mutate(pc = count/sum(count) * 100)

cat("For our purposes these will be matched by common names.\n") 

frogID_data <- frogID_data %>% mutate(scientificName = ifelse(scientificName == "Limnodynastes dumerilii","Limnodynastes dumerilii dumerilii",scientificName))
frogID_data <- frogID_data %>% mutate(scientificName = ifelse(scientificName == "Litoria verreauxii","Litoria verreauxii verreauxii",scientificName))
frogID_data <- frogID_data %>% mutate(scientificName = ifelse(scientificName == "Cyclorana platycephala","Cyclorana platycephalus",scientificName))
frogID_data <- frogID_data %>% mutate(scientificName = ifelse(scientificName == "Lechriodus fletcheri","Platyplectrum fletcheri",scientificName))
frogID_data <- frogID_data %>% mutate(scientificName = ifelse(scientificName == "Philoria sphagnicola","Philoria sphagnicolus",scientificName))
frogID_data <- frogID_data %>% mutate(scientificName = ifelse(scientificName == "Heleioporus australiacus","Heleioporus australiacus australiacus",scientificName))
frogID_data <- frogID_data %>% mutate(stateProvince = ifelse(stateProvince == "Other Territories","New South Wales",stateProvince))

frog_names <- frog_names %>% na.omit() %>% filter(scientificName != "Uperoleia mjobergii" | secondary_commonNames != "—")

list_of_species_ided_without_a_name_match <- setdiff(frogID_data$scientificName, frog_names$scientificName)

df <- left_join(frog_names, frogID_data, by = c("scientificName")) %>% na.omit()
#df <- df %>% mutate(across(where(is.character), as.factor))
Some notes about cleaning the data113 / 293  species from the list of frog names do not appear in the observation data set.
A look over these species show some are extinct (Rheobatrachus silus), some are newly discovered (Philoria knowlesi), some are considered rare and may not have been found during the survey time.
6  species in the obersavtion data do not appear in the frogName data set, accounting for  9158 / 136621  observations.
For our purposes these will be matched by common names.

1.2 Observation times

The times that observations were made is not even throughout the day. The hours that people are making observations and the hours that frogs are active are two factors affecting this. There does seem to be an effect from Latitude too. Low latitudes might lead to low levels of activity in frogs and people. The large spikes in frog observations at 10am and 8pm might actually be more about the times people actively look for frog than frog activity levels. A dip in observations at 6pm may correspond with the time many people have dinner.

Code
lat_bin <- cut_number(df$decimalLatitude, n = 10)
time_bin <- time_cut(as.numeric(df$eventTime), n = 288)

time_of_day_plot <- ggplot(mapping = aes(x = time_bin, colour = lat_bin, fill = lat_bin)) + geom_bar(position = "stack")
time_of_day_plot <- time_of_day_plot + labs(title = "Histogram of frog observations by time of day", "Colour scale show latitude of the observation", fill = "Latitude\nSouth is Red\nNorth is Magenta", alt = "This is a chart showing increases in observations around 10am and 8pm. There is a small dip at 6:30pm. ") + xlab("Time of day")+ ylab("Number of observations")
time_of_day_plot <- time_of_day_plot + scale_x_time(labels = function(x) format(as_datetime(x, tz = "UTC"), "%H:%M:%S")) + guides(colour = "none")
time_of_day_plot

Code
time_of_day_plot <- ggplot(mapping = aes(x = time_bin, colour = lat_bin, fill = lat_bin)) + geom_bar(position = "fill")
time_of_day_plot <- time_of_day_plot + labs(title = "Proportion of frog observations as given latitudes, by time of day", "Colour scale show latitude of the observation", fill = "Latitude\nSouth is Red\nNorth is Magenta") + xlab("Time of day")+ ylab("Proportion of observations")
time_of_day_plot <- time_of_day_plot + scale_x_time(labels = function(x) format(as_datetime(x, tz = "UTC"), "%H:%M:%S")) + guides(colour = "none")
time_of_day_plot

Code
day_of_year_plot <- df %>% ggplot(mapping = aes(x = eventDate)) + geom_histogram(bins = 100)

Let’s compare scientificName with stateProvince for a coarse examination of whether species are highly endemic.

Code
species_count_states <- df %>% group_by(scientificName, stateProvince, tribe, subfamily) %>% summarise() %>% group_by(scientificName, tribe, subfamily) %>% summarise(Number_of_states = n())

p <- species_count_states %>% ggplot(mapping = aes(x = Number_of_states, fill = subfamily)) + geom_bar()
p <- p + labs(title = "Number of States in which each species was observed") + xlab("Number of States") + ylab("Number of species")
p

Code
most_distrib <- species_count_states %>% filter(Number_of_states == 6)
most_distrib <- frog_names %>% filter(scientificName %in% most_distrib$scientificName)


kable(most_distrib, caption = "Species observed in 6 states")
Species observed in 6 states
subfamily tribe scientificName commonName secondary_commonNames
Myobatrachid Limnodynastidae Limnodynastes dumerilii dumerilii Eastern Banjo Frog Grey-bellied Pobblebonk
Myobatrachid Limnodynastidae Limnodynastes peronii Striped Marsh Frog
Myobatrachid Limnodynastidae Limnodynastes tasmaniensis Spotted Marsh Frog Spotted Grass Frog
Myobatrachid Myobatrachidae Crinia signifera Common Eastern Froglet Clicking Froglet

1.3 Are frogs endemic?

We can see that the majority of species were only observed in one or two states. We see that a sizable subfamily (Microhylidae) were only observed in Queensland.

Even a coarse examination of the data suggests frogs are generally endemic.

Let’s look closer at their distributions.

We need to do a little prep. To make a responsive map showing observation locations we will fetch data on user choice of species. We will create this leaflet map via javascript for extra flexibility.

Code
species_list <- df %>% select(commonName, scientificName) %>% unique() %>% arrange(commonName) %>% mutate(r = row_number()) %>% select(r, commonName, scientificName) %>% rename(cn = commonName, sn = scientificName)
json_list <- jsonlite::toJSON(species_list)

prep_split <- df %>% select(scientificName, eventDate, decimalLatitude, decimalLongitude) %>% mutate(m = month(eventDate)) %>% rename(sn = scientificName, lat = decimalLatitude, lng = decimalLongitude)
prep_split <- left_join(species_list, prep_split, by = join_by(sn)) %>% select(r, m, lat, lng)
splitup <- split(prep_split, prep_split$r, drop = TRUE)
dir.create("databyfrog", showWarnings = TRUE, recursive = FALSE, mode = "0777")
writefrogs <- function(f) {
  fn <- paste0(getwd(), "/databyfrog/frog", f$r[1], ".csv")
  write.csv(f , fn, row.names = FALSE, col.names = TRUE)
}
zzz <- lapply(splitup, writefrogs)





 first_map <- div(tags$select(id="myDropdown"),
                  tags$script(paste0("var ddlist = ", json_list, ";")),
          div(class = "containerleg",
          div(id = "mapgt"),
          div(id="legend", p("Map Legend"),
              div(class="legend-title"),
          )
 )
 )
 first_map

Map Legend

Code
//const Papa = require('papaparse');
const redToGreenColors = [
  {c:'#ff0000', m: "January"},// Pure red
  {c:'#ff6600', m: "February"}, // Orange
  {c:'#ffcc00', m: "March"}, // Yellow
  {c:'#ccff00', m: "April"}, // Yellow-green
  {c:'#66ff00', m: "May"}, // Light green
  {c:'#00ff00', m: "June"},  // Pure green
  {c:'#00ff00', m: "July"},  // Pure green
  {c:'#66ff00', m: "August"}, // Light green
  {c:'#ccff00', m: "September"}, // Yellow-green
  {c:'#ffcc00', m: "October"}, // Yellow
  {c:'#ff6600', m: "November"}, // Orange
  {c:'#ff0000', m: "December"} // Pure red
];
console.log("oh well")
const map = L.map('mapgt').setView([-25.8, 133.2], 4);
var CartoDB_Positron = L.tileLayer('https://{s}.basemaps.cartocdn.com/light_all/{z}/{x}/{y}{r}.png', {
    attribution: '&copy; <a href="https://www.openstreetmap.org/copyright">OpenStreetMap</a> contributors &copy; <a href="https://carto.com/attributions">CARTO</a>',
    subdomains: 'abcd',
    maxZoom: 20
}).addTo(map);

var markergroup = new L.FeatureGroup()
//df.forEach((ele, i) => {
//  L.circle(ele, {radius: 30, fill: "red", color:"red"}).addTo(markergroup);
//  })
  markergroup.addTo(map);

//console.log(df)
console.log("oh well 2")
const dropdown = document.getElementById("myDropdown");
    ddlist.forEach(item => {
      const option = document.createElement("option");
      option.value = item.r; // Set the value attribute of the option
      option.textContent = item.cn + " - " + item.sn; // Set the visible text of the option
      dropdown.appendChild(option);
    });
dropdown.addEventListener('change', newchoice)
async function newchoice(){
  var fn = "databyfrog/frog" + dropdown.value + ".csv"
  fetch(fn)
  .then(response => {
    // Check if the request was successful
    if (!response.ok) {
      throw new Error(`HTTP error! status: ${response.status}`);
    }
    // Return the response as text
    return response.text();
  })
  .then(csvContent => {
    var df = Papa.parse(csvContent, {
        header: true}
        );
        var df2 = df.data.slice(0, -1);
        markergroup.clearLayers();
        df2.forEach(ele => {
        L.circle(ele, {fill: redToGreenColors[ele.m].c, color: redToGreenColors[ele.m].c, rad: 30}).addTo(markergroup)})
        //map.invalidateSize();
        map.flyTo([-25.8, 133.2], 4, { duration: 0.2});
        var bounds = markergroup.getBounds();
        map.flyToBounds(bounds, {
                padding: [20, 20], // Add padding around bounds
                duration: 2,       // Animation duration in seconds
                easeLinearity: 0.1 // Animation easing
            });
        
        

  

  })
  .catch(error => {
    console.error("Error fetching or processing CSV:", error);
  });
}

dropdown.value = 57;
newchoice();

        function createLegend(data, containerId) {
            const container = document.getElementById(containerId);
            container.innerHTML = data.map(item => 
                `<div class="legend-item"><div class="color-box" style="background-color: ${item.c}"></div><span>${item.m}</span></div>`
            ).join('');
        }


   

        createLegend(redToGreenColors, 'legend');

The interactive map above is useful for exploring the data and answer some of the questions raised by the Tidy Tuesday team.

1.3.1 Are species endemic to certain regions?

Some species are endemic, some species are not. Below are some examples

Code
month_colour <- data.frame(
  c = c('#ff0000', '#ff6600', '#ffcc00', '#ccff00', '#66ff00', '#00ff00',
        '#00ff00', '#66ff00', '#ccff00', '#ffcc00', '#ff6600', '#ff0000'),
  m = c("January", "February", "March", "April", "May", "June",
        "July", "August", "September", "October", "November", "December"),
  stringsAsFactors = FALSE
)
wltf <- df %>% filter(scientificName == "Litoria infrafrenata") %>% mutate(m = month(eventDate)) %>% rename(lat = decimalLatitude, lng = decimalLongitude)

satf <- df %>% filter(scientificName == "Litoria calliscelis") %>% mutate(m = month(eventDate)) %>% rename(lat = decimalLatitude, lng = decimalLongitude)

smf <- df %>% filter(scientificName == "Limnodynastes peronii") %>% mutate(m = month(eventDate)) %>% rename(lat = decimalLatitude, lng = decimalLongitude)

min_long <- min(wltf$lng)
max_long <- max(wltf$lng)
min_lat <- min(wltf$lat)
max_lat <- max(wltf$lat)

 leaflet_map4 <- leaflet(wltf) %>% fitBounds(min_long, min_lat, max_long, max_lat) %>%
   addProviderTiles(providers$CartoDB.Positron) %>%  # Positron tiles
   addCircleMarkers(
     lng = ~lng,
     lat = ~lat,
     radius = 3,  # Scale the radius
     color = month_colour[wltf$m,]$c,
     fillColor = month_colour[wltf$m,]$c,
     fillOpacity = 1,
     stroke = TRUE,
     weight = 2
   )
 
 
min_long <- min(satf$lng) -1
max_long <- max(satf$lng) + 1
min_lat <- min(satf$lat) - 1
max_lat <- max(satf$lat) + 1
 
 
 leaflet_map5 <- leaflet(satf) %>% fitBounds(min_long, min_lat, max_long, max_lat) %>%
   addProviderTiles(providers$CartoDB.Positron) %>%  # Positron tiles
   addCircleMarkers(
     lng = ~lng,
     lat = ~lat,
     radius = 3,  # Scale the radius
     color = month_colour[satf$m,]$c,
     fillColor = month_colour[satf$m,]$c,
     fillOpacity = 1,
     stroke = TRUE,
     weight = 2,

   )
 
min_long <- min(smf$lng)
max_long <- max(smf$lng)
min_lat <- min(smf$lat)
max_lat <- max(smf$lat)
 
 
  leaflet_map6 <- leaflet(smf) %>% fitBounds(min_long, min_lat, max_long, max_lat) %>%
   addProviderTiles(providers$CartoDB.Positron) %>%  # Positron tiles
   addCircleMarkers(
     lng = ~lng,
     lat = ~lat,
     radius = 3,  # Scale the radius
     color = month_colour[smf$m,]$c,
     fillColor = month_colour[smf$m,]$c,
     fillOpacity = 1,
     stroke = TRUE,
     weight = 2,

   )
  
maps4 <- mapview::mapshot(
    leaflet_map4,
    file = "map4.png",
    remove_controls = c("zoomControl", "layersControl", "homeButton"),
    vwidth = 250,
    vheight = 250
  )
maps5 <- mapview::mapshot(
    leaflet_map5,
    file = "map5.png",
    remove_controls = c("zoomControl", "layersControl", "homeButton"),
    vwidth = 250,
    vheight = 250
  )
maps6 <- mapview::mapshot(
    leaflet_map6,
    file = "map6.png",
    remove_controls = c("zoomControl", "layersControl", "homeButton"),
    vwidth = 250,
    vheight = 250
  )

 
 side_by_side_maps <- div(
  style = "display: flex; flex-wrap: wrap;",
  div(
    style = "flex: 1; min-width: 250px; padding: 10px;",
    h3("White Lipped Treefrog", style = "text-align: center; margin: 10px 0;"),
   img(src="map4.png")
  ),
  div(
    style = "flex: 1; min-width: 250px; padding: 10px;",
    h3("South Australian Treefrog", style = "text-align: center; margin: 10px 0;"),
   img(src="map5.png")
  ),
    div(
    style = "flex: 1; min-width: 250px; padding: 10px;",
    h3("Striped Marsh Frog", style = "text-align: center; margin: 10px 0;"), br(),
   img(src="map6.png")
  ),  div(style = "padding: 10px;", class="legend-title;", p("Legend"),
            p( id="legend2", style = "minimum-width: 250px; display: flex; flex-wrap: wrap; justify-content: flex-start; gap: 5px;",
                tags$script("createLegend(redToGreenColors, 'legend2');"),
              
          )
  )

)
 
side_by_side_maps

White Lipped Treefrog

South Australian Treefrog

Striped Marsh Frog


Legend

1.3.2 Do different frog species have distinct calling seasons?

Below are three species that are found in similar areas, but the distributions of their sightings makes it clear species call at different times of the year.

Code
moan <- df %>% filter(scientificName == "Heleioporus eyrei") %>% mutate(m = month(eventDate)) %>% rename(lat = decimalLatitude, lng = decimalLongitude)

motorbike <- df %>% filter(scientificName == "Litoria moorei") %>% mutate(m = month(eventDate)) %>% rename(lat = decimalLatitude, lng = decimalLongitude)

quack <- df %>% filter(scientificName == "Crinia georgiana") %>% mutate(m = month(eventDate)) %>% rename(lat = decimalLatitude, lng = decimalLongitude)

 leaflet_map1 <- leaflet(moan) %>% setView(lat = -31.9514, lng = 115.8617, zoom = 5) %>%
   addProviderTiles(providers$CartoDB.Positron) %>%  # Positron tiles
   addCircleMarkers(
     lng = ~lng,
     lat = ~lat,
     radius = 3,  # Scale the radius
     color = month_colour[moan$m,]$c,
     fillColor = month_colour[moan$m,]$c,
     fillOpacity = 1,
     stroke = TRUE,
     weight = 2
   )
 
 
 leaflet_map2 <- leaflet(motorbike) %>% setView(lat = -31.9514, lng = 115.8617, zoom = 5) %>%
   addProviderTiles(providers$CartoDB.Positron) %>%  # Positron tiles
   addCircleMarkers(
     lng = ~lng,
     lat = ~lat,
     radius = 3,  # Scale the radius
     color = month_colour[motorbike$m,]$c,
     fillColor = month_colour[motorbike$m,]$c,
     fillOpacity = 1,
     stroke = TRUE,
     weight = 2,

   )
 
  leaflet_map3 <- leaflet(quack) %>% setView(lat = -31.9514, lng = 115.8617, zoom = 5) %>%
   addProviderTiles(providers$CartoDB.Positron) %>%  # Positron tiles
   addCircleMarkers(
     lng = ~lng,
     lat = ~lat,
     radius = 3,  # Scale the radius
     color = month_colour[quack$m,]$c,
     fillColor = month_colour[quack$m,]$c,
     fillOpacity = 1,
     stroke = TRUE,
     weight = 2,

   )
  
maps1 <- mapview::mapshot(
    leaflet_map1,
    file = "map1.png",
    remove_controls = c("zoomControl", "layersControl", "homeButton"),
    vwidth = 250,
    vheight = 250
  )
maps2 <- mapview::mapshot(
    leaflet_map2,
    file = "map2.png",
    remove_controls = c("zoomControl", "layersControl", "homeButton"),
    vwidth = 250,
    vheight = 250
  )
maps3 <- mapview::mapshot(
    leaflet_map3,
    file = "map3.png",
    remove_controls = c("zoomControl", "layersControl", "homeButton"),
    vwidth = 250,
    vheight = 250
  )

 
 side_by_side_maps <- div(
  style = "display: flex; flex-wrap: wrap;",
  div(
    style = "flex: 1; min-width: 250px; padding: 10px;",
    h3("Moaning Frog", style = "text-align: center; margin: 10px 0;"),
   img(src="map1.png")
  ),
  div(
    style = "flex: 1; min-width: 250px; padding: 10px;",
    h3("Motorbike Frog", style = "text-align: center; margin: 10px 0;"),
   img(src="map2.png")
  ),
    div(
    style = "flex: 1; min-width: 250px; padding: 10px;",
    h3("Quacking Frog", style = "text-align: center; margin: 10px 0;"),
   img(src="map3.png")
  ),  div(style = "padding: 10px;", class="legend-title;", p("Legend"),
            p( id="legend3", style = "minimum-width: 250px; display: flex; flex-wrap: wrap; justify-content: flex-start; gap: 5px;",
                tags$script("createLegend(redToGreenColors, 'legend3');"),
              
          )
  )

)
 
side_by_side_maps

Moaning Frog

Motorbike Frog

Quacking Frog

Legend