Final report

Project: Using collaborative open science tools to improve engagement with the ecology of the Guana River Estuary
Project team: Geraldine Klarenberg, Kristie Perez, Nikki Dix, Nia Morales, and Shirley Baker

The University of Florida and Guana Tolomato Matanzas National Estuarine Research Reserve (GTMNERR) are partnering with the local community and broader science community to develop a web-based, public-facing, interactive dashboard to provide access to Guana Estuary datasets. The aim of this work is to support open science and to increase diverse engagement with the Guana Estuary within the GTMNERR by making the data available interactively, using visualization tools.

To this end, the project team sought feedback from those who have been involved with the Guana Estuary to help them to better understand their needs. This document summarizes the results of an online survey that was made available via email, social media, and QR code. It was sent to the Technical Advisory Group and project-specific stakeholders (identified by the GTMNERR research coordinator). Responses were collected through the software Qualtrics from April 2023 to June 2024. The programming language R (version 4.4.0) was used to deidentify, clean, analyze and publish the results.

This survey was approved by the University of Florida’s Institutional Review Board (IRB), IRB#202202509.

1. Response rate

There were 56 individuals that started the survey, but five did not answer any questions. Out of the remaining 51, ten surveys were unfinished. For this report, we also took these unfinished surveys into account. Most results are shown as percentages of respondents selecting certain options or answers. Since the total number of responses varies, due to those unfinished surveys, all results show the total number of people (N) that answered a question.

In terms of origination, 47 respondents started or filled in the survey based on a link received via email, three via social media, zero via the QR code available at the GTMNERR Welcome Center, and one via the QR code available at the kiosk at the dam.


2. Introductory questions

The survey started with asking respondents about their connection to the Guana Estuary, how often they engage with the Guana Estuary, what data they would be interested in, and whether or not they ever accessed data associated with the Guana Estuary.

For the purposes of this project, and this survey, “Guana Estuary” refers to the Guana Lake and Guana River: the area north and south of the Guana Dam, from Micklers Road to the Tolomato River / intracoastal.

2.1 Guana Estuary connection

We asked respondents about their connection with the Guana Estuary. In total there were 111 connections chosen. The figure below summarizes the responses as percentages of the total number of respondents but note that people could select more than one option. This is why the sum of all percentages adds up to more than 100%.

For example, a little over 60% of respondents do recreational activities at the Guana Estuary, and almost 50% collect data or use data for scientific purposes - and these choices are not mutually exclusive! Someone could collect data and also enjoy the Guana Estuary recreationally. Or volunteer and also use the Guana Estuary for educational purposes.

Under “Other”, respondents answered:

  • Lake management - FWC
  • Health
  • GTM-NERR MAG member
  • Sometimes bring international visitors for professional conversations (am retired now).

2.2 Level of engagement

The next table indicates how often respondents say they interact(ed) with the Guana Estuary. Only one answer was possible here.


2.3 Data of interest (in general)

We asked respondents what Guana Estuary data they would be interested in, regardless of whether or not they currently have access to these data. Respondents were also asked to rank their interest in these datasets, with “1” being the data they are most interested in. They could select as many or few as they wanted.

The figure below shows the percentage of respondents that selected a particular dataset being of interest to them. For example, over 80% of respondents selected water quality data. The colors inside the bar indicate how they ranked it: for instance, almost 40% of all respondents ranked water quality data as their number “1” dataset of interest.

Under “Other”, the four types of datasets mentioned were: historical maps, water fowl, dam operations, natural resource management practices/techniques/results.


2.4 Previous experience with accessing data associated with Guana Estuary

The survey asked respondents whether they had accessed data before, and by “data”, we clarified that we meant “information, especially facts or numbers, collected to be examined and considered and used to help decision-making; or information in an electronic form that can be stored and used by a computer, for instance spreadsheets, databases, graphs, and maps.”

Based on their answer here, respondents were directed to different survey sections. Part 3 of this document reports on questions for those who answered “No”, Parts 4 and 5 on questions for those who answered “Yes” to this question.

Take home messages

  • The majority of respondents use the Guana Estuary recreationally (62.7%) and almost half (49%) collect or use its data for scientific purposes. In third and fourth place are other work-related and educational connections (resp. 33.3% and 29.4%).
  • Engagement with the Guana Estuary is high: mostly once a month (21.6% of respondents), with a tie (19.6%) for once a week and once every 6 months. Daily engagement was indicated by 17.6% of respondents.
  • The top four datasets of interest, chosen by more than 60% of the respondents, are: water quality information, information on shellfish, fish and other aquatic organisms, water level information, and information on vegetation.
  • The majority of respondents (83.7%) has accessed data before.

Based on their response whether they had accessed data, respondents answered different sets of questions. The results are summarized in the next two sections.


3. Feedback from respondents that had not accessed data before

3.1 Datasets of interest

For respondents that had not (yet) accessed data (N = 8), the figure below summarizes their answers from section 2.3 (datasets of interest). In this figure, the datasets are ordered according to their average ranking (the number on the right of the bars), once again “1” being the dataset of most interest.

One person responded they were not interested in any data (“None”), hence this item ranks first, as the average of 1 is 1. The rest of the responses paints an interesting picture, as, for instance, water quality data were selected by most respondents, but in terms of average ranking it comes in third place of the datasets of interest. The three datasets that have average rankings between “2” and “3” are water level information, reserve or trail closures, and water quality information. However, information on vegetation, and information on fish, shellfish and other aquatic organisms was also selected by more than 60% (five respondents) - but it was ranked lower on average.

3.2 Frequency and usage of data, if accessed in the future

The survey asked these respondents broad questions on how often they would access these data, and what they would use them for.

The figure below shows that 25% (two respondents) were not interested in accessing data, and that about half of the respondents would access data either once a month or once a year (25% each).


In terms of what people would use data for, the majority would use it for (non-research / non-educational) work-related purposes and decision making, as per the figure below. Also here, respondents could select more than one answer, so the sum of all percentages is more than 100%.

Under “Other”, respondents listed environmental impacts and resilience planning.

Take home messages

  • For the (eight) respondents that had not accessed data before, their main interest would be data on water level information, reserve or trail closures, and water quality information.
  • If they could access data, 25% of respondents would access data once a month and 25% once a year.
  • 25% of respondents (two people) are not interested in accessing data.
  • The majority of usage of data would be for (non-research / non-educational) work-related purposes and decision making: each of those options was selected by three people.

4. Feedback from respondents that had accessed data before

For respondents that had accessed data before (N = 40), the figure below summarizes their answers from section 2.3 (datasets of interest - regardless of whether respondents can or have accessed these data). In this figure, the datasets are ordered according to their average ranking, once again 1 being the dataset of most interest.

The top three data of interest are water quality information, information on shellfish, fish and other aquatic organisms, and water level information: both in terms of average ranking and the percentage of respondents selecting these. Interestingly, more than half the respondents chose weather information and vegetation information as data of interest, but their rankings are relatively low.

4.1 Datasets of interest

Under “Other”, the four types of datasets mentioned were: historical maps, water fowl, dam operations, natural resource management practices/techniques/results.


4.2 Data used or accessed

In the section for respondents that had accessed data before, the survey asked which datasets they had accessed, and a number of detailed questions about their experiences related to how they accessed these data, the advantages and disadvantages of this access, the frequency of access, the usage of the data, and respondents’ satisfaction with these data (for their needs).

There was an option “Other”, to which there was one response: LiDAR data.


When making a comparison between the datasets respondents would like to access (regardless whether they can or have - section 2.3) and the datasets they had actually accessed, there is a clear discrepancy, see figure below.

While not explicitly mapped 1:1 for each respondent, from the figure it appears that for almost all data (aside from water quality information, and “Other”) 22.5 - 30% of respondents had not accessed or are not accessing data they would like to access.


4.3 Details on dataset access and satisfaction

The following table summarizes the detailed questions per dataset. The numbers represent the percentage of respondents that chose that answer. Questions that allowed multiple answers are indicated in the table with an asterisk: so these percentages can add up to more than 100%.

The column colors correspond to the colors in the table above. The darker the color, the higher the percentage in the cell.

Question Water quality information (including nutrients and algae) Information on fish, shellfish or other aquatic organisms Water level information (tides, Guana lake, river) Information on vegetation (salt marsh or uplands) Weather information Reserve or trail closures Information on terrestrial animals Other: LiDAR
How do you most frequently obtain or access these data?
Request from a GTMNERR staff member by email 54.8 55 40 42.9 0.0 22.2 28.6 0
Download from website (If so, what website?) 25.8 35 30 35.7 75.0 55.6 57.1 100
Other (please specify) 19.4 10 30 21.4 25.0 11.1 14.3 0
Pick-up paper copy in person 0.0 0 0 0.0 0.0 11.1 0.0 0
How often do/did you access or obtain these data?
Daily 27.6 35 30 7.7 16.7 57.1 28.6 0
At least once a week 24.1 5 0 23.1 8.3 14.3 0.0 0
2-3 times a month 17.2 15 20 15.4 0.0 0.0 0.0 100
Once a month 17.2 20 20 23.1 50.0 14.3 42.9 0
Once every 6 months 10.3 20 5 23.1 8.3 0.0 14.3 0
Once every year 3.4 0 15 0.0 8.3 0.0 0.0 0
Less than once a year 0.0 5 10 7.7 8.3 14.3 14.3 0
What do you typically use these data for?*
Research 58.6 65 55 46.2 50.0 28.6 28.6 100
Monitoring 27.6 25 15 23.1 8.3 14.3 42.9 0
Work-related purposes (not research or education) 27.6 25 30 30.8 50.0 14.3 28.6 0
Educational purposes 24.1 40 25 38.5 33.3 42.9 57.1 0
Decision making (for recreational/educational/scientific visits) 20.7 15 30 23.1 33.3 42.9 42.9 0
Other (please specify) 6.9 0 10 15.4 16.7 14.3 14.3 0
How well do these data generally satisfy your need(s)?
Slightly well 41.4 45 55 46.2 25.0 28.6 42.9 0
Moderately well 41.4 35 25 53.8 41.7 57.1 14.3 100
Very well 13.8 15 15 0.0 25.0 0.0 28.6 0
Extremely well 3.4 5 5 0.0 8.3 14.3 14.3 0

In terms of usage of data, respondents added the following under “Other”:

Water quality information (including nutrients and algae)
Personal interest
Vegetation management on lake
Water level information (tides, Guana lake, river)
Guana Dam management
I would access it more if I knew how to get to the data
Information on vegetation (salt marsh or uplands)
Monitoring of invasive plant species sites for re-ocurrence / growth
Personal use, plus is helpful during some of the volunteer programs
Weather information
Recreation
Prescribed fire weather forecasts
Reserve or trail closures
Leisure
Information on terrestrial animals
Personal interest; useful in some of my volunteer activities

In summary, the three usages mentioned here most are personal use, volunteer activities, and management purposes (vegetation / invasive plants, prescribed fire).


Take home messages

  • The majority of people that had accessed data before did so for water quality information: 85%.
  • Water level information and information on fish, shellfish or other aquatic organisms were accessed by 55% of the respondents, and 45% had accessed information on vegetation. All the other datasets were accessed by fewer than 40% of respondents.
  • Comparing this to the information on the data people would like to access (regardless of whether they had in the past or can do so), the number people that would like to access data on fish, shellfish, or other aquatic organisms, water levels, weather, vegetation, and terrestrial animals is larger than those who had actually accessed those data. Only for water quality data there is a small discrepancy between the number of people wanting to access those data, and the number who had.

  • For the four most popular types of datasets (bolded above), the majority of respondents obtain these data by requesting it from GTMNERR staff (by email).
  • About 70 to 85% of people access these data at least once a month, with most more frequent than that.
  • The majority of respondents use these data for research purposes, but a considerable number of people also use them for other purposes such as education or other work-related purposes (not research/education).
  • For all these four data types, 80 to 100% of respondents indicate that they satisfy their needs slightly to moderately well.

  • In contrast, the other four types of information (weather, reserve or trail closures, terrestrial animals and LiDAR) are mostly accessed through websites.
  • In terms frequency of accessing these data, most people either access these data once a month (weather, and terrestrial animals) or daily (reserve and trail closures, and terrestrial animals).
  • The usage of these data fairly evenly distributed across the options that were provided - as well as how well the data satisfied people’s needs.

5. Advantages and disadvantages - from respondents that had accessed data before

We asked respondents about the advantages and disadvantages of the current manner of accessing datasets. They could choose more than one answer (without ranking), and we offered the opportunity to provide free text answers for the options “Download from website” and “Other”. For the next few figures we added those specific levels of information to the visualizations (“Other: unspecified” is the category where respondents chose “Other” but did not provide more written detail).

For both the advantages and disadvantages, results are shown from two perspectives: - first, from the perspective of the data access method (for example, download from website, request via email, etc), where we show the percentages of (dis)advantages for each. Note that not all data access methods were selected by the same number of respondents, so some have more nuanced (dis)advantages listed if there were more respondents. - Second, from the perspective of the (dis)advantages, where we have grouped these together, and show which dataset types they were mentioned with.

5.1 Details on advantages of dataset access

The figure below summarizes the advantages that respondents listed per data access method.

The abbreviations refer to the following:

  • “CDMO” is the National Estuarine Research Reserve (NERR) System Centralized Data Management Office: https://cdmo.baruch.sc.edu/;
  • “SWMP” is the NERR System-Wide Monitoring Program: https://coast.noaa.gov/digitalcoast/data/nerr.html;
  • “SEACAR” is the Statewide Ecosystem Assessment of Coastal and Aquatic Resources by the Florida Department of Environmental Protection: https://data.florida-seacar.org/;
  • “FL DEP” is the Florida Department of Environmental Protection;
  • “NOAA” refers to the National Oceanic and Atmospheric Administration, and “COPEPOD” is their global plankton database: https://www.st.nmfs.noaa.gov/copepod/;
  • “UF IFAS” is the University of Florida Institute of Food and Agricultural Sciences; and
  • “FWC” is the Florida Fish and Wildlife Conservation Commission.

The next figure, also known as a treemap, shows the information for the same questions but summarized differently; based on the advantages. The size of the different colored rectangles for each advantage indicates the percentage of respondents that selected this advantage compared to the total advantage answers (outer box). Clicking on each rectangle then shows the data access methods for which this advantage was selected; also by percentage, but now the selected advantage represents 100%. Example: if 32 out of a 204 advantage answers were “Requesting the data is quick”, this box (colored red) represents about 16% of the total area. Clicking on this, the detail shows that out of the 32 times this answer was chosen, almost half of those (15, or 47%) were chosen in relation to the method “Request from a GTMNERR staff member by email”.

A hover label shows the percentages as text, as well as the advantages or access methods if the text is too long to be readily readable.


5.2 Details on disadvantages of dataset access

The following two figures summarize the data on disadvantages in the same way as the advantages were summarized and visualized, starting with the disadvantages per data access method.

Similar to the section 5.1, the next figure, the treemap, shows the information for the same questions but summarized differently; based on the disadvantages. The size of the different colored rectangles for each disadvantage indicates the percentage of respondents that selected this disadvantage compared to the total disadvantage answers (outer box). Clicking on each rectangle then shows the data access methods for which this disadvantage was selected; also by percentage, but now the selected disadvantage represents 100%.

A hover label shows the percentages as text, as well as the disadvantages or access methods if the text is too long to be readily readable.


Take home messages

  • In terms of advantages, there were only two access methods for which respondents said there were no advantages: picking up paper copies, and downloading data from the NOAA website.
  • For many methods, access convenience and the usefulness of the data format were highlighted as advantages.
  • The speed with which data is received is listed as an advantage for some methods, but fewer times than the above two advantages. This could be because data is not received quickly; but it could also be that this is not an advantage that is important to respondents.

  • For the disadvantages the results are more heterogeneous, as many respondents filled in their own disadvantages under the “Other:” option.
  • However, a number of methods received only “There are no disadvantages” responses: downloading data from iNaturalist, from SEACAR, and requesting data from FWC. Various other download methods also received a reasonable number of responses for this answer.
  • A few notable disadvantages are: the difficulty and non-user-friendliness of getting data from CDMO / SWMP (or SEACAR), the slow receipt of data from the NOAA website(s), the complications of picking up paper copies, and the lack of up-to-date data and the various formats when downloading data from UF IFAS and / or FL DEP websites.

Linking these results to section 4, some additional information:

  • The majority of respondents finds the way they access data easy and convenient for the four most accessed datasets (water quality information; water level information; information on fish, shellfish or other aquatic organisms; information on vegetation). In addition, 35-55% find the format in which they get these data useful. While 40-45% of respondents say there are no disadvantages in accessing these datasets, these results show that GTMNERR staff probably spend considerable time providing these data (and do an excellent job!).

  • Also for the remaining datasets (weather; reserve or trail closures; terrestrial animals; LiDAR) respondents largely feel data access is easy / convenient, and the format is useful. Respondents feel that there are not many disadvantages, though they do highlight difficulty in accessing weather data (25%) and the time it takes to request information on terrestrial animals (42.9%).


6. Dashboard preferences

The survey asked respondents about their preferences regarding dashboard features (type and format of information, data delivery mode) and how they would access the dashboard.

By “dashboard” we meant a user interface on a computer display that presents (up-to-date) information with visualization tools such as graphs, charts, and tables - in a dynamic and interactive way.

6.1 How to access data


6.2 What type of information

The response in the category “Other” is projections.


6.3 What form of information

When asked about the form of information and the format of data delivery, respondents were also asked to rank their choices. They did not have to rank all options: only those they were interested in.


6.4 Format of data delivery

Take home messages

  • The overwhelming majority of respondents prefer to access a data dashboard on their computer. Slightly less than 50% would (also) want to access it on a cellphone.
  • The type of information respondents would like to access is firstly recent data (from the past year), chosen by almost 95% of respondents. In a shared second place (almost 90% each) came historical data and current data (from the past month). Data representing current conditions was also chosen by just over 70% of the respondents.
  • In terms of providing aggregated data through the dashboard, an equal number of respondents is interested in data aggregated by location as well as by time periods. On average, aggregation by location is ranked slightly higher by most respondents (1.7 vs 1.8 for time periods; again, lower values indicate higher preference/higher ranking).
  • A relatively large number of people is also interested in non-aggregated (“raw”) data: almost 75%.
  • All data delivery formats suggested in the survey were popular with the respondents. While data downloads did not get the most votes in absolute numbers, almost half of those who selected this format ranked it their number “1” format: it has the best (lowest) average ranking. Maps and figures were about equally popular, both in absolute numbers as well as rankings. And while online tables came in “last” place, still more than 75% of people would like to see data in this format.

7. Characteristics of respondents

Finally, the survey requested demographic information from respondents. This helps the project team get a better understanding of the dashboard’s target audience.

7.1 Age and gender


7.2 Distance from the Guana Estuary

Take home messages

  • Slightly more respondents were female (55%) than male (45%).
  • Most respondents were 65 or over (32.5%), but the two age groups 35-44 and 45-54 combined made up 35% of the respondents: we assume this reflects a balance between respondents that are retired and that are mid- to senior-level working professionals.
  • Most people live 31-60 minutes driving distance away from the Guana Estuary, which (together with the 11-30 minute driving distance option) probably corresponds to the nearest towns (Ponte Vedra, St Augustine). Those two categories made up 67.5% of the responses, meaning that the majority of the respondents have strong local connections to the area. 30% of respondents live further away (greater than 60 minutes driving distance), which most likely represents (visiting) scientists.

Next steps

The project team is considering dashboard designs based on this report, and previous workshops. Work is underway on a draft (prototype) dashboard, and the project team will be in touch soon about further steps on this, and to inform you of upcoming participation and discussion opportunities.

To access the code that created this document, the survey result data, or jpg versions of the figures, go to https://github.com/GTMNERR-Science-Transfer/Survey-results.

Suggestions and comments on this draft report are very welcome; please email Dr. Geraldine Klarenberg at , or leave an “Issue” on the above linked GitHub repository.