Data is currently preliminary and likely to undergo additional QAQC steps

Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise, does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or any agency thereof.

1. Summary

Two data sets are included: (1) sensor and calculated data and (2) modeled stream metabolism estimates.

Sampling was conducted at sites in East Fork Poplar Creek by members of Oak Ridge National Laboratory’s WaDE Science Focus Area. Data from Theme 3 (9 sites) and Theme 2 (2 sites) are provided.

Location

East Fork Poplar Creek in Oak Ridge, TN, USA, Roane and Anderson Counties. 12-digit hydrologic unit code: 060102070302.

Site.Name.Normalized Lat Long Altitude.Estimated.m
EFK5.4 35.96648 -84.3581 230
EFK7.9 35.96937 -84.3464 234
GumHollow 35.97553 -84.3356 237
EFK10.0 35.97628 -84.3355 237
EFK11.3 35.98249 -84.3274 240
EFK13.8 35.99250 -84.3153 245
MillBranch 35.99629 -84.3032 249
EFK16.2 35.99888 -84.3001 250
EFK18.2 36.00642 -84.2810 254
EFK20.0 36.00951 -84.2685 256
Downtown 36.00974 -84.2677 256

Establishment and History

The Theme 3 sites were established in 2023 by Oak Ridge National Laboratory as part of the WaDE Science Focus Area. Theme 2 sites were established prior to 2023, however data from these sites is only provided for the time period that overlaps with when Theme 3 sites were active.

Acknowledgements

This work was funded by the U.S. Department of Energy, Office of Science, Biological and Environmental Research, Environmental Systems Science (ESS) Program, supporting the Watershed Dynamics and Evolution Science Focus Area (WaDE SFA).

Data Citation

Forthcoming. Please contact Marie Kurz prior to using any of this data in a publication.

2. Sensor and Calculated Data

Variables measured by sensors at Theme 3 sites were collected with two co-deployed underwater sensors: miniDOT (water temperature and dissolved oxygen) and CTD (water temperature, water pressure, and specific conductance) sensors. Air temperature and pressure is accessed from a nearby NOAA ASOS weather station.

Theme 2 site data are collected from sonde data (water temperature, dissolved oxygen, specific conductance, and depth). Additional meta-data is available elsewhere for these sites.

A visualization of theme 3 data is available (mean hourly, to speed processing) here: preliminarywadedata.shinyapps.io/Shiny_AllData. Contact Jonny Behrens for a password.

Parameter Dictionary

Each variable are reported in a comma separated file (.csv). Missing values are reported as NA.

  • Air temperature (Air.Temp.C): OakRidgeASOS; temperature; air; degrees Celsius
  • Dissolved oxygen (DO_mgL): miniDOT, Sonde; dissolved oxygen; water; unfiltered; milligrams per liter
  • Dissolved oxygen (DO_sat): miniDOT, Sonde; dissolved oxygen; water; unfiltered; percent of saturation
  • pH (pH): Sonde; pH; water; standard units
  • Stormflow flag (Predicted_Stormflow_Conditions): Calculated; NA
  • Water pressure (Pressure_Water_cmH2O): CTD; pressure above sensor; water; cmH20
  • Q error term (Q_errorterm): miniDOT; error term associated with the optical DO sensor; unitless
  • Air pressure (Site_Specific_AirPressure_cmH2O): Calculated; pressure estimated at site; air; cmH20
  • Specific conductance (SpC_uScm): CTD, Sonde; specific conductance; water; unfiltered; field; microsiemens per centimeter at 25 degrees Celsius
  • Stage height (Stage_Height): Calculated; depth below water surface, relative to sensor deployment; water; uncorrected; centimeters
  • Water temperature (Temp_C_CTD): CTD; temperature; water; degrees Celsius
  • Water temperature (Temp_C_miniDOT): miniDOT; temperature; water; degrees Celsius
  • Water temperature (Temp_C_Sonde): Sonde; temperature; water; degrees Celsius
  • Turbidity (Turb.NTU): Sonde; turbidity; water; nephelometric turbidity units
  • Water level (WaterLevel): Calculated; depth below water surface, relative to sensor deployment, normalized to a reference point; water; centimeters

Manual Data Flags

Flag Flag description
REMOVE Data is not real and replaced with “NA”. A known issue precludes its use (e.g., sensor out of water or not deployed at correct site)
Bad Data Data is unreliable and should not be used in analyses. Issue with data is strongly inferred (e.g., sensor malfunction, cleaning, or calibration)
Questionable The quality of the data cannot be verified. Use with caution.
Interesting Data has unexpected values but is likely quality data.
No Flag No quality concerns have been identified.

Dataset Specific Characteristics

Theme 3 Data

Temporal Coverage

This dataset covers water years 2024 and 2025, which begins on 1 October each year and ends 30 September each year. On-going data collection is underway for water year 2026, and is provided as it become available. Data are recorded in 15-minute intervals.

Equipment

A PME miniDOT logger measures water temperature and dissolved oxygen. An error term (Q) is also reported by the instrument.

A VanEssen CTD-Diver measures water temperature, water depth, and specific conductance. Both sensors are installed in cinder blocks that are fully submerged underwater on the streambed at each site.

miniDOT Q Error Term

The Q term reported by the miniDOT provides a quality check for the optical sensor used to measure dissolved oxygen.

“Q is the ratio of the [DO] as determined from emission intensity to the [DO] as determined by emission lifetime. [DO] should be the same no matter which measurement technique is used. So [DOlifetime] = [DOintensity] and therefore [DOlifetime]/[DOintensity] = Q is ideally 1.0. In practice miniDOTs report Q to very close to 1.0.”

Source: PME: q-measurement-found-in-minidot.

Sensor Sensitivity

The sensitivity reported by the manufacturer is provided below.

Parameter SensitivitySensor Accuracy Precision
DO_mgL miniDOT ± 5% of measure or ± 0.3 mg/L, whichever is larger 0.001 mg/L
DO_sat miniDOT ± 5% of measure or ± 0.3 mg/L, whichever is larger 0.001 mg/L
Pressure_Water_cmH2O CTD ± 0.5 cmH20 0.2 cmH20
SpC_uScm CTD ± 1% of measure 0.1% of measure
Temp_C_CTD CTD ± 0.1 degrees C 0.01 degrees C
Temp_C_miniDOT miniDOT ± 0.1 degrees C ± 0.1 degrees C

QAQC Protocols

Raw sensor data underwent automated and manual QAQC. When possible, data is preserved and marked with flags to enable users the ability to make decisions on what to include in their analyses.

We automatically flag data as “questionable” if any of the follow criteria are met.

  • Water depth
    • Depth is less than 1 cm
  • Specific conductance
    • SpC < 10 us/cm
    • SpC > 100,000 us/cm (note: in our watershed salting is infrequent, and this threshold was manually checked to ensure it remained below SpC measured on days when road salting was known to happen)
  • Dissolved oxygen
    • DO (mg/L) > 20 mg/L (also marked, as a comment, if greater than 40 mg/L)
    • DO (percent sat) > 200%
    • miniDOT q error term differed throughout the day (max - min) by greater than 0.01
  • Water temperature
    • Temperature on CTD or miniDOT greater than 40 degrees C (in this watershed a value above 40 degrees C is not observed)
    • Temperature on CTD or miniDOT less than -1 degrees C (in this watershed it’s unlikely for the stream to fully freeze at the face of the sensors)
  • Outlier for day
    • value is greater than 10% different from the median value for that day ^double check this

During the period of sensor maintenance, time points are removed (~ 1 hour) and the gap is linearly interpolated.

Manually we also flag for various things observed. For example, when the sensor block is known or expected to be out of water, when biofouling of the miniDOT is expected, and during maintenance. When possible, we only flag the impacted sensor (e.g., if biofouling occurred, only DO data is flagged).

Theme 2 Data

Detailed information is available alongside previously published datasets. Details forthcoming.

Calculated Measures

There are four calculated measures. The include: site-specific air pressure, dissolved oxygen in percent saturation, depth at the location of sensor deployment, stage height.

Site-Specific Air Pressure

Air pressure and air temperature data were downloaded from a nearby NOAA weather station in Oak Ridge, WBAN:53868. Data is sourced from the NOAA Local Climatological Data, Version 2. The station is located at lat: 36.023 and long: -84.2337.

Functionally, the air pressure at the ASOS weather station is slightly different (~2-5 cmH2O) relative to other sites, given different altitudes. This is important, as that could introduce ~10-20% error to our estimate of water depth at some sites, if not calculated at each site specifically.

Thus, the air pressure at each site, approximately above the water surface, is estimated by correcting the air pressure measured by the ASOS weather station in Oak Ridge, TN for each site’s altitude. Altitude for each site was estimated based on the lowest elevation reported for the watershed draining to the site, using NHDPlus V2 NEDSnapshot DEM that was accessed through the Shroud Center’s Model My Watershed tool.

  1. Sub-hourly NOAA data is transmuted to every minute and filtered to the 15 minute timestep to be consistent with the timestep of the CTDs.

  2. Gaps in station pressure, when station altimeter is reported, are filled per NOAA methods.

  3. The air pressure for each site is calculated by correcting the ASOS Oak Ridge NOAA station data to the altitude of the site of interest, per Equations 1 and 2. These equations are a simplified calculation that is used by streamMetabolizer, with site specific parameters. They are adapted from the second equation provided on this page.

    • Equation 1: \(Altitude_{delta} = Altitude_{Site} - Altitude_{ASOS}\)
    • Equation 2: \(AirPressure_{site} = AirPressure_{ASOS} \times e^{\frac{-9.80655 \times 0.0289644 \times Altitude_{delta}}{8.31447 \times (AirTemp_{ASOS})}}\)
      • Air Pressure is in cmH2O
      • Air Temperature is in Kelvin

A few assumptions were made for these analyses

  • Altitude of stream surface is assumed to be similar across time at a given site, but different between the sites.
  • Shifts in air pressure at the ASOS station is representative of changes across the watershed, on a 15 minute timestep.

Dissolved Oxygen in Percent Saturation

  1. For each time-step, at each site, we estimate the theoretical concentration of dissolved oxygen (mg/L) if the water was 100% saturated. This is the equilibrium concentration of DO at a given water temperature, site-specific air pressure, and altitude, assuming complete mixing of the water with the atmosphere.

    • The function LakeMetabolizer::o2.at.sat.base() is used to calculate this value independently at each timestep at each site
    • Site specific altitude (constant throughout timeseries), water temperature from the miniDOT (changes each timestep), and the site-specific air pressure (changes each timestep) are the input data to this function
  2. We divide the actual measured DO (mg/L) by the equilibrium concentration and multiply by 100 (to get a percent). This gives the dissolved oxygen in percent saturation.

    • Conceptually, if the actual measured DO is the same as the calculated equilibrium concentration, then the value would be 100%.
    • If it is super-saturated (observed concentration is greater than the equilibrium concentration) then the value is >100%.

Water Depth and Stage Height

Water depth was measured by subtracting the calculated site-specific air pressure from the water pressure measured by CTD sensor.

A stage height was approximated from the raw water depth. The sensor location moved at sites throughout the timeseries due to sensor cleaning, sensor re-positioning, or disturbance during disruptive flow events. Therefore, an approximate stage depth is calculated, relative to where the sensor was first deployed in the streambed. The stage height is manually estimated based on known and identified periods when the sensor block shifted.

3. Modeled Stream Metabolism Estimates

Data Dictionary

Each variable is reported in a comman separated file (.csv).

  • date: Date to which the fitted values apply, for the period from 4am on that date to 3:59am on the following date.
  • x_daily_mean: Mean of the post-warmup MCMC distribution of the x parameter (GPP, ER, or K600), where the x parameter is the mean for this date
  • x_daily_se_mean: Standard error of the mean of the post-warmup MCMC distribution of the x parameter, where x parameter for this date.
  • x_daily_sd: Standard deviation of the post-warmup MCMC distribution of the x parameter, where x parameter for this date.
  • x_daily_2.5pct: The 2.5th quantile of the post-warmup MCMC distribution of the x parameter, where x parameter for this date.
  • x_daily_25pct: The 25th quantile of the post-warmup MCMC distribution of the x parameter, where x parameter for this date.
  • x_daily_50pct: The 50th quantile of the post-warmup MCMC distribution of the x parameter, where x parameter for this date.
  • x_daily_75pct: The 75th quantile of the post-warmup MCMC distribution of the x parameter, where x parameter for this date.
  • x_daily_97.5pct: The 97.5th quantile of the post-warmup MCMC distribution of the x parameter, where x parameter for this date.
  • x_daily_n_eff: Estimated effective sample size of the MCMC sampling for the x parameter, where x parameter for this date.
  • x_daily_Rhat: Gelman-Rubin convergence statistic (R-hat statistic) of the MCMC sampling for the x parameter, where x parameter for this date. Values near or below 1.05 indicate convergence of the MCMC chains.
  • valid_day: TRUE if the input data for this date were considered valid and included in the model, FALSE otherwise
  • warnings: Date-specific warnings about input data
  • errors: Date-specific problems with input data that prevented model fitting on that date and drove the setting of valid_day to FALSE
  • discharge_exceed_1sd: for a day, the difference in discharge (max-min) is greater than the standard deviation of all time points across the time series, for a site
  • Flag_InputData: Data flags, from QAQC, associated with any data used as an input to the streamMetabolizer model
  • Flag_OutputData: Data flags, from QAQC, associated with outputs from the streamMetabolizer model
  • Flag_Comment: Comments associated with flags

Input Data Note

Data inputs into streamMetabolizer include 2 parameters discussed in the “sensor and calculated data” section: dissolved oxygen (percent saturation) and water temperature. Light and depth are calculated with the streamPULSE::prep_metabolism R function, by using site-specific lat/long and discharge, respectively.

We estimated discharge at each site by …

QAQC Protocols

Modeled estimates of daily metabolism underwent automated and manual QAQC. When possible, data is preserved and marked with flags to enable users the ability to make decisions on what to include in their analyses.

Input QAQC

Data that used for streamMetabolizer was processed per the previously outlined protocols. All time steps marked with “bad” or “remove” flags were excluded from analysis, and replaced with NA. When data was missing, it was linearly interpolated for gaps up to 3 hours in length using the prep_metabolism function in the streamPULSE R package. The same package was used to estimate PAR (above the canopy) for each site and areal depth. NOTE: the PAR estimated for streamMetabolizer should not be used for other analyses, as it is not an estimate of light reaching the stream surface. Contact Jonny Behrens for modeled estimates of PAR at the stream surface (with the R package streamlight).

We chose to include flagged data with “questionable” or “interesting” to allow for the metabolism model to run, even with imperfect data. As described further below in the output QAQC, on days with this flagged data, caution should be used when interpreting those results. For example, if there is clear biofouling (DO will be flagged), metabolism estimates may not be trustworthy.

For detailed information on how the streamMetabolizer bayes model was set-up, contact Jonny Behrens. In brief summary, K600 priors were binned into 6 bins and 1000 burnin steps and 1000 saved steps were used.

Output QAQC

Daily estimates were automatically flagged based on the following criteria. We group these into two categories: Reject and Questionable. It is encouraged that days marked with “reject” are excluded from all analyses. Days marked with “questionable” could be used in some analyses, depending on tolerance of error.

Days are marked as “rejected” for one of the following reasons:

  • GPP or ER is egregiously beyond biologically realistic (GPP < -0.5 or ER > 0.5)
  • Rhat for ER, GPP, or K600 > 1.2
  • Discharge changed dramatically throughout the day (difference between a day’s max and min discharge is greater than 1 standard deviation of the discharge for entire time series for given site). That violates the model’s assumption that K600 is relatively constant for a given day

Days are marked as “questionable” for one of the following reasons:

  • GPP or ER is beyond biologically realistic, but close to 0 (GPP between -0.5 and 0 or ER between 0 and 0.5)
  • Rhat for ER, GPP, or K600 > 1.05 but less than 1.2
  • Any input data to the model for that day (e.g., discharge and DO) had a QAQC flag

When the model does not have sufficient input data to model a day, it will not attempt to model GPP, ER, K600, and DO. For these days, “NA” is provided instead of any numeric value.

4. Planned Updates to Dataset

Plans as of 2026-02-02:

  • All streamMetabolism estimates will be re-run for Theme 3 sites with slightly updated stream discharge estimates. This is because stage height methods were updated (now use the Oak Ridge ASOS air pressure data), which in turn will slightly impact our scaled sub-daily estimates of discharge which are used by streamMetabolizer. It will give estimates for all days in water year 2024 through mid-December 2025 (some were missing in the last iteration). We might alter some of the input parameters to help with increasing model performance.
  • We will rerun metabolism for EFK 5.4 and 16.2 when data becomes available from Theme 2 for water year 2025. Before such, we will also do a quick QAQC check consistent with Theme 3 methods.
  • When sensors are next downloaded (likely in Feb or March 2026), we will add that data to the time series.
  • We will likely add the Bear Creek site, when data becomes available from Theme 2 for water years 2024 and 2025.
  • Current data folder for sharing (internal to ORNL): Cotton-Strips/2026-02-09_Combined.