| locale | state | country | lat | lon | elev | n |
|---|---|---|---|---|---|---|
| Alberta | AB | CAN | 50.85 | -110 | 710 | 2 |
| Bouse | AZ | USA | 33.84 | -114 | 360 | 8 |
| Chimineas Ranch | CA | USA | 35.17 | -119.9 | 744 | 23 |
| Eatonton | GA | USA | 33.23 | -83.52 | 202 | 3 |
| Indio Mountains Research Station | TX | USA | 30.75 | -105 | 1250 | 22 |
| Kamloops-Thompson | BC | CAN | 50.75 | -120.5 | 697 | 8 |
| Lordsburg | NM | USA | 32.24 | -108.9 | 1274 | 15 |
| McIntyre Landing | MI | USA | 44.65 | -84.86 | NA | 8 |
| Montana de Oro | CA | USA | 35.26 | -120.9 | 24 | 15 |
| Norden | NE | USA | 42.8 | -100.1 | 714 | 1 |
| Owl Head Buttes | AZ | USA | 32.6 | -111.1 | 840 | 5 |
| Powerlines | AZ | USA | 32.62 | -111.3 | 635 | 3 |
| Quartzite | AZ | USA | 33.78 | -114.1 | 360 | 5 |
| Rattlesnake Butte | CO | USA | 40.57 | -107 | 2200 | 18 |
| Rodeo | NM | USA | 31.89 | -109 | 1257 | 17 |
| Sedgwick Reserve | CA | USA | 34.69 | -120 | 342 | 28 |
| Vandenberg Airforce Base | CA | USA | 34.74 | -120.6 | 429 | 19 |
| Youngblood Place | TX | USA | 30.33 | -103.1 | 1330 | 10 |
Rattlesnake thermoregulation
Analysis of internal body temperatures
Comments look like this.
To-do items look like this.
1 Datasets
Study data are organized across three primary sources:
data/site-metadata.csv— geographic and collaboration information for each of the 19 field sites.data/snake-metadata.csv— physical and biological attributes for each of the 210 individual snakes, including a path to the corresponding body temperature file.data/_raw/— one CSV per snake with timestamped internal body temperature recordings from iButton sensors.
An integrated, quality-controlled workspace is provided as data/processed/tbdata.RData. It contains tb (body temperature time series), meta_snake, and meta_site.
tbdata.RData is produced in two sequential steps. analysis/processing/data-processing.r reads the raw CSV files, joins metadata, and detects basic integrity issues (duplicate timestamps, corrupted observations). analysis/processing/outliers.r then applies biological bounds filtering, removes short discontinuous segments, and flags rolling-window outliers as NA. See the Outlier detection section for details.
1.1 Internal body temperatures
Body temperature data from 210 snakes were collected at 19 sites across western North America, spanning 2013–2025.
Sampled individuals belonged to 11 species. Most were represented at a single site.
| spp | n_indiv | n_sites |
|---|---|---|
| crotalus oreganus | 85 | 4 |
| crotalus viridis | 43 | 5 |
| crotalus atrox | 25 | 5 |
| crotalus scutulatus | 16 | 1 |
| crotalus oreganus oreganus | 8 | 1 |
| sistrurus catenatus | 8 | 1 |
| crotaus pyrrhus | 7 | 2 |
| crotalus lepidus | 6 | 1 |
| crotalus ornatus | 5 | 1 |
| crotalus scutulatus x viridis | 4 | 2 |
| crotalus horridus | 3 | 1 |
| locale | state | start | end | years |
|---|---|---|---|---|
| Chimineas Ranch | CA | Apr 2010 | Oct 2017 | 2 |
| Vandenberg Airforce Base | CA | Apr 2012 | Nov 2017 | 3 |
| Montana de Oro | CA | Apr 2014 | Feb 2018 | 3 |
| Eatonton | GA | Apr 2021 | Apr 2021 | 1 |
| Owl Head Buttes | AZ | Apr 2022 | Aug 2022 | 1 |
| Quartzite | AZ | Apr 2023 | Aug 2023 | 1 |
| Kamloops-Thompson | BC | Jul 2005 | Apr 2006 | 2 |
| Alberta | AB | Jul 2021 | Jul 2022 | 2 |
| Rattlesnake Butte | CO | Jun 2017 | May 2021 | 4 |
| Lordsburg | NM | Jun 2020 | Sep 2021 | 2 |
| Rodeo | NM | Jun 2021 | Sep 2021 | 1 |
| Youngblood Place | TX | Jun 2021 | Aug 2022 | 2 |
| McIntyre Landing | MI | May 2014 | Aug 2014 | 1 |
| Sedgwick Reserve | CA | May 2015 | Dec 2017 | 3 |
| Indio Mountains Research Station | TX | May 2019 | Oct 2023 | 5 |
| Norden | NE | May 2022 | Jun 2022 | 1 |
| Powerlines | AZ | May 2022 | Aug 2023 | 2 |
| Bouse | AZ | May 2023 | Aug 2023 | 1 |
1.2 DAYMET weather data
Daily weather covariates (minimum and maximum air temperature) are sourced from the ORNL Daymet 1-km gridded product, fetched via API by site coordinate and covering 1980–2024.
| lat | lon | date | year | yday | dayl | prcp | tmax | tmin | vp |
|---|---|---|---|---|---|---|---|---|---|
| 30.33 | -103.1 | 1980-01-01 | 1980 | 1 | 36293 | 0 | 18.54 | -3.63 | 423.8 |
| 30.33 | -103.1 | 1980-01-02 | 1980 | 2 | 36321 | 0 | 17.27 | 1.12 | 641.6 |
| 30.33 | -103.1 | 1980-01-03 | 1980 | 3 | 36351 | 0 | 12.22 | -1.7 | 539.1 |
| 30.33 | -103.1 | 1980-01-04 | 1980 | 4 | 36383 | 0 | 17.16 | -2.94 | 456.1 |
| 30.33 | -103.1 | 1980-01-05 | 1980 | 5 | 36418 | 0 | 19.23 | -3.21 | 431.4 |
| 30.33 | -103.1 | 1980-01-06 | 1980 | 6 | 36454 | 0 | 19.27 | -1.04 | 513.5 |
2 Exploratory Data Analysis
2.1 Outlier detection
Quality control is implemented in analysis/processing/outliers.r and proceeds in three stages.
1. Biological bounds. Observations with \[T_b < -5°C\] or \[T_b > 40°C\] are removed as physiologically implausible.
2. Discontinuity removal. Time gaps larger than 24 hours define segment boundaries within each snake’s series. Segments shorter than 7 days are dropped, as they are too short to contribute reliable information and may reflect instrument artifacts or re-deployment gaps.
3. Rolling-window outlier detection. For each observation a local mean (0.5-day window) and local SD (4-day window) are estimated using centered rolling statistics, with fallback to forward- or backward-looking windows near series endpoints. An observation is flagged if it deviates from the rolling mean by more than 6 SD and by more than 1°C. Flagged values are set to NA; rows are preserved.
The code block below shows the core logic of each stage.
Code
n_days_mean <- 0.5 # rolling-mean window
n_days_sd <- 4 # rolling-SD window
n_sd <- 6 # outlier threshold
# Stages 1-2: biological bounds + discontinuity removal
tb_qc <- tb |>
filter(tb > -5 & tb < 40) |>
group_by(file) |>
arrange(datetime) |>
mutate(
lagged = datetime - lag(datetime, default = first(datetime)),
is_not_contiguous = cumsum(duration(24, "hours") < lagged)
) |>
group_by(file, is_not_contiguous) |>
mutate(chunk_length = max(last(datetime) - first(datetime))) |>
filter(chunk_length > duration(7, "days")) |>
ungroup()
# Stage 3: per-file rolling mean and SD, then flag deviations
# NB: outliers.r additionally uses forward/backward windows near series
# endpoints; this shows the centered-window core for brevity.
tb_aug <- tb_qc |>
group_by(file) |>
group_modify(function(tbdata, ...) {
samp_freq <- median(diff(as.numeric(tbdata$datetime)), na.rm = TRUE)
w_mean <- ceiling(as.numeric(duration(n_days_mean, "days")) / samp_freq)
w_sd <- ceiling(as.numeric(duration(n_days_sd, "days")) / samp_freq)
tbdata |>
mutate(
rmean = data.table::frollmean(tb, w_mean, align = "center", fill = NA),
rsd = data.table::frollapply(tb, w_sd, FUN = sd, align = "center", fill = NA)
)
}) |>
ungroup()
# flagged values set to NA; the full pipeline writes the result back to tbdata.RData
tb_adj <- tb_aug |>
mutate(
tb = if_else((abs(tb - rmean) > n_sd * rsd) & (abs(tb - rmean) > 1), NA_real_, tb)
) |>
select(file, datetime, tb)Diagnostic plots showing a ±7-day window around each flagged observation are written to analysis/processing/_diagnostics/_outliers/ for manual review.
2.2 Active season identification
Active season dates are inferred from internal body temperatures using changepoint detection on daily diurnal variance. The intuition is that thermoregulating snakes show greater daily temperature variance when active; variance is low and flat during inactivity.
Season classification is a standalone analysis. The output data/processed/seasons.rds is produced by analysis/processing/season-identification.R and is not currently consumed by the main modeling pipeline.
2.2.1 Diurnal variation
Daily standard deviations in body temperature are computed per snake per day of year. Snakes with fewer than 30 days of data are excluded, and series are smoothed with a 7-day rolling mean.
2.2.2 Location averages
Smoothed series are aggregated to locale-level pointwise medians to give the representative seasonal profile for each site.
2.2.3 Changepoint detection
Mean changepoints are detected via binary segmentation (up to \(Q = 3\) segments). A threshold of 1.5°C on segment-level mean distinguishes active from inactive periods.
Comments:
In Lordsburg, a changepoint falls in the middle of the end-of-season transition. The 1.5°C threshold excludes the tapering period; reducing it to 1.25 would include it.
In Montana de Oro and Sedgwick Reserve, inferred start dates are unreliable due to sparse data (see below).
In Montana de Oro, data span 2017–2018 (mid-April to late February). A single snake was active for roughly one week in early February 2018; the inferred changepoint simply marks the start of data collection. In Sedgwick, data span 2015–2017 but with only one snake recorded in 2016; the changepoint reflects that individual’s activity onset.
| locale | yr | first.obs | last.obs |
|---|---|---|---|
| Montana de Oro | 2014 | 2014-04-16 | 2014-10-07 |
| Montana de Oro | 2017 | 2017-04-15 | 2017-12-31 |
| Montana de Oro | 2018 | 2018-01-01 | 2018-02-21 |
| Sedgwick Reserve | 2015 | 2015-05-16 | 2015-12-31 |
| Sedgwick Reserve | 2016 | 2016-01-01 | 2016-03-24 |
| Sedgwick Reserve | 2017 | 2017-04-15 | 2017-12-11 |
2.2.4 Inferred active seasons
Active season classification uses the changepoint-inferred segment means thresholded at 1.5°C (daily SD). Locales for which all data fell within the active season (Rodeo, McIntyre, Chimineas) are excluded from the table.
| location | start | end |
|---|---|---|
| Vandenberg Airforce Base | May 5 | Oct 16 |
| Montana de Oro | Jan 30 | Oct 26 |
| Lordsburg | Mar 28 | Nov 12 |
| Alberta | May 1 | Sep 8 |
| Youngblood Place | Mar 25 | Nov 10 |
| Rattlesnake Butte | May 3 | Sep 22 |
| Indio Mountains Research Station | Mar 16 | Nov 10 |
| Kamloops-Thompson | Mar 23 | Sep 27 |
| Sedgwick Reserve | Feb 12 | Oct 24 |
3 Statistical Modeling
The analysis models how continuous body temperature profiles (\(T_b\) as a function of time of day) vary with site-level environmental covariates across months, using a pooled cyclic functional regression fit across all months simultaneously.
3.1 Notation and covariates
| Symbol | Meaning |
|---|---|
| \[T_b(t)\] | Body temperature at hour \[t \in [0, 24)\] |
| \[x\] | Covariate (tmin, tmax, elev, or lat), z-scored globally |
| \[m\] | Month (\[1, \ldots, 12\]), treated as a continuous cyclic variable |
| \[\ell\] | Locale (field site) |
| \[i\] | Curve index — one snake-day |
| \[\phi^{(1)}_k(t)\] | Level-1 (between-locale) eigenfunction, PC \[k\] |
| \[\phi^{(2)}_k(t)\] | Level-2 (within-locale) eigenfunction, PC \[k\] |
Four covariates are supported. Each is z-scored globally (across all locale-dates) so that a 1-SD shift is interpretable as a population-level contrast regardless of site.
| Variable | Units | Contrast | Notes |
|---|---|---|---|
tmin |
°C | 5°C | — |
tmax |
°C | 5°C | — |
elev |
m | 100 m | Rattlesnake Butte excluded (extreme outlier) |
lat |
degrees | 5° | — |
3.2 Data preparation
The package function prep_tf_data() loads the processed data files, joins snake and site metadata with Daymet weather, constructs functional data objects (one curve per snake-day on the 0–24h domain), and z-scores the four covariates globally. Sparse curves (fewer than 10 hourly observations) are retained at this stage and filtered downstream.