Main Reseach Question:
Did the average daily average 2-m depth coastal ocean temperature at Birchy Head in Nova Scotia change from 2019 to 2025?
ocean_temperature <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2026/2026-03-31/ocean_temperature.csv')
## Rows: 19165 Columns: 5
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (4): sensor_depth_at_low_tide_m, mean_temperature_degree_c, sd_temperat...
## date (1): date
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
ocean_temperature_deployments <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2026/2026-03-31/ocean_temperature_deployments.csv')
## Rows: 14 Columns: 5
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): deployment_id
## dbl (2): latitude, longitude
## date (2): start_date, end_date
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
The project examines daily coastal ocean temperatures at Birchy Head, Nova Scotia, during 2019 to 2025. This dataset includes daily mean water temperature of the sample at different sensor depths. The 2 m sensor depth was my focus to ensure comparability across the years. Ocean temperature influences marine environments, seasonal variability and habitat for coastal biota. The dependent variable, mean_temperature_degree_c, means the average daily temperature in Celsius. We have the independent variable, year, which represents our changes on a time series.
Did the average daily average 2-m depth coastal ocean temperature at Birchy Head in Nova Scotia change from 2019 to 2025?
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.6
## ✔ forcats 1.0.1 ✔ stringr 1.6.0
## ✔ ggplot2 4.0.1 ✔ tibble 3.3.1
## ✔ lubridate 1.9.5 ✔ tidyr 1.3.2
## ✔ purrr 1.2.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(lubridate)
temp_url <- "https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2026/2026-03-31/ocean_temperature.csv"
deploy_url <- "https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2026/2026-03-31/ocean_temperature_deployments.csv"
ocean_temperature <- readr::read_csv(temp_url)
## Rows: 19165 Columns: 5
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (4): sensor_depth_at_low_tide_m, mean_temperature_degree_c, sd_temperat...
## date (1): date
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
ocean_temperature_deployments <- readr::read_csv(deploy_url)
## Rows: 14 Columns: 5
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): deployment_id
## dbl (2): latitude, longitude
## date (2): start_date, end_date
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# Dataset structure.
glimpse(ocean_temperature)
## Rows: 19,165
## Columns: 5
## $ date <date> 2018-02-20, 2018-02-20, 2018-02-20, 2018-0…
## $ sensor_depth_at_low_tide_m <dbl> 2, 5, 10, 15, 20, 30, 40, 2, 5, 10, 15, 20,…
## $ mean_temperature_degree_c <dbl> 1.584839, 1.495161, 1.486429, 1.627000, 1.8…
## $ sd_temperature_degree_c <dbl> 0.04130141, 0.04288519, 0.03067176, 0.04842…
## $ n_obs <dbl> 31, 31, 21, 21, 32, 21, 32, 142, 143, 96, 9…
glimpse(ocean_temperature_deployments)
## Rows: 14
## Columns: 5
## $ deployment_id <chr> "depl_01", "depl_02", "depl_03", "depl_04", "depl_05", "…
## $ start_date <date> 2018-02-20, 2018-04-25, 2019-05-02, 2019-11-22, 2020-11…
## $ end_date <date> 2018-04-25, 2019-05-02, 2019-11-22, 2020-11-08, 2021-05…
## $ latitude <dbl> 44.57130, 44.56983, 44.56975, 44.56985, 44.56992, 44.569…
## $ longitude <dbl> -64.03512, -64.03411, -64.03448, -64.03443, -64.03440, -…
ocean_clean <- ocean_temperature %>%
mutate(
date = as.Date(date),
year = year(date),
month = month(date, label = TRUE)
) %>%
filter(
year >= 2019,
year <= 2025,
sensor_depth_at_low_tide_m == 2
)
# Check sample size by year.
ocean_clean
## # A tibble: 2,184 × 7
## date sensor_depth_at_low_tide_m mean_temperature_degree_c
## <date> <dbl> <dbl>
## 1 2019-01-01 2 3.01
## 2 2019-01-02 2 3.05
## 3 2019-01-03 2 2.53
## 4 2019-01-04 2 2.20
## 5 2019-01-05 2 2.77
## 6 2019-01-06 2 2.58
## 7 2019-01-07 2 2.39
## 8 2019-01-08 2 2.06
## 9 2019-01-09 2 2.67
## 10 2019-01-10 2 2.78
## # ℹ 2,174 more rows
## # ℹ 4 more variables: sd_temperature_degree_c <dbl>, n_obs <dbl>, year <dbl>,
## # month <ord>
ocean_clean %>% count(year)
## # A tibble: 7 × 2
## year n
## <dbl> <int>
## 1 2019 219
## 2 2020 366
## 3 2021 365
## 4 2022 365
## 5 2023 365
## 6 2024 164
## 7 2025 340
temperature_summary <- ocean_clean %>%
group_by(year) %>%
summarise(
mean_temp = mean(mean_temperature_degree_c, na.rm = TRUE),
median_temp = median(mean_temperature_degree_c, na.rm = TRUE),
sd_temp = sd(mean_temperature_degree_c, na.rm = TRUE),
sample_size = n(),
se_temp = sd_temp / sqrt(sample_size)
)
temperature_summary
## # A tibble: 7 × 6
## year mean_temp median_temp sd_temp sample_size se_temp
## <dbl> <dbl> <dbl> <dbl> <int> <dbl>
## 1 2019 5.09 4.59 4.03 219 0.272
## 2 2020 8.93 8.33 5.65 366 0.296
## 3 2021 9.82 8.99 5.49 365 0.288
## 4 2022 9.34 8.70 5.56 365 0.291
## 5 2023 9.70 7.70 5.62 365 0.294
## 6 2024 3.89 3.50 1.97 164 0.154
## 7 2025 9.59 9.13 6.31 340 0.342
# Visualizations
# Boxplot by year
ggplot(ocean_clean, aes(x = factor(year), y = mean_temperature_degree_c)) +
geom_boxplot(fill = "purple", color = "Black") +
theme_minimal() +
labs(
title = "Daily Coastal Ocean Temperature by Year",
subtitle = "Birchy Head, Nova Scotia; 2 m sensor depth",
x = "Year",
y = "Mean daily temperature (degrees C)"
)
# Annual mean trend with standard error bars
ggplot(temperature_summary, aes(x = year, y = mean_temp)) +
geom_line(color = "darkblue", size = 1) +
geom_point(size = 3, color = "green") +
geom_errorbar(
aes(ymin = mean_temp - se_temp, ymax = mean_temp + se_temp),
width = 0.15,
color = "gray30"
) +
theme_minimal() +
labs(
title = "Mean Daily Coastal Ocean Temperature from 2019 to 2025",
subtitle = "Points are annual means; error bars are +/- 1 standard error",
x = "Year",
y = "Mean temperature (degrees C)"
)
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once per session.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
# Histogram of daily temperatures
ggplot(ocean_clean, aes(x = mean_temperature_degree_c)) +
geom_histogram(binwidth = 1, fill = "brown", color = "white") +
theme_minimal() +
labs(
title = "Distribution of Daily Coastal Ocean Temperatures",
subtitle = "Birchy Head, Nova Scotia; 2 m sensor depth, 2019-2025",
x = "Mean daily temperature (degrees C)",
y = "Number of days"
)
### Descriptive Statistics
temperature_summary <- ocean_clean %>%
group_by(year) %>%
summarise(
mean_temp = mean(mean_temperature_degree_c, na.rm = TRUE),
median_temp = median(mean_temperature_degree_c, na.rm = TRUE),
sd_temp = sd(mean_temperature_degree_c, na.rm = TRUE),
sample_size = n(),
se_temp = sd_temp / sqrt(sample_size)
)
temperature_summary
## # A tibble: 7 × 6
## year mean_temp median_temp sd_temp sample_size se_temp
## <dbl> <dbl> <dbl> <dbl> <int> <dbl>
## 1 2019 5.09 4.59 4.03 219 0.272
## 2 2020 8.93 8.33 5.65 366 0.296
## 3 2021 9.82 8.99 5.49 365 0.288
## 4 2022 9.34 8.70 5.56 365 0.291
## 5 2023 9.70 7.70 5.62 365 0.294
## 6 2024 3.89 3.50 1.97 164 0.154
## 7 2025 9.59 9.13 6.31 340 0.342
# Boxplot by year
ggplot(ocean_clean, aes(x = factor(year), y = mean_temperature_degree_c)) +
geom_boxplot(fill = "orange", color = "black") +
theme_minimal() +
labs(
title = "Daily Coastal Ocean Temperature by Year",
subtitle = "Birchy Head, Nova Scotia; 2 m sensor depth",
x = "Year",
y = "Mean daily temperature (degrees C)"
)
# Annual mean trend with standard error bars
ggplot(temperature_summary, aes(x = year, y = mean_temp)) +
geom_line(color = "pink2", size = 1) +
geom_point(size = 3, color = "lightgray") +
geom_errorbar(
aes(ymin = mean_temp - se_temp, ymax = mean_temp + se_temp),
width = 0.15,
color = "gray30"
) +
theme_minimal() +
labs(
title = "Mean Daily Coastal Ocean Temperature from 2019 to 2025",
subtitle = "Points are annual means; error bars are +/- 1 standard error",
x = "Year",
y = "Mean temperature (degrees C)"
)
# Histogram of daily temperatures
ggplot(ocean_clean, aes(x = mean_temperature_degree_c)) +
geom_histogram(binwidth = 1, fill = "plum3", color = "white") +
theme_minimal() +
labs(
title = "Distribution of Daily Coastal Ocean Temperatures",
subtitle = "Birchy Head, Nova Scotia; 2 m sensor depth, 2019-2025",
x = "Mean daily temperature (degrees C)",
y = "Number of years")
temp_model <- lm(mean_temperature_degree_c ~ year, data = ocean_clean)
summary(temp_model)
##
## Call:
## lm(formula = mean_temperature_degree_c ~ year, data = ocean_clean)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8.315 -5.165 -1.420 5.347 12.705
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -373.6723 129.6134 -2.883 0.00398 **
## year 0.1891 0.0641 2.949 0.00322 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.69 on 2182 degrees of freedom
## Multiple R-squared: 0.003971, Adjusted R-squared: 0.003514
## F-statistic: 8.699 on 1 and 2182 DF, p-value: 0.003217
par(mfrow = c(2, 2))
plot(temp_model)
par(mfrow = c(1, 1))
Linear regressions were conducted because they were suitable for the research question. The mean_temperature_degree_c, the dependent variable, is the daily mean temperature. The independent variable, which is the years, shows the difference between 2019 and 2025. Linear regressions examine year and temperature as correlates with slope indicating annual variation. The null hypothesis is that there is no linear relationship, and the alternative is that there is. P-values less than 0.05 are considered statistically significant. Year over year, the regression slope was approximately 0.189 °C, with a p-value of 0.0032. Because the p-value was less than 0.05, we considered temperature change to be significant. The positive slope reflects a tiny gain, whereas the R-squared value of 0.004 indicates the year explained next to nothing of daily temperature change.
The result indicated that the average daily coastal ocean temperature at the 2 m sensor depth was altered from 2019 to 2025. The slope for positive regression indicates the increase. The slope of the linear regression was estimated as approximately 0.189°C annually, leading to a small annual temperature increase. The p-value was 0.0032 (below the 0.05 significance level), the change was significant. But the R-squared value was approximately 0.004; the year explained roughly none of the daily temperature change.
Biologically interesting, as slight increases in temperature can influence marine life. Temperature of sea water regulates seasonality, habitat quality, and environment for coastal species. Yet, the analysis does not rule out any other possible limitations. The data were gathered from one place, Birchy Head, Nova Scotia, so the results cannot capture the full ocean. The analysis is based on just the 2 m sensor depth and it does not explain temperature changes in all measurement layers. Daily temperatures in the ocean also vary substantially with season, adding as explanation why the year alone did not explain a good deal of the variation. This analysis demonstrates a relationship between year and temperature over time, which is not evidence to tell us why they changed or why temperature changed.
Taking the linear regression as a starting point, the average daily coastal ocean temperature at the 2 m sensor depth at Birchy Head, Nova Scotia changed between 2019 and 2025. Over the years it had a slight increase in temperature with an approximate slope of about 0.189°C annually. This is significant because the p-value was 0.0032, < 0.05. However, the R-squared value was very small, this year alone explained very little of the variation in daily temperature. On the whole, the findings indicate an increase in the coastal ocean temperature over time with seasonal patterns and other environmental influences probably also contributing.
Centre for Marine Applied Research. (n.d.). Coastal Monitoring Program. https://cmar.ca/coastal-monitoring-program/
Nova Scotia Open Data. (n.d.). Understanding Complex Data: Coastal Monitoring. https://data.novascotia.ca/stories/s/Understanding-Complex-Data-Coastal-Monitoring/a25g-piws/
TidyTuesday. (2026). Coastal Ocean Temperature by Depth. GitHub. https://github.com/rfordatascience/tidytuesday/blob/main/data/2026/2026-03-31/readme.md