GPS Elevation

Note: everything is in metric because we are doing science!

suppressPackageStartupMessages({
library(sf)
library(elevatr)
library(tidyverse)
library(tigris)
library(terra)
library(knitr)
library(broom)
})

path <- read_sf('~/Downloads/2025-12-29-093245-Hiking-Craig’s Series 10 Watch.gpx', layer = "track_points")
elev_rast <- get_elev_raster(locations = path, clip = 'locations', expand = .001, z = 14, neg_to_na = TRUE)
path <- path %>%
  mutate(DEM = extract(elev_rast, .))

We’ll look at an activity recorded on an Apple watch, and use two sources for elevation:

The elevations recorded by the watch into the GPX file (presumably the altimeter readings).
The elevations at each point from the Mapzen Digital Elevation Model, which returns the best resolution grid fro the area of interest (usually 10m).

Note that while the DEM is quite accurate, it will necessarily flatten features smaller than 10m, and it depends on the accuracy of the horizontal (X, Y) coordinates.

elev_on_path <- path %>% 
  mutate(step_dist = coalesce(as.numeric(st_distance(geometry, lag(geometry), by_element = TRUE)), 0)) %>% 
  st_drop_geometry() %>% 
  select(time, step_dist, DEM, raw = ele) %>%
  pivot_longer(-c(time, step_dist), names_to = 'src', values_to = 'elev') %>%
  group_by(src) %>%
  arrange(time) %>%
  mutate(
    cum_dist = cumsum(step_dist)/1000,
    gain = coalesce(pmax(elev - lag(elev), 0), 0),
    cum_elev = cumsum(gain)
  )

Elevation vs Distance

The altimeter data reported a consistently higher value, but traces quite well.

ggplot(elev_on_path) + 
  geom_line(aes(x = cum_dist, y = elev, color = src)) +
  theme_minimal()

Let’s zoom in a bit to a half kilometer traveled so we can see the different structure of the elevation data sources. You can see the “steps” in the DEM data as you cross from one tile to another, while the altimeter data is recorded frequently enough that it appears smooth (this is directly from the GPX and has not been smoothed beyond what happened on-device).

ggplot(elev_on_path) + 
  geom_line(aes(x = cum_dist, y = elev, color = src)) +
  theme_minimal() + 
  scale_x_continuous(limits = c(5, 5.5)) + 
  scale_y_continuous(limits = c(1175, 1250))

Did the relationship between the two change over time? We can plot the “error” ( in quotes, because neither of these values are the truth)
of predicting the DEM from the altimeter over time, and while we can see that the relationship between the two clearly wobbles over short time periods, it’s stable overall: the DEM averages about 1.01 times the altimeter reading, minus 24 meters.

mod <- lm(data = path, DEM ~ ele)

path %>% 
  st_drop_geometry() %>%
  mutate(resid = residuals(mod)) %>%
  ggplot() +
  geom_point(aes(x = time, y = resid)) + 
  geom_smooth(aes(x = time, y = resid), method = 'lm') +
  theme_minimal()

First half

path %>% 
  mutate(resid = residuals(mod)) %>%
  slice_head(prop = .5) %>%
  ggplot() +
  geom_sf(aes(color = resid)) + 
  theme_minimal() +
  xlab("Longitude") + 
  scale_color_distiller(palette = 'RdBu')

Second half (return)

path %>% 
  mutate(resid = residuals(mod)) %>%
  slice_tail(prop = .5) %>%
  ggplot() +
  geom_sf(aes(color = resid)) + 
  theme_minimal() +
  xlab("Longitude") + 
  scale_color_distiller(palette = 'RdBu')

Cumulative Elevation

If we simply add up each increase in elevation over the track, we get the following:

elev_on_path %>%
  group_by(Source = src) %>% 
  summarize(
    `Cumulative Elevation Gain` = round(sum(gain))
  ) %>% kable()

Source	Cumulative Elevation Gain
DEM	1452
raw	856

Clearly, neither of these are what we expect. We are counting every little jitter in the elevation data as a gain, which results in even the DEM-sourced gain being inflated. Is that “correct”? Up to you, but unless you want your jaunty bouncy walk to count for twice as much elevation as your friend’s steady careful walk (or realistically, your noisy altimeter to count for twice as much elevation as their precise one), we need to dull out the small, randomish gains in elevation.

We can smooth the z-axis (elevation) of the path using a spline (credit to https://github.com/trackerproject/trackeR).

elev_on_path_smooth <- elev_on_path %>%
  group_by(src) %>%
  nest() %>%
  mutate(
      fit = map(data, ~smooth.spline(as.numeric(.x$time), .x$elev)),
      augmented = map(fit, augment)
  ) %>%
  unnest(c(data, augmented)) %>%
  select(-fit) %>%
  rename(elev_smooth = .fitted) %>%
  mutate(
    gain_smooth = coalesce(pmax(elev_smooth - lag(elev_smooth), 0), 0),
  )


elev_on_path_smooth %>%
  group_by(Source = src) %>% 
  summarize(
    `Cumulative Elevation Gain Smoothed` = round(sum(gain_smooth))
  ) %>% kable()

Source	Cumulative Elevation Gain Smoothed
DEM	762
raw	757

Now the numbers agree a little better with each other! But we still have two numbers.

We could go a step further and fit the altimeter values to the DEM values (as we did before to see if there was a pattern in the “error”).

That ties the altimeter readings to the “ground truth” of the DEM readings, but preserves the small scale variation that the altimeter sees while traversing a single grid cell in the DEM (which will look flat or stepped). Then we can smooth that fitted value to get an elevation gain that incorporates both sets of data while ignoring noise in the ups and downs.

elev_on_path_adjusted <-
  path %>% 
  st_drop_geometry() %>% 
  mutate(
    elev_adjusted = augment(mod)$.fitted
  ) 
elev_on_path_adjusted_smoothed <- elev_on_path_adjusted %>%
  mutate(
    elev_adjusted_smoothed = augment(smooth.spline(as.numeric(elev_on_path_adjusted$time), elev_on_path_adjusted$elev_adjusted))$.fitted
  ) %>%
  mutate(
    gain_adjusted_smooth = coalesce(pmax(elev_adjusted_smoothed - lag(elev_adjusted_smoothed), 0), 0),
  )


elev_on_path_adjusted_smoothed %>%
  summarize(
    `Cumulative Elevation Gain Smoothed & Adjusted` = round(sum(gain_adjusted_smooth))
  ) %>% kable()

Cumulative Elevation Gain Smoothed & Adjusted
767