Modeling Levan and Hovermap Travel Rates

Author

Michael Campbell

Introduction

In 2016, we ran our first travel rate experiment. Thirty-one study subjects walked 22 transects, each 100m in length, through central Wasatch foothills outside of Levan, UT featuring juniper-sagebrush vegetation. Study subjects were timed with stopwatches, the results of which were used to derive travel rates. Travel rates were compared to lidar-derived estimates of slope, vegetation density, and ground surface roughness to develop models capable of predicting travel rates through wildland environments and mapping least-cost paths. The study was successful, but broad applicability was limited due to the limited ranges of terrain and vegetation conditions tested.

In 2023, we ran a second set of travel rate experiments, hoping to build upon the foundation laid by “the Levan study”, with some key improvements, including:

Selection of transects that captured a wider range of landscape conditions
Precise study subject spatiotemporal tracking
Mapping of ground-level landscape characteristics using an Emesent Hovermap mobile laser scanner (MLS)

Although the ultimate goal of this work is to gain highly-precise insights into travel behaviors through the analysis of the MLS data, the shared characteristics between the Levan study and this one (namely, a bunch of people walking 100 m transects back and forth through the wild) enable a comprehensive analysis that leverages data from both studies simultaneously.

There are some key differences between the studies, however, that may limit the direct comparative abilities:

Study subjects
- Levan study had 31; Hovermap study had 8-9
- No overlapping participants
Landscape characteristics
- Levan study captured more transects (22) in narrower set of landscape conditions
- Hovermap study captured fewer transects (11) in wider set of landscape conditions
Study design
- Levan study had people walking freely, only carrying their data logging clipboard
- Hovermap study had people walking with a backpack-mounted MLS, which adds both weight and caution

Accordingly, the extent to which a singular predictive model that combines data from both studies and possesses a valuable degree of predictive capacity is unknown. If such a model could be built, it would provide strong evidence towards a broadly-applicable set of landscape impedance factors that could be applied to a whole host of scenarios, including the simulation of wildland firefigher evacuation.

The objective of this markdown document is to examine the performance of a singular travel rate predictive model, built with data from our two travel rate experiments.

A Brief Aside about HVM Travel Rate Data

Behind the scenes, in a series of much less interesting scripts, I derived Levan-equivalent lidar and travel rate data for the Hovermap data. There are some things to discuss about this process that I’ll note here for further discussion at some later points while they’re still fresh in my head:

Approach to distinguishing between start-to-end and end-to-start portions of the continuous Hovermap trajectory
Issue with the placement of georeferencing gates on the end of transect 8 for group 1
Issue with Phil’s transect 7 data

Data Exploration

I think it’s worthwhile to take a look at how well the two studies capture landscape variability. So, I’ll first read in the combined data and print the full table for those interested. Note that travel rates have already been collapsed to per-transect/per-direction means, so there are no individual-level data contained within.

library(rmarkdown)

# read in the data
df <- read.csv("S:/ursa2/campbell/hvm_travel_rate/data/travel_rates/transect_structure_vs_travel_rates_lev_and_hvm.csv")
paged_table(df)

Now, I’ll make some plots that capture the landscape variable space between the two studies. The Levan transects are shown in red and the Hovermap transects are shown in blue.

library(RColorBrewer)

# plot the data out
col.lev <- brewer.pal(3, "Set1")[1]
col.hvm <- brewer.pal(3, "Set1")[2]
cols <- rep(col.lev, nrow(df))
cols[df$study == "hovermap"] <- col.hvm
pairs(df[,c("slope", "density", "roughness")],
      pch = 16, col = cols)

A few noteworthy results here:

Very intentionally, we captured much steeper slopes in the Hovermap study than the Levan study
We also captured the two highest vegetation density transects in the Hovermap study
Something I didn’t expect is the increase in ground surface roughness found in this study… I think this is likely due to the fact that the Hovermap transects featured much higher ground-level vegetation densities than the Levan study. In Levan, it was basically either juniper, sagebrush, or soil. In the Hovermap study, we were in much denser forests with heavy grass/forb/shrub understories. So, what is being classified as “ground” in the ALS data is likely inclusive of the lumpy near-ground vegetation, making it appear as if the ground surface is more rough. That said, we were also generally in more rugged terrain (i.e., steeper slopes, boulder fields, drainages, etc.), so ground surface may have also… just been rougher!

In any case, adding the Hovermap data to the Levan data clearly captures a much wider range of conditions.

Now let’s take a look at the travel rates.

library(colorspace)

# get histograms
h.lev <- hist(df$travel.rate[df$study == "levan"], breaks = seq(0,1.6,0.2), plot = F)
h.hvm <- hist(df$travel.rate[df$study == "hovermap"], breaks = seq(0,1.6,0.2), plot = F)

# plot histograms
par(mar = c(5,5,1,1), las = 1)
col.lev <- brewer.pal(3, "Set1")[1] |> adjust_transparency(0.5)
col.hvm <- brewer.pal(3, "Set1")[2] |> adjust_transparency(0.5)
plot(h.lev, col = col.lev, border = NA, main = NA, xlab = "Travel Rate (m/s)")
plot(h.hvm, col = col.hvm, border = NA, main = NA, xlab = NA, ylab = NA,
     axes = F, add = T)
legend("topleft", legend = c("levan", "hovermap"), fill = c(col.lev, col.hvm),
       bty = "n", border = NA)

Quite clearly, Levan travel rates were generally faster than Hovermap travel rates. Not entirely surprising, given the challenging landscape conditions faced in the Hovermap transects. But, perhaps points to some challenges of merging the two datasets together?

Let’s see how travel rates compare to the three landscape metrics of interest:

# set up plot
par(mfrow = c(1,3), las = 1, mar = c(5,0,0,0), oma = c(0,5,1,1))
col.lev <- brewer.pal(3, "Set1")[1]
col.hvm <- brewer.pal(3, "Set1")[2]
cols <- rep(col.lev, nrow(df))
cols[df$study == "hovermap"] <- col.hvm

# plot travel rate vs. slope
plot(travel.rate ~ slope, data = df, pch = 16, col = cols, xlab = "Slope (deg)")
plot(travel.rate ~ density, data = df, pch = 16, col = cols, yaxt = "n", xlab = "Density (%)")
plot(travel.rate ~ roughness, data = df, pch = 16, col = cols, yaxt = "n", xlab = "Roughness (m)")
mtext("Travel Rate (m/s)", 2, line = 3, outer = T, cex = 2/3, las = 0)

*VERY interesting**… Slope looks about how we would expect it. Density and roughness have clear negative trends, with some outliers. What is really surprising is the seemingly really strong relationship between travel rates and ground surface roughness – something that wasn’t all that present in the Levan-only data.

Modeling

OK, now for the fun part! I’ll spare a lot of the gory details of the modeling approach, and instead refer the exceedingly curious among you to a previous RPubs markdown document. But, briefly, the approach is as follows…

I will be using a non-linear least squares (NLS) regression modeling approach.
The slope-travel rate relationship will be modeled using a Lorentz function form, which several of our studies have demonstrated to effectively capture the nature of that relationshop.
Given that these transects are relatively short, study subjects tend to walk relatively fast, so I will be providing the NLS regression with starting terms that represent the 97.5th percentile of travel rates in our AllTrails study.
I will treat vegetation density and ground surface roughness as multiplicative factors (or, really, divisive factors) – that is, their effects will augment the slope-travel rate relationship in a multiplicative manner, which is both intuitive (e.g., walking through dense vegetation slows me down by a factor of 2x) and prevents the prediction of negative travel rates (which can result from modeling veg density and roughness as additive/subtractive terms).
To assess model performance, I will be using a leave-one-transect-out cross-validation approach.

library(stringr)

# read in the slope-travel rate terms from alltrails study
slope.rate.terms <- read.csv("S:/ursa/campbell/alltrails/modeling/slope_vs_travel_rate_model_percentile_coefficients_med.csv")

# define starting coefficients
a.start <- slope.rate.terms$a[slope.rate.terms$percentile == 0.975]
b.start <- slope.rate.terms$b[slope.rate.terms$percentile == 0.975]
c.start <- slope.rate.terms$c[slope.rate.terms$percentile == 0.975]
d.start <- slope.rate.terms$d[slope.rate.terms$percentile == 0.975]
e.start <- slope.rate.terms$e[slope.rate.terms$percentile == 0.975]
f.start <- 5
g.start <- 0

# build model
mod.full <- nls(travel.rate ~ (c*(1/(pi*b*(1+((slope-a)/b)^2)))+d+e*slope) / (f * density + g * roughness + 1),
                    data = df,
                    start = list(a = a.start, 
                                 b = b.start, 
                                 c = c.start, 
                                 d = d.start, 
                                 e = e.start, 
                                 f = f.start,
                                 g = g.start),
                    control = nls.control(maxiter = 1000))

# get starting coefficients
a.start <- coef(mod.full)[1][[1]]
b.start <- coef(mod.full)[2][[1]]
c.start <- coef(mod.full)[3][[1]]
d.start <- coef(mod.full)[4][[1]]
e.start <- coef(mod.full)[5][[1]]
f.start <- coef(mod.full)[6][[1]]
g.start <- coef(mod.full)[7][[1]]

# leave one transect out cross-validation
tids <- unique(df$transect)
for (tid in tids){
  
  # split into training and test
  df.train <- df[df$transect != tid,]
  df.test <- df[df$transect == tid,]
  
  # build model
  mod.fold <- nls(travel.rate ~ (c*(1/(pi*b*(1+((slope-a)/b)^2)))+d+e*slope) / (f * density + g * roughness + 1),
                    data = df.train,
                    start = list(a = a.start, 
                                 b = b.start, 
                                 c = c.start, 
                                 d = d.start, 
                                 e = e.start, 
                                 f = f.start, 
                                 g = g.start),
                    control = nls.control(maxiter = 1000,
                                          minFactor = 1e-10,
                                          warnOnly = T))
  
  # apply model
  pred <- predict(mod.fold, df.test)
  obs <- df.test$travel.rate
  
  # compile results
  results.df.temp <- data.frame(pred = pred,
                             obs = obs,
                             a = coef(mod.fold)[1][[1]],
                             b = coef(mod.fold)[2][[1]],
                             c = coef(mod.fold)[3][[1]],
                             d = coef(mod.fold)[4][[1]],
                             e = coef(mod.fold)[5][[1]],
                             f = coef(mod.fold)[6][[1]],
                             g = coef(mod.fold)[7][[1]])
  if (tid == tids[1]){
    results.df <- results.df.temp
  } else {
    results.df <- rbind(results.df, results.df.temp)
  }
}

# set up plot
par(las = 1, mar = c(5,5,1,1))

# define point size and color
pt.sizes <- 1 + df$density * 4
col.1 <- brewer.pal(3, "Set1")[1]
col.2 <- "white"
col.3 <- brewer.pal(3, "Set1")[2]
pt.cols <- colorRampPalette(c(col.1, col.2, col.3))(100)
pt.cols <- pt.cols[as.numeric(cut(df$slope, breaks = 100))]

# define x and y
x <- results.df$obs
y <- results.df$pred

# get axis limits
ax.min <- min(c(x, y))
ax.max <- max(c(x, y))

# plot the data
plot(y ~ x, type = "n", xlab = "Observed Travel Rate (m/s)", ylab = "Predicted Travel Rate (m/s)", xlim = c(ax.min, ax.max), ylim = c(ax.min, ax.max))
grid()
lines(c(-100,100),c(-100,100), col = "darkgray", lty = 2)
points(y ~ x, pch = 21, bg = pt.cols, cex = pt.sizes)

# get model performance metrics
mod <- lm(y ~ x)
abline(mod, lwd = 2)
r2 <- round(summary(mod)$adj.r.squared,2)
rrmse <- round(100*(sqrt(mean((y-x) ^ 2)))/mean(x),2)
r2.txt <- bquote(R^2==.(r2))
rrmse.txt <- bquote(rRMSE==.(rrmse)*"%")
bias <- round(mean(y-x),2)
bias.txt <- bquote(bias==.(bias)*"m/s")
a <- round(coef(mod)[2], 2)
b <- round(coef(mod)[1], 2)
if (b < 0){
  eq.txt <- bquote(y==.(a)~x-.(abs(b)))
} else {
  eq.txt <- bquote(y==.(a)~x+.(abs(b)))
}
legend("topleft",
       bty = "n",
       legend = c(eq.txt, r2.txt, rrmse.txt, bias.txt),
       x.intersp = 0)

# add legend for slope
pt.cols <- colorRampPalette(c(col.1, col.2, col.3))(100)
leg.img <- as.raster(matrix(rev(pt.cols), ncol = 1))
rasterImage(image = leg.img,
            xleft = par("usr")[1] + 0.9 * (par("usr")[2] - par("usr")[1]),
            ybottom = par("usr")[3] + 0.05 * (par("usr")[4] - par("usr")[3]),
            xright = par("usr")[1] + 0.95 * (par("usr")[2] - par("usr")[1]),
            ytop = par("usr")[3] + 0.25 * (par("usr")[4] - par("usr")[3]))
rect(xleft = par("usr")[1] + 0.9 * (par("usr")[2] - par("usr")[1]),
     ybottom = par("usr")[3] + 0.05 * (par("usr")[4] - par("usr")[3]),
     xright = par("usr")[1] + 0.95 * (par("usr")[2] - par("usr")[1]),
     ytop = par("usr")[3] + 0.25 * (par("usr")[4] - par("usr")[3]))
text(x = par("usr")[1] + 0.95 * (par("usr")[2] - par("usr")[1]),
     y = par("usr")[3] + 0.3 * (par("usr")[4] - par("usr")[3]),
     labels = "Slope (deg)",
     adj = c(1,0))
text(x = par("usr")[1] + 0.88 * (par("usr")[2] - par("usr")[1]),
     y = par("usr")[3] + 0.25 * (par("usr")[4] - par("usr")[3]),
     labels = round(max(df$slope)),
     adj = c(1,0.5))
text(x = par("usr")[1] + 0.88 * (par("usr")[2] - par("usr")[1]),
     y = par("usr")[3] + 0.15 * (par("usr")[4] - par("usr")[3]),
     labels = "0",
     adj = c(1,0.5))
text(x = par("usr")[1] + 0.88 * (par("usr")[2] - par("usr")[1]),
     y = par("usr")[3] + 0.05 * (par("usr")[4] - par("usr")[3]),
     labels = round(min(df$slope)),
     adj = c(1,0.5))

# add legend for density
text(x = par("usr")[1] + 0.75 * (par("usr")[2] - par("usr")[1]),
     y = par("usr")[3] + 0.3 * (par("usr")[4] - par("usr")[3]),
     labels = "Density",
     adj = c(1,0))
text(x = par("usr")[1] + 0.68 * (par("usr")[2] - par("usr")[1]),
     y = par("usr")[3] + 0.25 * (par("usr")[4] - par("usr")[3]),
     labels = str_pad(round(max(df$density),2), 4, "right", "0"),
     adj = c(1,0.5))
text(x = par("usr")[1] + 0.68 * (par("usr")[2] - par("usr")[1]),
     y = par("usr")[3] + 0.15 * (par("usr")[4] - par("usr")[3]),
     labels = str_pad(round(mean(range(df$density)),2), 4, "right", "0"),
     adj = c(1,0.5))
text(x = par("usr")[1] + 0.68 * (par("usr")[2] - par("usr")[1]),
     y = par("usr")[3] + 0.05 * (par("usr")[4] - par("usr")[3]),
     labels = str_pad(round(min(df$density),2), 4, "right", "0"),
     adj = c(1,0.5))
points(x = par("usr")[1] + 0.715 * (par("usr")[2] - par("usr")[1]),
       y = par("usr")[3] + 0.25 * (par("usr")[4] - par("usr")[3]),
       cex = max(pt.sizes))
points(x = par("usr")[1] + 0.715 * (par("usr")[2] - par("usr")[1]),
       y = par("usr")[3] + 0.2 * (par("usr")[4] - par("usr")[3]),
       cex = mean(c(max(pt.sizes), mean(range(pt.sizes)))))
points(x = par("usr")[1] + 0.715 * (par("usr")[2] - par("usr")[1]),
       y = par("usr")[3] + 0.15 * (par("usr")[4] - par("usr")[3]),
       cex = mean(range(pt.sizes)))
points(x = par("usr")[1] + 0.715 * (par("usr")[2] - par("usr")[1]),
       y = par("usr")[3] + 0.1 * (par("usr")[4] - par("usr")[3]),
       cex = mean(c(min(pt.sizes), mean(range(pt.sizes)))))
points(x = par("usr")[1] + 0.715 * (par("usr")[2] - par("usr")[1]),
       y = par("usr")[3] + 0.05 * (par("usr")[4] - par("usr")[3]),
       cex = min(pt.sizes))

AMAZING!! I’m genuinely shocked by how good this model is. Despite all differences between studies, a singular model is able to account for 84% of variance in travel rates. Truly impressive. Let’s see how the model coefficients varied throughout the leave-one-transect-out iterations.

# define colors
cols.fill <- adjust_transparency(brewer.pal(7,"Set1"), 0.5)
cols.line <- brewer.pal(7,"Set1")

# plot density distributions
par(las = 1, mfrow = c(3,3), mar = c(5,5,1,1))
for (i in seq(3,ncol(results.df))){
  x <- results.df[,i]
  d <- density(x)
  plot(d$y ~ d$x, type = "n", xlab = colnames(results.df)[i], ylab = "Density")
  polygon(d$y ~ d$x, col = cols.fill[i-2], border = NA)
  lines(d$y ~ d$x, col = cols.line[i-2], lwd = 2)
  abline(v = median(x), lty = 2, lwd = 2)
  if (median(x) < mean(par("usr")[1:2])){
    legend("topright", legend = paste0("Med = ", round(median(x),3)),
           x.intersp = 0, bty = "n")
  } else {
    legend("topleft", legend = paste0("Med = ", round(median(x), 3)),
           x.intersp = 0, bty = "n")
  }
}

Pretty consistent, with a few outliers. Suggests the model is pretty robust.

Let’s take a look at what travel rate predictions will be across a range of slopes and vegetation densities, while holding ground surface roughness constant at zero.

library(viridis)

Loading required package: viridisLite

# define new travel rate function that incorporates slope, vegetation density, and surface roughness
travelrate.fun <- function(slope,density,roughness,a,b,c,d,e,f,g){
  travelrate <- (c * (1 / (pi * b * (1 + ((slope - a) / b) ^ 2))) + d + e * slope) / (f * density + g * roughness + 1)
  return(travelrate)
}

# get coefficients (medians from LOOCV)
a <- median(results.df$a)
b <- median(results.df$b)
c <- median(results.df$c)
d <- median(results.df$d)
e <- median(results.df$e)
f <- median(results.df$f)
g <- median(results.df$g)

# get range of travel rate predictions
min.rate <- 100
max.rate <- -100
for (i in seq(0,1,0.1)){
  slopes <- seq(-45,45,0.1)
  densities <- rep(i, length(slopes))
  roughnesses <- rep(0, length(slopes))
  rates <- travelrate.fun(slopes, densities, roughnesses, a, b, c, d, e, f, g)
  if (min(rates) < min.rate) min.rate <- min(rates)
  if (max(rates) > max.rate) max.rate <- max(rates)
}

# plot out the results
cols <- viridis(11)
counter <- 0
par(las = 1, mar = c(5,5,1,8))
plot(x = c(-45,45), y = c(min.rate, max.rate), type = "n",
     xlab = "Slope (deg)", ylab = "Travel Rate (m/s)")
grid()
for (i in seq(0,1,0.1)){
  counter <- counter + 1
  slopes <- seq(-45,45,0.1)
  densities <- rep(i, length(slopes))
  roughnesses <- rep(0, length(slopes))
  rates <- travelrate.fun(slopes, densities, roughnesses, a, b, c, d, e, f, g)
  lines(rates ~ slopes, lwd = 2, col = cols[counter])
}
par(xpd = NA)
legend(x = par("usr")[2], y = mean(par("usr")[3:4]),
       legend = seq(0,1,0.1), title = "Veg Dens", lwd = 2, col = cols,
       bty = "n", yjust = 0.5)

Very interesting… In effect, what emerges is essentially a negative exponential curve – a function form that we know does not very accurately represent the slope-travel rate relationship. Further, the predicted travel rates with zero vegetation density and nearly flat slopes are VERY high. Unrealistically high. It translates to about a 10 minute mile – a jog. Of course, ground surface roughness is never zero, so maybe this is not a huge problem. But, it does make me curious about the large effect of ground surface roughness on this modeling process.

Let’s see what the curves look like with the ground surface roughness set to the median found among all of the transects:

# get range of travel rate predictions
min.rate <- 100
max.rate <- -100
for (i in seq(0,1,0.1)){
  slopes <- seq(-45,45,0.1)
  densities <- rep(i, length(slopes))
  roughnesses <- rep(median(df$roughness), length(slopes))
  rates <- travelrate.fun(slopes, densities, roughnesses, a, b, c, d, e, f, g)
  if (min(rates) < min.rate) min.rate <- min(rates)
  if (max(rates) > max.rate) max.rate <- max(rates)
}

# plot out the results
cols <- viridis(11)
counter <- 0
par(las = 1, mar = c(5,5,1,8))
plot(x = c(-45,45), y = c(min.rate, max.rate), type = "n",
     xlab = "Slope (deg)", ylab = "Travel Rate (m/s)")
grid()
for (i in seq(0,1,0.1)){
  counter <- counter + 1
  slopes <- seq(-45,45,0.1)
  densities <- rep(i, length(slopes))
  roughnesses <- rep(median(df$roughness), length(slopes))
  rates <- travelrate.fun(slopes, densities, roughnesses, a, b, c, d, e, f, g)
  lines(rates ~ slopes, lwd = 2, col = cols[counter])
}
par(xpd = NA)
legend(x = par("usr")[2], y = mean(par("usr")[3:4]),
       legend = seq(0,1,0.1), title = "Veg Dens", lwd = 2, col = cols,
       bty = "n", yjust = 0.5)

…Much more reasonable rates, but we still have the problem of the function form not matching what we know to be broadly representative of the slope-travel rate relationship. I’m curious what happens if we remove ground surface roughness as a predictor altogether and focus purely on the effects of slope and vegetation density.

# define starting coefficients
a.start <- slope.rate.terms$a[slope.rate.terms$percentile == 0.975]
b.start <- slope.rate.terms$b[slope.rate.terms$percentile == 0.975]
c.start <- slope.rate.terms$c[slope.rate.terms$percentile == 0.975]
d.start <- slope.rate.terms$d[slope.rate.terms$percentile == 0.975]
e.start <- slope.rate.terms$e[slope.rate.terms$percentile == 0.975]
f.start <- 5

# build model
mod.full <- nls(travel.rate ~ (c*(1/(pi*b*(1+((slope-a)/b)^2)))+d+e*slope) / (f * density + 1),
                    data = df,
                    start = list(a = a.start, 
                                 b = b.start, 
                                 c = c.start, 
                                 d = d.start, 
                                 e = e.start, 
                                 f = f.start),
                    control = nls.control(maxiter = 1000))

# get starting coefficients
a.start <- coef(mod.full)[1][[1]]
b.start <- coef(mod.full)[2][[1]]
c.start <- coef(mod.full)[3][[1]]
d.start <- coef(mod.full)[4][[1]]
e.start <- coef(mod.full)[5][[1]]
f.start <- coef(mod.full)[6][[1]]

# leave one transect out cross-validation
tids <- unique(df$transect)
for (tid in tids){
  
  # split into training and test
  df.train <- df[df$transect != tid,]
  df.test <- df[df$transect == tid,]
  
  # build model
  mod.fold <- nls(travel.rate ~ (c*(1/(pi*b*(1+((slope-a)/b)^2)))+d+e*slope) / (f * density + 1),
                    data = df.train,
                    start = list(a = a.start, 
                                 b = b.start, 
                                 c = c.start, 
                                 d = d.start, 
                                 e = e.start, 
                                 f = f.start),
                    control = nls.control(maxiter = 1000,
                                          minFactor = 1e-10,
                                          warnOnly = T))
  
  # apply model
  pred <- predict(mod.fold, df.test)
  obs <- df.test$travel.rate
  
  # compile results
  results.df.temp <- data.frame(pred = pred,
                             obs = obs,
                             a = coef(mod.fold)[1][[1]],
                             b = coef(mod.fold)[2][[1]],
                             c = coef(mod.fold)[3][[1]],
                             d = coef(mod.fold)[4][[1]],
                             e = coef(mod.fold)[5][[1]],
                             f = coef(mod.fold)[6][[1]])
  if (tid == tids[1]){
    results.df <- results.df.temp
  } else {
    results.df <- rbind(results.df, results.df.temp)
  }
}

# set up plot
par(las = 1, mar = c(5,5,1,1))

# define point size and color
pt.sizes <- 1 + df$density * 4
col.1 <- brewer.pal(3, "Set1")[1]
col.2 <- "white"
col.3 <- brewer.pal(3, "Set1")[2]
pt.cols <- colorRampPalette(c(col.1, col.2, col.3))(100)
pt.cols <- pt.cols[as.numeric(cut(df$slope, breaks = 100))]

# define x and y
x <- results.df$obs
y <- results.df$pred

# get axis limits
ax.min <- min(c(x, y))
ax.max <- max(c(x, y))

# plot the data
plot(y ~ x, type = "n", xlab = "Observed Travel Rate (m/s)", ylab = "Predicted Travel Rate (m/s)", xlim = c(ax.min, ax.max), ylim = c(ax.min, ax.max))
grid()
lines(c(-100,100),c(-100,100), col = "darkgray", lty = 2)
points(y ~ x, pch = 21, bg = pt.cols, cex = pt.sizes)

# get model performance metrics
mod <- lm(y ~ x)
abline(mod, lwd = 2)
r2 <- round(summary(mod)$adj.r.squared,2)
rrmse <- round(100*(sqrt(mean((y-x) ^ 2)))/mean(x),2)
r2.txt <- bquote(R^2==.(r2))
rrmse.txt <- bquote(rRMSE==.(rrmse)*"%")
bias <- round(mean(y-x),2)
bias.txt <- bquote(bias==.(bias)*"m/s")
a <- round(coef(mod)[2], 2)
b <- round(coef(mod)[1], 2)
if (b < 0){
  eq.txt <- bquote(y==.(a)~x-.(abs(b)))
} else {
  eq.txt <- bquote(y==.(a)~x+.(abs(b)))
}
legend("topleft",
       bty = "n",
       legend = c(eq.txt, r2.txt, rrmse.txt, bias.txt),
       x.intersp = 0)

# add legend for slope
pt.cols <- colorRampPalette(c(col.1, col.2, col.3))(100)
leg.img <- as.raster(matrix(rev(pt.cols), ncol = 1))
rasterImage(image = leg.img,
            xleft = par("usr")[1] + 0.9 * (par("usr")[2] - par("usr")[1]),
            ybottom = par("usr")[3] + 0.05 * (par("usr")[4] - par("usr")[3]),
            xright = par("usr")[1] + 0.95 * (par("usr")[2] - par("usr")[1]),
            ytop = par("usr")[3] + 0.25 * (par("usr")[4] - par("usr")[3]))
rect(xleft = par("usr")[1] + 0.9 * (par("usr")[2] - par("usr")[1]),
     ybottom = par("usr")[3] + 0.05 * (par("usr")[4] - par("usr")[3]),
     xright = par("usr")[1] + 0.95 * (par("usr")[2] - par("usr")[1]),
     ytop = par("usr")[3] + 0.25 * (par("usr")[4] - par("usr")[3]))
text(x = par("usr")[1] + 0.95 * (par("usr")[2] - par("usr")[1]),
     y = par("usr")[3] + 0.3 * (par("usr")[4] - par("usr")[3]),
     labels = "Slope (deg)",
     adj = c(1,0))
text(x = par("usr")[1] + 0.88 * (par("usr")[2] - par("usr")[1]),
     y = par("usr")[3] + 0.25 * (par("usr")[4] - par("usr")[3]),
     labels = round(max(df$slope)),
     adj = c(1,0.5))
text(x = par("usr")[1] + 0.88 * (par("usr")[2] - par("usr")[1]),
     y = par("usr")[3] + 0.15 * (par("usr")[4] - par("usr")[3]),
     labels = "0",
     adj = c(1,0.5))
text(x = par("usr")[1] + 0.88 * (par("usr")[2] - par("usr")[1]),
     y = par("usr")[3] + 0.05 * (par("usr")[4] - par("usr")[3]),
     labels = round(min(df$slope)),
     adj = c(1,0.5))

# add legend for density
text(x = par("usr")[1] + 0.75 * (par("usr")[2] - par("usr")[1]),
     y = par("usr")[3] + 0.3 * (par("usr")[4] - par("usr")[3]),
     labels = "Density",
     adj = c(1,0))
text(x = par("usr")[1] + 0.68 * (par("usr")[2] - par("usr")[1]),
     y = par("usr")[3] + 0.25 * (par("usr")[4] - par("usr")[3]),
     labels = str_pad(round(max(df$density),2), 4, "right", "0"),
     adj = c(1,0.5))
text(x = par("usr")[1] + 0.68 * (par("usr")[2] - par("usr")[1]),
     y = par("usr")[3] + 0.15 * (par("usr")[4] - par("usr")[3]),
     labels = str_pad(round(mean(range(df$density)),2), 4, "right", "0"),
     adj = c(1,0.5))
text(x = par("usr")[1] + 0.68 * (par("usr")[2] - par("usr")[1]),
     y = par("usr")[3] + 0.05 * (par("usr")[4] - par("usr")[3]),
     labels = str_pad(round(min(df$density),2), 4, "right", "0"),
     adj = c(1,0.5))
points(x = par("usr")[1] + 0.715 * (par("usr")[2] - par("usr")[1]),
       y = par("usr")[3] + 0.25 * (par("usr")[4] - par("usr")[3]),
       cex = max(pt.sizes))
points(x = par("usr")[1] + 0.715 * (par("usr")[2] - par("usr")[1]),
       y = par("usr")[3] + 0.2 * (par("usr")[4] - par("usr")[3]),
       cex = mean(c(max(pt.sizes), mean(range(pt.sizes)))))
points(x = par("usr")[1] + 0.715 * (par("usr")[2] - par("usr")[1]),
       y = par("usr")[3] + 0.15 * (par("usr")[4] - par("usr")[3]),
       cex = mean(range(pt.sizes)))
points(x = par("usr")[1] + 0.715 * (par("usr")[2] - par("usr")[1]),
       y = par("usr")[3] + 0.1 * (par("usr")[4] - par("usr")[3]),
       cex = mean(c(min(pt.sizes), mean(range(pt.sizes)))))
points(x = par("usr")[1] + 0.715 * (par("usr")[2] - par("usr")[1]),
       y = par("usr")[3] + 0.05 * (par("usr")[4] - par("usr")[3]),
       cex = min(pt.sizes))

Fascinating! So, the ground surface roughness in the new data is really playing a major role in explaining variance in travel rates. It shouldn’t be removed from consideration!

Let’s try one more thing… Let’s try fixing the effects of slope on travel rate according to coefficients found in the AllTrails data, effectively just deriving new coefficients for vegetation density and ground surface roughness to see how well we can fit the experimental data.

# define starting coefficients
a <- slope.rate.terms$a[slope.rate.terms$percentile == 0.975]
b <- slope.rate.terms$b[slope.rate.terms$percentile == 0.975]
c <- slope.rate.terms$c[slope.rate.terms$percentile == 0.975]
d <- slope.rate.terms$d[slope.rate.terms$percentile == 0.975]
e <- slope.rate.terms$e[slope.rate.terms$percentile == 0.975]
f.start <- 1.6
g.start <- 43

# build model
mod.full <- nls(travel.rate ~ (c*(1/(pi*b*(1+((slope-a)/b)^2)))+d+e*slope) / (f * density + g * roughness + 1),
                    data = df,
                    start = list(f = f.start,
                                 g = g.start),
                    control = nls.control(maxiter = 1000))

# get starting coefficients
f.start <- coef(mod.full)[1][[1]]
g.start <- coef(mod.full)[2][[1]]

# leave one transect out cross-validation
tids <- unique(df$transect)
for (tid in tids){
  
  # split into training and test
  df.train <- df[df$transect != tid,]
  df.test <- df[df$transect == tid,]
  
  # build model
  mod.fold <- nls(travel.rate ~ (c*(1/(pi*b*(1+((slope-a)/b)^2)))+d+e*slope) / (f * density + g * roughness + 1),
                    data = df.train,
                    start = list(f = f.start,
                                 g = g.start),
                    control = nls.control(maxiter = 1000,
                                          minFactor = 1e-10,
                                          warnOnly = T))
  
  # apply model
  pred <- predict(mod.fold, df.test)
  obs <- df.test$travel.rate
  
  # compile results
  results.df.temp <- data.frame(pred = pred,
                             obs = obs,
                             f = coef(mod.fold)[1][[1]],
                             g = coef(mod.fold)[2][[1]])
  if (tid == tids[1]){
    results.df <- results.df.temp
  } else {
    results.df <- rbind(results.df, results.df.temp)
  }
}

# set up plot
par(las = 1, mar = c(5,5,1,1))

# define point size and color
pt.sizes <- 1 + df$density * 4
col.1 <- brewer.pal(3, "Set1")[1]
col.2 <- "white"
col.3 <- brewer.pal(3, "Set1")[2]
pt.cols <- colorRampPalette(c(col.1, col.2, col.3))(100)
pt.cols <- pt.cols[as.numeric(cut(df$slope, breaks = 100))]

# define x and y
x <- results.df$obs
y <- results.df$pred

# get axis limits
ax.min <- min(c(x, y))
ax.max <- max(c(x, y))

# plot the data
plot(y ~ x, type = "n", xlab = "Observed Travel Rate (m/s)", ylab = "Predicted Travel Rate (m/s)", xlim = c(ax.min, ax.max), ylim = c(ax.min, ax.max))
grid()
lines(c(-100,100),c(-100,100), col = "darkgray", lty = 2)
points(y ~ x, pch = 21, bg = pt.cols, cex = pt.sizes)

# get model performance metrics
mod <- lm(y ~ x)
abline(mod, lwd = 2)
r2 <- round(summary(mod)$adj.r.squared,2)
rrmse <- round(100*(sqrt(mean((y-x) ^ 2)))/mean(x),2)
r2.txt <- bquote(R^2==.(r2))
rrmse.txt <- bquote(rRMSE==.(rrmse)*"%")
bias <- round(mean(y-x),2)
bias.txt <- bquote(bias==.(bias)*"m/s")
a <- round(coef(mod)[2], 2)
b <- round(coef(mod)[1], 2)
if (b < 0){
  eq.txt <- bquote(y==.(a)~x-.(abs(b)))
} else {
  eq.txt <- bquote(y==.(a)~x+.(abs(b)))
}
legend("topleft",
       bty = "n",
       legend = c(eq.txt, r2.txt, rrmse.txt, bias.txt),
       x.intersp = 0)

# add legend for slope
pt.cols <- colorRampPalette(c(col.1, col.2, col.3))(100)
leg.img <- as.raster(matrix(rev(pt.cols), ncol = 1))
rasterImage(image = leg.img,
            xleft = par("usr")[1] + 0.9 * (par("usr")[2] - par("usr")[1]),
            ybottom = par("usr")[3] + 0.05 * (par("usr")[4] - par("usr")[3]),
            xright = par("usr")[1] + 0.95 * (par("usr")[2] - par("usr")[1]),
            ytop = par("usr")[3] + 0.25 * (par("usr")[4] - par("usr")[3]))
rect(xleft = par("usr")[1] + 0.9 * (par("usr")[2] - par("usr")[1]),
     ybottom = par("usr")[3] + 0.05 * (par("usr")[4] - par("usr")[3]),
     xright = par("usr")[1] + 0.95 * (par("usr")[2] - par("usr")[1]),
     ytop = par("usr")[3] + 0.25 * (par("usr")[4] - par("usr")[3]))
text(x = par("usr")[1] + 0.95 * (par("usr")[2] - par("usr")[1]),
     y = par("usr")[3] + 0.3 * (par("usr")[4] - par("usr")[3]),
     labels = "Slope (deg)",
     adj = c(1,0))
text(x = par("usr")[1] + 0.88 * (par("usr")[2] - par("usr")[1]),
     y = par("usr")[3] + 0.25 * (par("usr")[4] - par("usr")[3]),
     labels = round(max(df$slope)),
     adj = c(1,0.5))
text(x = par("usr")[1] + 0.88 * (par("usr")[2] - par("usr")[1]),
     y = par("usr")[3] + 0.15 * (par("usr")[4] - par("usr")[3]),
     labels = "0",
     adj = c(1,0.5))
text(x = par("usr")[1] + 0.88 * (par("usr")[2] - par("usr")[1]),
     y = par("usr")[3] + 0.05 * (par("usr")[4] - par("usr")[3]),
     labels = round(min(df$slope)),
     adj = c(1,0.5))

# add legend for density
text(x = par("usr")[1] + 0.75 * (par("usr")[2] - par("usr")[1]),
     y = par("usr")[3] + 0.3 * (par("usr")[4] - par("usr")[3]),
     labels = "Density",
     adj = c(1,0))
text(x = par("usr")[1] + 0.68 * (par("usr")[2] - par("usr")[1]),
     y = par("usr")[3] + 0.25 * (par("usr")[4] - par("usr")[3]),
     labels = str_pad(round(max(df$density),2), 4, "right", "0"),
     adj = c(1,0.5))
text(x = par("usr")[1] + 0.68 * (par("usr")[2] - par("usr")[1]),
     y = par("usr")[3] + 0.15 * (par("usr")[4] - par("usr")[3]),
     labels = str_pad(round(mean(range(df$density)),2), 4, "right", "0"),
     adj = c(1,0.5))
text(x = par("usr")[1] + 0.68 * (par("usr")[2] - par("usr")[1]),
     y = par("usr")[3] + 0.05 * (par("usr")[4] - par("usr")[3]),
     labels = str_pad(round(min(df$density),2), 4, "right", "0"),
     adj = c(1,0.5))
points(x = par("usr")[1] + 0.715 * (par("usr")[2] - par("usr")[1]),
       y = par("usr")[3] + 0.25 * (par("usr")[4] - par("usr")[3]),
       cex = max(pt.sizes))
points(x = par("usr")[1] + 0.715 * (par("usr")[2] - par("usr")[1]),
       y = par("usr")[3] + 0.2 * (par("usr")[4] - par("usr")[3]),
       cex = mean(c(max(pt.sizes), mean(range(pt.sizes)))))
points(x = par("usr")[1] + 0.715 * (par("usr")[2] - par("usr")[1]),
       y = par("usr")[3] + 0.15 * (par("usr")[4] - par("usr")[3]),
       cex = mean(range(pt.sizes)))
points(x = par("usr")[1] + 0.715 * (par("usr")[2] - par("usr")[1]),
       y = par("usr")[3] + 0.1 * (par("usr")[4] - par("usr")[3]),
       cex = mean(c(min(pt.sizes), mean(range(pt.sizes)))))
points(x = par("usr")[1] + 0.715 * (par("usr")[2] - par("usr")[1]),
       y = par("usr")[3] + 0.05 * (par("usr")[4] - par("usr")[3]),
       cex = min(pt.sizes))

Cool, so a very high R² indicates that the prevailing trend in travel rates is being captured, and a bias of essentially zero suggests that the 97.5th percentile is a good representation of the study subjects’ speeds. But the relatively higher rRMSE and low regression line slope demonstrates that low travel rates are being overestimated and high travel rates are being underestimated. I interpret this as the model doing a good job at identifying the role that vegetation density and ground surface roughness are playing in moderating travel rates, but the AllTrails-derived slope-travel rate relationship simply being too moderate in comparison to the experimental data. That is, slope had a more profound effect in the experiments than in the AllTrails data.

I know this goes against what I discussed above – that the multiplicative approach of accounting for vegetation and roughness effects is superior to the additive approach. But, given these findings, that the inherently moderate nature of the AllTrails curves is limiting the ability of the model to capture extreme high and low travel rates well, I want to run a quick test to see if the additive function form can yield improved predictive performance.

# define starting coefficients
a <- slope.rate.terms$a[slope.rate.terms$percentile == 0.975]
b <- slope.rate.terms$b[slope.rate.terms$percentile == 0.975]
c <- slope.rate.terms$c[slope.rate.terms$percentile == 0.975]
d <- slope.rate.terms$d[slope.rate.terms$percentile == 0.975]
e <- slope.rate.terms$e[slope.rate.terms$percentile == 0.975]
f.start <- -1.6
g.start <- -43

# build model
mod.full <- nls(travel.rate ~ c*(1/(pi*b*(1+((slope-a)/b)^2)))+d+e*slope+f*density+g* roughness,
                    data = df,
                    start = list(f = f.start,
                                 g = g.start),
                    control = nls.control(maxiter = 1000))

# get starting coefficients
f.start <- coef(mod.full)[1][[1]]
g.start <- coef(mod.full)[2][[1]]

# leave one transect out cross-validation
tids <- unique(df$transect)
for (tid in tids){
  
  # split into training and test
  df.train <- df[df$transect != tid,]
  df.test <- df[df$transect == tid,]
  
  # build model
  mod.fold <- nls(travel.rate ~ c*(1/(pi*b*(1+((slope-a)/b)^2)))+d+e*slope+f*density+g* roughness,
                    data = df.train,
                    start = list(f = f.start,
                                 g = g.start),
                    control = nls.control(maxiter = 1000,
                                          minFactor = 1e-10,
                                          warnOnly = T))
  
  # apply model
  pred <- predict(mod.fold, df.test)
  obs <- df.test$travel.rate
  
  # compile results
  results.df.temp <- data.frame(pred = pred,
                             obs = obs,
                             f = coef(mod.fold)[1][[1]],
                             g = coef(mod.fold)[2][[1]])
  if (tid == tids[1]){
    results.df <- results.df.temp
  } else {
    results.df <- rbind(results.df, results.df.temp)
  }
}

# set up plot
par(las = 1, mar = c(5,5,1,1))

# define point size and color
pt.sizes <- 1 + df$density * 4
col.1 <- brewer.pal(3, "Set1")[1]
col.2 <- "white"
col.3 <- brewer.pal(3, "Set1")[2]
pt.cols <- colorRampPalette(c(col.1, col.2, col.3))(100)
pt.cols <- pt.cols[as.numeric(cut(df$slope, breaks = 100))]

# define x and y
x <- results.df$obs
y <- results.df$pred

# get axis limits
ax.min <- min(c(x, y))
ax.max <- max(c(x, y))

# plot the data
plot(y ~ x, type = "n", xlab = "Observed Travel Rate (m/s)", ylab = "Predicted Travel Rate (m/s)", xlim = c(ax.min, ax.max), ylim = c(ax.min, ax.max))
grid()
lines(c(-100,100),c(-100,100), col = "darkgray", lty = 2)
points(y ~ x, pch = 21, bg = pt.cols, cex = pt.sizes)

# get model performance metrics
mod <- lm(y ~ x)
abline(mod, lwd = 2)
r2 <- round(summary(mod)$adj.r.squared,2)
rrmse <- round(100*(sqrt(mean((y-x) ^ 2)))/mean(x),2)
r2.txt <- bquote(R^2==.(r2))
rrmse.txt <- bquote(rRMSE==.(rrmse)*"%")
bias <- round(mean(y-x),2)
bias.txt <- bquote(bias==.(bias)*"m/s")
a <- round(coef(mod)[2], 2)
b <- round(coef(mod)[1], 2)
if (b < 0){
  eq.txt <- bquote(y==.(a)~x-.(abs(b)))
} else {
  eq.txt <- bquote(y==.(a)~x+.(abs(b)))
}
legend("topleft",
       bty = "n",
       legend = c(eq.txt, r2.txt, rrmse.txt, bias.txt),
       x.intersp = 0)

# add legend for slope
pt.cols <- colorRampPalette(c(col.1, col.2, col.3))(100)
leg.img <- as.raster(matrix(rev(pt.cols), ncol = 1))
rasterImage(image = leg.img,
            xleft = par("usr")[1] + 0.9 * (par("usr")[2] - par("usr")[1]),
            ybottom = par("usr")[3] + 0.05 * (par("usr")[4] - par("usr")[3]),
            xright = par("usr")[1] + 0.95 * (par("usr")[2] - par("usr")[1]),
            ytop = par("usr")[3] + 0.25 * (par("usr")[4] - par("usr")[3]))
rect(xleft = par("usr")[1] + 0.9 * (par("usr")[2] - par("usr")[1]),
     ybottom = par("usr")[3] + 0.05 * (par("usr")[4] - par("usr")[3]),
     xright = par("usr")[1] + 0.95 * (par("usr")[2] - par("usr")[1]),
     ytop = par("usr")[3] + 0.25 * (par("usr")[4] - par("usr")[3]))
text(x = par("usr")[1] + 0.95 * (par("usr")[2] - par("usr")[1]),
     y = par("usr")[3] + 0.3 * (par("usr")[4] - par("usr")[3]),
     labels = "Slope (deg)",
     adj = c(1,0))
text(x = par("usr")[1] + 0.88 * (par("usr")[2] - par("usr")[1]),
     y = par("usr")[3] + 0.25 * (par("usr")[4] - par("usr")[3]),
     labels = round(max(df$slope)),
     adj = c(1,0.5))
text(x = par("usr")[1] + 0.88 * (par("usr")[2] - par("usr")[1]),
     y = par("usr")[3] + 0.15 * (par("usr")[4] - par("usr")[3]),
     labels = "0",
     adj = c(1,0.5))
text(x = par("usr")[1] + 0.88 * (par("usr")[2] - par("usr")[1]),
     y = par("usr")[3] + 0.05 * (par("usr")[4] - par("usr")[3]),
     labels = round(min(df$slope)),
     adj = c(1,0.5))

# add legend for density
text(x = par("usr")[1] + 0.75 * (par("usr")[2] - par("usr")[1]),
     y = par("usr")[3] + 0.3 * (par("usr")[4] - par("usr")[3]),
     labels = "Density",
     adj = c(1,0))
text(x = par("usr")[1] + 0.68 * (par("usr")[2] - par("usr")[1]),
     y = par("usr")[3] + 0.25 * (par("usr")[4] - par("usr")[3]),
     labels = str_pad(round(max(df$density),2), 4, "right", "0"),
     adj = c(1,0.5))
text(x = par("usr")[1] + 0.68 * (par("usr")[2] - par("usr")[1]),
     y = par("usr")[3] + 0.15 * (par("usr")[4] - par("usr")[3]),
     labels = str_pad(round(mean(range(df$density)),2), 4, "right", "0"),
     adj = c(1,0.5))
text(x = par("usr")[1] + 0.68 * (par("usr")[2] - par("usr")[1]),
     y = par("usr")[3] + 0.05 * (par("usr")[4] - par("usr")[3]),
     labels = str_pad(round(min(df$density),2), 4, "right", "0"),
     adj = c(1,0.5))
points(x = par("usr")[1] + 0.715 * (par("usr")[2] - par("usr")[1]),
       y = par("usr")[3] + 0.25 * (par("usr")[4] - par("usr")[3]),
       cex = max(pt.sizes))
points(x = par("usr")[1] + 0.715 * (par("usr")[2] - par("usr")[1]),
       y = par("usr")[3] + 0.2 * (par("usr")[4] - par("usr")[3]),
       cex = mean(c(max(pt.sizes), mean(range(pt.sizes)))))
points(x = par("usr")[1] + 0.715 * (par("usr")[2] - par("usr")[1]),
       y = par("usr")[3] + 0.15 * (par("usr")[4] - par("usr")[3]),
       cex = mean(range(pt.sizes)))
points(x = par("usr")[1] + 0.715 * (par("usr")[2] - par("usr")[1]),
       y = par("usr")[3] + 0.1 * (par("usr")[4] - par("usr")[3]),
       cex = mean(c(min(pt.sizes), mean(range(pt.sizes)))))
points(x = par("usr")[1] + 0.715 * (par("usr")[2] - par("usr")[1]),
       y = par("usr")[3] + 0.05 * (par("usr")[4] - par("usr")[3]),
       cex = min(pt.sizes))

Well, would ya look at that. The additive function form does really well, even with the fixed 97.5th percentile slope-travel rate terms. The problem, of course, is that application of these coefficients to the prediction of travel rates to the average (50th percentile) slope-travel rate model will yield many negative travel rates, particularly when the ground surface roughness is fixed at the maximum among all of the transects:

# define new travel rate function that incorporates slope, vegetation density, and surface roughness
travelrate.fun <- function(slope,density,roughness,a,b,c,d,e,f,g){
  travelrate <- c * (1 / (pi * b * (1 + ((slope - a) / b) ^ 2))) + d + e * slope + f * density + g * roughness
  return(travelrate)
}

# get coefficients (medians from LOOCV)
a <- slope.rate.terms$a[slope.rate.terms$percentile == 0.975]
b <- slope.rate.terms$b[slope.rate.terms$percentile == 0.975]
c <- slope.rate.terms$c[slope.rate.terms$percentile == 0.975]
d <- slope.rate.terms$d[slope.rate.terms$percentile == 0.975]
e <- slope.rate.terms$e[slope.rate.terms$percentile == 0.975]
f <- median(results.df$f)
g <- median(results.df$g)

# get range of travel rate predictions
min.rate <- 100
max.rate <- -100
for (i in seq(0,1,0.1)){
  slopes <- seq(-45,45,0.1)
  densities <- rep(i, length(slopes))
  roughnesses <- rep(max(df$roughness), length(slopes))
  rates <- travelrate.fun(slopes, densities, roughnesses, a, b, c, d, e, f, g)
  if (min(rates) < min.rate) min.rate <- min(rates)
  if (max(rates) > max.rate) max.rate <- max(rates)
}

# plot out the results
cols <- viridis(11)
counter <- 0
par(las = 1, mar = c(5,5,1,8))
plot(x = c(-45,45), y = c(min.rate, max.rate), type = "n",
     xlab = "Slope (deg)", ylab = "Travel Rate (m/s)")
grid()
for (i in seq(0,1,0.1)){
  counter <- counter + 1
  slopes <- seq(-45,45,0.1)
  densities <- rep(i, length(slopes))
  roughnesses <- rep(max(df$roughness), length(slopes))
  rates <- travelrate.fun(slopes, densities, roughnesses, a, b, c, d, e, f, g)
  lines(rates ~ slopes, lwd = 2, col = cols[counter])
}
par(xpd = NA)
legend(x = par("usr")[2], y = mean(par("usr")[3:4]),
       legend = seq(0,1,0.1), title = "Veg Dens", lwd = 2, col = cols,
       bty = "n", yjust = 0.5)

Discussion

So, what can we take away from all of this? To me, the big takeaways are these:

On 100m transects, study subjects walk faster than the average hiker. This is why the 97.5th percentile from the AllTrails data turns out to be a decent fit for the data. So, we shouldn’t really trust the absolute travel rates contained within our experiments.
The effects of slope are more pronounced within the experimental data in comparison to crowdsoured AllTrails data. I think this has to do with the fact that the AllTrails data were extremely heavily biased towards lower slopes, with steep slopes (e.g., >30 degrees) being very rare in the data. Therefore, it could be argued that the relative effects of slope vs. travel rate are more reliable in the experimental data than the crowdsourced data. But, we need to be cautious about that conclusion, since this is only based on ~40 people on 33 different trails.
To my mind, the best solution to move forward with is the multiplicative hybrid between AllTrails and our experimental data. The fact that the fixed 97.5th percentile slope-travel rate model with the newly-modeled vegetation density and roughness terms performed so well (at least in terms of variance explained and bias) suggests that the model is capturing the relative effects of vegetation density and roughness well, but suffering on the prediction of slope effects. Our AllTrails study has already demonstrated an impressive capacity to accurately predict travel times as a function of slope. So, taking the slope-travel rate relationships from AllTrails, and mixing in newly-derived coefficients for vegetation density and ground surface roughness should, on average provide both good travel rate estimates and provide a robust accounting of the relative effects of landscape conditions on travel rates. The latter of the two points means it will still be very effective for least cost path mapping.

Conclusions

If we agree with point #3 above, then it’s worth generating a final set of coefficients that we can move forward with for our various projects that depend upon these numbers. **Taking the vegetation density and ground surface roughness coefficients from the fixed 97.5th slope-travel rate model above, but replacing the slope-travel rate coefficients with those from the 50th percentile model to represent the “average” hiker, here is the final model:

\(v = \frac{c\left(\frac{1}{\pi b\left(1+(\frac{slope-a}{b})^{2}\right)}\right)+d+e*slope}{f*density+g*roughness+1}\)

where:

a = -1.4578596
b = 22.0787183
c = 76.3270853
d = 0.0524843
e = -3.2001753^{-4}
f = 0.5816999
g = 7.44597

Here are some predictions with those numbers: