Splines in mrtool

In both meta-regression and regular regression models, incorporating a spline term is a way to estimate a non-linear relationship between a continuous covariate and the dependent variable. Here we focus on splines as implemented in the mrtool meta-regression package.

We might use a spline when a linear effect, which can be estimated with a single parameter describing the slope, fails to capture important information about the shape of the curve. A key decision in setting spline specifications is choosing how much flexibility to allow in the shape of the curve. Greater flexibility requires more data. As a general principle, we recommend starting simple and adding flexibility only when there is a specific reason. This approach helps to avoid overfitting and encourages modelers to track, document, and justify their modeling choices.

We recommend starting with the following specifications as a default: - Three internal knots (spline_knots = array(seq(0, 1, length.out = 5))); - Spacing knots according to data density (spline_knots_type = 'frequency'); - Quadratic (i.e. spline_degree = 2L) rather than cubic splines; - Constraining the left and right tails to be linear (spline_l_linear = TRUE and spline_r_linear = TRUE, respectively).

In the rest of the tutorial, we explain what these terms mean and demonstrate additional spline functionality available in the mrtool package. Additional functionality includes: - Monotonicity: e.g. forcing portions of the curve to be always decreasing (or more specifically, non-increasing); - Convexity and concavity; - Using Uniform and Gaussian priors; and - Ensemble splines.

Load the package

mrtool is a Python package that we access with the R package reticulate. To learn about reticulate, see Chapter 2. For instructions on how to install the mrtool Python package, see Chapter 1.

library(reticulate)
use_condaenv("mrtool-0.0.1")
mr <- import("mrtool")

Create simulated data

In the simulated data, the relationship between covariate x1 and dependent variable y2 is J-shaped. This shape is common in the field of epidemiology where the optimal level of exposure to something with respect to health risk might be a moderate amount. For example, x1 might be systolic blood pressure (SBP) and y2 might be the log-scale relative risk of heart disease (ignoring the scales).

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
set.seed(1)
k_studies <- 10
n_per_study <- 5
tau_2 <- 0.6
sigma_2 <- 0.2

df_sim_study <- data.frame(study_id = as.factor(1:k_studies)) %>%
  mutate(study_effect2 = rnorm(n = k_studies, mean = 0, sd = tau_2) )

df_sim_ch4 <- do.call("rbind", lapply(1:nrow(df_sim_study), function(i) {
  df_sim_study[rep(i, n_per_study), ] })) %>%
  mutate(
    x1 = runif(n = nrow(.), min = 0, max = 10),
    y2_true = 2 * sin(0.43*x1-2.9),
    y2 = y2_true - min(y2_true) + study_effect2 + rnorm(nrow(.), mean = 0, sd = sigma_2),
    y2_se = runif(n = nrow(.), min = sigma_2 - (0.2*sigma_2), max = sigma_2 + (0.2*sigma_2)),
    is_outlier = FALSE) %>%
  arrange(x1)

We plot the data with size proportional to the inverse of the standard error (SE). Observations with lower SE have less uncertainty and are therefore more heavily weighted in the model, so we want them to be bigger on the plot.

make_plot_ch4 <- function() {
  with(df_sim_ch4, plot(
    x = x1, 
    y = y2,
    cex = 1/y2_se * 0.2
  ))
  grid()
}

make_plot_ch4()

Format the data for mrtool

Here we tell mrtool that the dataset has one covariate (x1) and the variable identifying an observation’s group membership is study_id. Group membership is used for estimating random effects; see Chapter 3 for a description of random effects as estimated in mixed effects models like mrtool.

dat4 <- mr$MRData()
dat4$load_df(
  data = df_sim_ch4, col_obs = "y2", col_obs_se = "y2_se",
  col_covs = list("x1"), col_study_id = "study_id"
)

Understanding knot placement

The quantity and placement of knots affect the shape of the estimated curve. Choosing a greater number of knots gives the curve more flexibility in general, and clustering knots closely together provides relatively greater flexibility in that portion of the domain. As a starting point, we recommend placing three internal knots such that the segments between the knots have a roughly equal amount of data.

Here is how we specify this in mrtool; the relevant code is spline_knots_type = 'frequency'.

mod_4a <- mr$MRBRT(
  data = dat4,
  cov_models = list(
    mr$LinearCovModel("intercept", use_re = TRUE),
    mr$LinearCovModel(
      alt_cov = "x1",
      use_spline = TRUE,
      spline_degree = 2L, # 2L is quadratic
      spline_knots = array(seq(0, 1, length.out = 5)), # length.out=5 means 3 internal knots
      spline_knots_type = 'frequency', # 'frequency' means spaced according to data density
      spline_r_linear = TRUE,
      spline_l_linear = TRUE
    )
  )
)
mod_4a$fit_model(inner_print_level = 4L, inner_max_iter = 1000L)

This code is just defining a function for plotting spline predictions, which we will use throughout the tutorial.

plot_spline_results <- function(mod_tmp, show_knots = TRUE) {
  df_pred_tmp <- data.frame(x1 = seq(0, 10, by = 0.1))
  dat_pred_tmp <- mr$MRData()
  dat_pred_tmp$load_df(
    data = df_pred_tmp, 
    col_covs=list('x1')
  )
  df_pred_tmp$pred_tmp <- mod_tmp$predict(data = dat_pred_tmp)
  with(df_pred_tmp, lines(x1, pred_tmp))
  if (show_knots) {
    for (k in mod_tmp$cov_models[[2]]$spline_knots) abline(v = k, col = "gray")
  }
}

Plotting the result, we notice that the knots (vertical lines) are not spaced evenly across x1. This happens because, by random chance, some portions of the x1 domain have greater data density than others. Points are more concentrated around 4 than 2, for example. We prefer to place the knots according to data density so that no individual segment is subject to data sparsity.

make_plot_ch4()
plot_spline_results(mod_tmp = mod_4a)

If spacing the knots equally over the domain is desired, change spline_knots_type = 'frequency' to spline_knots_type = 'domain' as below:

mod_4b <- mr$MRBRT(
  data = dat4,
  cov_models = list(
    mr$LinearCovModel("intercept", use_re = TRUE),
    mr$LinearCovModel(
      alt_cov = "x1",
      use_spline = TRUE,
      spline_degree = 2L, # 2L is quadratic
      spline_knots = array(seq(0, 1, length.out = 5)), # length.out=5 means 3 internal knots
      spline_knots_type = 'domain',
      spline_r_linear = TRUE,
      spline_l_linear = TRUE
    )
  )
)
mod_4b$fit_model(inner_print_level = 4L, inner_max_iter = 1000L)

The difference between choosing "frequency" and choosing "domain" is small in this example, but in the applied setting it can make a big difference. The key point is that if you are using "domain", you will want to make sure that all segments have sufficient data to inform the shape of the curve. In the extreme, with 1 data point or no data in a segment, this can lead the model not to converge. With less extreme forms of data sparsity, fitting a flexible curve based on insufficient data can still lead to overfitting and instability in the predictions. For that reason, we typically only recommend using "domain" for a specific reason identified by the modeler.

In the section below regarding ensemble splines, we describe another approach to knot selection. Briefly, the approach is to randomly generate knot sets according to user-defined rules, fit a submodel with each individual knot set, score each knot set according to a performance metric, and make predictions as a performence-weighted average of submodel-specific predictions. Users might prefer this approach in general or simply as a data-driven way to identify knot locations.

make_plot_ch4()
plot_spline_results(mod_tmp = mod_4b)

In the rest of the tutorial, spline_knots_type will be set to “frequency”.

So far, we have been discussing “internal” knots but not “external” knots. External knots are by definition placed at the minimum and maximum of the observed covariate values. This detail is ignorable in most use cases but can be important for models without intercepts. That is, in general for no-intercept models we expect the prediction to pass through the origin when all covariates are set to zero (i.e. \(E[y|x=0]=0\)). In this case, however, please note that the prediction would only be zero by definition when all non-spline covariates are set to zero and spline covariates are set to their minimum observed values.

Increasing the flexibility of the curve

The main methods for increasing the flexibility of a spline curve in mrtool are: 1) increase the number of knots; 2) increase the degree of the spline, converting it from quadratic to cubic; and 3) remove linearity in the tails, if applicable.

Increasing the number of knots

The following example increases the number of internal knots from 3 to 10, which is intentionally excessive. A typical number of internal knots is 3 to 5, but more may be desired for especially complex curve shapes. The relevant code is spline_knots = array(seq(0, 1, length.out = 12)). Note that specifying length.out = 12 gives 10 internal knots because two of them are the external knots.

mod_4c1 <- mr$MRBRT(
  data = dat4,
  cov_models = list(
    mr$LinearCovModel("intercept", use_re = TRUE),
    mr$LinearCovModel(
      alt_cov = "x1",
      use_spline = TRUE,
      spline_degree = 2L, # 2L is quadratic
      spline_knots = array(seq(0, 1, length.out = 12)), # length.out=12 means 10 internal knots
      spline_knots_type = 'frequency',
      spline_r_linear = TRUE,
      spline_l_linear = TRUE
    )
  )
)
mod_4c1$fit_model(inner_print_level = 4L, inner_max_iter = 1000L)
make_plot_ch4()
plot_spline_results(mod_tmp = mod_4c1)

Increasing the degree (2=quadratic, 3=cubic)

This section has the usual number of knots but increases the degree from 2L to 3L. In mathematical language, this changes it from a quadratic spline to a cubic spline. The use of higher degrees is technically possible but not recommended.

mod_4c2 <- mr$MRBRT(
  data = dat4,
  cov_models = list(
    mr$LinearCovModel("intercept", use_re = TRUE),
    mr$LinearCovModel(
      alt_cov = "x1",
      use_spline = TRUE,
      spline_degree = 3L, # 3L is cubic
      spline_knots = array(seq(0, 1, length.out = 5)), 
      spline_knots_type = 'frequency',
      spline_r_linear = TRUE,
      spline_l_linear = TRUE
    )
  )
)
mod_4c2$fit_model(inner_print_level = 4L, inner_max_iter = 1000L)
make_plot_ch4()
plot_spline_results(mod_tmp = mod_4c2)

Removing linearity in the tail segments

The following example removes linearity in the tail segments. This is typically not recommended except where there is strong evidence that the non-linear relationship has important variation at the lowest and/or highest end of the spline covariate. One example is that age-specific mortality risk is indeed usually high for infants compared to young children, but then mortality risk starts increasing consistently with increasing age.

mod_4c3 <- mr$MRBRT(
  data = dat4,
  cov_models = list(
    mr$LinearCovModel("intercept", use_re = TRUE),
    mr$LinearCovModel(
      alt_cov = "x1",
      use_spline = TRUE,
      spline_degree = 3L, # 3L is cubic
      spline_knots = array(seq(0, 1, length.out = 5)), 
      spline_knots_type = 'frequency',
      spline_r_linear = FALSE,
      spline_l_linear = FALSE
    )
  )
)
mod_4c3$fit_model(inner_print_level = 4L, inner_max_iter = 1000L)

We observe that without linearity in the tails (and using a cubic spline just for illustration), the model is overfitting the data at the high end of x1. This happens for the same reason that flexible models in general need a lot of data; data sparsity is leading to overfitting. It tracks too closely to the few observations that exist in that part of the domain. We expect data sparsity at the extremes by definition, and for that reason removing flexibility by using linear tails is the default.

make_plot_ch4()
plot_spline_results(mod_tmp = mod_4c3)

Monotonicity

Monotonicity constraints are for requiring that the effect of a spline covariate be increasing or decreasing with respect to the covariate. Technically, a spline with a decreasing monotonicity constraint is actually non-increasing, but we use the term “decreasing” for convenience.

The key argument for monotonicity is prior_spline_monotonicity. The allowed values are “increasing” and “decreasing”.

Here is an example with “increasing”.

mod_4d <- mr$MRBRT(
  data = dat4,
  cov_models = list(
    mr$LinearCovModel("intercept", use_re = TRUE),
    mr$LinearCovModel(
      alt_cov = "x1",
      use_spline = TRUE,
      spline_degree = 2L, # 2L is quadratic
      spline_knots = array(seq(0, 1, length.out = 5)), # length.out=5 means 3 internal knots
      spline_knots_type = 'frequency',
      spline_r_linear = TRUE,
      spline_l_linear = TRUE,
      prior_spline_monotonicity = "increasing"
    )
  )
)
mod_4d$fit_model(inner_print_level = 4L, inner_max_iter = 1000L)

A modeler might choose an increasing monotonicity constraint like this if, for example, a non-increasing effect is biologically implausible or otherwise not defensible from a scientific standpoint.

make_plot_ch4()
plot_spline_results(mod_tmp = mod_4d)

Convexity and concavity

A curve is convex if the second derivative (rate of change in the slope) is always \(>=0\); this looks like the letter “U”. A curve is concave is the second derivative is always \(<=0\); this looks like a dome. These specifications are not commonly used in health research, but can make the model more efficient if the particular shape can be assumed. In general, as we provide more information to the model a priori in the form of priors and constraints, the information content of the data can be used to more effectively learn about other aspects of the curve.

Here is an example of specifying a convexity constraint. The key argument is prior_spline_convexity. The allowed values are “convex” and “concave”.

mod_4e <- mr$MRBRT(
  data = dat4,
  cov_models = list(
    mr$LinearCovModel("intercept", use_re = TRUE),
    mr$LinearCovModel(
      alt_cov = "x1",
      use_spline = TRUE,
      spline_degree = 2L, # 2L is quadratic
      spline_knots = array(seq(0, 1, length.out = 5)), # length.out=5 means 3 internal knots
      spline_knots_type = 'frequency',
      spline_r_linear = TRUE,
      spline_l_linear = TRUE,
      prior_spline_convexity = "convex"
    )
  )
)
mod_4e$fit_model(inner_print_level = 4L, inner_max_iter = 1000L)

The J shape that underlies in the simulated data is already somewhat convex, so the addition of the convexity constraint does not make much of a difference.

make_plot_ch4()
plot_spline_results(mod_tmp = mod_4e)

Just for fun, let’s see what a concavity constraint would look like (code not shown; it includes prior_spline_convexity = "concave").

Because the data are clearly not dome-shaped, the best the model can do with a concavity constraint is a straight line.

Uniform and Gaussian priors

We can also alter the shape of the curve by applying Bayesian priors directly to the coefficients of the spline. Setting priors directly on the spline coefficients requires a fairly in-depth understanding of how B-splines are constructed and what the coefficients represent. Because this is outside the scope of the tutorial, we refer readers to the IHME blog post on splines which provides a more mathematical treatment of the topic. We give two examples here. For a full list of the prior options available, see py_help(mr$LinearCovModel); this assumes that you have run mr <- import("mrtool") as above.

As a side note, this section lays the groundwork for cascading splines, the topic of Chapter 5. Briefly, the method involves using spline coefficients estimated from a full-data model as priors on the spline coefficients for models fit to subsets of the data. In this way, information about the shape of the curve is passed hierarchically to different portions of the feature space.

Specifying a prior on the maximum order derivative

In both examples, we set priors on the maximum order derivative of a spline segment. We know how many spline segments there are based on how many internal knots there are. For example, using three internal knots splits the domain into four chunks, so the object we give to the model needs four priors (or one prior that for convenience will be used for all segments). Note that Uniform priors and Gaussian priors both have two parameters that define the prior. Uniform priors have a lower bound and an upper bound; Gaussian priors have a mean and standard deviation. Correspondingly, the structure of the object we pass to the model is a 2-by-n matrix, where n is the number of segments. The first row is either the lower bound (Uniform prior) or a mean (Gaussian prior), and the second row is either an upper bound (Uniform prior) or a standard deviation (Gaussian prior).

Without getting into too much math, there are two concepts that together give a good intuition for what the maximum order derivative represents. First, when the maximum order derivative for a cubic spline segment is set to exactly zero, it becomes a quadratic spline segment. Recall that quadratic splines have less flexibility than cubic splines. So essentially the strength of a prior with mean 0 acts as a tuning knob for reducing the flexibility of a spline. We call this a dampening prior, and it is the topic of the second example in this section. Second, for linear segments, as in piecewise linear splines (i.e. spline_degree = 1L) or linear tails, the maximum order derivative is nothing other than the slope of the line. So setting a prior on the maximum order derivative for linear segments is giving information to the model about what the slope should be. This is a topic of the first example.

Uniform prior on a linear tail

For the first example, we set a strict uniform prior on the slope of a linear tail segment. Specifically, we force the right linear tail to have a slope of \(-1\), which is contrary to the prevailing trend in the data. We do this with the prior_spline_maxder_uniform argument to the mr$LinearCovModel() function. We use 4 internal knots for this example, yielding 5 spline segments. Correspondingly, we pass a 2-by-5 matrix to the model with information about priors. Note that setting Uniform priors with a lower bound of -Inf and an upper bound of Inf effectively provide no prior information to the model.

mod_4g <- mr$MRBRT(
  data = dat4,
  cov_models = list(
    mr$LinearCovModel("intercept", use_re = TRUE),
    mr$LinearCovModel(
      alt_cov = "x1",
      use_spline = TRUE,
      spline_degree = 2L, # 2L is quadratic
      spline_knots = array(seq(0, 1, length.out = 6)), # length.out=5 means 3 internal knots
      spline_knots_type = 'frequency', # 'frequency' means spaced according to data density
      spline_r_linear = TRUE,
      spline_l_linear = TRUE,
      prior_spline_maxder_uniform = rbind(c(-Inf,-Inf,-Inf,-Inf,-1), c(Inf,Inf,Inf,Inf,-1))
    )
  )
)
mod_4g$fit_model(inner_print_level = 4L, inner_max_iter = 1000L)

In the plot below, we see the effects of setting the slope of the right linear tail to exactly \(-1\). Not only is this segment going against the trend in the data; it also has indirect effects on the fit in the adjacent segment. This happens due to the requirement in spline models generally that, exactly at a given knot location, the slope for adjacent spline segments must be equal to each other.

make_plot_ch4()
plot_spline_results(mod_tmp = mod_4g)

For the second example, we reduce the flexibility of a quadratic spline by using dampening priors. By using Gaussian priors with mean zero on the highest order derivative, we effectively create a situation where the spline is somewhere between quadratic (i.e. spline_degree = 2L) and piecewise linear (i.e. spline_degree = 1L). We remove the linear tail constraint in this example because otherwise, as we saw in the previous example, the mean-zero prior would bias the slope toward flatness (i.e. toward first derivative equals zero).

Recall the instance from above where we used too many knots, leading to too much wiggliness.

make_plot_ch4()
plot_spline_results(mod_tmp = mod_4c1)

Ten internal knots means that there are 11 spline segments, which is a lot. Dampening priors reduce the wiggles. Note that the choice of 0.1 for the prior’s standard deviation is arbitrary; lower values make the spline more like piecewise linear, and higher values make the spline more like quadratic.

mod_4c1_dampened <- mr$MRBRT(
  data = dat4,
  cov_models = list(
    mr$LinearCovModel("intercept", use_re = TRUE),
    mr$LinearCovModel(
      alt_cov = "x1",
      use_spline = TRUE,
      spline_degree = 2L, # 2L is quadratic
      spline_knots = array(seq(0, 1, length.out = 12)), # length.out=12 means 10 internal knots
      spline_knots_type = 'frequency',
      spline_r_linear = FALSE,
      spline_l_linear = FALSE,
      prior_spline_maxder_gaussian = array(c(0, 0.1))
    )
  )
)
mod_4c1_dampened$fit_model(inner_print_level = 4L, inner_max_iter = 1000L)

Hm. This worked even better than I expected. We should probably being using this approach more… We get the dual benefit of reasonable extrapolation (like piecewise linear splines) and differentiability at the knots (like quadratic splines).

make_plot_ch4()
plot_spline_results(mod_tmp = mod_4c1_dampened)

Ensemble splines

An ensemble spline is a performance-weighted average of predictions from several submodels, each of which incorporates a spline with different knot locations. Knot locations are generated randomly for each submodel, with randomness subject to a set of contraints defined by the user. Constraints include the minimum allowable distance between adjacent knots and the minimum width of the tail segment (expressed as a proportion of the full domain of the covariate).

Specifying random knots

The function for generating random knot sets is mr$utils$sample_knots(), which takes arguments num_intervals, knot_bounds, interval_sizes, and num_samples. Please see py_help(mr$utils$sample_knots) for the Python docstring describing each input (or search for sample_knots in the GitHub repo at https://github.com/ihmeuw-msca/mrtool, which also has the docstring information). For simplicity, this tutorial focuses only on how to generate a knot set given a minimum distance between the knots.

Here is a wrapper function that generates random knots given a user-defined minimum distance between the knots:

get_knots_mindist <- function(n_internal_knots, min_distance_between_knots, how_many = 1) {
  if (0) {
    n_internal_knots <- 3
    min_distance_between_knots <- 0.05
    how_many <- 2
  }
  
  n_intervals_tmp <- n_internal_knots + 1
  if (min_distance_between_knots > (1 / n_intervals_tmp)) {
    stop("The minimum distance between knots cannot be greater than 1 divided by the number of intervals")
  }
  
  nknots <- n_internal_knots
  intervals_tmp <- matrix(NA, nrow = n_intervals_tmp, ncol = 2)
  intervals_tmp[,1] <- min_distance_between_knots
  intervals_tmp[,2] <- 1
  
  mr$utils$sample_knots(
    num_intervals = as.integer(n_internal_knots) + 1L, 
    knot_bounds = do.call("rbind", lapply(1:nknots, function(x) c(0, 1))), 
    interval_sizes = intervals_tmp, 
    num_samples = as.integer(how_many)
  )
  
}

The function allows for knots to be placed anywhere along the domain, from 0 (lowest observed covariate value) to 1 (highest observed covariate value). Note that, by definition, the minimum distance between knots cannot be greater than 1 divided by the number of spline segments (a.k.a. intervals). Within that constraint, a lower minimum distance leads to greater variation in knot sets. Let’s see some examples.

First, to make the selection of knots reproducible, we need to set the random seed. Because the underlying code is in Python, we need to use reticulate to access numpy’s random seed function.

np <- import("numpy")
np$random$seed(1L)

If we have 3 internal knots, that implies 4 segments, so the highest allowed minimum distance between knots is \(1/4 = 0.25\). If we set the minimum distance to just below 0.25, there is little variation in the knot sets.

knots1 <- get_knots_mindist(
    n_internal_knots = 3,
    min_distance_between_knots = 0.24,
    how_many = 10
)
knots1
##       [,1]      [,2]      [,3]      [,4] [,5]
##  [1,]    0 0.2566763 0.5078633 0.7599954    1
##  [2,]    0 0.2421767 0.5100834 0.7563065    1
##  [3,]    0 0.2463720 0.5105013 0.7525496    1
##  [4,]    0 0.2447849 0.4973761 0.7432322    1
##  [5,]    0 0.2470826 0.4919579 0.7589045    1
##  [6,]    0 0.2456554 0.4988367 0.7433078    1
##  [7,]    0 0.2423086 0.4902788 0.7543845    1
##  [8,]    0 0.2551559 0.4964255 0.7474630    1
##  [9,]    0 0.2716538 0.5158695 0.7565982    1
## [10,]    0 0.2452310 0.4901053 0.7584378    1

And if we set the minimum distance to a lower number, there is greater variation.

knots2 <- get_knots_mindist(
    n_internal_knots = 3,
    min_distance_between_knots = 0.12,
    how_many = 10
)
knots2
##       [,1]      [,2]      [,3]      [,4] [,5]
##  [1,]    0 0.2878356 0.4297331 0.8288596    1
##  [2,]    0 0.2331778 0.5134017 0.7159319    1
##  [3,]    0 0.4674706 0.6734652 0.8704901    1
##  [4,]    0 0.1210289 0.2468211 0.4909539    1
##  [5,]    0 0.2121534 0.4417281 0.8263225    1
##  [6,]    0 0.2002253 0.3677556 0.7273206    1
##  [7,]    0 0.1775440 0.6679009 0.8699292    1
##  [8,]    0 0.1480376 0.4350432 0.7699534    1
##  [9,]    0 0.3478695 0.6893284 0.8522515    1
## [10,]    0 0.3501400 0.6262657 0.8037011    1

To use ensemble splines, we call the function MRBeRT() rather than MRBRT(). The MRBeRT() function takes two extra arguments and is otherwise the same as MRBRT(). The first extra argument ensemble_cov_model takes the LinearCovModel() function call that specifies the spline. Correspondingly, this particular LinearCovModel function call is no longer given to cov_models as in the examples up to this point. The second extra argument is ensemble_knots, which takes as its input the knot sets given by mr$utils$sample_knots() (or the wrapper function get_knots_mindist(), whichever one you use).

Here is an example of specifying an ensemble spline model:

mod_4h <- mr$MRBeRT(
  data = dat4, 
  ensemble_cov_model = mr$LinearCovModel(
    alt_cov = "x1",
    use_spline = TRUE,
    spline_degree = 2L, 
    spline_knots = array(seq(0, 1, length.out = 5)), 
    spline_knots_type = 'frequency', 
    spline_r_linear = TRUE,
    spline_l_linear = TRUE ),
  ensemble_knots = knots2,
  cov_models = list(
    mr$LinearCovModel("intercept", use_re = TRUE)
  )
)
mod_4h$fit_model(inner_print_level = 0L, inner_max_iter = 1000L)

Please note that spline_knots_type is still relevant here as it was before. The knots will be placed differently if "domain" is chosen instead of "frequency".

The syntax for obtaining predictions is the same, even though the underlying structure of the model has changed. This is a nice benefit of object-oriented programming in Python.

make_plot_ch4()
plot_spline_results(mod_tmp = mod_4h, show_knots = FALSE)

# note that the `plot_spline_results()` function was defined above 
# and is only used for purposes of the tutorial

From a fitted model object, we can obtain the inputted knot locations with mod_4h$ensemble_knots. And we can obtain the ensemble weight corresponding to each knot set with:

mod_4h$weights
##  [1] 6.309879e-02 1.032125e-01 0.000000e+00 1.425371e-01 1.204004e-01
##  [6] 1.409692e-01 7.326226e-02 3.563428e-01 4.466001e-05 1.321533e-04

If the ensemble spline method was only used as a way to optimize knot locations for a regular MRBRT() model, the single highest-performing knot set can be obtained with, for example:

best_knots1 <- mod_4h$ensemble_knots[which.max(mod_4h$weights), ]
best_knots1
## [1] 0.0000000 0.1480376 0.4350432 0.7699534 1.0000000