Assessing inter-rater and intra-rater agreement

My scripts:

rendered html versions: Rpubs/thomas-weissensteiner
.rmd files with executable code chunks: www.github.com/thomas-weissensteiner/portfolio/tree/main/

About myself: www.linkedin.com/in/ThomasWs-Mopfair

1. Background and motivation

1.1. The challenge

“Wonderful Wednesdays” are a monthly webinar organised by the Visualisation Special Interest Group of Statisticians in the Pharmaceutical Industry (PSI). Participants are invited to address a visualisation challenge by submitting images and code which are discussed at the following meeting.

Here, the task involved comparing two methods of measurement, based on data from a paper by Eliasziw et al., 1994 ¹:

one therapist uses two different instruments (electro and manual goniometers) to measure knee joint angles
twenty-nine subjects (n=29) were measured three consecutive times on each goniometer at full passive joint extension

The visualisation should answer

do the two goniometers agree?
do the goniometers perform consistently across the three repeated measurements?

A video and blog post of the webinar are available from the PSI website.

1.2. Reliability and agreement

In the terminology that will be used here, the level of inter-rater agreement means how frequently two or more raters or methods can be expected to give exactly the same result. Inter-rater reliability refers to the consistency with between raters, or the degree by which their results are proportional when expressed as deviation from the mean. Therefore, it can be expressed as a correlation coefficient or analysis of variance. Reliability can be considered as a necessary but not sufficient condition to demonstrate agreement. Agreement and reliability are also sometimes referred to as absolute and relative reliability, respectively ².

Different methods for assessing the agreement and/or reliability of measurements exists. The most common indices in the medical literature are

hypothesis testing for bias, e.g comparing means (paired t-test)
correlation coefficients, e.g. Pearson’s r (reliability)
standard error of measurement
repeatability coefficient
Bland and Altman limits of agreement (mainly for agreement)
These methods do not have clear definitions of a null hypothesis. Rather, they are used to test for levels of agreement or reliability that should be defined as clinically acceptable prior to the analysis ^2,3.
x In the following sections, I will focus on visualisations of means and standard errors, Bland-Altman limits with exact confidence intervals ⁴, Probability of Agreement ⁵, and a new method that introduces statistical tests for accuracy, precision, and agreement ⁶.

2. R packages and data set

Data, first 7 lines

# - libraries_and_data                                                       - #

# Data manipulation
library(readxl)       # version 1.4.3  # read_excel
library(dplyr)        # version 1.1.4  # general grammar
library(purrr)        # version 1.0.2  # list_rbind, set_names
library(tidyr)        # version 1.3.1  # pivot_longer, pivot_wider

# Graphics
library(gridExtra)    # version 2.3    # tableGrob
library(ggplot2)      # version 3.4.0 
library(ggside)       # version 0.3.1  # geom_ysidedensity (density side plots)
library(ggiraph)      # version 0.8.10 # girafe, geom_..._interactive (interactive plots)
library(patchwork)    # version 1.3.0  # plot_spacer, plot_layout, plot_annotation (assembly of interactive plots)

# Visual and statistical assessments of accuracy, precision and agreement
library(eirasagree)    # version 5.1 

library(kableExtra)   # rendering tables in HTML format


# Working directory

setwd("C:/Users/ThomasWeissensteiner/OneDrive/Documents/portfolio/WWchallenges/WWchallenge_57")


# Data, were downloaded manually into the working directory from https://github.com/VIS-SIG/Wonderful-Wednesdays/tree/master/data/2024/2024-11-13

data <- read_excel("goniometer.xlsx")

data <- 
  data %>% 
  rename(
    Id = id, 
    Replicate = time, 
    Method = rater, 
    Value = y
    ) %>% 
  mutate(
    Replicate =  case_when(
      Replicate == "1" ~"1st",
      Replicate == "2" ~"2nd",
      Replicate == "3" ~"3rd"
      ) %>% factor,
    Method =  case_when(
      Method == "manual" ~"Manual",
      Method == "electro" ~"Electro"
      ) %>% 
      factor(
        levels = c("Manual", "Electro") 
        ),
      Id = factor(Id)
    )

# Data with new variable names

data %>% .[1:7,]

## # A tibble: 7 × 4
##   Id    Replicate Method  Value
##   <fct> <fct>     <fct>   <dbl>
## 1 1     1st       Manual     -2
## 2 1     1st       Electro     2
## 3 1     2nd       Manual      0
## 4 1     2nd       Electro     1
## 5 1     3rd       Manual      1
## 6 1     3rd       Electro     1
## 7 2     1st       Manual     16

3. Distribution of measured values

3.1. Raw data: consecutive replicate measurements with two different methods

# - Chunk_1                                                           - #
# - Requires:  * R-libraries and object data (libraries_and_data)     - #


## Plot of measurements in individual subjects versus replicate number

p1 <- 
  data %>%  
  ggplot(
    aes(
      x = as.numeric(Replicate), 
      y = Value, 
      color = Id,
      data_id = Id,
      tooltip = Id
      )
    ) +
  geom_line(color = "lightgrey") +
  geom_point_interactive() +
  facet_wrap(.~Method) +
  scale_x_continuous(
    breaks = 1:3,
    labels = levels(data$Replicate)    
    ) +
  xlab("Replicate") +
  ylab("Measured value") +
  labs(
    title = "Consecutive replicate measurements in individual patients",
    caption = "Hover over a point to highlight individual patient ids"
    ) +
  theme_light() +
  theme(
    plot.title = element_text(hjust = 0.5),
    strip.background = element_blank(),
    strip.text.x = element_text(
      color = "black",
      size = 10
      ),
    legend.position = "none",
    plot.caption = element_text(hjust=0.5)
    ) 

  
girafe(
  ggobj = p1,
    options = list(
    opts_hover(css = ''),                       # CSS code of selected line
    opts_hover_inv(css = "opacity:0.05;"),      # CSS code of all other lines
    opts_sizing(rescale = FALSE)
    )
  )

Manual measurements were spread over a greater range and tended to be somewhat higher than the values obtained with the electro method.
Raw data show no obvious trends across repeated measurements for either method.
Note on coding: There is presently no reliable function in ggiraph for interactive line plots, i.e. lines that cross each other tend to respond as a group. As a substitute, I used geom_point_interactive although coloured lines would have been visually more appealing.

3.2. Comparing means, variances, and repeatabilities of the original measurements

Plotting the distribution of the data can help checking whether two statistical assumptions of the Bland-Altman method are met: Variance for both methods should a) be of similar magnitude, and b) not depend on the true latent trait ^3,7. Mean and standard error give a first indication of method agreement.

# - Chunk_2                                                             - #
# - Requires:  * R-libraries and object data (libraries_and_data)       - #


# Standard error function

std <- function(x) sd(x)/sqrt(length(x))

# Calculate intra-subject replicate means

replicates <- data %>% .$Replicate %>% levels

data_replMean <- 
  data %>%
  pivot_wider(
    names_from = "Replicate",
    values_from = "Value") %>% 
  rowwise() %>% 
  mutate(
      replicate_Mean = mean(
        c_across(
          all_of(replicates)
          )
        ),
      replicate_SD = sd(
        c_across(
          all_of(replicates) 
          )
        ),
      replicate_SE = std(
        c_across(
          all_of(replicates) 
          )
        )
      ) %>% 
  pivot_longer(
    cols = all_of(replicates),
    names_to = "Replicate",
    values_to = "Value"
  ) %>% 
  mutate(
    Replicate = factor(Replicate)
  )

## Table with summary statistics of the original sample data

# Generate summary table

data_replMeanMean <- 
  data_replMean %>%
  { rbind(
      group_by (., Method, Replicate) %>%     # for each method and replicate, calculate mean and SE across subjects
      summarise(
        Mean = mean(Value),
        SD = sd(Value),
        SE = std(Value)
        ),
      group_by (., Method) %>%           # for each method, calculate means across subjects for mean and SE across replicates 
      summarise(
        Mean = mean(replicate_Mean),
        SD = sd(replicate_Mean),
        SE = std(replicate_Mean)
        ) %>% 
        mutate(Replicate = "replicate mean")
      )
  } %>% 
  mutate(                 # vertical offset for error bars and their means in the plots
    errorbar_pos = rep(- 0.06, n()),
    Replicate = factor(Replicate)
    )                                   


## Plots 

palette_1 <- c("blue", "red")

# Distribution, mean and SE for individual replicates and their means 

p2A <- 
data_replMean %>% 
  bind_rows(
    data_replMean %>%
    filter(Replicate == "1st") %>%          # select a unique row for each subject's replicate mean
    mutate(
    .keep = "unused",
    Replicate = "replicate mean",
    Value = replicate_Mean)
  ) %>%
  select(Id, Replicate, Method, Value) %>%            
  ggplot(
    ) +
  geom_density(                             # show smoothed distribution of measurement values
    aes(
      Value * 0.7, 
      fill = Method
      ),
    alpha = 0.3,
    color = "grey", 
    position = position_nudge(y = 0.02)
    ) +
  geom_point_interactive(                   # show data points of all subjects
    aes(
      x = Value,
      y = -0.02,                            # vertical offset for data point rows
      color = Method,
      data_id = Id,
      tooltip = Id
      ),
    size = 3,
    alpha = 0.5
    ) +
  geom_errorbarh (                          # show SE bar
    data = data_replMeanMean,
    #inherit.aes = FALSE,
    aes(
      xmin = Mean - SE, 
      xmax = Mean + SE,
      y = errorbar_pos - 0.025,
      color = Method
      ),
      height = 0.025                       # height of bar caps
    ) +
  geom_errorbarh (                         # show SD bar
    data = data_replMeanMean,
    #inherit.aes = FALSE,
    aes(
      xmin = Mean - SD, 
      xmax = Mean + SD,
      y = errorbar_pos,
      color = Method
      ),
      height = 0.04                        # height of bar caps
    ) +
  geom_point(                              # add means to SE bars
    data = data_replMeanMean,
     aes(
      x = Mean,             
      y = errorbar_pos,
      color = Method
      ),
      shape = 18,
      size = 3
    ) +  
  facet_wrap(
    .~ Replicate, 
    dir = "v", 
    strip.position = "left",
    nrow = 4
    ) +
  labs(
    title = "Inter-method comparison of mean, SD and SE",
    y = "Replicate",
    x = "Measurement",
    tag = "A"
    ) +
  scale_fill_manual( values = palette_1) +
  scale_color_manual(values = palette_1) +
  guides(
    fill = guide_legend(title="Method"),
    color = guide_legend(title="Method")
    ) +
  theme_light() +
  theme(
    plot.title = element_text(
      size = 12,
      hjust = 0.5),
    axis.text.y = element_blank(), 
    strip.background = element_rect(
      fill = "white", 
      color = "darkgrey"
      ), 
    axis.ticks.y = element_blank(),
    strip.text.y = element_text(
      color = "black",
      size = 12
      )
    ) 


# Consistency of results over repeated measurements

p2B <- 
data_replMeanMean %>% 
  filter (!Replicate == "replicate mean") %>% 
  mutate(
    SEup = Mean + SE,
    SElow = Mean - SE
    ) %>% 
  select(!c(errorbar_pos, SD, SE) ) %>% 
  ggplot(
    aes(
      x = as.numeric(Replicate), 
      y = Mean, 
      #group = Method, 
      color = Method, 
      fill = Method
      )
    ) + 
  geom_ribbon(
    aes(ymin =SElow, ymax = SEup), 
    linetype = 0, 
    alpha = 0.2
    ) +
  geom_line() + 
  scale_color_manual(values = palette_1) +
  scale_fill_manual(values = palette_1) +
  scale_x_continuous(
    breaks = 1:3,
    labels = levels(data$Replicate)    
    ) +
  labs(
    title = "Repeatability over consecutive 
    replicate measurements",
    x = "Replicate",
    y = "Mean +/- SE",
    tag = "B"
    ) +
  theme_light() +
  theme(
    plot.title = element_text(
      size = 12,
      hjust = 0.5
      ),
    legend.position = "none"
    )


# Variance versus value of replicate means

p2C <- 
data_replMean %>%
  filter(Replicate == "1st") %>%             # select a unique row for each subject's replicate mean (avoid generating one row for each replicate)
  ggplot(
    aes(
      x = replicate_Mean,
      y = replicate_SD,
      color = Method,
      fill = Method
      )
    ) +
  geom_point_interactive(             # show data points of for replicate means of all subjects
    aes(
      data_id = Id,
      tooltip = Id
      ),
    size = 3,
    alpha = 0.5
    ) +
  geom_smooth(
    lwd = 0.1, 
    alpha = 0.1
    ) +
  scale_color_manual(values = palette_1) +
  scale_fill_manual(values = palette_1) +
  labs(
    title = "Replicate variance versus mean",
    caption =  "Hover over points to highlight individual patients",
    x = "Replicate mean",
    y = "SD",
    tag = "C"
    ) +
  theme_light() +
  theme(
    plot.title = element_text(
      size = 12,
      hjust = 0.5),
    plot.caption = element_text(
      size = 9,
      hjust = 0.5),
    legend.position = "none"
    ) 


# Deviation of individual measurements from replicate means
# (alternative to the SD versus value plot, p2C) 

p2D <- 
  data_replMean %>%
  mutate(Residuals = Value - replicate_Mean) %>% 
  ggplot(
    aes(
      x = Value,
      y = Residuals,
      color = Method,
      fill = Method
      ) 
  ) +
  geom_point_interactive(
    aes(
      data_id = Id,
      tooltip = Id
      ),
    size = 2,
    alpha = 0.5
    ) +
  geom_hline(
    aes(yintercept = 0),
    color = "darkgrey"
    ) +
  geom_ysidedensity(
    aes(y = Residuals),
    alpha = 0.1
    ) +     
  scale_color_manual(values = palette_1) +
  scale_fill_manual(values = palette_1) +
  labs(
    title = "Deviation from replicate mean",
    x = "Measurement",
    y = "Measurement - Replicate mean",
    tag = "D"
    ) +
  theme_light() +
  theme(
    plot.title = element_text(
      size = 12,
      hjust = 0.5
      ),
    axis.text.y = element_text(
      size = 9
      ),
    legend.position = "none"
    ) +
  theme_ggside_void()

         
# Combined plot layout

design <- "AAAAA#BBBB
           AAAAA#BBBB
           AAAAA#BBBB
           AAAAA#####
           AAAAA#####
           AAAAA#E###
           AAAAA#####
           AAAAA#####
           CCCCC#DDDD
           CCCCC#DDDD
           CCCCC#DDDD"

girafe(
  ggobj = 
    p2A + 
    p2B + 
    p2C +
    p2D +
    guide_area() +
    plot_annotation(
      title = "Distributions and means of original measurements\n",
      theme = theme(
        plot.title = element_text(
          size = 14, 
          hjust = 0.5
          )
        )
      ) +
    plot_layout(
      design = design,
      guides = "collect"
      ),
    height_svg = 11,
    width_svg = 9,
    options = list(
      opts_hover(css = "stroke:black; stroke-width:2px;"),  # CSS code of selected line
      opts_hover_inv(css = "opacity:0.1;"),                 # CSS code of all other lines
      opts_sizing(rescale = FALSE)                          # allows to zoom in and out of saved html image
      )
  )

# Display summary table

data_replMeanMean %>% 
  select(!errorbar_pos) %>% 
  kable(., "html") %>% 
  kable_styling( 
    bootstrap_options = c(
      "striped", "hover", "responsive", full_width = FALSE
      ) 
    )

Method	Replicate	Mean	SD	SE
Manual	1st	1.5862069	7.218613	1.3404628
Manual	2nd	1.4827586	7.471951	1.3875064
Manual	3rd	1.2413793	7.452972	1.3839822
Electro	1st	0.0000000	7.086204	1.3158750
Electro	2nd	0.0344828	7.480846	1.3891582
Electro	3rd	0.1034483	7.148151	1.3273783
Manual	replicate mean	1.4367816	7.263108	0.7786869
Electro	replicate mean	0.0459770	7.109729	0.7622430

A) Means difference The distribution of measurement values appears to be similar and approximately normal for both methods at all time points. Differences between means and SD/SE seem to decrease with repeated measurements (replicate 1 to 3). Means of manual measurements are higher at all time points.
B) Intra-method repeatability over consecutive replicate measurements C) Intra-subject replicate standard deviation versus mean Repeatability plot ³ showing the standard deviation (SD) of the measurements for each subject and method against replicate averages. Larger SD values indicate poorer replicate agreement. SD does not appear to be related to the magnitude of replicate means. However, the maximum of variance in replicated electro measurements appears to be shifted to higher means relative to the maximum in the manual data. Hovering over individual data points shows that subjects 1, 9, 12, 18 and 20 the had the highest SD values, and that the first time point contributed most to their variance.
D) Deviation from replicate mean versus replicate value This plot represents an alternative visualization of the variance across measurement values. As in (C), no obvious trend exists.

Note that standard deviations and errors for the replicate means show in the fourth row of (A) are not entirely correct: Overall variance should account for both within-sample and between-sample variation, but shown here is only the latter which is an underestimate. Because replicates within a sample are also correlated, more appropriate methods for their calculations would be:

Mixed-effects model or variance components analysis (e.g. R package lme4)
Nested ANOVA (Analysis of Variance)

3.3. Differences between measurements for each replicate and the replicate mean: do they have normal distribution?

The Bland-Altman method for assessing method equivalence also requires that differences between the methods are normally distributed ^3,7.
Histograms and probability (or quantile-quantile / qq) plots are common ways for visualising how well distributions conform with normality: On the left, black bars represent binned actual values while red lines show their theoretical normal distributions. For data with a perfect normal distribution, the histogram bars will fill the area under the line.
On the right, quantiles of actual values are plotted against their theoretical quantiles. In case of a perfectly normal distribution, all points will lie on the dashed red line.

# - Chunk_3                                                                      - #
# - Requires:  * R-libraries and object data (libraries_and_data),               - #
#              * data_replMean (chunk 2)                                         - #
#              * samples to be numbered 1:n, in same order as theoretical values - #


## Inter-replicate means and SDs for differences and means (chunk 3) of measurements for each sample

# Differences and means of measurements with the two methods for each sample and replicate

data_BlAltm <- 
  data_replMean %>% 
    select(Id, Replicate, Method, Value) %>%
    pivot_wider(                
        names_from = "Method", 
        values_from = "Value"
    ) %>%   
  mutate(
    Method_mean = (Manual + Electro)/2,
    Method_diff = Manual - Electro
    )

# Differences and means of measurements with the two methods: replicate means for each sample

# Question: Does considering only the variance of the measurement differences, instead of intra-replicate variance of each rater/method of measurement lead to an underestimation of SDs? Would mixed-effects models or nested ANOVA, also suggested for the nested SDs below, be more appropriate?   

data_BlAltm_replMeans <- 
  data_BlAltm %>% 
  select(Id, Replicate, Method_mean, Method_diff) %>% 
  split(.$Id) %>% 
  lapply(
    function(x) summarise(x,  
      Mean_methodMean = mean(Method_mean), 
      SD_methodMean = sd(Method_mean), 
      Mean_methodDiff = mean(Method_diff), 
      SD_methodDiff = sd(Method_diff)
      )
    ) %>% 
  list_rbind(names_to = "Id") 

# Combine the 2 dataframes above

data_BlAltm_combined <-
 data_BlAltm %>% 
  select(Id, Replicate, Method_diff, Method_mean) %>% 
  bind_rows(
    data_BlAltm_replMeans %>%
    mutate(
      Id = Id,
      Replicate = "replicate mean",
      Method_diff = Mean_methodDiff,
      Method_mean = Mean_methodMean,
      .keep = "none"
      )
    )


## Limits of agreement (LoA) for all inter-replicate means and overall mean of method differences

# Function for calculating nested SDs

SD_nested <- 
  function(x) {
    sqrt(sum(x^2)/(length(x)-1))
    }
    
data_BlAltm_replLoA <-
  data_BlAltm %>% 
  group_by(Replicate) %>% 
  summarise(
    Mean = mean(Method_diff),
    SD = sd(Method_diff),
    LoA_up = Mean + 1.96* SD,
    LoA_low = Mean - 1.96* SD
    )

data_BlAltm_LoA_replMeansMeans <-
  data_BlAltm_replMeans %>% 
  summarise(
    Mean = mean(Mean_methodDiff), 
    SD = SD_nested(SD_methodDiff)
    ) %>%
  mutate(
    LoA_up = Mean + 1.96*SD, 
    LoA_low = Mean - 1.96*SD
    )  

# Combine the 2 dataframes above

data_BlAltm_LoA_combined <-
 data_BlAltm_replLoA %>% 
  select(Replicate, Mean, SD) %>% 
    bind_rows(
    data_BlAltm_LoA_replMeansMeans %>%
      mutate(Replicate = "replicate mean") %>% 
      select(Replicate, Mean,  SD) 
      )

# Add theoretical values with normal distribution

data_BlAltm_combined <-
  full_join(
    data_BlAltm_combined,
    data_BlAltm_LoA_combined %>%           # adding column with theoretical values
      apply(., 1, 
        function(row) {dnorm(
          -14:14,                          # centers distribution 
          mean = as.numeric(row['Mean']), 
          sd = as.numeric(row['SD'])
          )
          }
        ) %>% 
      {.*29} %>%                          # scale distribution to total number of samples
      as.data.frame() %>% 
      mutate(Id = factor(1:29) ) %>% 
      pivot_longer(
        cols = !"Id", 
        names_to = "Replicate",
        values_to = "Theor_method_diff"
        ) %>% 
      mutate(
        Replicate = rep(
          c("1st", "2nd", "3rd", "replicate mean"), 29
          )
        )
      )

## Plots

# Histogram plot of measurement differences

binwidth = 1.5           # results in 9 bins

p3A <-
 data_BlAltm_combined %>%
  mutate(
    Id = as.numeric(Id),
    x= Id - mean(Id)
  ) %>% 
  ggplot() + 
  geom_histogram(
    aes(Method_diff),
    binwidth = binwidth,
    position = "stack",
    color = "darkgrey"
    ) +
  geom_line(        # stat_function (mean, sd) would be result im smother line, but does not work with faceting
    aes(
      x = x,                  
      y = Theor_method_diff * binwidth),
      color = "darkred"
    ) +
  facet_wrap(
    .~ Replicate, 
    dir = "v", 
    strip.position = "left",
    nrow = 4
    ) +
  labs(
    title = "Histogram (counts)",
    x = "\nManual - Electro",
    y = "Replicate"
    ) +
  scale_x_continuous(
    limits = c(-4.5, 7.5),                    # set manually to match histogram bins (ignore warnings)
    breaks = seq(-4, 7, 1)
    ) +
  theme_minimal() +
  theme(
    plot.title = element_text(
      size = 12,
      hjust = 0.5),
    axis.text.y = element_blank(), 
    strip.placement = "outside", 
    strip.background = element_blank(),
    strip.text.y = element_text(
      color = "black",
      size = 12
      ),
    axis.ticks = element_blank(),
    axis.text.x = element_text(vjust = -2)
    )

# qq-plot of measurement differences (unscaled)

p3B <- 
 data_BlAltm_combined %>%
  ggplot(
    aes(sample = Method_diff)
    ) + 
  geom_qq(
    dparams = list(
      mean = mean(data_BlAltm_combined$Method_diff), 
      sd = sd(data_BlAltm_combined$Method_diff)
      ),
    size = 3,
    color = "black",
    alpha = 0.3,
    ) +
   geom_abline(
    linetype = "dashed",
    slope = 1,
    intercept = 0,
    color = "darkred"
    ) +
    facet_wrap(
    .~ Replicate, 
    dir = "v", 
    strip.position = "left",
    nrow = 4
    ) +
  labs(
    title = "Probability plot",
    x = "\nManual - Electro (theoretical data)",
    y = "Manual - Electro"
    ) +
  scale_x_continuous(
    limits = c(-4.5, 7.5),                    # set manually to match histogram bins
    breaks = seq(-4, 7, 1)
    ) +
  theme_minimal() +
  theme(
    plot.title = element_text(
      size = 12,
      hjust = 0.5),
    strip.background = element_blank(),
    strip.text = element_blank(),
    axis.ticks = element_blank(),
    axis.text.y = element_blank(), 
    axis.text.x = element_text(vjust = 6), 
    axis.title.x = element_text(vjust = 13)
     )

p3A + p3B  +
  plot_layout(
    ncol = 2,
    heights = 6
    ) +
  plot_annotation(
    title = "Distribution of inter-method differences \nfor consecutive replicate measurements and their means",
    theme = theme(
      plot.title = element_text(
        size = 14, 
        hjust = 0.5
        )
      )
    )

The plots show that method differences for individual replicates, as well as their means, have approximately normal distribution. Statistical tests for normality such as Shapiro-Wilk, D’Agostino-Pearson, Kolmogorov-Smirnov can be used to confirm normality. If rejected, a transformation of the original data may be tried. In certain cases, estimating the limits with a non-parametric method might be preferable ³.

4. Method agreement

4.1. Bland-Altman plots for replicate means and individual time points

A Bland-Altman plot displays the difference between measurements by two raters against the averages of the same measurement pairs. Horizontal lines denote the mean difference and upper and lower “limits of agreement” (LoAs). I.e., 95% LoAs will enclose 95% of the differences, under the assumptions that differences are distributed normally and sample sizes were very large. However, LoAs particularly for small sample sizes, may vary considerably from the population LoAs. Moreover, for samples smaller than infinity, LoAs will tend to be on average slightly closer to the mean of differences than the population LoAs would be. Therefore, reporting LoAs with confidence intervals is recommended as standard practice. Here, I adopted Carkeet’s exact two-sided tolerance approach for calculating confidence intervals of the LoAs, which is more accurate than the original Bland-Altman method ⁴.

# - Chunk_4a                                                          - #

## Function for calculating the coefficients of confidence intervals for limits of agreement
# Based on MATLAB code published by Charkeet, 2015 (ref. 4), results validated against table 2 in the same paper
# Input parameters:
# - `N`: Number of pairs in the sample (required)
# - `gamma`: Area of left tail in confidence interval (required)
# - `p`: Probability value, defaulting to 0.975 if not specified

calculate_ba_coefficient <- function(N, gamma, p = 0.95) {
  # Input validation
  if (!is.numeric(N) || N <= 1 || N != round(N)) {
    stop("N must be a positive integer greater than 1")
  }
  if (!is.numeric(gamma) || gamma <= 0 || gamma >= 1) {
    stop("gamma must be between 0 and 1")
  }
  if (!is.numeric(p) || p <= 0 || p >= 1) {
    stop("p must be between 0 and 1")
  }
  
  # Initialize variables
  Degf <- N - 1
  gammaest <- 0
  Kest <- 0
  Kstep <- 4
  DirectK <- 1
  
  # Main iteration loop
  while(abs(gammaest - gamma) > 1e-8) {
    Kest <- Kest + Kstep
    K <- Kest
    stepper <- 0.05/N
    toprange <- 8/(N^0.5) + stepper
    xdist <- seq(0, toprange, by=stepper)
    boxes <- length(xdist)
    boxes <- round(boxes/2 + 0.1) * 2 - 1
    
    Prchi <- numeric(boxes)
    Combpdf <- numeric(boxes)
    halfgauss <- exp(-(N/2) * xdist^2)
    shrinkfactor <- 2 * (N/(2*pi))^0.5
    
    # Calculate probabilities for each box
    for(s in 1:boxes) {
      xtest <- xdist[s]
      startp <- 0.5 + p/2
      resti <- qnorm(startp) + xtest - 0.1
      restiprior <- resti
      
      phigh <- pnorm(xtest + resti)
      plow <- pnorm(xtest - resti)
      pesti <- phigh - plow
      pestiprior <- pesti
      perror <- pesti - p
      
      resti <- resti + 0.11
      phigh <- pnorm(xtest + resti)
      plow <- pnorm(xtest - resti)
      pesti <- phigh - plow
      perror <- pesti - p
      
      # Newton-Raphson iteration
      while(abs(perror) > 2e-15) {
        deltap <- pesti - pestiprior
        deltaresti <- resti - restiprior
        newresti <- resti - perror/deltap * deltaresti
        restiprior <- resti
        resti <- newresti
        pestiprior <- pesti
        
        phigh <- pnorm(xtest + resti)
        plow <- pnorm(xtest - resti)
        pesti <- phigh - plow
        perror <- pesti - p
      }
      
      chiprob <- 1 - pchisq((Degf * resti^2)/(K^2), Degf)
      Prchi[s] <- chiprob
      Combpdf[s] <- chiprob * halfgauss[s]
    }
    
    # Simpson's rule integration
    Integ <- 0
    for(s in seq(1, boxes-2, by=2)) {
      M <- Combpdf[s+1] * stepper * 2
      T <- (Combpdf[s] + Combpdf[s+2]) * stepper
      Integ <- Integ + (M*2 + T)/3 * shrinkfactor
    }
    
    gammaest <- Integ
    if(gammaest * DirectK > gamma * DirectK) {
      DirectK <- DirectK * -1
      Kstep <- -Kstep/2
    }
  }
  
  return(Kest)
}

# - Chunk_4b                                                          - #
# - Requires:  * R-libraries and object data (libraries_and_data),    - #
#              * data_BlAltm, data_BlAltm_replLoA (chunk 2)           - #
#              * calculate_ba_coefficient (chunk 4a)                  - #


## Bland-Altman plots with confidence limits

# Inter-replicate mean and SD for each sample

data_BlAltm_replMeans <- 
  data_BlAltm %>% 
  select(Id, Replicate, Method_mean, Method_diff) %>% 
  split(.$Id) %>% 
  lapply(
    function(x) summarise(x,  
      Mean_methodMean = mean(Method_mean), 
      SD_methodMean = sd(Method_mean), 
      Mean_methodDiff = mean(Method_diff), 
      SD_methodDiff = sd(Method_diff)
      )
    ) %>% 
  list_rbind(names_to = "Id") 


#  Summary stats of inter-replicate means and LoAs
     
data_BlAltm_LoA_replMeansMeans <- 
  data_BlAltm_replMeans %>% 
  summarise(
    Mean = mean(Mean_methodDiff), 
    SD = SD_nested(SD_methodDiff)
    ) %>%
  mutate(
    LoA_up = Mean + 1.96*SD, 
    LoA_low = Mean - 1.96*SD
    ) 


# Table of means and LoAs

N <- data %>% .$Id %>% as.numeric() %>% max()
gamma <- c(inner = 0.025, outer = 0.975)

LoA_CIcoeffs <- 
  sapply(
    gamma,
    function (x)
  calculate_ba_coefficient(N, x)
  )
    
LoA_table <- 
  data_BlAltm_LoA_replMeansMeans %>% 
  mutate(
    .before = Mean, 
    Replicate = "replicate mean"
    ) %>% 
  bind_rows(
    data_BlAltm_replLoA,
    .
    ) %>% 
  mutate( 
    LoA_outer_high = Mean + SD * LoA_CIcoeffs[2],
    LoA_outer_low = Mean - SD * LoA_CIcoeffs[2],
    LoA_inner_high = Mean + SD *LoA_CIcoeffs[1],
    LoA_inner_low = Mean - SD * LoA_CIcoeffs[1]
    ) %>% 
  select(!SD)


# Color palette

palette_2 <- 
  c("1st"= "#F8766D", "2nd" = "#00BA38", "3rd" = "#619CFF")


# BA plot for all samples and replicates

p4A <- 
  data_BlAltm %>% 
  ggplot(
    aes(
      x= Method_mean, 
      color = Replicate
      )
    ) +
  geom_point_interactive(
    aes(
      y= Method_diff,
      group = Id,
      data_id = Id,
      tooltip = Id
      ),
    size = 3,
    alpha = 0.5,
    show.legend = FALSE
    ) +
  
  geom_ysidedensity(                  # Density sideplot
    aes(
      y= Method_diff, 
      fill = Replicate
      ),
    alpha = 0.3
    ) + 
  
  geom_hline_interactive(             # Lines for mean difference and LoAs
    data = LoA_table[1:3,],
    aes(
      data_id = Replicate,
      tooltip = Replicate,
      yintercept = Mean,
      color = Replicate
      ),
    alpha = 0.7
    ) +
  geom_hline_interactive(
    data = LoA_table[1:3,],
    aes(
      data_id = Replicate,
      tooltip = Replicate,
      yintercept = LoA_up,
      color = Replicate
      ),
    alpha = 0.7
    ) +
  geom_hline_interactive(
    data = LoA_table[1:3,],
    aes(
      data_id = Replicate,
      tooltip = Replicate,
      yintercept = LoA_low,
      color = Replicate
      ),
    alpha = 0.7
    ) +

  geom_hline_interactive(             # Lines for CIs of LoAs
    data = LoA_table[1:3,],
    aes(
      data_id = Replicate,
      tooltip = Replicate,
      yintercept = LoA_outer_high,
      color = Replicate
      ),
    linetype = "dashed",
    linewidth = 0.7,
    ) +
  geom_hline_interactive(
    data = LoA_table[1:3,],
    aes(
      data_id = Replicate,
      tooltip = Replicate,
      yintercept = LoA_outer_low,
      linetype = "dashed",
      color = Replicate
      ),
    linetype = "dashed",
    linewidth = 0.7,
    ) +
  geom_hline_interactive( 
    data = LoA_table[1:3,],
    aes(
      data_id = Replicate,
      tooltip = Replicate,
      yintercept = LoA_inner_high,
      color = Replicate
      ),
    linetype = "dotted",
    linewidth = 1,
    ) +
  geom_hline_interactive(
    data = LoA_table[1:3,],
    aes(
      data_id = Replicate,
      tooltip = Replicate,
      yintercept = LoA_inner_low,
      linetype = "dashed",
      color = Replicate
      ),
    linetype = "dotted",
    linewidth = 1,
    ) +
  
  scale_color_manual_interactive(
    values = palette_2)  +
  scale_y_continuous(
    limits = c(
      LoA_table %>% 
      select(!Replicate) %>% 
      range() + c(-0.5, .5)
      ), 
    breaks = seq(-10, 10, 1)
    ) + 
  labs(
    tag = "A",
    title = "Individual replicate data sets",
    x = "(Manual + Electro) / 2",
    y = "Manual - Electro"    ) +
  theme_light() +
  theme(
    plot.title = element_text(
      size = 11,
      hjust = 0.3
      ),
    legend.position = "none"
    ) +
  theme_ggside_void()



# BA plot for intra-replicate means

p4B <- 
  data_BlAltm_replMeans %>% 
  ggplot(
    aes(
      x= Mean_methodMean, 
      y= Mean_methodDiff,
      data_id = Id,
      tooltip = Id
      )
    ) +
  geom_point_interactive(
    color = "black",
    alpha = 0.3,
    size = 3
    ) +
  geom_ysidedensity(
    inherit.aes = FALSE,
    aes(y = Mean_methodDiff)
    ) +     
  geom_hline(
    data = LoA_table[4,],
    aes(yintercept = LoA_table[4,]$Mean)
    ) +
  geom_hline(
    data = LoA_table[4,],
    aes(yintercept = LoA_table[4,]$LoA_up),
    ) +
  geom_hline(
    data = LoA_table[4,],
    aes(yintercept = LoA_table[4,]$LoA_low)
    ) +
  
  geom_hline(                             # Lines for CIs of LoAs
    data = LoA_table[4,],
    aes(
      yintercept = LoA_outer_high
      ),
    linetype = "dashed",
    linewidth = 0.7,
    ) +
  geom_hline(
    data = LoA_table[4,],
    aes(
      yintercept = LoA_outer_low
      ),
    linetype = "dashed",
    linewidth = 0.7,
    ) +
  geom_hline( 
    data = LoA_table[4,],
    aes(
      yintercept = LoA_inner_high
      ),
    linetype = "dotted",
    linewidth = 1,
    ) +
  geom_hline(
    data = LoA_table[4,],
    aes(
      yintercept = LoA_inner_low
      ),
    linetype = "dotted",
    linewidth = 1,
    ) +
  
  scale_y_continuous(
    limits = c(
      LoA_table %>% 
      select(!Replicate) %>% 
      range() + c(-0.5, .5)
      ), 
    breaks = seq(-10, 10, 1)
    ) + 
  labs(
    tag = "B",
    title = "Averaged replicate data",
    x = "(Manual + Electro) / 2",
    y = "Manual - Electro"
   ) +
  theme_light() +
  theme(
    plot.title = element_text(
      size = 11,
      hjust = 0.5
      ),
    axis.title.y = element_blank(),
    legend.position = "none"
    ) +
  theme_ggside_void()


# Interactive legend for plots

p4_legend <- 
  LoA_table %>%
  select(!c(
    LoA_inner_high, 
    LoA_inner_low)
    ) %>% 
  mutate(
     across(
      !Replicate, 
      function (x) sprintf(
        "%0.2f", round(x, digits = 2) 
        )
      )
    ) %>% 
  rename(
    mean = Mean, 
    high = LoA_up, 
    low = LoA_low,
    `high, \nupper CI`= LoA_outer_high,
    `low, \nlower CI` = LoA_outer_low
    ) %>% 
  mutate(
    .before = everything(), "Limits of Agreement\n\n" = "___", 
    across(everything(), as.character)
    ) %>% 
  rbind(., 
    c(" ", "Replicate", " ", " ", " ", " ", " ")
    ) %>% 
  pivot_longer(
    cols = !c("Replicate"), 
    names_to = "x"
    ) %>% 
  mutate(
    Replicate = factor(
      Replicate, 
      levels = c(
        "replicate mean", "3rd", "2nd", "1st", "Replicate"
      )
    ),
    x = factor(
      x,
      levels = c(unique(x)
      )
    )
  ) %>% 
  ggplot(
    aes(
      x = x, 
      y = Replicate, 
      color = Replicate
      )
    ) + 
  geom_text_interactive(
    aes(
      data_id = Replicate,
      label = paste(value)),
    size = 3.5
    ) +
  scale_color_manual_interactive(
    values = palette_2) +
  scale_x_discrete(position = "top") +
  scale_y_discrete(position = "right") +
  theme_classic()+
  theme(
    aspect.ratio = 0.25,
    axis.line = element_blank(),
    axis.ticks = element_blank(),
    axis.title = element_blank(),
    axis.text = element_text(
      size = 10
      ),
    legend.position = "none"
    )


# Combined plot layout

design <-  "AAAAABBBB
            AAAAABBBB
            AAAAABBBB
            AAAAABBBB
            ##CCCCC##
            ##CCCCC##"

girafe(
  ggobj = 
    p4A + 
    p4B +
    p4_legend +
    plot_layout(
      design = design
      ) +
    plot_annotation(
    title = 
      "Bland-Altman plots with replicate measurements\n",
    caption = "Hover over points to highlight individual patients,
    hover over legend to display mean and limit values for individual replicates",
    theme = theme(
      plot.title = element_text(
        face = "bold",
        size = 12, 
        hjust = 0.5
        ),
      plot.caption = element_text(
        size = 9,
        hjust = 0.5
        )
      )
    ),
  height_svg = 7, width_svg = 9,              # svg settings for saving to individual image to html, omit for knitting
  options = list(
    opts_hover(css = ''),                     # CSS code of selected interactive geoms
    opts_hover_inv(css = "opacity:0.12;"),    # CSS code of non-selected interactive geoms
    opts_sizing(rescale = FALSE)
    )
  )

The offset of the mean difference line from the 0 demonstrates the bias of the electro versus the manual method.
Broken lines indicate outer, dotted lines inner 95% confidence intervals of the LoAs.
A) Bland-Altman plot showing the individual replicate data sets, distinguished by colour.
B) Bland-Altman plot based on replicate average. LoAs are wider for the first compared to second and third replicate. As expected, the average replicate data have the narrowest LoAs. However, their calculation did not consider the variability of the individual replicates, and therefore might not be entirely accurate.

Note on coding: Following a suggestion at the PSI webinar I tried to make the plot legend interactive so that it can be used to highlight the mean and limit lines of different replicates in (A). Implementing it in ggiraph turned out to be far from straightforward, including a possible solution posted on Stackoverflow (https://stackoverflow.com/questions/73721101/ggiraph-r-how-to-link-legend-and-plot-with-same-data-id-attribute). In the end, I decided to just generate a separate ggplot object in the shape of an interactive table, linking it to the other plots through a shared definition of “data_id”. In a future update, it would have been nice if hovering over the legend could highlight lines and dots of the selected replicate together - without sacrificing the option to highlight replicates in the same patient when hovering over the dots in the plot.

4.2. Probability of Agreement plot

Probability of Agreement is the proportion of measurements by two independent raters whose differences fall within a chosen range ⁵. Its clinical interpretation is more intuitive, but computation more complicated, and replicate measurements are required as input.

# - Chunk_5                                                           - #
# - Requires:  * R-libraries and object data (libraries_and_data)     - #


## Probability agreement (NT Stevens, 2019)
# Script customised for analysis with own data set


# Choose acceptable difference for which the probability of agreement should be evalualated

delta <- 3

# Reformat the input dataframe (check variable names and formats)

rr <-
  data %>% 
  rename(
    Subject = Id,
    Repeat = Replicate,
    MS = Method,
    Measurements = Value
    ) %>% 
  mutate(
    MS = 
      case_when(
        MS == "Manual" ~ 1,
        MS == "Electro" ~ 2
      ),
    across(
      .cols = everything(), 
      as.double
      )
    ) 

# Calculate number of subjects
n_subj <- rr$Subject %>% max



############################################################################
## Relevant functions required for the Probability of Agreement analysis  ##
############################################################################

## Author: Nathaniel T. Stevens
##         Assistant Professor, University of Waterloo
##   Date: March 2019 

## Analysis script for the probabilioty of agreement methodology
## https://github.com/vandainacio/Comparison-of-Six-Agreement-Methods-for-Continuous-Repeated-Measures-Data/blob/master/PAgreement_Analysis_Script.R

## Note that the dataframe must have four columns with headers 'Subject', 'MS', 'Repeat' and 'Measurements', 
## and N = sum{m_ij} rows. The 'Subject' column contains subject codes and the entries of 'MS' must be 1's and 
## 2's indicating which measurement system a given row corresponds to. The entries of 'Repeat' indicate which 
## repeated measurement a given row corrsponds to, and the entries of the 'Measurements' column correspond to 
## the repeated measurements by the given system on each of the n subjects. Note that the observations must be 
## ordered by subject with repeated measurements appearing consecutively with MS1 measurements appearing first 
## and MS2 second.

#############################
## Log-Likelihood Function ##
#############################

log.likelihood <- function(param, n, r1vec, r2vec, data){
  # Calculate the log-likelihood contribution for each subject and then sum them all
  mu <- param[1]
  alpha <- param[2]
  beta <- param[3]
  sp <- param[4]
  s1 <- param[5]
  s2 <- param[6]
  l <- rep(0, n)
  for(i in 1:n){
    r1 <- r1vec[i]
    r2 <- r2vec[i]
    Yi1 <- subset(data, subset = (data$Subject==i & data$MS==1))$Measurements
    Yi2 <- subset(data, subset = (data$Subject==i & data$MS==2))$Measurements
    A <- sum((Yi1-rep(mu, r1))^2)
    B <- sum(Yi1-rep(mu, r1))^2
    C <- sum((Yi2-rep(alpha + beta*mu, r2))^2)
    D <- sum(Yi2-rep(alpha + beta*mu, r2))^2
    E <- sum(Yi1-rep(mu, r1)) * sum(Yi2-rep(alpha + beta*mu, r2))
    # Log-likelihood contribution for subject i
    l[i] <- -(r1 / 2 + r2 / 2) * log(2 * pi) - r1 * log(s1) - r2 * log(s2) - log1p(sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) / 2 - A / s1 ^ 2 / 2 + sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * B / s1 ^ 4 / 2 - C / s2 ^ 2 / 2 + sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta ^ 2 * D / s2 ^ 4 / 2 + sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta * E / s1 ^ 2 / s2 ^ 2
  }

  # The log-likelihood function:
  l <- sum(l)
  return(-l) # Return negative because optim() is a minimizer
}


##############################
## Log-Likelihood Gradients ##
##############################

gradient <- function(param, n, r1vec, r2vec, data){
  # Calculate the gradients for each parameter / subject combination and sum over all subjects
  # for the gradients for each parameter.
  mu <- param[1]
  alpha <- param[2]
  beta <- param[3]
  sp <- param[4]
  s1 <- param[5]
  s2 <- param[6]
  g <- matrix(0, nrow = 6, ncol = n)
  for(i in 1:n){
    r1 <- r1vec[i]
    r2 <- r2vec[i]
    Yi1 <- subset(data, subset = (data$Subject==i & data$MS==1))$Measurements
    Yi2 <- subset(data, subset = (data$Subject==i & data$MS==2))$Measurements
    A <- sum((Yi1-rep(mu, r1))^2)
    C <- sum((Yi2-rep(alpha + beta*mu, r2))^2)
    F <- sum(Yi1-rep(mu, r1))
    G <- sum(Yi2-rep(alpha + beta*mu, r2))
    B <- F^2
    D <- G^2
    E <- F*G
    
    # Derivative with respect to mu:
    g[1,i] <- F / s1 ^ 2 - r1 * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * F / s1 ^ 4 + beta * G / s2 ^ 2 - r2 * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta ^ 3 * G / s2 ^ 4 - r1 * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta * G / s1 ^ 2 / s2 ^ 2 - r2 * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta ^ 2 * F / s1 ^ 2 / s2 ^ 2
    # Derivative with respect to alpha:
    g[2,i] <- G / s2 ^ 2 - r2 * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta ^ 2 * G / s2 ^ 4 - r2 * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta * F / s1 ^ 2 / s2 ^ 2
    # Derivative with respect to beta:
    g[3,i] <- -beta * r2 * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) / s2 ^ 2 - beta * r2 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 / s2 ^ 2 * B / s1 ^ 4 + mu * G / s2 ^ 2 + (-2 * G * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta ^ 2 * mu * r2 - 2 * D * beta ^ 3 * r2 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 / s2 ^ 2 + 2 * D * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta) / s2 ^ 4 / 2 + (-F * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta * mu * r2 - 2 * E * beta ^ 2 * r2 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 / s2 ^ 2 + E * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2))) / s1 ^ 2 / s2 ^ 2
    # Derivative with respect to sp:
    g[4,i] <- -sp * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2) / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) + sp / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * B / s1 ^ 4 - sp ^ 3 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * B / s1 ^ 4 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2) + sp / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta ^ 2 * D / s2 ^ 4 - sp ^ 3 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta ^ 2 * D / s2 ^ 4 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2) + 2 * sp / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta * E / s1 ^ 2 / s2 ^ 2 - 2 * sp ^ 3 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta * E / s1 ^ 2 / s2 ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)
    # Derivative with respect to s1:
    g[5,i] <- -r1 / s1 + sp ^ 2 * r1 / s1 ^ 3 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) + A / s1 ^ 3 + sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * B / s1 ^ 7 * r1 - 2 * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * B / s1 ^ 5 + sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta ^ 2 * D / s2 ^ 4 * r1 / s1 ^ 3 + 2 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta * E / s1 ^ 5 / s2 ^ 2 * r1 - 2 * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta * E / s1 ^ 3 / s2 ^ 2
    # Derivative with respect to s2:
    g[6,i] <- -r2 / s2 + sp ^ 2 * r2 * beta ^ 2 / s2 ^ 3 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) + sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * B / s1 ^ 4 * r2 * beta ^ 2 / s2 ^ 3 + C / s2 ^ 3 + sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta ^ 4 * D / s2 ^ 7 * r2 - 2 * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta ^ 2 * D / s2 ^ 5 + 2 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta ^ 3 * E / s1 ^ 2 / s2 ^ 5 * r2 - 2 * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta * E / s1 ^ 2 / s2 ^ 3
  }
  
  # The gradients:
  g <- apply(X = g, MARGIN = 1, FUN = sum)
  return(-g) # Return negative because optim() is a minimizer
}


###############################
## Fisher Information Matrix ##
###############################

fisher.info <- function(param, n, r1vec, r2vec){
  # Calculate the Fisher Information Matrix associated with the 6 parameters
  # Expected values of the various sums of squares
  mu <- param[1]
  alpha <- param[2]
  beta <- param[3]
  sp <- param[4]
  s1 <- param[5]
  s2 <- param[6]
  dldmu2 <- rep(0, n); dldmualpha <- rep(0, n); dldmubeta <- rep(0, n); dldmusp <- rep(0, n); dldmus1 <- rep(0, n); dldmus2 <- rep(0, n); dldalpha2 <- rep(0, n); dldalphabeta <- rep(0, n); dldalphasp <- rep(0, n); dldalphas1 <- rep(0, n); dldalphas2 <- rep(0, n); dldbeta2 <- rep(0, n); dldbetasp <- rep(0, n); dldbetas1 <- rep(0, n); dldbetas2 <- rep(0, n); dldsp2 <- rep(0, n); dldsps1 <- rep(0, n); dldsps2 <- rep(0, n); dlds12 <- rep(0, n); dlds1s2 <- rep(0, n); dlds22 <- rep(0, n);
  for(i in 1:n){
    r1 <- r1vec[i]
    r2 <- r2vec[i]
    A <- r1*(sp^2+s1^2)
    B <- r1*(r1*sp^2+s1^2)
    C <- r2*((beta^2)*(sp^2)+s2^2)
    D <- r2*(r2*(beta^2)*(sp^2)+s2^2)
    E <- (r1*r2)*beta*sp^2
    F <- 0
    G <- 0
    # Second partial derivatives for each subject
    dldmu2[i] <- -r1 / s1 ^ 2 + r1 ^ 2 * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) / s1 ^ 4 - r2 * beta ^ 2 / s2 ^ 2 + r2 ^ 2 * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta ^ 4 / s2 ^ 4 + 2 * r1 * r2 * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta ^ 2 / s1 ^ 2 / s2 ^ 2
    dldmualpha[i] <- -r2 * beta / s2 ^ 2 + r2 ^ 2 * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta ^ 3 / s2 ^ 4 + r1 * r2 * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta / s1 ^ 2 / s2 ^ 2
    dldmubeta[i] <- 2 * r1 * beta * r2 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 / s2 ^ 2 * F / s1 ^ 4 + (-beta * mu * r2 + G) / s2 ^ 2 - (-sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta ^ 3 * mu * r2 ^ 2 - 2 * G * beta ^ 4 * r2 ^ 2 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 / s2 ^ 2 + 3 * G * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta ^ 2 * r2) / s2 ^ 4 - (-sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta * mu * r1 * r2 - 2 * G * beta ^ 2 * r2 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 / s2 ^ 2 * r1 + G * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * r1) / s1 ^ 2 / s2 ^ 2 - (-2 * F * beta ^ 3 * r2 ^ 2 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 / s2 ^ 2 + 2 * F * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta * r2) / s1 ^ 2 / s2 ^ 2
    dldmusp[i] <- -2 * r1 * sp / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * F / s1 ^ 4 + 2 * r1 * sp ^ 3 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * F / s1 ^ 4 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2) - 2 * r2 * sp / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta ^ 3 * G / s2 ^ 4 + 2 * r2 * sp ^ 3 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta ^ 3 * G / s2 ^ 4 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2) - 2 * r1 * sp / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta * G / s1 ^ 2 / s2 ^ 2 + 2 * r1 * sp ^ 3 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta * G / s1 ^ 2 / s2 ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2) - 2 * r2 * sp / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta ^ 2 * F / s1 ^ 2 / s2 ^ 2 + 2 * r2 * sp ^ 3 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta ^ 2 * F / s1 ^ 2 / s2 ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)
    dldmus1[i] <- -2 * F / s1 ^ 3 - 2 * r1 ^ 2 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * F / s1 ^ 7 + 4 * r1 * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * F / s1 ^ 5 - 2 * r2 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta ^ 3 * G / s2 ^ 4 * r1 / s1 ^ 3 - 2 * r1 ^ 2 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta * G / s1 ^ 5 / s2 ^ 2 + 2 * r1 * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta * G / s1 ^ 3 / s2 ^ 2 - 2 * r2 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta ^ 2 * F / s1 ^ 5 / s2 ^ 2 * r1 + 2 * r2 * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta ^ 2 * F / s1 ^ 3 / s2 ^ 2
    dldmus2[i] <- -2 * r1 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * F / s1 ^ 4 * r2 * beta ^ 2 / s2 ^ 3 - 2 * beta * G / s2 ^ 3 - 2 * r2 ^ 2 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta ^ 5 * G / s2 ^ 7 + 4 * r2 * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta ^ 3 * G / s2 ^ 5 - 2 * r1 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta ^ 3 * G / s1 ^ 2 / s2 ^ 5 * r2 + 2 * r1 * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta * G / s1 ^ 2 / s2 ^ 3 - 2 * r2 ^ 2 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta ^ 4 * F / s1 ^ 2 / s2 ^ 5 + 2 * r2 * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta ^ 2 * F / s1 ^ 2 / s2 ^ 3
    dldalpha2[i] <- -r2 / s2 ^ 2 + r2 ^ 2 * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta ^ 2 / s2 ^ 4
    dldalphabeta[i] <- -r2 * mu / s2 ^ 2 - (-sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta ^ 2 * mu * r2 ^ 2 - 2 * G * beta ^ 3 * r2 ^ 2 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 / s2 ^ 2 + 2 * G * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta * r2) / s2 ^ 4 - (-2 * F * beta ^ 2 * r2 ^ 2 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 / s2 ^ 2 + F * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * r2) / s1 ^ 2 / s2 ^ 2
    dldalphasp[i] <- -2 * r2 * sp / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta ^ 2 * G / s2 ^ 4 + 2 * r2 * sp ^ 3 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta ^ 2 * G / s2 ^ 4 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2) - 2 * r2 * sp / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta * F / s1 ^ 2 / s2 ^ 2 + 2 * r2 * sp ^ 3 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta * F / s1 ^ 2 / s2 ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)
    dldalphas1[i] <- -2 * r2 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta ^ 2 * G / s2 ^ 4 * r1 / s1 ^ 3 - 2 * r2 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta * F / s1 ^ 5 / s2 ^ 2 * r1 + 2 * r2 * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta * F / s1 ^ 3 / s2 ^ 2
    dldalphas2[i] <- -2 * G / s2 ^ 3 - 2 * r2 ^ 2 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta ^ 4 * G / s2 ^ 7 + 4 * r2 * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta ^ 2 * G / s2 ^ 5 - 2 * r2 ^ 2 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta ^ 3 * F / s1 ^ 2 / s2 ^ 5 + 2 * r2 * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta * F / s1 ^ 2 / s2 ^ 3
    dldbeta2[i] <- -(-2 * beta ^ 2 * r2 ^ 2 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 / s2 ^ 2 + sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * r2) / s2 ^ 2 - r2 * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * (-4 * beta ^ 2 * r2 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 / s2 ^ 2 + sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2))) / s2 ^ 2 * B / s1 ^ 4 - r2 * mu ^ 2 / s2 ^ 2 + (8 * G * beta ^ 3 * r2 ^ 2 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 / s2 ^ 2 * mu - 2 * D * beta ^ 2 * r2 * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * (-4 * beta ^ 2 * r2 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 / s2 ^ 2 + sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2))) / s2 ^ 2 - 8 * D * beta ^ 2 * r2 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 / s2 ^ 2 - 8 * G * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta * mu * r2 + 2 * D * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) + 2 * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta ^ 2 * mu ^ 2 * r2 ^ 2) / s2 ^ 4 / 2 + (4 * F * beta ^ 2 * r2 ^ 2 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 / s2 ^ 2 * mu - 2 * E * beta * r2 * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * (-4 * beta ^ 2 * r2 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 / s2 ^ 2 + sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2))) / s2 ^ 2 - 4 * E * beta * r2 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 / s2 ^ 2 - 2 * r2 * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * mu * F) / s1 ^ 2 / s2 ^ 2
    dldbetasp[i] <- -2 * beta * r2 * sp / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) / s2 ^ 2 + 2 * beta * r2 * sp ^ 3 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 / s2 ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2) - 4 * beta * r2 * sp ^ 3 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 / s2 ^ 2 * B / s1 ^ 4 + 4 * beta * r2 * sp ^ 5 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 3 / s2 ^ 2 * B / s1 ^ 4 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2) + (-4 * G * sp / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta ^ 2 * mu * r2 + 4 * G * sp ^ 3 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta ^ 2 * mu * r2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2) - 8 * D * beta ^ 3 * r2 * sp ^ 3 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 / s2 ^ 2 + 8 * D * beta ^ 3 * r2 * sp ^ 5 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 3 / s2 ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2) + 4 * D * sp / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta - 4 * D * sp ^ 3 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) / s2 ^ 4 / 2 + (-2 * F * sp / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta * mu * r2 + 2 * F * sp ^ 3 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta * mu * r2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2) - 8 * E * beta ^ 2 * r2 * sp ^ 3 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 / s2 ^ 2 + 8 * E * beta ^ 2 * r2 * sp ^ 5 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 3 / s2 ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2) + 2 * E * sp / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) - 2 * E * sp ^ 3 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) / s1 ^ 2 / s2 ^ 2
    dldbetas1[i] <- -2 * beta * r2 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 / s2 ^ 2 * r1 / s1 ^ 3 - 4 * beta * r2 * sp ^ 6 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 3 / s2 ^ 2 * B / s1 ^ 7 * r1 + 4 * beta * r2 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 / s2 ^ 2 * B / s1 ^ 5 + (-4 * G * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta ^ 2 * mu * r2 * r1 / s1 ^ 3 - 8 * D * beta ^ 3 * r2 * sp ^ 6 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 3 / s2 ^ 2 * r1 / s1 ^ 3 + 4 * D * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta * r1 / s1 ^ 3) / s2 ^ 4 / 2 + (-2 * F * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta * mu * r2 * r1 / s1 ^ 3 - 8 * E * beta ^ 2 * r2 * sp ^ 6 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 3 / s2 ^ 2 * r1 / s1 ^ 3 + 2 * E * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * r1 / s1 ^ 3) / s1 ^ 2 / s2 ^ 2 - 2 * (-F * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta * mu * r2 - 2 * E * beta ^ 2 * r2 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 / s2 ^ 2 + E * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2))) / s1 ^ 3 / s2 ^ 2
    dldbetas2[i] <- -2 * beta ^ 3 * r2 ^ 2 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 / s2 ^ 5 + 2 * beta * r2 * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) / s2 ^ 3 - 4 * beta ^ 3 * r2 ^ 2 * sp ^ 6 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 3 / s2 ^ 5 * B / s1 ^ 4 + 2 * beta * r2 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 / s2 ^ 3 * B / s1 ^ 4 - 2 * mu * G / s2 ^ 3 + (-4 * G * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta ^ 4 * mu * r2 ^ 2 / s2 ^ 3 - 8 * D * beta ^ 5 * r2 ^ 2 * sp ^ 6 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 3 / s2 ^ 5 + 8 * D * beta ^ 3 * r2 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 / s2 ^ 3) / s2 ^ 4 / 2 - 2 * (-2 * G * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta ^ 2 * mu * r2 - 2 * D * beta ^ 3 * r2 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 / s2 ^ 2 + 2 * D * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta) / s2 ^ 5 + (-2 * F * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta ^ 3 * mu * r2 ^ 2 / s2 ^ 3 - 8 * E * beta ^ 4 * r2 ^ 2 * sp ^ 6 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 3 / s2 ^ 5 + 6 * E * beta ^ 2 * r2 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 / s2 ^ 3) / s1 ^ 2 / s2 ^ 2 - 2 * (-F * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta * mu * r2 - 2 * E * beta ^ 2 * r2 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 / s2 ^ 2 + E * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2))) / s1 ^ 2 / s2 ^ 3
    dldsp2[i] <- -(r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2) / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) + 2 * sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2) ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 + 1 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * B / s1 ^ 4 - 5 * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * B / s1 ^ 4 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2) + 4 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 3 * B / s1 ^ 4 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2) ^ 2 + 1 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta ^ 2 * D / s2 ^ 4 - 5 * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta ^ 2 * D / s2 ^ 4 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2) + 4 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 3 * beta ^ 2 * D / s2 ^ 4 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2) ^ 2 + 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta * E / s1 ^ 2 / s2 ^ 2 - 10 * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta * E / s1 ^ 2 / s2 ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2) + 8 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 3 * beta * E / s1 ^ 2 / s2 ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2) ^ 2
    dldsps1[i] <- 2 * sp * r1 / s1 ^ 3 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) - 2 * sp ^ 3 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2) / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * r1 / s1 ^ 3 + 4 * sp ^ 3 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * B / s1 ^ 7 * r1 - 4 * sp / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * B / s1 ^ 5 - 4 * sp ^ 5 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 3 * B / s1 ^ 7 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2) * r1 + 4 * sp ^ 3 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * B / s1 ^ 5 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2) + 4 * sp ^ 3 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta ^ 2 * D / s2 ^ 4 * r1 / s1 ^ 3 - 4 * sp ^ 5 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 3 * beta ^ 2 * D / s2 ^ 4 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2) * r1 / s1 ^ 3 + 8 * sp ^ 3 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta * E / s1 ^ 5 / s2 ^ 2 * r1 - 4 * sp / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta * E / s1 ^ 3 / s2 ^ 2 - 8 * sp ^ 5 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 3 * beta * E / s1 ^ 5 / s2 ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2) * r1 + 4 * sp ^ 3 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta * E / s1 ^ 3 / s2 ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)
    dldsps2[i] <- 2 * sp * r2 * beta ^ 2 / s2 ^ 3 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) - 2 * sp ^ 3 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2) / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * r2 * beta ^ 2 / s2 ^ 3 + 4 * sp ^ 3 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * B / s1 ^ 4 * r2 * beta ^ 2 / s2 ^ 3 - 4 * sp ^ 5 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 3 * B / s1 ^ 4 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2) * r2 * beta ^ 2 / s2 ^ 3 + 4 * sp ^ 3 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta ^ 4 * D / s2 ^ 7 * r2 - 4 * sp / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta ^ 2 * D / s2 ^ 5 - 4 * sp ^ 5 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 3 * beta ^ 4 * D / s2 ^ 7 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2) * r2 + 4 * sp ^ 3 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta ^ 2 * D / s2 ^ 5 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2) + 8 * sp ^ 3 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta ^ 3 * E / s1 ^ 2 / s2 ^ 5 * r2 - 4 * sp / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta * E / s1 ^ 2 / s2 ^ 3 - 8 * sp ^ 5 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 3 * beta ^ 3 * E / s1 ^ 2 / s2 ^ 5 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2) * r2 + 4 * sp ^ 3 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta * E / s1 ^ 2 / s2 ^ 3 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)
    dlds12[i] <- r1 / s1 ^ 2 - 3 * sp ^ 2 * r1 / s1 ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) + 2 * sp ^ 4 * r1 ^ 2 / s1 ^ 6 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 - 3 * A / s1 ^ 4 + 4 * sp ^ 6 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 3 * B / s1 ^ 10 * r1 ^ 2 - 11 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * B / s1 ^ 8 * r1 + 10 * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * B / s1 ^ 6 + 4 * sp ^ 6 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 3 * beta ^ 2 * D / s2 ^ 4 * r1 ^ 2 / s1 ^ 6 - 3 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta ^ 2 * D / s2 ^ 4 * r1 / s1 ^ 4 + 8 * sp ^ 6 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 3 * beta * E / s1 ^ 8 / s2 ^ 2 * r1 ^ 2 - 14 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta * E / s1 ^ 6 / s2 ^ 2 * r1 + 6 * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta * E / s1 ^ 4 / s2 ^ 2
    dlds1s2[i] <- 2 * sp ^ 4 * r1 / s1 ^ 3 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * r2 * beta ^ 2 / s2 ^ 3 + 4 * sp ^ 6 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 3 * B / s1 ^ 7 * r1 * r2 * beta ^ 2 / s2 ^ 3 - 4 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * B / s1 ^ 5 * r2 * beta ^ 2 / s2 ^ 3 + 4 * sp ^ 6 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 3 * beta ^ 4 * D / s2 ^ 7 * r1 / s1 ^ 3 * r2 - 4 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta ^ 2 * D / s2 ^ 5 * r1 / s1 ^ 3 + 8 * sp ^ 6 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 3 * beta ^ 3 * E / s1 ^ 5 / s2 ^ 5 * r1 * r2 - 4 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta * E / s1 ^ 5 / s2 ^ 3 * r1 - 4 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta ^ 3 * E / s1 ^ 3 / s2 ^ 5 * r2 + 4 * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta * E / s1 ^ 3 / s2 ^ 3
    dlds22[i] <- r2 / s2 ^ 2 - 3 * sp ^ 2 * r2 * beta ^ 2 / s2 ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) + 2 * sp ^ 4 * r2 ^ 2 * beta ^ 4 / s2 ^ 6 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 + 4 * sp ^ 6 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 3 * B / s1 ^ 4 * r2 ^ 2 * beta ^ 4 / s2 ^ 6 - 3 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * B / s1 ^ 4 * r2 * beta ^ 2 / s2 ^ 4 - 3 * C / s2 ^ 4 + 4 * sp ^ 6 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 3 * beta ^ 6 * D / s2 ^ 10 * r2 ^ 2 - 11 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta ^ 4 * D / s2 ^ 8 * r2 + 10 * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta ^ 2 * D / s2 ^ 6 + 8 * sp ^ 6 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 3 * beta ^ 5 * E / s1 ^ 2 / s2 ^ 8 * r2 ^ 2 - 14 * sp ^ 4 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) ^ 2 * beta ^ 3 * E / s1 ^ 2 / s2 ^ 6 * r2 + 6 * sp ^ 2 / (1 + sp ^ 2 * (r1 / s1 ^ 2 + r2 * beta ^ 2 / s2 ^ 2)) * beta * E / s1 ^ 2 / s2 ^ 4
  }

  # Second partial derivatives
  dldmu2 <- sum(dldmu2)
  dldmualpha <- sum(dldmualpha)
  dldmubeta <- sum(dldmubeta)
  dldmusp <- sum(dldmusp) 
  dldmus1 <- sum(dldmus1)
  dldmus2 <- sum(dldmus2)
  dldalpha2 <- sum(dldalpha2)
  dldalphabeta <- sum(dldalphabeta)
  dldalphasp <- sum(dldalphasp)
  dldalphas1 <- sum(dldalphas1)
  dldalphas2 <- sum(dldalphas2)
  dldbeta2 <- sum(dldbeta2)
  dldbetasp <- sum(dldbetasp)
  dldbetas1 <- sum(dldbetas1)
  dldbetas2 <- sum(dldbetas2)
  dldsp2 <- sum(dldsp2)
  dldsps1 <- sum(dldsps1)
  dldsps2 <- sum(dldsps2)
  dlds12 <- sum(dlds12)
  dlds1s2 <- sum(dlds1s2)
  dlds22 <-sum(dlds22)
  
  # Create the matrix
  J <- matrix(0, nrow = 6, ncol = 6)
  J[1,] <- c(dldmu2, dldmualpha, dldmubeta, dldmusp, dldmus1, dldmus2)
  J[2,] <- c(dldmualpha, dldalpha2, dldalphabeta, dldalphasp, dldalphas1, dldalphas2)
  J[3,] <- c(dldmubeta, dldalphabeta, dldbeta2, dldbetasp, dldbetas1, dldbetas2)
  J[4,] <- c(dldmusp, dldalphasp, dldbetasp, dldsp2, dldsps1, dldsps2)
  J[5,] <- c(dldmus1, dldalphas1, dldbetas1, dldsps1, dlds12, dlds1s2)
  J[6,] <- c(dldmus2, dldalphas2, dldbetas2, dldsps2, dlds1s2, dlds22)
  
  return(-J) #change the sign since we want the negative of the expected value instead of the expected value
}


############################
## P(agreement) Gradients ##
############################

delta.method <- 
  function(delta, p, alpha, beta, s1, s2){
  # Function that calculates the gradient vector for theta(p)
  D <- matrix(0, nrow = 6)
  # Derivative with respect to mu
  D[1,] <- 0
  # Derivative with respect to alpha
  D[2,] <- -pi ^ (-0.1e1 / 0.2e1) * exp(-(p * beta + alpha - delta - p) ^ 2 / (s1 ^ 2 + s2 ^ 2) / 2) * sqrt(2) * (s1 ^ 2 + s2 ^ 2) ^ (-0.1e1 / 0.2e1) / 2 + pi ^ (-0.1e1 / 0.2e1) * exp(-(p * beta + alpha + delta - p) ^ 2 / (s1 ^ 2 + s2 ^ 2) / 2) * sqrt(2) * (s1 ^ 2 + s2 ^ 2) ^ (-0.1e1 / 0.2e1) / 2
  # Derivative with respect to beta
  D[3,] <- -pi ^ (-0.1e1 / 0.2e1) * exp(-(p * beta + alpha - delta - p) ^ 2 / (s1 ^ 2 + s2 ^ 2) / 2) * sqrt(2) * p * (s1 ^ 2 + s2 ^ 2) ^ (-0.1e1 / 0.2e1) / 2 + pi ^ (-0.1e1 / 0.2e1) * exp(-(p * beta + alpha + delta - p) ^ 2 / (s1 ^ 2 + s2 ^ 2) / 2) * sqrt(2) * p * (s1 ^ 2 + s2 ^ 2) ^ (-0.1e1 / 0.2e1) / 2
  # Derivative with respect to sp
  D[4,] <- 0
  # Derivative with respect to s1
  D[5,] <- pi ^ (-0.1e1 / 0.2e1) * exp(-(p * beta + alpha - delta - p) ^ 2 / (s1 ^ 2 + s2 ^ 2) / 2) * sqrt(2) * (p * beta + alpha - delta - p) * (s1 ^ 2 + s2 ^ 2) ^ (-0.3e1 / 0.2e1) * s1 / 2 - pi ^ (-0.1e1 / 0.2e1) * exp(-(p * beta + alpha + delta - p) ^ 2 / (s1 ^ 2 + s2 ^ 2) / 2) * sqrt(2) * (p * beta + alpha + delta - p) * (s1 ^ 2 + s2 ^ 2) ^ (-0.3e1 / 0.2e1) * s1 / 2
  # Derivative with respect to s2
  D[6,] <- pi ^ (-0.1e1 / 0.2e1) * exp(-(p * beta + alpha - delta - p) ^ 2 / (s1 ^ 2 + s2 ^ 2) / 2) * sqrt(2) * (p * beta + alpha - delta - p) * (s1 ^ 2 + s2 ^ 2) ^ (-0.3e1 / 0.2e1) * s2 / 2 - pi ^ (-0.1e1 / 0.2e1) * exp(-(p * beta + alpha + delta - p) ^ 2 / (s1 ^ 2 + s2 ^ 2) / 2) * sqrt(2) * (p * beta + alpha + delta - p) * (s1 ^ 2 + s2 ^ 2) ^ (-0.3e1 / 0.2e1) * s2 / 2
  return(D)
}


######################
## Modified QQ-Plot ##
######################

mod.qqplot <- function(data, n){ 
  ## This function creates a modified QQ-plot as described in Stevens et al. (2015) but for unbalanced repeated measurements
  ## Inputs:
  #  data = a dataframe with four columns (Subject, MS, Repeat, Measurements) and N = sum{r_ij} rows. The entries of Measurements
  #         correspond to the repeated measurements by the given system on each of the n subjects. Note that the observations 
  #         are ordered by subject with repeated measurements appearing consecutively with MS1 measurements appearing first and MS2 second
  #     n = the number of subjects used in the study
  
  ## Sample call: mod.qqplot(data = rr, n = 21)
  
  mm1 <- rep(0, n) # vector of subject-means (MS1)
  mm2 <- rep(0, n) # vector of subject-means (MS2)
  for(i in 1:n){
    mm1[i] <- mean(subset(data, (subset = data$Subject==i & data$MS==1))$Measurements)
    mm2[i] <- mean(subset(data, (subset = data$Subject==i & data$MS==2))$Measurements) 
  }
  mu1 <- mean(mm1)
  mu2 <- mean(mm2)
  sd1 <- sd(mm1)
  sd2 <- sd(mm2)
  rand.norm1 <- matrix(0, nrow = n, ncol = 50)
  rand.norm2 <- matrix(0, nrow = n, ncol = 50)
  for (i in 1:50){
    rand.norm1[,i] <- rnorm(n, mu1, sd1)
    rand.norm2[,i] <- rnorm(n, mu2, sd2)
  }
  par(mfrow=c(1,2))
  # MS1
  blackpts1 <- qqnorm(mm1, ylim = range(mm1, rand.norm1), main = "QQ-Plot of MS1 Part Averages versus Standard Normal", pch = 19, cex = 0.75)
  for (i in 1:50){
    graypts1 <- qqnorm(rand.norm1[,i], plot.it = FALSE)
    points(graypts1, col = "grey87", cex = 0.75)
  }
  points(blackpts1, col = "black", pch = 19, cex = 0.75) 
  qqline(mm1, col = "red")
  # MS2
  blackpts2 <- qqnorm(mm2, ylim = range(mm2, rand.norm2), main = "QQ-Plot of MS2 Part Averages versus Standard Normal", pch = 19, cex = 0.75)
  for (i in 1:50){
    graypts2 <- qqnorm(rand.norm2[,i], plot.it = FALSE)
    points(graypts2, col = "grey87", cex = 0.75)
  }
  points(blackpts2, col = "black", pch = 19, cex = 0.75) 
  qqline(mm2, col = "red")
}


########################
## Repeatabiltiy Plot ##
########################

repeat.plot <- function(data, n){
  ## This function creates a repeatability plot as described in Stevens et al. (2015) but for unbalanced repeated measurements
  ## Inputs:
  #  data = a dataframe with four columns (Subject, MS, Repeat, Measurements) and N = sum{r_ij} rows. The entries of Measurements
  #         correspond to the repeated measurements by the given system on each of the n subjects. Note that the observations 
  #         are ordered by subject with repeated measurements appearing consecutively with MS1 measurements appearing first and MS2 second
  #     n = the number of subjects used in the study
  
  ## Sample call: repeat.plot(data = rr, n = 21)
  
  mm1 <- rep(0, n) # vector of subject-means (MS1)
  mm2 <- rep(0, n) # vector of subject-means (MS2)
  r1vec <- rep(0, n) # vector of repeated measurement numbers (MS1)
  r2vec <- rep(0, n) # vector of repeated measurement numbers (MS2)
  for(i in 1:n){
    mm1[i] <- mean(subset(data, subset = (data$Subject==i & data$MS==1))$Measurements)
    mm2[i] <- mean(subset(data, subset = (data$Subject==i & data$MS==2))$Measurements)
    r1vec[i] <- max(subset(data, subset = (data$Subject==i & data$MS==1))$Repeat)
    r2vec[i] <- max(subset(data, subset = (data$Subject==i & data$MS==2))$Repeat)
  }
  mm1 <- rep(mm1, r1vec) # vector of MS1 subject-means repeated r_i1 times
  res1 <- data[data$MS==1,]$Measurements - mm1
  mm2 <- rep(mm2, r2vec) # vector of MS1 subject-means repeated r_i2 times
  res2 <- data[data$MS==2,]$Measurements - mm2
  par(mfrow=c(1,2))
  plot(mm1, res1, pch = 19, cex = 0.5, main = "MS1 Residual Measurements versus Subject-Average", ylab = "Observed Residuals", xlab = "Subject Averages", xlim = c(min(mm1,mm2),max(mm1,mm2)), ylim = c(min(res1,res2),max(res1,res2)))
  abline(h = 0, col = "red")
  plot(mm2, res2, pch = 19, cex = 0.5, main = "MS2 Residual Measurements versus Subject-Average", ylab = "Observed Residuals", xlab = "Subject Averages", xlim = c(min(mm1,mm2),max(mm1,mm2)), ylim = c(min(res1,res2),max(res1,res2)))
  abline(h = 0, col = "red")
}


####################################
## Repeated Measures Scatter Plot ##
####################################

rep.scatter.plot <- function(data, n){
  Y1 <- subset(data, subset = data$MS==1)$Measurements
  Y2 <- subset(data, subset = data$MS==2)$Measurements
  sub_means1 <- rep(0, n)
  sub_means2 <- rep(0, n)
  for(i in 1:n){
    sub_i1 <- subset(data, subset = (data$Subject==i & data$MS==1))
    sub_means1[i] <- mean(sub_i1$Measurements)
    sub_i2 <- subset(data, subset = (data$Subject==i & data$MS==2))
    sub_means2[i] <- mean(sub_i2$Measurements)
  }
  par(mfrow = c(1,1))
  plot(sub_means1, sub_means2, ylim = c(min(sub_means1, sub_means2), max(sub_means1, sub_means2)), xlim = c(min(sub_means1, sub_means2), max(sub_means1, sub_means2)), xlab = "MS1 Subject Averages", ylab = "MS2 Subject Averages", main = "Repeated Measures Scatterplot", pch = 16)
  abline(a = 0, b = 1, col = "red", lty = 2)
}


###########################
## Ordinary Scatter Plot ##     (this only works when number of measurements per subject is the same across devices)
###########################

scatter.plot <- function(data, n){
  Y1 <- subset(data, subset = data$MS==1)$Measurements
  Y2 <- subset(data, subset = data$MS==2)$Measurements
  par(mfrow = c(1,1))
  plot(Y1, Y2, ylim = c(min(Y1, Y2), max(Y1, Y2)), xlim = c(min(Y1, Y2), max(Y1, Y2)), xlab = "MS1 Readings", ylab = "MS2 Readings", main = "Scatterplot", pch = 16)
  abline(a = 0, b = 1, col = "red", lty = 2)
}


###########################
## P(agreement) Function ##
###########################

prob.agree <- function(n, delta, data){
  ## This function conducts the Probability of Agreement analysis for the 
  ## comparison of two homoscedastic measurement systems with possibly
  ## unbalanced measurements.
  
  ## Sample call: prob.agree(n = 21, delta = 5, data = rr)
  
  ## Inputs:
  # * n = the number of parts/subjects being measured by each system
  # * delta = the the maximum allowable difference between measurement by two systems
  # * data = a dataframe with four columns (Subject, MS, Repeat, Measurements) and N = sum{r_ij} rows. The entries of Measurements
  #   correspond to the repeated measurements by the given system on each of the n subjects. Note that the observations 
  #   are ordered by subject with repeated measurements appearing consecutively with MS1 measurements appearing first and MS2 second
  
  
  # Make sure the correct inputs are inputted
  if(is.null(n) == TRUE){
    print("You must enter the number of subjects, n.")
  }else if(is.null(delta) == TRUE){
    print("You must define the clinically acceptable difference by inputing delta.")
  }else if(is.null(data) == TRUE){
    print("You must enter data for both measurement systems.")
  }else{ # Function inputs are correct, therefore we can proceed
    
    # Get initial guesses for parameters (needed for likelihood maximization)
    Y1 <- subset(data, subset = data$MS==1)$Measurements
    Y2 <- subset(data, subset = data$MS==2)$Measurements
    sub_means1 <- rep(0, n)
    sub_means2 <- rep(0, n)
    sub_vars1 <- rep(0, n)
    sub_vars2 <- rep(0, n)
    r1vec <- rep(0, n)
    r2vec <- rep(0, n)
    for(i in 1:n){
      sub_i1 <- subset(data, subset = (data$Subject==i & data$MS==1))
      sub_means1[i] <- mean(sub_i1$Measurements)
      sub_vars1[i] <- var(sub_i1$Measurements)
      r1vec[i] <- max(sub_i1$Repeat)
      sub_i2 <- subset(data, subset = (data$Subject==i & data$MS==2))
      sub_means2[i] <- mean(sub_i2$Measurements)
      sub_vars2[i] <- var(sub_i2$Measurements)
      r2vec[i] <- max(sub_i2$Repeat)
    }
    muinit <- mean(c(Y1,Y2)) # overall mean for both MS observations
    alphainit <- mean(Y2) - mean(Y1)
    betainit <- 1
    s1init <- sqrt(mean(sub_vars1)) # repeatability for MS1 (pooled across parts)
    s2init <- sqrt(mean(sub_vars2)) # repeatability for MS2 (pooled across parts)
    spinit <- mean(c(sqrt(var(Y1) - s1init^2), sqrt(var(Y2) - s2init^2)))
    
    # Saving initial parameters in matrix
    init.param <- t(as.matrix(c(muinit,alphainit,betainit,spinit,s1init,s2init)))
    
    # Maximize the Likelihood
    options(warn = -1)
    maximize <- optim(init.param, fn = log.likelihood, gr = gradient, n, r1vec, r2vec, data, method = "BFGS", hessian = T)
    
    # Get MLEs and standard errors 
    param.hat <- as.matrix(t(maximize$par)) # save parameters estimates in matrix
    muhat <- param.hat[1]
    alphahat <- param.hat[2]
    betahat <- param.hat[3]
    sphat <- param.hat[4]
    s1hat <- param.hat[5]
    s2hat <- param.hat[6]
    
    #Get Inverse of Fisher Information to get Assymptotic Variances
    I <- fisher.info(param.hat, n, r1vec, r2vec)
    inverseI <- solve(I) 
    
    # Square root of diagonal of Inverse Fisher information matrix: find Standard Errors
    standard.errors <- as.matrix(sqrt(diag(inverseI)))   
    # standard.errors <- as.matrix(sqrt(diag(solve(maximize$hessian))))   
    
    # Construct estimate of P(agreement) and get its standard error for CIs
    p <- seq(muhat-3*sphat, muhat+3*sphat, 0.1)
    thetahat <- matrix(0, nrow = length(p))
    SEtheta <- matrix(0, nrow = length(p))
    for(i in 1:length(p)){
      k1hat <- (-delta-alphahat-p[i]*(betahat-1))/(sqrt((s1hat^2)+(s2hat^2)))
      k2hat <- (delta-alphahat-p[i]*(betahat-1))/(sqrt((s1hat^2)+(s2hat^2)))
      thetahat[i] <- pnorm(k2hat)-pnorm(k1hat) # estimated theta (evaluated at MLEs)
      D <- delta.method(delta = delta,p = p[i], alpha = alphahat, beta = betahat, s1 = s1hat,s2 = s2hat)
      SEtheta[i] <- sqrt(t(D) %*% inverseI %*% D) # Standard error of the probability of agreement
    }
    LCL <- thetahat-1.96*SEtheta
    LCL[LCL < 0] = 0
    UCL <- thetahat+1.96*SEtheta
    LCL[LCL > 1] = 1
    
    
    # Probability of Agreement plot
    
    par(mfrow=c(1,1))
    
    plot(
      p, 
      thetahat, 
      main = bquote(
        plain("Probability of Agreement Plot with ") ~ delta == .(delta)), 
        ylim = c(0,1), 
        col = "red", 
        type = "l", 
        ylab = "θ(s)", # Probability of Agreement (thetahat)
        xlab = "s"     # MLE of true value
        )
    points(
      p, 
      thetahat - 1.96 * SEtheta, 
      type = "l", 
      col = "blue", 
      lty = 2
      )
    points(
      p, 
      thetahat + 1.96 * SEtheta, 
      type = "l", 
      col = "blue", 
      lty = 2
      )
    labels <- 
      c("θ(s)","95% CI")
    legend(
      "bottomright", 
      inset = 0.02, 
      labels, 
      col = c("red", "blue"), 
      lty = c(7,2)
      )
  }
  
  # Order of return: (mu, alpha, beta, sp, s1, s2)
  return(
    list(
      Estimates = param.hat, 
      St.Errors = standard.errors, 
      PAgree = data.frame(
        True.Val = p, 
        PA = thetahat,
        SE = SEtheta)
        )
    )
  }


## Construct relevant plots:

## Scatter plot (ignoring repeated measurements)
scatter.plot(data = rr, n = n_subj )

## Scatter plot (accounting for repeated measurements)
rep.scatter.plot(data = rr, n = n_subj)

## Repeatability plot
repeat.plot(data = rr, n = n_subj )

## Modified QQ-plot
mod.qqplot(data = rr, n = n_subj )

## Probability of Agreement analysis:
p <- prob.agree(n = n_subj , delta = delta, data = rr)

## Inspect parameter estimates and standard errors. (Order of return: mu, alpha, beta, sigma_s, sigma_1, sigma_2).
p$Estimates
p$St.Errors


## Inspect Probability of Agreement estimates.
p$PAgree

# - Chunk_6                                                                  - #
# - Requires:  * R-libraries and object data (libraries_and_data),           - #
# -            * p$PAgree (chunk_5)                                          - #


## Probability agreement (NT Stevens, 2019)
#  customized agreement plot

p$PAgree %>% 
  ggplot(
    aes(
      x = True.Val,
      y = PA
      ) 
    ) +
  geom_ribbon(
    aes(
    ymin = PA - 1.96 * SE,
    ymax = PA + 1.96 * SE
    ),
    color = "lightgrey",
    alpha = 0.2
    ) +
  geom_line() + 
  scale_y_continuous(
    limits = c(0, 1.05)                  # adjust upper limit so that all of 95%CI band is included
  ) +
  xlab("True value (ML estimate)") +
  ylab("Probability of Agreement") +
  labs(
    title = paste ("Probability of Agreement \nbetween manual and electro measurements"),
    subtitle = paste ("Maximum allowed inter-rater difference:", paste(delta)),
    caption = "Code: Parker RA et al., BMC Med Res Methodol 2020; 20:154"
    ) +
  theme_light() +
  theme(
    plot.title = element_text(hjust = 0.5),
    plot.subtitle = element_text(hjust = 0.5)
    )

4.3. Decomposition of accuracy, precision, and agreement, with statistical tests for equivalence

Electro and manual method appear to have good equivalence according to Bland-Altman and Probability of Agreement plots. These analyses facilitate decision based on clinical relevance but lack a clear null hypothesis on method equivalence. Recently, a novel strategy was proposed enabling statistical decision-making. Causes of non-equivalence are identified in three steps, decomposing accuracy (constant bias), precision (proportional bias), and agreement ⁶.

# - Chunk_7                                                           - #
# - Requires:  * R-libraries and object data (libraries_and_data),    - #
# -            * data_replMean (chunk_4b)                             - # 

## Assumption: manual is the new/putative method, electro is the reference

# Optional adjustment of new method for bias, using parameters obtained from Deming regression with the original data (see "regression" plot obtained with original data)

# C <- 1.3982
# m <- 0.9908

# default
C <- 0
m <- 1

## Run eirasagree

data_replMean %>% 
  filter(Replicate == "1st") %>% 
    select(
      Id, 
      Method, 
      replicate_Mean         # use "y" instead of "replicate_Mean" if raw replicated measurements should be used 
      ) %>%
    pivot_wider(                
        names_from = "Method", 
        values_from = "replicate_Mean"
    ) %>%
  mutate(electro = (C + Electro * m) ) %>%
  AllStructuralTests(
    reference.cols = {grep("Electro", colnames(.) )},
    newmethod.cols = {grep("Manual", colnames(.) )},
    alpha = 0.05,
    out.format = "txt",
    B = 5e3
    )

Accuracy In method comparison, accuracy assesses whether one measurement method consistently over- or under-estimates the true value compared to a reference method. For this test, differences between measurements obtained from the same subjects (yi − xi) are charted against the centered values of the reference measurement (xi − mean(x)). A simple linear regression is applied. Under the null hypothesis of method equivalence the y-intercept, representing the mean difference between measurements, is 0. The null hypothesis is rejected if bootstrapping shows 0 outside the 95% confidence interval for the y-intercept.

The bias in accuracy can be corrected by translating lines according to the amount of bias computed, thereby positioning them inside the confidence band obtained by bootstrapping.

Graphically, the regression intercept is the mean of y − x and located where the line crosses the y axis. The null hypothesis, no difference between the techniques, is satisfied when the confidence interval includes the origin of the graph (0, 0). If bootstrapping shows (0, 0) outside the 95% confidence interval, the null hypothesis is rejected.

Precision The slope of the Bland-Altman plot is used to detect unequal precision, as regression of 𝑦 −𝑥 against 𝑥 +𝑦 will not be 0 when the true ratio of variability of measurement errors λ ≠ 1. If a horizontal line cannot be fitted into the 95% confidence band defined by the functional regression, the null hypothesis of equal variability of measurement errors is rejected.

The graph for this test resembles the Bland-Altman plot. For a precision test, the null hypothesis is not rejected when a horizontal line shifted by the bias can fit into the 95% confidence band.

Agreement This test applies the Deming regression to verify if two measurement techniques measure the same values in the same subjects […] # – depends on λ, estimated in the previous step, to compute the true X and Y values before the computation of the regression estimates. When the value of lambda is not assumed to be 1, it affects the band width. Analytically, the null hypothesis is rejected if α ≠ 0 and β ≠ 1. Since these two parameters are estimated together, the Bonferroni correction is applied to control the probability of a type I error and preserve test power (effective significance level is 2.5%). Graphically, two alternative statistical approaches were implemented for the bisector line agreement: the assessments of the 95% prediction ellipse and of the 95% confidence band of regression, both done by bootstrapping. In the first one, the null hypothesis is to be rejected if (β, α) is not inside the ellipse, in the second one, if the bisector line cannot fit inside the band. These methods test intercept and slope together and provide stronger statistical power than an independent assessment of these two factors.

In the bisector agreement test, non-rejection of the null hypothesis occurs when the lines that are parallel to the bisector line, translated by the bias range, can fit into the 95% bootstrapping confidence regression band.

Analytically, the null hypothesis is rejected if the Deming regression line intercept α ≠ 0 and slope β ≠ 1. Graphically, two alternative statistical approaches were implemented for the bisector line agreement: the assessments of the 95% prediction ellipse and of the 95% confidence band of regression, both done by bootstrapping. In the first one, the null hypothesis is to be rejected if (β, α) is not inside the ellipse, in the second one, if the bisector line cannot fit inside the 95% confidence interval of the Deming regression.

Slope and intercept of the regression can also be used to adjust methods methods for, respectively, proportional and constant bias relative to each other. Thus, in some cases method equivalence may be achieved by a simple data transformation.

5. Adjusting electro measurements for bias improves method agreement

5.1. Comparing the means, variances and repeatabilities of adjusted replicate measurements

# - Chunk_8                                                                  - #
# - Requires:  * R-libraries and data (libraries_and_data)                   - #
# -            * replicates, std (chunk_2)                                   - # 
# - parameters for bias of new versus reference method (chunk_7)             - #                  
# Adjustment parameters: manual = C + m * electro                            - #

C <- 1.3951  # constant bias  
m <- 0.9904  # proportional bias

# default
# C <- 0
# m <- 1


## Adjust data for bias of the new method and alculate intra-subject replicate means

data_replMean_adj <- 
  data %>%  
  mutate(
    Value = case_when(
      Method == "Electro" ~ Value * m + C, 
      Method == "Manual" ~ Value)
    ) %>%  
  pivot_wider(
    names_from = "Replicate",
    values_from = "Value") %>% 
  rowwise() %>% 
  mutate(
      replicate_Mean = mean(
        c_across(
          all_of(replicates)
          )
        ),
      replicate_SD = sd(
        c_across(
          all_of(replicates) 
          )
        ),
      replicate_SE = std(
        c_across(
          all_of(replicates) 
          )
        )
      ) %>% 
  pivot_longer(
    cols = all_of(replicates),
    names_to = "Replicate",
    values_to = "Value"
  ) %>% 
  mutate(
    Replicate = factor(Replicate)
    )

# Table with summary statistics of the original sample data

data_replMeanMean_adj <- 
  data_replMean_adj %>%  
  { rbind(
      group_by(., Method, Replicate) %>%     # for each method and replicate, calculate mean and SE across subjects
      summarise(
        Mean = mean(Value),
        SD = sd(Value),
        SE = std(Value)
        ),
      group_by (., Method) %>%                # for each method, calculate means across subjects for mean and SE across replicates 
      summarise(
        Mean = mean(replicate_Mean),
        SD = sd(replicate_Mean),
        SE = std(replicate_Mean)
        ) %>% 
        mutate(Replicate = "replicate mean")
      )
  } %>% 
  mutate(                                     # vertical offset for error bars and their means in the plots
    errorbar_pos = rep(- 0.06, n()),
    Replicate = factor(Replicate)
    ) 


## Plots 

# Distribution, mean and SE for individual replicates and their means 

p5A <- 
data_replMean_adj %>% 
  bind_rows(
    data_replMean_adj %>%
    filter(Replicate == "1st") %>%                 # select a unique row for each subject's replicate mean
    mutate(
    .keep = "unused",
    Replicate = "replicate mean",
    Value = replicate_Mean)
  ) %>%
  select(Id, Replicate, Method, Value) %>%            
  ggplot() +
  geom_density(                             # show smoothed distribution of measurement values
    aes(
      Value*0.7, 
      fill = Method
      ),
    alpha = 0.3,
    color = "grey", 
    position = position_nudge(y = 0.02)
    ) +
  geom_point_interactive(                   # show data points of all subjects
    aes(
      x = Value,
      y = -0.02,                            # defines vertical position of the data point rows in the facets
      color = Method,
      data_id = Id,
      tooltip = Id
      ),
    size = 3,
    alpha = 0.5
    ) +
  geom_errorbarh (                          # show SE bar
    data = data_replMeanMean_adj,
    aes(
      xmin = Mean - SE, 
      xmax = Mean + SE,
      y = errorbar_pos - 0.025,
      color = Method
      ),
      height = 0.025                       # height of bar caps
    ) +
  geom_errorbarh (                         # show SD bar
    data = data_replMeanMean_adj,
     aes(
      xmin = Mean - SD, 
      xmax = Mean + SD,
      y = errorbar_pos,
      color = Method
      ),
      height = 0.04                        # height of bar caps
    ) +
  geom_point(                              # add means to SE bars
    data = data_replMeanMean_adj,
     aes(
      x = Mean,             
      y = errorbar_pos,
      color = Method
      ),
      shape = 18,
      size = 3
    ) +  
  facet_wrap(
    .~ Replicate, 
    dir = "v", 
    strip.position = "left",
    nrow = 4
    ) +
  labs(
    title = "Inter-method comparison of mean, SD and SE",
    y = "Replicate",
    x = "Measurement",
    tag = "A"
    ) +
  scale_fill_manual(values = palette_1) +
  scale_color_manual(values = palette_1) +
  guides(
    fill = guide_legend(title="Method"),
    color = guide_legend(title="Method")
    ) +
  theme_light() +
  theme(
    plot.title = element_text(
      size = 12,
      hjust = 0.5),
    axis.text.y = element_blank(), 
    strip.background = element_rect(
      fill = "white", 
      color = "darkgrey"
      ), 
    axis.ticks.y = element_blank(),
    strip.text.y = element_text(
      color = "black",
      size = 12
      )
    ) 


# Consistency of results over repeated measurements

p5B <- 
data_replMeanMean_adj %>% 
  filter (!Replicate == "replicate mean") %>% 
  mutate(
    Replicate = as.numeric(Replicate), 
    SEup = Mean + SE,
    SElow = Mean - SE
    ) %>% 
  select(!c(errorbar_pos, SD, SE) ) %>% 
  ggplot(
    aes(
      x = as.numeric(Replicate), 
      y = Mean, 
      color = Method, 
      fill = Method
      )
    ) + 
  geom_ribbon(
    aes(ymin =SElow, ymax = SEup), 
    linetype = 0, 
    alpha = 0.2
    ) +
  geom_line() + 
  scale_fill_manual(values = palette_1) +
  scale_color_manual(values = palette_1) +
  scale_x_continuous(
    breaks = 1:3,
    labels = levels(data$Replicate)
    ) +
  labs(
    title = "Repeatability over consecutive 
    replicate measurements",
    x = "Replicate",
    y = "Mean +/- SE",
    tag = "B"
    ) +
  theme_light() +
  theme(
    plot.title = element_text(
      size = 12,
      hjust = 0.5
      ),
    legend.position = "none"
    )

         

# Variance versus value of replicate means

p5C <- 
data_replMean_adj %>%
  filter(Replicate == "1st") %>%             # select a unique row for each subject's replicate mean (avoid generating one row for each replicate)
  ggplot(
    aes(
      x = replicate_Mean,
      y = replicate_SD,
      color = Method,
      fill = Method
      )
    ) +
  geom_point_interactive(             # show data points of for replicate means of all subjects
    aes(
      data_id = Id,
      tooltip = Id
      ),
    size = 3,
    alpha = 0.5
    ) +
  geom_smooth(
    lwd = 0.1, 
    alpha = 0.1
    ) +
  scale_color_manual(values = palette_1) +
  scale_fill_manual(values = palette_1) +
  labs(
    title = "Replicate variance versus mean",
    caption =  "Hover over points to highlight individual patients",
    x = "Replicate mean",
    y = "SD",
    tag = "C"
    ) +
  theme_light() +
  theme(
    plot.title = element_text(
      size = 12,
      hjust = 0.5),
    plot.caption = element_text(
      size = 9,
      hjust = 0.5),
    legend.position = "none"
    ) 



# Deviation of individual measurements from replicate means
# (alternative to variance versus value plot, p9C) 

p5D <- 
  data_replMean_adj %>%
  mutate(Residuals = Value - replicate_Mean) %>% 
  ggplot(
    aes(
      x = Value,
      y = Residuals,
      color = Method,
      fill = Method
      ) 
  ) +
  geom_point_interactive(
    aes(
      data_id = Id,
      tooltip = Id      
      ),
    size = 2,
    alpha = 0.5
    ) +
  geom_hline(
    aes(yintercept = 0),
    color = "darkgrey"
    ) +
  geom_ysidedensity(
    aes(y = Residuals),
    alpha = 0.1
    ) +     
  scale_color_manual(values = palette_1) +
  scale_fill_manual(values = palette_1) +
  labs(
    title = "Deviation from replicate mean",
    x = "Measurement",
    y = "Measurement - Replicate mean",
    tag = "D"
    ) +
  theme_light() +
  theme(
    plot.title = element_text(
      size = 12,
      hjust = 0.5
      ),
    axis.text.y = element_text(
      size = 9),
    legend.position = "none"
    ) +
  theme_ggside_void()


# Combined plot layout

design <- "AAAAA#BBBB
           AAAAA#BBBB
           AAAAA#BBBB
           AAAAA#####
           AAAAA#####
           AAAAA##E##
           AAAAA#####
           AAAAA#####
           CCCCC#DDDD
           CCCCC#DDDD
           CCCCC#DDDD"

girafe(
  ggobj = 
    p5A + 
    p5B + 
    p5C +
    p5D +
    guide_area() +
    plot_annotation(
      title = "Distributions and means of measurements,
      electro method adjusted for bias",
      theme = theme(
        plot.title = element_text(
          size = 14, 
          hjust = 0.5
          )
        )
      ) +
    plot_layout(
      design = design,
      guides = "collect"
      ),
  height_svg = 11,
  width_svg = 9,
  options = list(
    opts_hover(css = "stroke:black; stroke-width:2px;"),  # CSS code of selected line
    opts_hover_inv(css = "opacity:0.1;"),                 # CSS code of all other lines
    opts_sizing(rescale = FALSE)                              # allows to zoom in and out of saved html image
    )
  )

# Display summary table

data_replMeanMean_adj %>% 
  select(!errorbar_pos) %>% 
  kable(., "html") %>% 
  kable_styling( 
    bootstrap_options = c(
      "striped", "hover", "responsive", full_width = FALSE
      ) 
    )

Method	Replicate	Mean	SD	SE
Manual	1st	1.586207	7.218613	1.3404628
Manual	2nd	1.482759	7.471951	1.3875064
Manual	3rd	1.241379	7.452972	1.3839822
Electro	1st	1.395100	7.018176	1.3032426
Electro	2nd	1.429252	7.409030	1.3758223
Electro	3rd	1.497555	7.079529	1.3146354
Manual	replicate mean	1.436782	7.263108	0.7786869
Electro	replicate mean	1.440636	7.041476	0.7549254

Sample and sample/replicate means become almost identical when measurements with the electro method are adjusted for bias.

5.2. Bland-Altman plot with adjusted measurements

# - Chunk_9                                                                  - #
# - Requires:  * R-libraries and object data (libraries_and_data)            - #
# -            * calculate_ba_coefficient (chunk_4a)                         - # 
# -            * LoA_CIcoeffs, SD_nested, palette_1 (chunk_4b)                 - #
# -            * data_replMean_adj (chunk_8)                                 - # 


# Differences and means of measurements with the two methods for each sample and replicate

data_BlAltm_adj <- 
  data_replMean_adj %>% 
    select(Id, Replicate, Method, Value) %>%
    pivot_wider(                
        names_from = "Method", 
        values_from = "Value"
    ) %>%   
  mutate(
    Method_mean = (Manual + Electro)/2,
    Method_diff = Manual - Electro
    )


# Inter-replicate means and SDs for differences and means of measurements for each sample

data_BlAltm_replMeans_adj <- 
  data_BlAltm_adj %>% 
  select(Id, Replicate, Method_mean, Method_diff) %>% 
  split(.$Id) %>% 
  lapply(
    function(x) summarise(x,  
      Mean_methodMean = mean(Method_mean), 
      SD_methodMean = sd(Method_mean), 
      Mean_methodDiff = mean(Method_diff), 
      SD_methodDiff = sd(Method_diff)
      )
    ) %>% 
  list_rbind(names_to = "Id") 


data_BlAltm_replLoA_adj <-
  data_BlAltm_adj %>% 
  group_by(Replicate) %>% 
  summarise(
    Mean = mean(Method_diff),
    SD = sd(Method_diff),
    LoA_up = Mean + 1.96* SD,
    LoA_low = Mean - 1.96* SD
    )

#  Summary stats of inter-replicate means and LoAs

data_BlAltm_LoA_replMeansMeans_adj <- 
  data_BlAltm_replMeans_adj %>% 
  summarise(
    Mean = mean(Mean_methodDiff), 
    SD = SD_nested(SD_methodDiff)
    ) %>%
  mutate(
    LoA_up = Mean + 1.96*SD, 
    LoA_low = Mean - 1.96*SD
    ) 

# Table of means and LoAs

LoA_table_adj <- 
  data_BlAltm_LoA_replMeansMeans_adj %>% 
  mutate(
    .before = Mean, 
    Replicate = "replicate mean"
    ) %>% 
  bind_rows(
    data_BlAltm_replLoA_adj,
    .
    ) %>% 
  mutate( 
    LoA_outer_high = Mean + SD * LoA_CIcoeffs[2],
    LoA_outer_low = Mean - SD * LoA_CIcoeffs[2],
    LoA_inner_high = Mean + SD *LoA_CIcoeffs[1],
    LoA_inner_low = Mean - SD * LoA_CIcoeffs[1]
    ) %>% 
  select(!SD)


# BA plot for all samples and replicates

p6A <- 
  data_BlAltm_adj %>% 
  ggplot(
    aes(
      x= Method_mean, 
      color = Replicate
      )
    ) +
  geom_point_interactive(
    aes(
      y= Method_diff,
      group = Id,
      data_id = Id,
      tooltip = Id
      ),
    size = 3,
    alpha = 0.5,
    show.legend = FALSE
    ) +
  
  geom_ysidedensity(                  # Density sideplot
    aes(
      y= Method_diff, 
      fill = Replicate
      ),
    alpha = 0.3
    ) + 
  
  geom_hline_interactive(             # Lines for mean difference and LoAs
    data = LoA_table_adj[1:3,],
    aes(
      data_id = Replicate,
      tooltip = Replicate,
      yintercept = Mean,
      color = Replicate
      ),
    alpha = 0.7
    ) +
  geom_hline_interactive(
    data = LoA_table_adj[1:3,],
    aes(
      data_id = Replicate,
      tooltip = Replicate,
      yintercept = LoA_up,
      color = Replicate
      ),
    alpha = 0.7
    ) +
  geom_hline_interactive(
    data = LoA_table_adj[1:3,],
    aes(
      data_id = Replicate,
      tooltip = Replicate,
      yintercept = LoA_low,
      color = Replicate
      ),
    alpha = 0.7
    ) +

  geom_hline_interactive(             # Lines for CIs of LoAs
    data = LoA_table_adj[1:3,],
    aes(
      data_id = Replicate,
      tooltip = Replicate,
      yintercept = LoA_outer_high,
      color = Replicate
      ),
    linetype = "dashed",
    linewidth = 0.7,
    ) +
  geom_hline_interactive(
    data = LoA_table_adj[1:3,],
    aes(
      data_id = Replicate,
      tooltip = Replicate,
      yintercept = LoA_outer_low,
      linetype = "dashed",
      color = Replicate
      ),
    linetype = "dashed",
    linewidth = 0.7,
    ) +
  geom_hline_interactive( 
    data = LoA_table_adj[1:3,],
    aes(
      data_id = Replicate,
      tooltip = Replicate,
      yintercept = LoA_inner_high,
      color = Replicate
      ),
    linetype = "dotted",
    linewidth = 1,
    ) +
  geom_hline_interactive(
    data = LoA_table_adj[1:3,],
    aes(
      data_id = Replicate,
      tooltip = Replicate,
      yintercept = LoA_inner_low,
      linetype = "dashed",
      color = Replicate
      ),
    linetype = "dotted",
    linewidth = 1,
    ) +
  
  scale_color_manual_interactive(
    values = palette_2)  +
  scale_y_continuous(
    limits = c(
      LoA_table_adj %>% 
      select(!Replicate) %>% 
      range()+ c(-0.5, .5)
      ), 
    breaks = seq(-10, 10, 1)
    ) + 
  labs(
    tag = "A",
    title = "Individual replicate data sets",
    x = "(Manual + Electro) / 2",
    y = "Manual - Electro"    ) +
  theme_light() +
  theme(
    plot.title = element_text(
      size = 11,
      hjust = 0.3
      ),
    legend.position = "none"
    ) +
  theme_ggside_void()



# BA plot for intra-replicate means

p6B <- 
  data_BlAltm_replMeans_adj %>% 
  ggplot(
    aes(
      x= Mean_methodMean, 
      y= Mean_methodDiff,
      data_id = Id,
      tooltip = Id
      )
    ) +
  geom_point_interactive(
    color = "black",
    alpha = 0.3,
    size = 3
    ) +
  geom_ysidedensity(
    inherit.aes = FALSE,
    aes(y = Mean_methodDiff)
    ) +     
  geom_hline(
    data = LoA_table_adj[4,],
    aes(yintercept = LoA_table_adj[4,]$Mean)
    ) +
  geom_hline(
    data = LoA_table_adj[4,],
    aes(yintercept = LoA_table_adj[4,]$LoA_up),
    ) +
  geom_hline(
    data = LoA_table_adj[4,],
    aes(yintercept = LoA_table_adj[4,]$LoA_low)
    ) +
  
  geom_hline(                             # Lines for CIs of LoAs
    data = LoA_table_adj[4,],
    aes(
      yintercept = LoA_outer_high
      ),
    linetype = "dashed",
    linewidth = 0.7,
    ) +
  geom_hline(
    data = LoA_table_adj[4,],
    aes(
      yintercept = LoA_outer_low
      ),
    linetype = "dashed",
    linewidth = 0.7,
    ) +
  geom_hline( 
    data = LoA_table_adj[4,],
    aes(
      yintercept = LoA_inner_high
      ),
    linetype = "dotted",
    linewidth = 1,
    ) +
  geom_hline(
    data = LoA_table_adj[4,],
    aes(
      yintercept = LoA_inner_low
      ),
    linetype = "dotted",
    linewidth = 1,
    ) +
  
  scale_y_continuous(
    limits = c(
      LoA_table_adj %>% 
      select(!Replicate) %>% 
      range() + c(-0.5, .5)
      ), 
    breaks = seq(-10, 10, 1)
    ) + 
  labs(
    tag = "B",
    title = "Averaged replicate data",
    x = "(Manual + Electro) / 2",
    y = "Manual - Electro"
   ) +
  theme_light() +
  theme(
    plot.title = element_text(
      size = 11,
      hjust = 0.5
      ),
    axis.title.y = element_blank(),
    legend.position = "none"
    ) +
  theme_ggside_void()


# Interactive legend for plots

p6_legend <- 
  LoA_table_adj %>%
  select(!c(
    LoA_inner_high, 
    LoA_inner_low)
    ) %>% 
  mutate(
     across(
      !Replicate, 
      function (x) sprintf(
        "%0.2f", round(x, digits = 2) 
        )
      )
    ) %>% 
  rename(
    mean = Mean, 
    high = LoA_up, 
    low = LoA_low,
    `high, \nupper CI`= LoA_outer_high,
    `low, \nlower CI` = LoA_outer_low
    ) %>% 
  mutate(
    .before = everything(), "Limits of Agreement\n\n" = "___", 
    across(everything(), as.character)
    ) %>% 
  rbind(., 
    c(" ", "Replicate", " ", " ", " ", " ", " ")
    ) %>% 
  pivot_longer(
    cols = !c("Replicate"), 
    names_to = "x"
    ) %>% 
  mutate(
    Replicate = factor(
      Replicate, 
      levels = c("replicate mean", "3rd", "2nd", "1st", "Replicate"
      )
    ),
    x = factor(
      x,
      levels = c(unique(x)
      )
    )
  ) %>% 
  ggplot(
    aes(
      x = x, 
      y = Replicate, 
      color = Replicate
      )
    ) + 
  geom_text_interactive(
    aes(
      data_id = Replicate,
      label = paste(value)),
    size = 3.5
    ) +
  scale_color_manual_interactive(
    values = palette_2) +
  scale_x_discrete(position = "top") +
  scale_y_discrete(position = "right") +
  theme_classic()+
  theme(
    aspect.ratio = 0.25,
    axis.line = element_blank(),
    axis.ticks = element_blank(),
    axis.title = element_blank(),
    axis.text = element_text(
      size = 10
      ),
    legend.position = "none"
    )


# Combined plot layout

design <-  "AAAAABBBB
            AAAAABBBB
            AAAAABBBB
            AAAAABBBB
            ##CCCCC##
            ##CCCCC##"

girafe(
  ggobj = 
    p6A + 
    p6B +
    p6_legend +
    plot_layout(
      design = design
      ) +
    plot_annotation(
    title = 
      "Bland-Altman plots with replicate measurements, \nelectro method adjusted for bias",
    caption = "Hover over points to highlight individual patients,
    hover over legend to display mean and limit values for individual replicates",
    theme = theme(
      plot.title = element_text(
        face = "bold",
        size = 12, 
        hjust = 0.5
        ),
      plot.caption = element_text(
        size = 9,
        hjust = 0.5
        )
      )
    ),
  height_svg = 7, width_svg = 9,              # svg settings for saving to individual image to html, omit for knitting
  options = list(
    opts_hover(css = ''),                     # CSS code of selected interactive geoms
    opts_hover_inv(css = "opacity:0.12;"),    # CSS code of non-selected interactive geoms
    opts_sizing(rescale = FALSE)
    )
  )

5.3. Probability of Agreement plots with adjusted measurements

6. Other methods for assessing agreement

Taffe extended the Probability of Agreement test, relaxing a number of assumptions regarding the true distribution of the measured characteristic ⁸. No code or package were mentioned. However, functions for decomposing bias, precision, and agreement were recenty added to the R package MethodCompare ⁹. It would be interesting to compare the results with those of eirasagree ⁶. SimplyAgree is another recently published R package that seems worth a look ¹⁰.

7. Citation

Please quote as: Weissensteiner, Thomas. Computing T-Cell Receptor Recognition Motifs and Their Matches in the Human Proteome. RPubs, 09 Jan. 2025. https://rpubs.com/thomas-weissensteiner/1261902

8. References

Statistical methodology for the concurrent assessment of interrater and intrarater reliability: using goniometric measurements as an example. Eliasziw M, Young SL, Woodbury MG, Fryday-Field K. Phys Ther. 1994;74(8):777-88. https://doi.org/10.1093/ptj/74.8.777
Reliability: What is it, and how is it measured? Bruton A, Conway J, Holgate S. Physiotherapy 2000;86:94–9. https://doi.org/10.1016/S0031-9406(05)61211-4
Bland-Altman method comparison tutorial. Analyse-it Software, Ltd. Version 6.15, published 18-Apr-2023. https://analyse-it.com/docs/tutorials/bland-altman/overview
Exact Parametric Confidence Intervals for Bland-Altman Limits of Agreement. Carkeet, A. Opt Vis Sci 2015: 92(3):e71-e80. https://doi.org/10.1097/OPX.0000000000000513
Assessing agreement between two measurement systems: An alternative to the limits of agreement approach. Stevens NT, Steiner SH, MacKay RJ. Stat Methods Med Res. 2017;26(6):2487-2504. https://doi.org/10.1177/0962280215601133
Is the Bland-Altman plot method useful without inferences for accuracy, precision, and agreement? Silveira PSP, Vieira JE, Siqueira JO. Rev Saude Publica. 2024;58:01. https://doi.org/10.11606/s1518-8787.2024058005430
The Bland-Altman method should not be used when one of the two measurement methods has negligible measurement errors. Taffé P, Zuppinger C, Burger GM, Nusslé SG. PLoS One. 2022 Dec 12;17(12):e0278915. https://doi.org/10.1371/journal.pone.0278915
Use of clinical tolerance limits for assessing agreement. Taffé P. Stat Methods Med Res. 2023 Jan;32(1):195-206. https://doi.org/10.1177/09622802221137743
Extended biasplot command to assess bias, precision, and agreement in method comparison studies. Taffé, P, Peng, M, Stagg, V, Williamson, T. The Stata Journal 2023, 23(1), 97-118. https://doi.org/10.1177/1536867X231161978
SimplyAgree: An R package and jamovi Module for Simplifying Agreement and Reliability Analyses. Caldwell, AR. Journal of Open Source Software 2022, 7(71), 4148. https://doi.org/10.21105/joss.04148