library(bookdown)
library(minpack.lm)
library(readxl)
library(dplyr)
library(purrr)
library(tidyr)
library(ggplot2)
library(ggpubr)
library(tibble)
library(ggsci)
library(latex2exp)
library(AICcmodavg)
library(vtreat)
library(kableExtra)
library(broom)
# automatically create a bib database for R packages
knitr::write_bib(c(
  .packages(), 'bookdown', 'knitr', 'rmarkdown'
), 'packages.bib')
**Statistical analysis roadmap.**

Figure 0.1: Statistical analysis roadmap.

1 Objective


An inevitable difficulty of the comparative analysis of variably parameterized binding models lies in the differences of the degrees of freedom that are linked to model complexity. For this reason, binding models were evaluated based on the Akaike Information Criterion (\(AIC\)) which takes into account goodness-of-fit as quantified by the \(RMSE\) of the residuals but which is capable of neutralizing complexity differences between binding models as determined by the number of parameter. The \(AIC\) approximates the information theoretic Kullback-Leibler (KL) distance which captures the information that is lost when a model is used to approximate the true data generating process. Hence, the KL divergence between a model and the data generating process can be used as a measure of performance for that particular model. In general, this approach can be thought of as a restricted regression. The idea is to evaluate the set of binding models by computing \(AIC\) scores and aligning the members on an ordinal scale with respect to the best describing top-ranked model based on \(AIC\) differences.


2 Methods

2.1 Model evaluation based on the Akaike Information Criterion

  • Performance of each model function in describing the ThiITm•tRNA\(\mathrm{^{Phe}_{Bs}}\) titration profile was evaluated based on the Akaike Information Criterion (\(AIC\))
  • In this work the \(AIC\) corrected for small sample sizes (\(AIC_c\)) was used
  • Binding models were ranked based on the \(AIC_c\) according to the following equation:

\[\begin{equation} AIC_c = n\,log\left(\frac{\sum\limits_{i=1}^n \hat{\epsilon}^2_i}{n}\right)+2k+\frac{2k(k+1)}{n-k-1} \tag{2.1} \end{equation}\]

  • Where \(n\) = number of data points, \(k\) = number of parameter and \(\hat{\epsilon}^2_i\) = the estimated residuals obtained from the nls regression analysis
  • Binding models were further ranked by \(AIC_c\) differences (\(\Delta\)) specified in equation (2.2):

\[\begin{equation} \Delta_i=AIC_{c,i}-AIC_{c,min} \tag{2.2} \end{equation}\]

  • Where \(AIC_{c,min}\) is the \(AIC_c\) of the top-ranked binding model that received the minimum \(AIC_c\) score according to equation (2.1)
  • Level of support for each binding model can be estimated using \(\Delta_i\) values based on plausibility key as described by Anderson and Burnham (2004) and which is depicted in figure 2.1:
    • \(\Delta_i\) values ranging between 0-2 translate to binding models having substantial support
    • \(\Delta_i\) values between 4-7 refer to binding models with considerably less support
    • Binding models receiving \(\Delta_i\) values > 10 have essentially no support
  • Based on \(AIC_c\) differences Akaike weights (weights of evidence) for each binding model (\(\Delta_i\)) with respect to the whole model set (\(\Delta_r\)) were calculated based on equation (2.3):

\[\begin{equation} \omega_i=\frac{exp(-\frac{1}{2}\Delta_i)}{\sum\limits_{r=1}^R exp(-\frac{1}{2}\Delta_r)} \tag{2.3} \end{equation}\]

  • Based on individual Akaike weights evidence ratios between two binding models were calculated according to equation (2.4):

\[\begin{equation} \frac{\omega_i}{\omega_j} \tag{2.4} \end{equation}\]

**Level of empirical support of binding model $i$.** Shown is a plausibility key expressed in terms of $AIC_c$ differences ($\Delta_i$) as described by Anderson and Burnham (2004) based on empirical results. $0 \leq \Delta_i \leq 2$: Substantial support for binding model $i$. $4 \leq \Delta_i \leq 7$: Considerable less support for binding model $i$. $\Delta_i > 10$: Essentially no support for binding model $i$.

Figure 2.1: Level of empirical support of binding model \(i\). Shown is a plausibility key expressed in terms of \(AIC_c\) differences (\(\Delta_i\)) as described by Anderson and Burnham (2004) based on empirical results. \(0 \leq \Delta_i \leq 2\): Substantial support for binding model \(i\). \(4 \leq \Delta_i \leq 7\): Considerable less support for binding model \(i\). \(\Delta_i > 10\): Essentially no support for binding model \(i\).

3 Major Results

  • First, the \(AIC_c\) scores for every binding model were calculated based on the respective residual sum of squares obtained from the nls fits (Rpubs1.1) according to equation (2.1)
  • Scores are shown in table 3.1 and figure 3.1A
  • Scores represent approximated KL divergences on a negative ordinal scale between the binding models and the true but unknown data-generating process (“reality”, red square)
  • Unknown data-generating process can be treated as a constant across all models
  • The lower the \(AIC_c\) score of a model (radial grid lines), the smaller the KL divergence and the better its performance in describing the data
  • Calculated \(AIC_c\) scores point towards the bm1to2.deg.add model (blue diamond) to be the KL best model in the set since it received the lowest \(AIC_c\) score amounting to -52.4
  • Competing binding models were further ranked by calculation of \(AIC_c\) differences (\(\Delta_i\)) for each binding model with respect to the top-ranked bm1to2.deg.add model (equation (2.2))
  • In this way the unknown “reality” is canceled out and \(\Delta_i\) values now reflect increments of KL divergences between the bm1to2.deg.add model (\(\Delta_i\) = 0) and all other competing models (figure 3.1B)
  • Those \(\Delta_i\) values were used as a basis to quantify the plausibility of each model as being the KL best model in the set instead of the actual top-ranked bm1to2.deg.add model
  • This was achieved by normalization of the \(\Delta_i\) values to be a set of Akaike weights \(\omega_i\) (evidence weights) adding to 1 as described by equation (2.3)
  • Additionally, evidence ratios \(\frac{\omega_i}{\omega_j}\) (equation (2.4)) between the bm1to2.deg.add model and the remaining models in the set were computed
  • The top-ranked bm1to2.deg.add model received 70.8% of the total weight of evidence of being the actual KL best model considering all models in the set
  • The bm1to2.deg.add model was the only one found to have substantial support of being the KL best model since no other model received a \(\Delta_i\) value less than 2 (see plausibility key, 2.1)
  • The bm1to2.coop.add model is ranked the second best in the set having an \(AIC_c\) score of -49.0 corresponding to a \(\Delta_i\) value of 3.35 which accounts for a weight of evidence of 13.3%
  • The remaining models all having \(\Delta_i\) values greater than 2 but less than 7 would be considered as having considerable less support of being the KL best model in the set (\(4 \leq \Delta_i \leq 7\), see 2.1)
  • Also considering evidence ratios, the bm1to2.deg.add model is 5.3 times more likely to be the KL best model instead of the bm1to2.coop.add model
  • The model ranked lowest comprises the one-site binding model bm1to1 which got the highest \(AIC_c\) score (-46.2) and a weight of evidence of 3.19% of being the KL best model in the set
  • It is 22.2 times more likely that the ThiITm•tRNA\(\mathrm{^{Phe}_{Bs}}\) binding profile was generated by the bm1to2.deg.add model than by the bm1to1 model.
####.......Loading.....#####
bm1to1_nls <- readRDS("Output/model1.rds")
bm1to2.deg.add_nls <- readRDS("Output/model2.rds")
bm1to2.deg_nls <- readRDS("Output/model3.rds")
bm1to2.coop.add_nls <- readRDS("Output/model4.rds")
bm1to2.coop_nls <- readRDS("Output/model5.rds")
fi <- readRDS("Output/fi.rds")
rmse <- readRDS("Output/rmse.rds")
####..........model performance analysis by aicc..............####
model.list <- list(bm1to1 = bm1to1_nls,
                   bm1to2.deg.add = bm1to2.deg.add_nls,
                   bm1to2.deg = bm1to2.deg_nls,
                   bm1to2.coop.add = bm1to2.coop.add_nls,
                   bm1to2.coop = bm1to2.coop_nls)

aicc_score <- model.list %>%
  map(`[[`, 1) %>% 
  map_dbl(~ AICc(.x)) 

k <- model.list %>% 
  map(`[[`, 1) %>% 
  map_dbl(~ length(coef(.x))) 


model.aicc <- tibble(model = names(model.list),
                     k = k +1,
                     RMSE = rmse,
                     aicc = aicc_score)

model.aicc <- model.aicc %>% 
  mutate(del_aicc = aicc - min(aicc),
         aicc_wt = (exp(-del_aicc/2)/sum(exp(-del_aicc/2)))*100) %>% 
  arrange(del_aicc) %>% 
  mutate(cum_wt = cumsum(aicc_wt))

model.aicc.best <- model.aicc[1,]

model.aicc.reality <- model.aicc %>% 
  select(model, aicc)
model.aicc.reality[6,1] <- c("reality")
model.aicc.reality[6,2] <- c(-60)

only.reality <- model.aicc.reality[6,]
only.best.model <- model.aicc.reality %>% 
  filter(model == "bm1to2.deg.add")
**Ranking of binding models according to their AIC<sub>c</sub> scores. A:** AIC<sub>c</sub> scores evaluating model performances in describing the ThiI<sub>Tm</sub>•$tRNA^{Phe}_{Bs}$ titration profile are highlighted as radial grid lines in the radar plot. They represent approximated KL distances on an ordinal scale with respect to the true data-generating process ("reality", red diamond). "Reality" represents the origin of the plot and is arbitrarily set to -60 for visualization purposes. Models were ranked based on the AIC<sub>c</sub> scores with the bm1to2.deg.add model (blue diamond) being the top-ranked model which got the smallest AIC<sub>c</sub> score (-52.4) and therefore best describes the binding curve. **B:** Binding models were further ranked based on AIC<sub>c</sub> differences ($\Delta_i$) with respect to the top-ranked bm1to2.deg.add model ($\Delta_i = 0$) so that "reality" treated as a constant across all models is canceled out. Calculated $\Delta_i$ values point towards the bm1to2.coop.add model as being the most competitive model ($\Delta_i = 3.3$) in the set with respect to the top-ranked bm1to2.deg.add model.

Figure 3.1: Ranking of binding models according to their AICc scores. A: AICc scores evaluating model performances in describing the ThiITm\(tRNA^{Phe}_{Bs}\) titration profile are highlighted as radial grid lines in the radar plot. They represent approximated KL distances on an ordinal scale with respect to the true data-generating process (“reality”, red diamond). “Reality” represents the origin of the plot and is arbitrarily set to -60 for visualization purposes. Models were ranked based on the AICc scores with the bm1to2.deg.add model (blue diamond) being the top-ranked model which got the smallest AICc score (-52.4) and therefore best describes the binding curve. B: Binding models were further ranked based on AICc differences (\(\Delta_i\)) with respect to the top-ranked bm1to2.deg.add model (\(\Delta_i = 0\)) so that “reality” treated as a constant across all models is canceled out. Calculated \(\Delta_i\) values point towards the bm1to2.coop.add model as being the most competitive model (\(\Delta_i = 3.3\)) in the set with respect to the top-ranked bm1to2.deg.add model.

Table 3.1: Results of the KL divergence analysis. Performance of binding models in the nls regression of the ThiITm•tRNA\(\mathrm{^{Phe}_{Bs}}\) binding isotherm was evaluated based on \(AIC_c\) scores according to which models are ranked in this table. \(RMSE\) values are those from the nls regression analysis. \(\Delta_i\) = AICc differences, \(\omega_i\) = evidence weights, CUSUM = cumulative sum of the evidence weights, k = number of parameter including an additional parameter for estimating the variance.
\(Model\) \(k\) \(RMSE\) \(AIC_c\) \(\Delta_i\) \(\omega_i\) \(CUSUM\)
bm1to2.deg.add 3 0.0205627 -52.35595 0.000000 70.847168 70.84717
bm1to2.coop.add 4 0.0204736 -49.01018 3.345772 13.298334 84.14550
bm1to2.deg 4 0.0216753 -47.64134 4.714610 6.707457 90.85296
bm1to2.coop 5 0.0178691 -47.40347 4.952481 5.955316 96.80828
bm1to1 3 0.0266239 -46.15603 6.199927 3.191725 100.00000

4 Conclusions

Model selection based on the \(AIC_c\) efficiently and reliably trades off goodness-of-fit and model complexity. The known over-parameterized global bm1to2.coop model, for instance, which was ranked in first place according to the \(RMSE\) of the residuals is ranked in fourth place when the \(AIC_c\) score is the basis. Conversely, the bm1to2.deg.add model formerly ranked in third place according to the \(RMSE\) is top-ranked based on the \(AIC_c\) score. Based on re-expression of \(AIC_c\) differences in terms of evidence weigths as well as evidence rations the relative strength of support for each of the binding models in the set could be assessed. The bm1to2.deg.add model is the actual KL best model that most accurately captures the ThiITm•tRNA\(\mathrm{^{Phe}_{Bs}}\) binding profile (\(\omega_i = 70.8 \%\)). It is currently \(\frac{\omega_{bm1to2.deg.add}}{\omega_{bm1to2.coop}} = 11.9\) times more likely that the data were generated by the bm1to2.deg.add model than by the global bm1to2.coop model. The bm1to2.coop.add model is the most competing model in the set (\(\Delta_i\) = 3.3). Currently, it might remain elusive whether the KL divergence between this and the top-ranked bm1to2.deg.add binding model, respectively, translates into a strength of evidence large enough to convincingly provide support for a single best data describing model in the set when the \(AIC_c\) alone is used as the basis for evaluation. With respect to the binding process this means that no distinction might yet be made whether tRNA\(\mathrm{^{Phe}_{Bs}}\) binding to ThiITm would be best described as being solely statistical or as being positively cooperative. However, valuable information about the underlying binding stoichiometry of the ThiITm•tRNA\(\mathrm{^{Phe}_{Bs}}\) complex could be gained based on consideration of the cumulative weights of evidence (\(CUSUM\)). All binding models ranked in places one to four comprise the bm1to2.deg.add, bm1to2.coop.add, bm1to2.deg, and the bm1to2.coop models, respectively. They unite a weight of evidence of 96.8% that the KL best model is among the two-site binding models as opposed to the one-site bm1to1 model having a weight of evidence lagging relatively far behind (3.2%).