Method Comparison: Alternative Graphical Methods

In addition to the Bland-Altman plot and variations thereof, several other graphical approaches have been proposed for method comparison studies, including the survival-agreement Plot and the mountain plot. None of the proposed graphical approaches has enjoyed the same popularity as the Bland-Altman plot. Two graphical approaches are presented here.

Strictly speaking, the Identity plot is advised by Bland and Altman as a prior analysis of the Bland-Altman plot and therefore is neither a variant nor an alternative approach. However, it is worth mentioning, as it is a simple, powerful, and elegant technique that is often overlooked in method comparison studies. The identity plot is a simple scatterplot approach of measurements for both methods on either axis, with the line of equality (the \(X=Y\) line, i.e.  the 45-degree line through the origin). This plot can give the analyst a cursory examination of how well the measurement methods agree. In the case of good agreement, the covariates of the plot accord closely with the line of equality.

Survival-Agreement Plot

A graphical technique for method comparison studies, that is entirely different to the Bland-Altman plot, was proposed by . This approach, known as the survival-agreement plot, is used to determine the degree of agreement using the Kaplan-Meier method, a well known graphical technique in the area of Survival Analysis. Furthermore, propose that commonly used survival analysis techniques should complement this method, *{providing a new analytical insight for agreement**.

Two survival–agreement plots are used to detect the bias between two measurements of the same variable. The presence of inter-method bias is tested with the log-rank test and its magnitude with Cox regression.

The degree of agreement (or disagreement) of a measure is expressed as a function of several limits of tolerance, using the Kaplan-Meier method, where the failures occur exactly at absolute values of the differences between the two methods of measurement.

According to , the survival-agreement plot is a step function of a typical survival analysis without censored data, where the Y-axis represents the proportion of discordant cases. This is equivalent to a step function where the X-axis represents the absolute observed differences and the Y-axis is the proportion of the cases with at least the observed difference (\(x_i\)).

Figure~\(\ref{PlasmaVol_KMplot}\) depicts the survival-agreement plot of the Nadler vs Hurley plasma volume measurement comparison, which is referenced in and described in detail later. The corresponding summary statistics for method comparison is contained in Table~\(\ref{Plasmadata}\). contend that many clinical statisticians would be familiar with this type of plot, due to its similar construction to the Kaplan-Meier plot. However, they offer no support to the idea that the plot would be accessible or intuitive to the broader scientific community.

% latex table generated in R 3.6.1 by xtable 1.8-4 package % Mon Mar 09 11:26:17 2020 % % PREVALENCE % % Implementation

% MCS Mountain Plot Notebook

have proposed a folded empirical cumulative distribution plot, otherwise known as a mountain plot. They propose it as a non-parametric method that can be used as a complement with the Bland-Altman plot and suitable for detecting large, infrequent errors. Mountain plots are created by computing a percentile for each ranked difference between methods.

Folded plots are so-called because all percentiles above 50 are transformed to the difference of that percentile and 100 (e.g. the 81st percentile is transformed to the 19th percentile). These percentiles are then plotted against the differences between the two methods.

argue that the mountain plot offers some following advantages. It is easier to find the central \(95\%\) of the data, even when the data are not normally distributed. Also, comparison on different distributions can be performed with ease. Unlike the Bland-Altman plot, the mountain plot does not feature case-wise differences.

An example of method comparison mountain plot is depicted in Figure~\(\ref{MountainPlot_NH_1}\), using the plasma volume data set. reproduces measurements of plasma volume expressed as a percentage of a predetermined value in 99 subjects, which was originally published in . Two alternative sets of normal values are used, named Nadler and Hurley respectively. The following plots were created using the package . Inter-method bias can be assessed by visual inspection of the mountain plot as proposed by , indicated by the distance between peaks.

Here the proposed approach is extended by the addition of a mountain plot of the mean-centred data, i.e.  \(x-\bar{x}\) and \(y-\bar{y}\). This additional plot will allow the user to directly compare the precision and distribution of measurements from both methods.

The ease of interpretability of these plots collectively can mitigate the adoption of this graphical approach among the broader scientific community. When combined with the identity plot and the Bland-Altman plot, these plots comprise a simple and effective graphical approach from assessing agreement.

\end{centering}

An advantage that the mountain plot has over the Bland-Altman plot is that the mountain plot can assess multiple methods simultaneously, while the Bland-Altman plot is devised for comparing two methods only. Figure~\(\ref{SBPMountain1}\) assesses the agreement of three measurement methods of systolic blood pressure, described in several method comparison papers such as and . A cursory inspection indicates that two methods are in close agreement, while a third method clearly diverges. The configuration as a folded empirical cumulative distribution function (ECDF) plot retains the centrality of the data and allows the user to assess skewness. Additionally, a folded plot is more economical in terms of space.

As with the Bland-Altman plot, the mountain plot is not devised for replicate measurements. When used for method comparison, mountain plots are constrained by the need for a suitably large number of items for there to be useful interpretation from the graph. When applied to Grubbs’s artillery data, which comprises 12 items measured by three methods, the mountain plot is completely ineffective at conveying any insights. Regarding Figure~\(\ref{GrubbsMountain1}\), it is hard to distinguish which method is which, such that the plot is not usable. A small number of items does not critically undermine the Bland-Altman approach.

Other graphical methods, such as kernel density plots, are equally applicable, as they could sufficiently fulfil the same function of the mountain plot, but the mountain plot is retained here as it has featured previously in scientific literature.

An interesting question arises as to why the Bland-Altman plot went on to enjoy widespread use, while the mountain plot and survival agreement plot are still almost unknown. Consideration of how the capabilities of data visualization software have evolved since the publication of would provide insights. Before the advent of usable data visualization software, the Bland-Altman plot required only basic technical graphical capabilities and mathematical transformations, with the reasonable expectation that its inclusion in a publication would be required for method comparison studies. Conversely, plots such as the mountain plot and survival-agreement plot require additional efforts to create, with the risk that the plot would be refused due to lack of visibility in the literature.

The package was created by Hadley Wickham and provides an intuitive plotting system to rapidly generate publication-quality graphics. builds on the concept of the “Grammar of Graphics” which describes a consistent syntax for the construction of a wide range of complex graphics by a concise description of their components. The overhead associated with creating complex graphics has diminished greatly, and there is a reasonable expectation that a practitioner can quickly develop the ability to create such plots.

%-https://cran.r-project.org/web/packages/mountainplot/vignettes/mountainplot_examples.html

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

The efficacy of the enhanced approach, i.e.  jointly using the scatterplot, the Bland-Altman plot and the mean-centred mountain plot, can be appraised using examples that have previously featured in method comparison research. Three data sets were selected to represent multiple likely outcomes. The plasma volume method comparison used previously serves as an example of an analysis of two methods in agreement. These three following examples were chosen to examine other scenarios.

The data tabulated in Appendix \(\ref{VCFData}\) comprises containing measurements for 100 observations of mean velocity of circumferential fibre (vcf) shortening, made by long axis and short axis echocardiography. Method comparison statistics are tabulated in Table \(\ref{VCFdata}\). These measurements originally featured in and were used in an example in who noted the increasing dispersion of covariates in the Bland-Altman plot and proposed log-transformation to attempt to improve the analysis.

Figure \(\ref{VCF_MCS}\) contains the scatter-plot and Bland-Altman plot of the untransformed measurements. While noting the increasing dispersal of the covariates in the Bland Altman plot, there is no strong indication of lack of agreement. The mean-centred mountain plot, shown in Figure \(\ref{VCF_Mountain}\), provided additional insights about method agreement. The plots are approximately similar in magnitude, with the curves tracking each other closely. Admittedly the curves are clearly discernible from each other. An assessment of agreement in this case would be subjective, and dependent on how well these distributions can be considered alike. Even if the methods are not sufficiently in agreement, it is reasonable to surmise that measurements taken by both devices are close to each other,

% latex table generated in R 3.6.1 by xtable 1.8-4 package % Mon Mar 09 11:46:23 2020 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

The data set, available through the R package contains the measurement of the fat content of human milk by two different methods. The fat content of human milk determined by the measurement of glycerol released by enzymic hydrolysis of triglycerides (Trig) and measurement by the Standard Gerber method (Gerber). Units are (g/100 ml). The dataset features in from data presented in , reproduced in Table~\(\ref{MilkData}\). The relevant summary statistics are detailed in Table~\(\ref{MilkSummary}\). Figure~\(\ref{Milk_Mountain}\) depicts the mountain plot for this comparison, while Figure~\(\ref{Milk_MCS}\) contains the corresponding identity and Bland-Altman plots. By visually inspecting all three plots, there is a clear indication that these two methods of measurement are not in agreement.

% latex table generated in R 3.6.1 by xtable 1.8-4 package % Mon Mar 09 11:30:29 2020

introduces a data set that comprises measurements of the inferior pelvic infundibular angle (IPIA) on the kidneys of 52 patients relevant to the treatment and diagnosis of renal lithiasis. The kidneys were measurement by mean of computerized urography (Uro) and tomography (Tom). The data are tabulated in Appendix \(\ref{KidneyData}\). Method comparison statistics are tabulated in Table \(\ref{Kidney_Statistics}\).

Figure \(\ref{Kidney_MCS}\) contains the scatterplot and Bland-Altman plot of these measurements. The scatter plot indicates the presence of proportional bias, which would dissuade the analyst from a conclusion of agreement. However, the strength of evidence is insufficient from this plot alone. Conversely, there is no strong indication of disagreement from visual inspection of the Bland-Altman plot without further assessment of the limits of agreement.

The mountain plot demonstrates a lack of agreement between methods. There are distinct plots for each method, which clearly indicates a lack of agreement between the methods. A clinical researcher would be dissuaded from the presumption of agreement immediately. This example demonstrates why the mountain plot is a useful complement to the Bland-Altman plot, which was unable to detect this disagreement.

Summary

The inclination to simplicity has been pivotal in the success of the Bland-Altman approach. This is perhaps the most important lesson to be drawn from the history of the Bland-Altman approach. The required level of statistical knowledge to carry out the technique can be contained in any conventional undergraduate statistics course.

However, their approach is insufficient for the case of replicate measurements and must therefore be complemented by suitable techniques.

A complementary method must be able to properly address the core research questions for assessing agreement between methods where replicate measurements are present but do so in a way that is not unduly complex. New methods should be formulated to be accessible and intuitive to practitioners from across the scientific community.