Unfortunately the questions posed for this study are unable to be answered based on the current data. There is considerable disagreement between the two raters so any comparison between MRI and operative findings is limited to a single rater (i.e., you end up throwing away half of your data). Furthermore, the questions asked are so granular (5 labral regions, 9 cartilage regions, 1a vs 1b delamination, etc.) that the number of individual statistical tests performed is much greater than number of data points, which leads to considerable problems with the accuracy of the results. I could do more in depth modeling that takes into account the ordinal nature of the labral and cartilage grading systems but it would require additional power (at the very least, inclusion of a third rater).
2 Packages
Code
library(tidyverse) # for everythinglibrary(tidymodels) # for modelinglibrary(rms) # for more modelinglibrary(Hmisc) # for statslibrary(magrittr) # for numberslibrary(here) # for path managementlibrary(readr)library(openxlsx)library(readr) # for importing datalibrary(broom.mixed) # for converting Bayesian models to tidy tibbleslibrary(dotwhisker) # for visualizing regression resultslibrary(skimr) # for variable summarieslibrary(gt) # for tableslibrary(gtsummary) # for prettier tableslibrary(ggdist) # for raincloud plotslibrary(patchwork) # for multi-planel figureslibrary(glue) # for working with stringslibrary(RColorBrewer) # for color paletteslibrary(lubridate) # for dateslibrary(broom)library(DTComPair)# Load custom functionssource("C:/Users/kinge/Dropbox/Research/Resources/mtk-custom-functions.R") # Windows# source("/Users/mtk/Library/CloudStorage/Dropbox/Research/Resources/mtk-custom-functions.R") # Mac
# Fix variable typesdf.raw <- df.raw %>%mutate(across(c( case, rater, labrum_tear_type, labrum_tear_loc, lig_teres_tear, tear_fluid, tear_cyst ), factor ))# Encode qualitative columns as factorsdf.raw <- df.raw %>%mutate_if(is.character, as.factor)# Labral tear location binary varsdf.raw <- df.raw %>%mutate(labral_tear_11 =str_detect(labrum_tear_loc, "\\b11\\b"))df.raw <- df.raw %>%mutate(labral_tear_12 =str_detect(labrum_tear_loc, "\\b12\\b"))df.raw <- df.raw %>%mutate(labral_tear_1 =str_detect(labrum_tear_loc, "\\b1\\b"))df.raw <- df.raw %>%mutate(labral_tear_2 =str_detect(labrum_tear_loc, "\\b2\\b"))df.raw <- df.raw %>%mutate(labral_tear_3 =str_detect(labrum_tear_loc, "\\b3\\b"))# Separate df for expanded cartilage delaminationdf_delam <- df.raw# Select only the highest cartilage injury grade for each location# Define a custom function to get the highest gradeget_highest_grade <-function(grades) {if (is.na(grades)) {return(grades) } grades <-str_split(grades, ",\\s*")[[1]] grade_levels <-c("0", "1", "1a", "1b", "2", "3") highest_grade <-max(factor(grades, levels = grade_levels, ordered =TRUE), na.rm =TRUE)return(as.character(highest_grade))}# Apply the function to all columns starting with "cartilage_"df.raw <- df.raw %>%mutate(across(starts_with("cartilage_"), ~sapply(.x, get_highest_grade)))# Consolidate 1a and 1b; keep expanded classificationsdf.raw <- df.raw %>%mutate(expanded_cartilage_tab_anterior = cartilage_tab_anterior)df.raw <- df.raw %>%mutate(expanded_cartilage_tab_superolateral = cartilage_tab_superolateral)df.raw <- df.raw %>%mutate(cartilage_tab_anterior =case_when( cartilage_tab_anterior =='1'~'1', cartilage_tab_anterior =='1a'~'1', cartilage_tab_anterior =='1b'~'1',TRUE~ cartilage_tab_anterior ))df.raw <- df.raw %>%mutate(cartilage_tab_superolateral =case_when( cartilage_tab_superolateral =='1'~'1', cartilage_tab_superolateral =='1a'~'1', cartilage_tab_superolateral =='1b'~'1',TRUE~ cartilage_tab_superolateral ))# Make ordered factorsgrade_levels <-c("0", "1", "2", "3")df.raw <- df.raw %>%mutate(across(starts_with("cartilage_"), ~factor(.x, levels = grade_levels, ordered =TRUE)))# Make expanded classifications ordered factorsgrade_levels_expanded <-c("0", "1", "1a", "1b", "2", "3")df.raw <- df.raw %>%mutate(across(starts_with("expanded_"), ~factor(.x, levels = grade_levels_expanded, ordered =TRUE)))# Make ordinal varsdf.raw$labrum_tear_type <-recode_factor(df.raw$labrum_tear_type,.ordered =TRUE,'0'='0','1'='1','2'='2','3'='3','4'='4','5'='5')df.raw$lig_teres_tear <-recode_factor(df.raw$lig_teres_tear,.ordered =TRUE,'0'='0','1'='1','2'='2','3'='3')# Final dfdf <- df.rawrm(df.raw)
5 Inter-observer agreement
Code
library(irr)df_icc <- df %>%filter(rater !='op')# Select columnscolumns_to_process <-c("labrum_tear_type", "cartilage_tab_anterior", "cartilage_tab_posterior", "cartilage_tab_superolateral", "cartilage_tab_superomedial","cartilage_head_anterior","cartilage_head_posterior","cartilage_head_superolateral","cartilage_head_superomedial","cartilage_head_inferomedial","lig_teres_tear","tear_fluid","tear_cyst","labral_tear_11","labral_tear_12","labral_tear_1","labral_tear_2","labral_tear_3")# Function to compute Kappa value and p valuecompute_kappa <-function(column_name, mri_type) { result <- df_icc %>%filter(mri == mri_type) %>%select(case, rater, !!sym(column_name)) %>%pivot_wider(names_from = rater,names_prefix ='rater_',values_from =!!sym(column_name)) %>%select(-case) %>%kappa2(weight ="equal")# Extract Kappa value and p value kappa_value <- result$value p_value <- result$p.valuereturn(tibble(parameter = column_name,mri_type = mri_type,kappa_weighted = kappa_value,p_value = p_value ) )}# Specify the MRI type to use, apply functionmri_type_to_use <-'conventional'kappa_results_conventional <-map_dfr(columns_to_process, ~compute_kappa(.x, mri_type_to_use))print(kappa_results_conventional)mri_type_to_use <-'radial'kappa_results_radial <-map_dfr(columns_to_process, ~compute_kappa(.x, mri_type_to_use))print(kappa_results_radial)# Display the resulting tablekappa_results <- kappa_results_conventional %>%left_join(kappa_results_radial,by ='parameter')# # Subset data to test# df_icc %>% # filter(mri == 'radial') %>% # select(case, rater, labral_tear_12) %>% # pivot_wider(names_from = rater,# names_prefix = 'rater_',# values_from = labral_tear_12) %>% # select(!case) %>% # kappa2(.,# weight = "equal")
Code
# Create a mapping of original column names to readable labelscolumn_labels <-c("labrum_tear_type"="Labral Tear Grade",'labral_tear_11'="Tear at 11 o'clock",'labral_tear_12'="Tear at 12 o'clock",'labral_tear_1'="Tear at 1 o'clock",'labral_tear_2'="Tear at 2 o'clock",'labral_tear_3'="Tear at 3 o'clock","cartilage_tab_anterior"="Acetabulum - Anterior","cartilage_tab_posterior"="Acetabulum - Posterior","cartilage_tab_superolateral"="Acetabulum - Superolateral","cartilage_tab_superomedial"="Acetabulum - Superomedial",'cartilage_head_anterior'='Head - Anterior','cartilage_head_posterior'='Head - Posterior','cartilage_head_superolateral'='Head - Superolateral','cartilage_head_superomedial'='Head - Superomedial','cartilage_head_inferomedial'='Head - Inferomedial','lig_teres_tear'='Ligamentum Teres Tear','tear_fluid'='Tear Fluid Signal','tear_cyst'='Tear Cyst')# Replace the column names with readable labelskappa_results <- kappa_results %>%mutate(parameter =recode(parameter, !!!column_labels))# Create a gt table for publicationtbl_irr <- kappa_results %>%select(!c(mri_type.x, mri_type.y)) %>% gt::gt() %>%tab_header(title ="Inter-Rater Reliability for Conventional and Radial MRI" ) %>%tab_spanner(label ="Conventional MRI",columns =c( kappa_weighted.x, p_value.x ) ) %>%tab_spanner(label ="Radial MRI",columns =c( kappa_weighted.y, p_value.y ) ) %>%fmt_number(columns =c(kappa_weighted.x, kappa_weighted.y, p_value.x, p_value.y),decimals =3 ) %>%sub_missing(columns =everything(),rows =everything(),missing_text ="---" ) %>%cols_label(parameter ="Parameter",kappa_weighted.x ="Weighted Kappa",p_value.x ="P value",kappa_weighted.y ="Weighted Kappa",p_value.y ="P value" ) %>%# Row group 1tab_row_group(label =md("**Labral Tears**"),rows = parameter %in%c("Labral Tear Grade","Tear at 11 o'clock","Tear at 12 o'clock","Tear at 1 o'clock","Tear at 2 o'clock","Tear at 3 o'clock") ) %>%# Row group 2tab_row_group(label =md('**Cartilage Grading Scale**'),rows = parameter %in%c('Acetabulum - Anterior','Acetabulum - Posterior','Acetabulum - Superolateral','Acetabulum - Superomedial','Head - Anterior','Head - Posterior','Head - Superolateral','Head - Superomedial','Head - Inferomedial') ) %>%# Row group 3tab_row_group(label =md('**Other**'),rows = parameter %in%c('Ligamentum Teres Tear','Tear Fluid Signal','Tear Cyst') ) %>%# Row group orderrow_group_order(groups =c("**Labral Tears**",'**Cartilage Grading Scale**','**Other**')) %>%# Format tabletab_options(table.font.size =px(12),data_row.padding = gt::px(3),row_group.padding =px(8) ) %>%tab_options(row_group.padding =px(8) ) %>%cols_align(align ='center',columns =c(kappa_weighted.x, p_value.x, kappa_weighted.y, p_value.y) ) %>%tab_options(table.border.top.width =2,table.border.top.color ="black",table_body.border.bottom.width =2,table_body.border.bottom.color ="black" )tbl_irr
Table 1: Inter-Rater Reliability for Conventional and Radial MRI
Inter-Rater Reliability for Conventional and Radial MRI
Parameter
Conventional MRI
Radial MRI
Weighted Kappa
P value
Weighted Kappa
P value
Labral Tears
Labral Tear Grade
0.275
0.000
0.128
0.012
Tear at 11 o'clock
0.000
—
0.000
1.000
Tear at 12 o'clock
0.411
0.001
0.136
0.203
Tear at 1 o'clock
0.018
0.777
−0.034
0.715
Tear at 2 o'clock
0.204
0.015
0.032
0.355
Tear at 3 o'clock
0.261
0.005
0.000
—
Cartilage Grading Scale
Acetabulum - Anterior
0.420
0.000
0.225
0.031
Acetabulum - Posterior
0.000
—
0.000
1.000
Acetabulum - Superolateral
0.039
0.597
0.119
0.221
Acetabulum - Superomedial
0.000
—
0.000
—
Head - Anterior
0.655
0.000
0.257
0.029
Head - Posterior
0.000
1.000
0.000
1.000
Head - Superolateral
−0.033
0.742
0.000
1.000
Head - Superomedial
0.000
—
0.000
—
Head - Inferomedial
0.000
1.000
0.000
1.000
Other
Ligamentum Teres Tear
0.208
0.018
−0.083
0.344
Tear Fluid Signal
0.363
0.005
0.465
0.000
Tear Cyst
0.729
0.000
0.461
0.001
The weighted kappa statistic for each parameter was estimated for ordinal variables. Overall, the inter-rater reliability is very poor for both conventional MRI and radial MRI. There seems to be consistent differences between the two raters, particularly with respect to labral tears. For example, rater 1 seems to have a much lower threshold for calling a labral tear than rater 2. For labral tears at the 11 o’clock position (I assume this is what is meant by the numbers under the labral tear location column), rater 1 had 12 tears on conventional MRI and 15 tears on radial MRI, while rater 2 had 0 tears on both conventional and radial MRI. Similarly, for labral tears at the 12 o’clock position, rater 1 had 20 tears on conventional MRI and 27 tears on radial MRI, while rater 2 had 9 tears on conventional MRI and 10 tears on radial MRI. There doesn’t seem to be any consistency in the differences between raters for cartilage lesion grading.
Given the significant amount of disagreement between raters, plus the fact that these are either binary or ordinal scales and you cannot average non-continuous scales, the results of both raters cannot be pooled. This problem could be solved by having a third rater evaluate the MRIs and use the results to inform the cases where there is disagreement between raters 1 and 2. With the data I have now, the best I can do is compare the diagnostic accuracy of conventional and radial MRI for rater 1, and again separately for rater 2. However, as you see in the above table, the diagnostic statistics for most of the outcomes of interest cannot be calculated because there are zeroes in 1 or more cells of the 2x2 contingency table (consisting of TP, TN, FP, FN).
Code
# Create a mapping of original column names to readable labelscolumn_labels_acc_1 <-c('labral_tear_11'="Tear at 11 o'clock",'labral_tear_12'="Tear at 12 o'clock",'labral_tear_1'="Tear at 1 o'clock",'labral_tear_2'="Tear at 2 o'clock",'labral_tear_3'="Tear at 3 o'clock")# Replace the column names with readable labelssummary_table_1 <- summary_table_1 %>%mutate(Column =recode(Column, !!!column_labels_acc_1))# Create a gt table for publicationtbl_acc_1 <- summary_table_1 %>% gt::gt() %>%tab_header(title ="Diagnostic Statistics for Labral Tears (Rater 1)" ) %>%tab_spanner(label ="Conventional MRI",columns =c( Conv_Sensitivity, Conv_Specificity, Conv_PPV, Conv_NPV ) ) %>%tab_spanner(label ="Radial MRI",columns =c( Rad_Sensitivity, Rad_Specificity, Rad_PPV, Rad_NPV ) ) %>%fmt_number(columns =everything(),decimals =3 ) %>%sub_missing(columns =everything(),rows =everything(),missing_text ="---" ) %>%cols_label(Column ="Parameter",Conv_Sensitivity ="Sensitivity",Conv_Specificity ="Specificity",Conv_PPV ="PPV",Conv_NPV ="NPV",Rad_Sensitivity ="Sensitivity",Rad_Specificity ="Specificity",Rad_PPV ="PPV",Rad_NPV ="NPV" ) %>%# Format tabletab_options(table.font.size =px(12),data_row.padding = gt::px(3),row_group.padding =px(8) ) %>%tab_options(row_group.padding =px(8) ) %>%cols_align(align ='center',columns =c( Conv_Sensitivity, Conv_Specificity, Conv_PPV, Conv_NPV, Rad_Sensitivity, Rad_Specificity, Rad_PPV, Rad_NPV ) ) %>%tab_options(table.border.top.width =2,table.border.top.color ="black",table_body.border.bottom.width =2,table_body.border.bottom.color ="black" )tbl_acc_1
Table 2: Diagnostic Statistics for Labral Tears (Rater 1)
# Create a mapping of original column names to readable labelscolumn_labels_acc_1 <-c('labral_tear_11'="Tear at 11 o'clock",'labral_tear_12'="Tear at 12 o'clock",'labral_tear_1'="Tear at 1 o'clock",'labral_tear_2'="Tear at 2 o'clock",'labral_tear_3'="Tear at 3 o'clock")# Replace the column names with readable labelssummary_table_2 <- summary_table_2 %>%mutate(Column =recode(Column, !!!column_labels_acc_1))# Create a gt table for publicationtbl_acc_2 <- summary_table_2 %>% gt::gt() %>%tab_header(title ="Diagnostic Statistics for Labral Tears (Rater 2)" ) %>%tab_spanner(label ="Conventional MRI",columns =c( Conv_Sensitivity, Conv_Specificity, Conv_PPV, Conv_NPV ) ) %>%tab_spanner(label ="Radial MRI",columns =c( Rad_Sensitivity, Rad_Specificity, Rad_PPV, Rad_NPV ) ) %>%fmt_number(columns =everything(),decimals =3 ) %>%sub_missing(columns =everything(),rows =everything(),missing_text ="---" ) %>%cols_label(Column ="Parameter",Conv_Sensitivity ="Sensitivity",Conv_Specificity ="Specificity",Conv_PPV ="PPV",Conv_NPV ="NPV",Rad_Sensitivity ="Sensitivity",Rad_Specificity ="Specificity",Rad_PPV ="PPV",Rad_NPV ="NPV" ) %>%# Format tabletab_options(table.font.size =px(12),data_row.padding = gt::px(3),row_group.padding =px(8) ) %>%tab_options(row_group.padding =px(8) ) %>%cols_align(align ='center',columns =c( Conv_Sensitivity, Conv_Specificity, Conv_PPV, Conv_NPV, Rad_Sensitivity, Rad_Specificity, Rad_PPV, Rad_NPV ) ) %>%tab_options(table.border.top.width =2,table.border.top.color ="black",table_body.border.bottom.width =2,table_body.border.bottom.color ="black" )tbl_acc_2
Table 3: Diagnostic Statistics for Labral Tears (Rater 2)
Diagnostic Statistics for Labral Tears (Rater 2)
Parameter
Conventional MRI
Radial MRI
Sensitivity
Specificity
PPV
NPV
Sensitivity
Specificity
PPV
NPV
Tear at 11 o'clock
—
—
—
—
—
—
—
—
Tear at 12 o'clock
0.000
0.780
0.000
0.744
0.182
0.805
0.200
0.786
Tear at 1 o'clock
0.960
0.000
0.960
0.000
0.980
0.000
0.961
0.000
Tear at 2 o'clock
0.750
0.475
0.300
0.864
0.250
0.800
0.273
0.780
Tear at 3 o'clock
—
—
—
—
—
—
—
—
7 Cartilage lesions
7.1 Rater 1
Code
# Convert to logical varsdf_bin <- df %>%mutate(across(starts_with("cartilage_"),~case_when( . ==0~FALSE, . %in%c(1, 2, 3) ~TRUE,TRUE~as.logical(.) # This preserves any other values as is ) ))# Filter data for rater 1 and operative findingsmri_conventional <- df_bin %>%filter(rater =='1', mri =="conventional")mri_radial <- df_bin %>%filter(rater =='1', mri =="radial")op_data <- df_bin %>%filter(rater =="op")# Get cartilage lesion columnscartilage_columns <- df_bin %>%select(starts_with('cartilage_')) %>%colnames()# Create contingency tables and calculate metrics for conventional MRIresults_conventional <-map(cartilage_columns, function(col) { table <-create_contingency_table(mri_conventional, op_data, col) metrics <-calculate_metrics(table, testname = col)list(table = table, metrics = metrics)})# Create contingency tables and calculate metrics for radial MRIresults_radial <-map(cartilage_columns, function(col) { table <-create_contingency_table(mri_radial, op_data, col) metrics <-calculate_metrics(table, testname = col)list(table = table, metrics = metrics)})# Print formatted resultsprint_formatted_results(results_conventional, "Conventional")print_formatted_results(results_radial, "Radial")# Create summary tablessummary_conventional <-bind_rows(map(results_conventional, ~.x$metrics), .id ="Column") %>%mutate(Column = cartilage_columns[as.numeric(Column)]) %>%rename_with(~paste0("Conv_", .), -Column)summary_radial <-bind_rows(map(results_radial, ~.x$metrics), .id ="Column") %>%mutate(Column = cartilage_columns[as.numeric(Column)]) %>%rename_with(~paste0("Rad_", .), -Column)# Combine summariessummary_table_3 <-full_join(summary_conventional, summary_radial, by ="Column") %>%select(Column, starts_with("Conv_"), starts_with("Rad_"))
Again, nearly all of the diagnostic statistics cannot be calculated accurately (or at all) because of zeroes in one or more cells of the contingency table. There are ways to model these values when a table contains a random (i.e., non-structural) zero, but that would only be appropriate in the case where there is a high level of confidence that the zero cell is accurate and not due to issues with sampling, as in this case.
Table 5: Diagnostic Statistics for Cartilage Lesions (Rater 2)
Diagnostic Statistics for Cartilage Lesions (Rater 2)
Parameter
Conventional MRI
Radial MRI
Sensitivity
Specificity
PPV
NPV
Sensitivity
Specificity
PPV
NPV
Acetabulum - Anterior
0.490
1.000
1.000
0.037
0.608
1.000
1.000
0.048
Acetabulum - Posterior
—
—
—
—
—
—
—
—
Acetabulum - Superolateral
0.211
1.000
1.000
0.688
0.368
0.909
0.700
0.714
Acetabulum - Superomedial
—
—
—
—
—
—
—
—
Head - Anterior
1.000
0.980
0.500
1.000
1.000
0.961
0.333
1.000
Head - Posterior
—
—
—
—
—
—
—
—
Head - Superolateral
1.000
1.000
1.000
1.000
—
—
—
—
Head - Superomedial
—
—
—
—
—
—
—
—
Head - Inferomedial
—
—
—
—
—
—
—
—
8 Cartilage delamination
Comparison of MRI grade 1a vs 1b was performed only for the anterior acetabulum region as none of the other regions contained cartilage lesions graded as 1a or 1b (the superolateral acetabulum region was grade 1 for 3/52 patients on operative evaluation, but this is insufficient to make a comparison).