2025-Jefferson_test-retest

Author

S Uribe

CREATED

June 2, 2025

UPDATED

July 17, 2025

Packages

[1] "/home/sergiouribe/Insync/sergio.uribe@gmail.com/Google Drive/Research Drive/2025_Sindija_Baltic Survey of Empathy Levels Jefferson scale/analysis_jefferson"

Docs

Abstract ADEE

Manuscript

Data

Filter those with test retest


 1  2 
14 14 
# A tibble: 0 × 2
# ℹ 2 variables: Respondent <dbl>, n <int>

Prepare the dataset

Check the correlation


    Pearson's product-moment correlation

data:  test_retest_wide$Time1 and test_retest_wide$Time2
t = 21.474, df = 278, p-value < 2.2e-16
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.7412695 0.8302116
sample estimates:
     cor 
0.789858 

Now a plot

Now intraclass correlation

Call: psych::ICC(x = select(test_retest_wide, Time1, Time2), alpha = 0.05, 
    lmer = TRUE, check.keys = FALSE)

Intraclass correlation coefficients 
                         type  ICC   F df1 df2       p lower bound upper bound
Single_raters_absolute   ICC1 0.79 8.5 279 280 1.9e-61        0.74        0.83
Single_random_raters     ICC2 0.79 8.5 279 279 1.8e-61        0.74        0.83
Single_fixed_raters      ICC3 0.79 8.5 279 279 1.8e-61        0.74        0.83
Average_raters_absolute ICC1k 0.88 8.5 279 280 1.9e-61        0.85        0.91
Average_random_raters   ICC2k 0.88 8.5 279 279 1.8e-61        0.85        0.91
Average_fixed_raters    ICC3k 0.88 8.5 279 279 1.8e-61        0.85        0.91

 Number of subjects = 280     Number of Judges =  2
See the help file for a discussion of the other 4 McGraw and Wong estimates,

ICC2
0.7976927

So, according to Koo and Li (2016), this is a good intraclass correlation.

Test-retest reliability by item was good, ICC = 0.80 (95%IC 0.75 to 0.84

Plot

NULL

Correcting for multiple measurements

Calculate Total Scores and ICC

Total Scores for Each Respondent
Test (Time1) and Retest (Time2)
Respondent Time1 Time2
2 96.0 96.0
3 83.0 79.0
4 101.0 101.0
5 70.0 62.0
7 92.0 93.0
8 68.0 72.0
10 79.0 81.0
13 91.0 79.0
15 98.0 80.0
16 94.0 100.0
17 110.0 111.0
18 101.0 106.0
21 102.0 101.0
23 103.0 96.0
Call: psych::ICC(x = select(test_retest_totals, Time1, Time2), alpha = 0.05, 
    lmer = TRUE, check.keys = FALSE)

Intraclass correlation coefficients 
                         type  ICC  F df1 df2       p lower bound upper bound
Single_raters_absolute   ICC1 0.87 14  13  14 7.9e-06        0.65        0.95
Single_random_raters     ICC2 0.87 14  13  13 1.2e-05        0.65        0.95
Single_fixed_raters      ICC3 0.87 14  13  13 1.2e-05        0.64        0.96
Average_raters_absolute ICC1k 0.93 14  13  14 7.9e-06        0.78        0.98
Average_random_raters   ICC2k 0.93 14  13  13 1.2e-05        0.79        0.98
Average_fixed_raters    ICC3k 0.93 14  13  13 1.2e-05        0.78        0.98

 Number of subjects = 14     Number of Judges =  2
See the help file for a discussion of the other 4 McGraw and Wong estimates,

Item-Level ICC (Optional)

Item-level ICC (based on 280 individual item responses): 0.789 (95% CI: 0.741 to 0.83 )
Total score ICC (based on 14 respondents): 0.867 (95% CI: 0.646 to 0.955 )

Summary

  • Test-retest reliability by item was good, ICC =

    Item-level ICC (based on 280 individual item responses): 0.789 (95% CI: 0.741 to 0.83 )
  • Test-retest reliability by respondent (n = 14) was good, ICC =

    0.867 (95% CI: 0.646 to 0.955 )

For the Methods section:

Test-retest reliability was assessed in a subsample of 14 respondents who completed the Jefferson Scale of Empathy twice. Intraclass correlation coefficients (ICC) were calculated using a two-way mixed-effects model (absolute agreement, single measures) for both item-level and total scores.

Cronbach’s Alpha for All 28 Students

Some items ( Q_Norm_6 Q_Norm_7 Q_Norm_17 Q_Norm_18 ) were negatively correlated with the first principal component and 
probably should be reversed.  
To do this, run the function again with the 'check.keys=TRUE' option

Reliability analysis   
Call: psych::alpha(x = cronbach_data)

  raw_alpha std.alpha G6(smc) average_r S/N   ase mean   sd median_r
      0.74      0.74    0.96      0.12 2.8 0.069  4.5 0.66     0.13

    95% confidence boundaries 
         lower alpha upper
Feldt     0.57  0.74  0.86
Duhachek  0.60  0.74  0.87

 Reliability if an item is dropped:
          raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
Q_Norm_1       0.69      0.70    0.95      0.11 2.3    0.082 0.073  0.12
Q_Norm_2       0.74      0.74    0.96      0.13 2.9    0.069 0.073  0.14
Q_Norm_3       0.73      0.73    0.96      0.12 2.7    0.070 0.073  0.13
Q_Norm_4       0.73      0.73    0.95      0.12 2.7    0.071 0.074  0.13
Q_Norm_5       0.73      0.73    0.96      0.12 2.7    0.072 0.071  0.13
Q_Norm_6       0.78      0.78    0.96      0.15 3.4    0.059 0.058  0.15
Q_Norm_7       0.74      0.74    0.96      0.13 2.8    0.068 0.071  0.14
Q_Norm_8       0.74      0.74    0.96      0.13 2.8    0.068 0.069  0.14
Q_Norm_9       0.74      0.74    0.96      0.13 2.9    0.068 0.069  0.14
Q_Norm_10      0.72      0.71    0.96      0.12 2.5    0.073 0.071  0.13
Q_Norm_11      0.71      0.72    0.95      0.12 2.5    0.076 0.069  0.13
Q_Norm_12      0.68      0.70    0.94      0.11 2.3    0.084 0.068  0.12
Q_Norm_13      0.73      0.73    0.95      0.13 2.7    0.070 0.069  0.13
Q_Norm_14      0.71      0.72    0.96      0.12 2.5    0.075 0.070  0.13
Q_Norm_15      0.70      0.70    0.95      0.11 2.4    0.078 0.067  0.13
Q_Norm_16      0.72      0.72    0.96      0.12 2.6    0.074 0.067  0.13
Q_Norm_17      0.75      0.75    0.96      0.13 3.0    0.065 0.071  0.15
Q_Norm_18      0.75      0.75    0.96      0.14 3.0    0.067 0.071  0.14
Q_Norm_19      0.72      0.71    0.95      0.12 2.5    0.074 0.069  0.13
Q_Norm_20      0.71      0.70    0.95      0.11 2.3    0.077 0.065  0.13

 Item statistics 
           n raw.r std.r  r.cor r.drop mean   sd
Q_Norm_1  28  0.74  0.70  0.709  0.664  3.9 1.96
Q_Norm_2  28  0.17  0.26  0.229  0.126  6.8 0.61
Q_Norm_3  28  0.40  0.40  0.383  0.289  3.8 1.62
Q_Norm_4  28  0.35  0.40  0.390  0.273  5.5 1.17
Q_Norm_5  28  0.46  0.41  0.399  0.316  4.0 2.20
Q_Norm_6  28 -0.21 -0.22 -0.231 -0.330  3.4 1.73
Q_Norm_7  28  0.27  0.30  0.289  0.161  5.4 1.55
Q_Norm_8  28  0.22  0.27  0.259  0.117  5.2 1.42
Q_Norm_9  28  0.27  0.21  0.201  0.145  3.2 1.78
Q_Norm_10 28  0.50  0.55  0.544  0.416  5.9 1.40
Q_Norm_11 28  0.58  0.53  0.535  0.492  3.6 1.69
Q_Norm_12 28  0.78  0.74  0.755  0.711  3.6 1.97
Q_Norm_13 28  0.35  0.35  0.345  0.224  4.1 1.76
Q_Norm_14 28  0.54  0.53  0.524  0.445  4.6 1.62
Q_Norm_15 28  0.65  0.68  0.683  0.563  5.5 1.69
Q_Norm_16 28  0.49  0.50  0.490  0.399  5.1 1.55
Q_Norm_17 28  0.21  0.17  0.147  0.069  3.6 1.91
Q_Norm_18 28  0.16  0.12  0.099  0.045  1.9 1.51
Q_Norm_19 28  0.51  0.56  0.564  0.425  5.9 1.36
Q_Norm_20 28  0.65  0.71  0.712  0.583  5.9 1.35

Non missing response frequency for each item
             1    2    3    4    5    6    7 miss
Q_Norm_1  0.07 0.21 0.21 0.18 0.00 0.18 0.14    0
Q_Norm_2  0.00 0.00 0.00 0.04 0.00 0.07 0.89    0
Q_Norm_3  0.11 0.14 0.18 0.21 0.18 0.18 0.00    0
Q_Norm_4  0.00 0.00 0.00 0.29 0.21 0.25 0.25    0
Q_Norm_5  0.18 0.14 0.07 0.18 0.18 0.00 0.25    0
Q_Norm_6  0.14 0.32 0.00 0.25 0.14 0.14 0.00    0
Q_Norm_7  0.04 0.00 0.07 0.18 0.11 0.32 0.29    0
Q_Norm_8  0.00 0.04 0.14 0.07 0.29 0.29 0.18    0
Q_Norm_9  0.18 0.25 0.18 0.11 0.11 0.18 0.00    0
Q_Norm_10 0.00 0.04 0.00 0.18 0.11 0.18 0.50    0
Q_Norm_11 0.04 0.25 0.36 0.07 0.14 0.04 0.11    0
Q_Norm_12 0.14 0.18 0.21 0.18 0.07 0.07 0.14    0
Q_Norm_13 0.04 0.21 0.11 0.29 0.14 0.07 0.14    0
Q_Norm_14 0.00 0.11 0.18 0.21 0.21 0.11 0.18    0
Q_Norm_15 0.00 0.04 0.11 0.21 0.14 0.00 0.50    0
Q_Norm_16 0.00 0.11 0.07 0.07 0.29 0.29 0.18    0
Q_Norm_17 0.14 0.18 0.14 0.29 0.07 0.04 0.14    0
Q_Norm_18 0.64 0.14 0.00 0.07 0.14 0.00 0.00    0
Q_Norm_19 0.00 0.00 0.04 0.21 0.07 0.14 0.54    0
Q_Norm_20 0.00 0.04 0.04 0.04 0.29 0.14 0.46    0

Internal consistency (Cronbach’s alpha) for the 20-item Jefferson Scale in the full sample (n = 28):

Cronbach's alpha = 0.737 (standardized alpha = 0.738)