Load Packages

Import Data

##  [1] "patientname"             "lastfour"               
##  [3] "patientsid"              "gender"                 
##  [5] "CysCLabDate"             "correctedage...6"       
##  [7] "LabChemResultValue...7"  "CysCeGFR"               
##  [9] "ScrLabDate"              "correctedage...10"      
## [11] "LabChemResultValue...11" "ScreGFR"                
## [13] "eGFRDifference"          "Smoking"                
## [15] "BMI"

Create eGFR Difference Groups

eGFRDifference = ScreGFR - CysCeGFR

Descriptive Statistics

##  correctedage...6      BMI           CysCeGFR          ScreGFR       
##  Min.   : 20.00   Min.   :14.14   Min.   :  7.796   Min.   :  4.627  
##  1st Qu.: 62.00   1st Qu.:27.54   1st Qu.: 23.818   1st Qu.: 38.629  
##  Median : 73.00   Median :33.51   Median : 37.938   Median : 57.287  
##  Mean   : 69.13   Mean   :34.34   Mean   : 42.954   Mean   : 62.468  
##  3rd Qu.: 78.00   3rd Qu.:40.83   3rd Qu.: 57.010   3rd Qu.: 85.361  
##  Max.   :101.00   Max.   :97.60   Max.   :121.296   Max.   :154.002  
##  eGFRDifference  
##  Min.   :-36.12  
##  1st Qu.: 10.45  
##  Median : 17.28  
##  Mean   : 19.51  
##  3rd Qu.: 28.26  
##  Max.   :101.82

Frequency of eGFR Difference Groups

## 
##   Group 1: <10 Group 2: 11-30   Group 3: >30 
##            391            866            364
## 
##   Group 1: <10 Group 2: 11-30   Group 3: >30 
##       24.12091       53.42381       22.45527

Summary Table by Group

Body Mass Index (BMI) is categorized as: Underweight (< 18.5), Normal weight (18.5–24.9), Overweight (25.0–29.9), and Obese (30.0 or higher)

Characteristic Group 1: <10
N = 172
1
Group 2: 11-30
N = 381
1
Group 3: >30
N = 165
1
p-value2
gender


0.003
    F 10 (5.8%) 33 (8.7%) 27 (16%)
    M 162 (94%) 348 (91%) 138 (84%)
Smoking


0.3
    CURRENT 14 (8.1%) 56 (15%) 23 (14%)
    FORMER 89 (52%) 185 (49%) 79 (48%)
    NEVER 69 (40%) 140 (37%) 63 (38%)
bmi_group



    Normal 27 (16%) 42 (11%) 14 (8.5%)
    Obese 97 (56%) 239 (63%) 130 (79%)
    Overweight 46 (27%) 99 (26%) 20 (12%)
    Underweight 2 (1.2%) 1 (0.3%) 1 (0.6%)
1 n (%)
2 Pearson’s Chi-squared test; NA

Paired t-test: ScreGFR vs CysCeGFR

Results: There is significance in mean difference between SCreGFR and CysCeGFR.

## 
##  Paired t-test
## 
## data:  paired_data$ScreGFR and paired_data$CysCeGFR
## t = 53.609, df = 1669, p-value < 2.2e-16
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  18.90153 20.33714
## sample estimates:
## mean difference 
##        19.61933

ANOVA: eGFR Difference by Group

Reults: At least one mean is different. In the multiple comparsions of means, all pairs of means are significantly different.

##               Df Sum Sq Mean Sq F value Pr(>F)    
## eGFRGroup      2 269938  134969    2203 <2e-16 ***
## Residuals   1618  99150      61                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = eGFRDifference ~ eGFRGroup, data = anova_data)
## 
## $eGFRGroup
##                                 diff      lwr      upr p adj
## Group 2: 11-30-Group 1: <10 16.68523 15.56636 17.80410     0
## Group 3: >30-Group 1: <10   37.77532 36.43782 39.11282     0
## Group 3: >30-Group 2: 11-30 21.09009 19.94299 22.23719     0

Compare BMI by eGFR Group

This tests if the mean BMI’s are different for 3 eGFR groups. Results: At least one mean BMI is different.

##              Df Sum Sq Mean Sq F value   Pr(>F)    
## eGFRGroup     2   2910  1454.8   20.81 1.65e-09 ***
## Residuals   715  49988    69.9                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Compare Age by eGFR Group

This tests if the mean ages are different for 3 eGFR groups. Results: At least one mean age is different.

##               Df Sum Sq Mean Sq F value   Pr(>F)    
## eGFRGroup      2   6781    3390   15.49 2.17e-07 ***
## Residuals   1618 354104     219                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Chi-square Test: Gender by eGFR Group

This tests to see if the gender and the eGFR group are associated or not. Results: Gender and eGFR group are significantly associated.

##    
##     Group 1: <10 Group 2: 11-30 Group 3: >30
##   F           24            100           65
##   M          367            766          299
## 
##  Pearson's Chi-squared test
## 
## data:  gender_table
## X-squared = 25.158, df = 2, p-value = 3.444e-06

Chi-square Test: Smoking by eGFR Group

This tests to see if the smoking and the eGFR group are associated or not. Results: Smoking and eGFR group are not significantly associated using a significance level of 5%.

##          
##           Group 1: <10 Group 2: 11-30 Group 3: >30
##   CURRENT           38            127           53
##   FORMER           181            411          175
##   NEVER            172            328          136
## 
##  Pearson's Chi-squared test
## 
## data:  smoking_table
## X-squared = 8.4538, df = 4, p-value = 0.0763

Linear Regression Model

We fit the model with outcome variable eGFRDifference against the variables: age, gender, BMI, smoking. Results: All variables are highly significant. However, we cannot rely on this model since the model fit is poor.

## 
## Call:
## lm(formula = eGFRDifference ~ correctedage...6 + gender + BMI + 
##     Smoking, data = reg_data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -53.783  -8.359  -1.788   7.710  87.023 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       4.35051    4.42195   0.984 0.325513    
## correctedage...6  0.11357    0.04272   2.658 0.008025 ** 
## genderM          -6.24007    1.86541  -3.345 0.000864 ***
## BMI               0.49240    0.06717   7.331    6e-13 ***
## SmokingFORMER    -4.05322    1.64302  -2.467 0.013853 *  
## SmokingNEVER     -5.19722    1.68242  -3.089 0.002082 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 14.13 on 741 degrees of freedom
## Multiple R-squared:  0.08589,    Adjusted R-squared:  0.07972 
## F-statistic: 13.92 on 5 and 741 DF,  p-value: 5.024e-13
term estimate std.error statistic p.value conf.low conf.high
(Intercept) 4.351 4.422 0.984 0.326 -4.331 13.032
correctedage…6 0.114 0.043 2.658 0.008 0.030 0.197
genderM -6.240 1.865 -3.345 0.001 -9.902 -2.578
BMI 0.492 0.067 7.331 0.000 0.361 0.624
SmokingFORMER -4.053 1.643 -2.467 0.014 -7.279 -0.828
SmokingNEVER -5.197 1.682 -3.089 0.002 -8.500 -1.894

Multinomial Logistic Regression

In this model, the outcome variable is eGFRGroup: Group 1:<10, Group 2: 11-30, Group 3:>30.
This is different model than Linear Regression. Results: all variables are significant. The baseline model is Group 1.
We can examine the odd ratios. For instance, for every 1-unit increase in BMI, the odds of being in Group 2 rather than Group 1 are estimated to increase by a factor of exp(1.062) = 2.89, holding all other variables fixed.

## # weights:  21 (12 variable)
## initial  value 788.803623 
## iter  10 value 696.704635
## final  value 694.569959 
## converged
## Call:
## multinom(formula = eGFRGroup ~ correctedage...6 + gender + BMI + 
##     Smoking, data = multi_data)
## 
## Coefficients:
##                (Intercept) correctedage...6    genderM        BMI SmokingFORMER
## Group 2: 11-30   -1.844526       0.03030899 -0.6820662 0.06058242    -0.9212411
## Group 3: >30     -2.875243       0.01877315 -1.1906104 0.09981461    -0.8150301
##                SmokingNEVER
## Group 2: 11-30   -0.9882738
## Group 3: >30     -1.0759661
## 
## Std. Errors:
##                (Intercept) correctedage...6   genderM        BMI SmokingFORMER
## Group 2: 11-30   0.8719064      0.007877506 0.3959867 0.01400878     0.3384408
## Group 3: >30     0.9946748      0.009155148 0.4207673 0.01612790     0.3930728
##                SmokingNEVER
## Group 2: 11-30    0.3446783
## Group 3: >30      0.4012563
## 
## Residual Deviance: 1389.14 
## AIC: 1413.14
y.level term estimate std.error statistic p.value conf.low conf.high
Group 2: 11-30 (Intercept) 0.158 0.872 -2.116 0.034 0.029 0.873
Group 2: 11-30 correctedage…6 1.031 0.008 3.848 0.000 1.015 1.047
Group 2: 11-30 genderM 0.506 0.396 -1.722 0.085 0.233 1.099
Group 2: 11-30 BMI 1.062 0.014 4.325 0.000 1.034 1.092
Group 2: 11-30 SmokingFORMER 0.398 0.338 -2.722 0.006 0.205 0.773
Group 2: 11-30 SmokingNEVER 0.372 0.345 -2.867 0.004 0.189 0.731
Group 3: >30 (Intercept) 0.056 0.995 -2.891 0.004 0.008 0.396
Group 3: >30 correctedage…6 1.019 0.009 2.051 0.040 1.001 1.037
Group 3: >30 genderM 0.304 0.421 -2.830 0.005 0.133 0.694
Group 3: >30 BMI 1.105 0.016 6.189 0.000 1.071 1.140
Group 3: >30 SmokingFORMER 0.443 0.393 -2.073 0.038 0.205 0.956
Group 3: >30 SmokingNEVER 0.341 0.401 -2.681 0.007 0.155 0.749

Scatterplot: ScreGFR vs CysCeGFR

Boxplot: BMI by eGFR Group

Boxplot: Age by eGFR Group

Histogram of eGFR Difference

Summary

This analysis classifies patients into three groups based on the difference between creatinine-based eGFR and cystatin C-based eGFR:

Missing values are removed separately for each analysis so that each test uses complete available data for the variables involved.