## [1] "patientname" "lastfour"
## [3] "patientsid" "gender"
## [5] "CysCLabDate" "correctedage...6"
## [7] "LabChemResultValue...7" "CysCeGFR"
## [9] "ScrLabDate" "correctedage...10"
## [11] "LabChemResultValue...11" "ScreGFR"
## [13] "eGFRDifference" "Smoking"
## [15] "BMI"
eGFRDifference = ScreGFR - CysCeGFR
## correctedage...6 BMI CysCeGFR ScreGFR
## Min. : 20.00 Min. :14.14 Min. : 7.796 Min. : 4.627
## 1st Qu.: 62.00 1st Qu.:27.54 1st Qu.: 23.818 1st Qu.: 38.629
## Median : 73.00 Median :33.51 Median : 37.938 Median : 57.287
## Mean : 69.13 Mean :34.34 Mean : 42.954 Mean : 62.468
## 3rd Qu.: 78.00 3rd Qu.:40.83 3rd Qu.: 57.010 3rd Qu.: 85.361
## Max. :101.00 Max. :97.60 Max. :121.296 Max. :154.002
## eGFRDifference
## Min. :-36.12
## 1st Qu.: 10.45
## Median : 17.28
## Mean : 19.51
## 3rd Qu.: 28.26
## Max. :101.82
##
## Group 1: <10 Group 2: 11-30 Group 3: >30
## 391 866 364
##
## Group 1: <10 Group 2: 11-30 Group 3: >30
## 24.12091 53.42381 22.45527
Body Mass Index (BMI) is categorized as: Underweight (< 18.5), Normal weight (18.5–24.9), Overweight (25.0–29.9), and Obese (30.0 or higher)
| Characteristic | Group 1: <10 N = 1721 |
Group 2: 11-30 N = 3811 |
Group 3: >30 N = 1651 |
p-value2 |
|---|---|---|---|---|
| gender | 0.003 | |||
| Â Â Â Â F | 10 (5.8%) | 33 (8.7%) | 27 (16%) | |
| Â Â Â Â M | 162 (94%) | 348 (91%) | 138 (84%) | |
| Smoking | 0.3 | |||
| Â Â Â Â CURRENT | 14 (8.1%) | 56 (15%) | 23 (14%) | |
| Â Â Â Â FORMER | 89 (52%) | 185 (49%) | 79 (48%) | |
| Â Â Â Â NEVER | 69 (40%) | 140 (37%) | 63 (38%) | |
| bmi_group | ||||
| Â Â Â Â Normal | 27 (16%) | 42 (11%) | 14 (8.5%) | |
| Â Â Â Â Obese | 97 (56%) | 239 (63%) | 130 (79%) | |
| Â Â Â Â Overweight | 46 (27%) | 99 (26%) | 20 (12%) | |
| Â Â Â Â Underweight | 2 (1.2%) | 1 (0.3%) | 1 (0.6%) | |
| 1 n (%) | ||||
| 2 Pearson’s Chi-squared test; NA | ||||
Results: There is significance in mean difference between SCreGFR and CysCeGFR.
##
## Paired t-test
##
## data: paired_data$ScreGFR and paired_data$CysCeGFR
## t = 53.609, df = 1669, p-value < 2.2e-16
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
## 18.90153 20.33714
## sample estimates:
## mean difference
## 19.61933
Reults: At least one mean is different. In the multiple comparsions of means, all pairs of means are significantly different.
## Df Sum Sq Mean Sq F value Pr(>F)
## eGFRGroup 2 269938 134969 2203 <2e-16 ***
## Residuals 1618 99150 61
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = eGFRDifference ~ eGFRGroup, data = anova_data)
##
## $eGFRGroup
## diff lwr upr p adj
## Group 2: 11-30-Group 1: <10 16.68523 15.56636 17.80410 0
## Group 3: >30-Group 1: <10 37.77532 36.43782 39.11282 0
## Group 3: >30-Group 2: 11-30 21.09009 19.94299 22.23719 0
This tests if the mean BMI’s are different for 3 eGFR groups. Results: At least one mean BMI is different.
## Df Sum Sq Mean Sq F value Pr(>F)
## eGFRGroup 2 2910 1454.8 20.81 1.65e-09 ***
## Residuals 715 49988 69.9
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
This tests if the mean ages are different for 3 eGFR groups. Results: At least one mean age is different.
## Df Sum Sq Mean Sq F value Pr(>F)
## eGFRGroup 2 6781 3390 15.49 2.17e-07 ***
## Residuals 1618 354104 219
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
This tests to see if the gender and the eGFR group are associated or not. Results: Gender and eGFR group are significantly associated.
##
## Group 1: <10 Group 2: 11-30 Group 3: >30
## F 24 100 65
## M 367 766 299
##
## Pearson's Chi-squared test
##
## data: gender_table
## X-squared = 25.158, df = 2, p-value = 3.444e-06
This tests to see if the smoking and the eGFR group are associated or not. Results: Smoking and eGFR group are not significantly associated using a significance level of 5%.
##
## Group 1: <10 Group 2: 11-30 Group 3: >30
## CURRENT 38 127 53
## FORMER 181 411 175
## NEVER 172 328 136
##
## Pearson's Chi-squared test
##
## data: smoking_table
## X-squared = 8.4538, df = 4, p-value = 0.0763
We fit the model with outcome variable eGFRDifference against the variables: age, gender, BMI, smoking. Results: All variables are highly significant. However, we cannot rely on this model since the model fit is poor.
##
## Call:
## lm(formula = eGFRDifference ~ correctedage...6 + gender + BMI +
## Smoking, data = reg_data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -53.783 -8.359 -1.788 7.710 87.023
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.35051 4.42195 0.984 0.325513
## correctedage...6 0.11357 0.04272 2.658 0.008025 **
## genderM -6.24007 1.86541 -3.345 0.000864 ***
## BMI 0.49240 0.06717 7.331 6e-13 ***
## SmokingFORMER -4.05322 1.64302 -2.467 0.013853 *
## SmokingNEVER -5.19722 1.68242 -3.089 0.002082 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 14.13 on 741 degrees of freedom
## Multiple R-squared: 0.08589, Adjusted R-squared: 0.07972
## F-statistic: 13.92 on 5 and 741 DF, p-value: 5.024e-13
| term | estimate | std.error | statistic | p.value | conf.low | conf.high |
|---|---|---|---|---|---|---|
| (Intercept) | 4.351 | 4.422 | 0.984 | 0.326 | -4.331 | 13.032 |
| correctedage…6 | 0.114 | 0.043 | 2.658 | 0.008 | 0.030 | 0.197 |
| genderM | -6.240 | 1.865 | -3.345 | 0.001 | -9.902 | -2.578 |
| BMI | 0.492 | 0.067 | 7.331 | 0.000 | 0.361 | 0.624 |
| SmokingFORMER | -4.053 | 1.643 | -2.467 | 0.014 | -7.279 | -0.828 |
| SmokingNEVER | -5.197 | 1.682 | -3.089 | 0.002 | -8.500 | -1.894 |
In this model, the outcome variable is eGFRGroup: Group 1:<10,
Group 2: 11-30, Group 3:>30.
This is different model than Linear Regression. Results: all variables
are significant. The baseline model is Group 1.
We can examine the odd ratios. For instance, for every 1-unit increase
in BMI, the odds of being in Group 2 rather than Group 1 are estimated
to increase by a factor of exp(1.062) = 2.89, holding all other
variables fixed.
## # weights: 21 (12 variable)
## initial value 788.803623
## iter 10 value 696.704635
## final value 694.569959
## converged
## Call:
## multinom(formula = eGFRGroup ~ correctedage...6 + gender + BMI +
## Smoking, data = multi_data)
##
## Coefficients:
## (Intercept) correctedage...6 genderM BMI SmokingFORMER
## Group 2: 11-30 -1.844526 0.03030899 -0.6820662 0.06058242 -0.9212411
## Group 3: >30 -2.875243 0.01877315 -1.1906104 0.09981461 -0.8150301
## SmokingNEVER
## Group 2: 11-30 -0.9882738
## Group 3: >30 -1.0759661
##
## Std. Errors:
## (Intercept) correctedage...6 genderM BMI SmokingFORMER
## Group 2: 11-30 0.8719064 0.007877506 0.3959867 0.01400878 0.3384408
## Group 3: >30 0.9946748 0.009155148 0.4207673 0.01612790 0.3930728
## SmokingNEVER
## Group 2: 11-30 0.3446783
## Group 3: >30 0.4012563
##
## Residual Deviance: 1389.14
## AIC: 1413.14
| y.level | term | estimate | std.error | statistic | p.value | conf.low | conf.high |
|---|---|---|---|---|---|---|---|
| Group 2: 11-30 | (Intercept) | 0.158 | 0.872 | -2.116 | 0.034 | 0.029 | 0.873 |
| Group 2: 11-30 | correctedage…6 | 1.031 | 0.008 | 3.848 | 0.000 | 1.015 | 1.047 |
| Group 2: 11-30 | genderM | 0.506 | 0.396 | -1.722 | 0.085 | 0.233 | 1.099 |
| Group 2: 11-30 | BMI | 1.062 | 0.014 | 4.325 | 0.000 | 1.034 | 1.092 |
| Group 2: 11-30 | SmokingFORMER | 0.398 | 0.338 | -2.722 | 0.006 | 0.205 | 0.773 |
| Group 2: 11-30 | SmokingNEVER | 0.372 | 0.345 | -2.867 | 0.004 | 0.189 | 0.731 |
| Group 3: >30 | (Intercept) | 0.056 | 0.995 | -2.891 | 0.004 | 0.008 | 0.396 |
| Group 3: >30 | correctedage…6 | 1.019 | 0.009 | 2.051 | 0.040 | 1.001 | 1.037 |
| Group 3: >30 | genderM | 0.304 | 0.421 | -2.830 | 0.005 | 0.133 | 0.694 |
| Group 3: >30 | BMI | 1.105 | 0.016 | 6.189 | 0.000 | 1.071 | 1.140 |
| Group 3: >30 | SmokingFORMER | 0.443 | 0.393 | -2.073 | 0.038 | 0.205 | 0.956 |
| Group 3: >30 | SmokingNEVER | 0.341 | 0.401 | -2.681 | 0.007 | 0.155 | 0.749 |
This analysis classifies patients into three groups based on the difference between creatinine-based eGFR and cystatin C-based eGFR:
Missing values are removed separately for each analysis so that each test uses complete available data for the variables involved.