This report was prepared by Group 6 as part of the final semester project for the ANACOVA Computation class.
WIDYA SAPUTRI AGUSTIN (22611001)
DIVIA PUTRI KHASANAH (22611174)
ADIRA FASYA NABILA PANE (22611111)
ARDILIA PUTRI MUHAFIDAH KUSUMAWATI (22611141)
End stage liver disease is a major cause of significant morbidity and mortality (Hendrawan and Rumawas, 2017). According to data from the Global Burden of Disease Study, chronic liver diseases including cirrhosis and liver cancer are the fifteenth leading cause of death in Indonesia, with annual mortality exceeding 35,000 cases (GBD 2019 Indonesia Collaborators, 2022). In 2019, Indonesia recorded over 34,000 deaths due to liver cirrhosis, making it one of the top twenty causes of death nationally. This number increased compared to approximately 30,000 cases in 2010 (IHME, 2020).
The 2018 Basic Health Research (Riskesdas) reported that the prevalence of chronic liver disease (including cirrhosis and chronic hepatitis) in Indonesia was 1.4 percent of the adult population, with the highest proportion found in the 45 to 65 year age group (Ministry of Health of the Republic of Indonesia, 2018). A retrospective study of 184 patients at Dr. Wahidin Sudirohusodo Central General Hospital in Makassar identified hepatitis B (49.4 percent) and hepatitis C (28.3 percent) as the leading causes of liver cirrhosis. Approximately 65.2 percent of patients presented with decompensated cirrhosis, indicating delays in clinical management (Natsir et al., 2020).
Liver transplantation has become the definitive therapy for patients with end stage liver failure, as other treatment methods only delay disease progression and complications (Hendrawan and Rumawas, 2017). The procedure has advanced significantly, including the development of living donor liver transplantation, to reduce mortality among patients waiting for a transplant (Panahatan, Lalisang, and Kusumadewi, 2020). However, liver transplantation still faces major challenges, such as the scarcity of liver donors and the high cost of care (Hendrawan and Rumawas, 2017). These conditions underscore the need for further research to understand the factors that influence post transplant patient survival time, including biological variables of the recipients.
One of the important biological factors is the ABO blood group. The ABO blood group system classifies individuals into four main groups (A, B, AB, and O) based on the presence of A and B antigens on the surface of red blood cells. In the context of organ transplantation, donor recipient blood group compatibility is critical, as incompatibility can trigger immunological reactions and affect graft survival (Romanos Sirakis and Desai, 2025). A study by Lee E. C. et al. (2017) demonstrated that the recipient’s blood group may influence post transplant survival outcomes. These findings emphasize the importance of examining the effect of blood group on transplant results, particularly in relation to patient survival duration.
Demographic variables, such as the recipient’s age, also play a crucial role in liver transplant outcomes. A large population cohort study by Gil et al. found that recipient age remains a significant risk factor for mortality following liver transplantation. Older recipients were found to have lower survival probabilities compared to younger patients (Gil et al., 2018). In addition, findings by Wang et al. (2024) also indicated that recipient age significantly affects survival outcomes, with an increased risk of mortality observed in the elderly group (aged 70 years and above), even after accounting for other clinical factors. Therefore, analyzing the effect of blood group on survival time should consider age as a covariate.
Although several studies have examined the effect of individual factors such as blood group or age on liver transplant outcomes, research that simultaneously analyzes both within a comprehensive statistical framework remains very limited, particularly in developing countries such as Indonesia. A study by Barone et al. (2008) indicated that blood group may influence waiting time and recipient survival, but the analysis did not explicitly control for age as a confounding factor. Meanwhile, Gil et al. (2018) emphasized that recipient age is a strong predictor of post operative mortality, yet did not include biological variables such as blood group in their multivariate analysis. Therefore, there is still a clear research gap in understanding the effect of blood group on survival time while controlling for age as a covariate.
This study aims to analyze the effect of blood group on the survival time of liver transplant patients by controlling for age as a covariate using the ANCOVA method. This approach allows for a more objective evaluation of the differences in mean survival time across blood groups by neutralizing the influence of age as a covariate. The findings of this study are expected to contribute to more accurate and personalized clinical decision making, particularly in determining organ allocation priorities based on the biological and demographic characteristics of patients in Indonesia.
The “transplant” dataset comes from the survival package in R, which is frequently used for survival analysis, specifically to study the influence of certain factors on survival time after transplantation. The data was collected by the United Network for Organ Sharing (UNOS) in the United States.The transplant dataset from the survival package can be accessed and downloaded through the following link https://github.com/therneau/survival/blob/master/data/transplant.rda.
This study used three variables:
ABO → Indicates the blood type of the transplant recipient.
age → This variable contains the patient’s age at the time of transplantation. The data type is numeric (continuous), with units in years. Patient age is a risk factor for survival after transplantation, as older patients tend to have a higher risk of complications.
futime → This variable records the patient’s survival time after transplantation until an event occurs (e.g., death) or until the end of observation (censoring). The data type is numeric (continuous), with units in days. The higher the futime value, the longer the patient survived after transplantation.
Analysis of Covariance (ANCOVA) is an analytical technique used to increase precision in experimental studies by adjusting for the influence of other uncontrolled independent variables (Zunita et al., 2018). In this method, independent variables can be classified into two types based on their data scale. Quantitative variables are referred to as covariates, while qualitative or categorical variables are considered treatments or factors (Basiriyah et al., 2018). ANCOVA involves incorporating covariates directly into the analytical model to account for their potential confounding effects (Zunita et al., 2018). ANCOVA can be used to compare the means of a dependent variable across two or more groups while controlling for one or more continuous covariates. The one way ANCOVA adjusts group means based on the covariate, thereby providing a more precise and valid test of differences among groups (Schwarz, 2025). The ANCOVA model with one covariate can be expressed as Equation (1) below (Basiriyah et al., 2018):
\[ y_{ij} = \mu + \tau_i + \beta x_{ij} + \epsilon_{ij} \tag{1} \]
where:
Indices: \(i = 1, 2, \dots, a\) represents the treatment groups, and \(j = 1, 2, \dots, n_i\) represents the observations within treatment \(i\).
This case study uses one-way ANCOVA. In a one-way ANCOVA, there are three variables: one factor variable with two or more groups, a response variable, and a covariate variable. In this case study, the factor (categorical) variable is “abo”, the covariate variable is “age”, and the response variable is “future”.
library(tidyverse)
library(rstatix)
library(car)
library(gridExtra)
library(survival)
Before further analysis, the data underwent a preprocessing procedure to ensure that the assumptions of the statistical model were met and to remove extreme outliers. This process aimed to produce results that are more valid and representative of the target population.
df <- read.csv("C://Users//LENOVO//Downloads//transplant.csv")
head(df)
## abo age futime
## 1 A 61 139
## 2 B 50 250
## 3 O 44 364
## 4 A 37 122
## 5 B 64 234
## 6 A 61 157
Based on the output above, it can be seen that the transplant dataset consists of three columns, namely abo, age, and futime.
df %>% group_by(abo) %>% get_summary_stats(futime, type = "common")
## # A tibble: 4 × 11
## abo variable n min max median iqr mean sd se ci
## <chr> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 A futime 62 110 190 146 26.8 146. 21.1 2.69 5.37
## 2 AB futime 12 16 60 27 26 33.4 15.7 4.53 9.97
## 3 B futime 11 212 250 231 22.5 231. 14.3 4.32 9.64
## 4 O futime 22 315 370 342. 33.5 342. 18.4 3.92 8.15
The statistical summary above shows the difference in survival time (futime) after transplantation based on the patient’s blood type (ABO). Patients with blood type O had the highest survival time with an average of 342 days and a range of 315 to 370 days. Blood type B also showed high survival with an average of 231 days. Conversely, patients with blood type AB have the lowest average survival time, at 33.4 days, with a range of only 16 to 60 days. Meanwhile, blood type A has an average survival time of 146 days. In general, there is a pattern indicating that blood type can influence post-transplant survival duration, with blood type O showing the longest survival period and AB showing the shortest.
df %>% group_by(abo) %>% get_summary_stats(age, type = "common")
## # A tibble: 4 × 11
## abo variable n min max median iqr mean sd se ci
## <chr> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 A age 62 26 70 47.5 16.8 48.5 10.5 1.33 2.66
## 2 AB age 12 37 72 48 19.5 51.8 11.8 3.41 7.50
## 3 B age 11 40 65 53 8.5 53.8 7.72 2.33 5.18
## 4 O age 22 27 67 54 13.2 50.5 10.9 2.33 4.84
The age variable contains the age of the patient at the time of transplantation. Based on the above output, it can be seen that the age range between groups is between 48.5 years and 53.8 years. Statistical results show that the average age of patients varies by blood type. Group B has the oldest average age at 53.8 years, followed by group AB (51.8 years), O (50.5 years), and A (48.5 years).
p1 <- ggplot(df, aes(age, futime, colour = abo)) +
geom_point(size = 3) +
theme(legend.position = "top")
p2 <- ggplot(df, aes(x = abo, y = futime, col = abo)) +
geom_boxplot(outlier.shape = NA) +
geom_jitter(width = 0.2) +
theme(legend.position = "top")
p3 <- ggplot(df, aes(x = abo, y = age, fill = abo)) +
geom_boxplot(outlier.shape = NA) +
geom_jitter(width = 0.2) +
theme(legend.position = "top")
grid.arrange(p1, p2, p3, ncol = 3)
In the scatter plot above, the x-axis represents the age variable, the y-axis represents the futime variable, and the color represents the abo variable. Visually, it can be seen that there are clear differences between the groups. This pattern of differences between groups may indicate the influence of factor variables on response variables, which will be analyzed further. Blood type O has the highest futime value, with many points above 300. Blood type AB is concentrated below 100, indicating a shorter lifespan. Blood types A and B are in the range of 100–250.
In the futime boxplot based on ABO, the highest futime median is seen in blood type O, around 320 with a narrow (stable) spread. Blood type B has a median of around 230 and A around 150. Blood type AB has the lowest median, below 50 with a small range. This shows a striking difference between groups.
In the boxplot of age based on ABO, blood group AB has the highest median age, around 55 years. Blood groups A and O have a median around 45–50 years. The age distribution is quite diverse, especially in group A, with ages ranging from <30 to >70 years. This indicates that there are age differences between groups and needs to be controlled in the analysis.
Hypothesis:
H0: There is no difference in survival time (futime) between blood types (abo) after controlling for age (age)
H1: There is a difference in survival time between blood types after controlling for age
ancova_result <- anova_test(data = df, formula = futime ~ age + abo, type = 3, detailed = TRUE)
ancova_result
## ANOVA Table (type III tests)
##
## Effect SSn SSd DFn DFd F p p<.05 ges
## 1 (Intercept) 131732.022 38857.96 1 102 345.789 1.53e-34 * 0.772
## 2 age 280.622 38857.96 1 102 0.737 3.93e-01 0.007
## 3 abo 933515.623 38857.96 3 102 816.809 3.84e-71 * 0.960
Using a significance level of 0.05, blood type (ABO) had a significant effect on patient survival time after transplantation because the p-value for the ABO variable was 3.84e-71, which is less than 0.05. The very large F-value (816.809) and effect size (ges) of 0.960 indicate a very strong influence of blood type on survival time.
Age does not have a significant effect because the p-value is 0.393, which is greater than 0.05. This means that in this data, the patient’s age at the time of transplantation is not significantly associated with survival time after transplantation.
The intercept is significant, but this only indicates that the overall average survival time is different from zero, so it is not the primary focus in interpreting the effect of the variable.
adj_means <- emmeans_test(data = df, formula = futime ~ abo, covariate = age)
get_emmeans(adj_means)
## # A tibble: 4 × 8
## age abo emmean se df conf.low conf.high method
## <dbl> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
## 1 49.8 A 146. 2.49 102 141. 151. Emmeans test
## 2 49.8 AB 33.1 5.65 102 21.9 44.3 Emmeans test
## 3 49.8 B 230. 5.93 102 218. 242. Emmeans test
## 4 49.8 O 342. 4.16 102 334. 350. Emmeans test
EMMs provide estimated marginal means (EMMs), also known as least-squares means. EMMs are averages that have been adjusted for each blood type by controlling for the influence of the patient’s age at the time of transplantation. At an average age of 49.8 years, blood type O had the highest survival time of 342 days, followed by blood type B at 230 days, blood type A at 146 days, and the lowest was blood type AB at 33.1 days. This indicates that even after controlling for age, blood type still shows significant differences in survival time after transplantation. Thus, blood type O still has the longest survival time (342 days) after controlling for the influence of age. These results are consistent with the ANCOVA results, which show that age does not affect patient survival after transplantation.
Hypothesis:
H0: There is no difference in survival time after transplantation between each blood type pair after controlling for age.
H1: There is a difference in survival time after transplantation between each blood type pair after controlling for age.
emmeans_test(data = df, formula = futime ~ abo, covariate = age, p.adjust.method = "fdr")
## # A tibble: 6 × 9
## term .y. group1 group2 df statistic p p.adj p.adj.signif
## * <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <chr>
## 1 age*abo futime A AB 102 18.3 6.22e-34 9.33e-34 ****
## 2 age*abo futime A B 102 -13.0 2.05e-23 2.05e-23 ****
## 3 age*abo futime A O 102 -40.3 1.66e-64 4.99e-64 ****
## 4 age*abo futime AB B 102 -24.2 5.20e-44 1.04e-43 ****
## 5 age*abo futime AB O 102 -44.1 3.23e-68 1.94e-67 ****
## 6 age*abo futime B O 102 -15.5 1.77e-28 2.13e-28 ****
Post hoc tests using the emmeans method with age adjustment (age as a covariate) and p-value correction using the FDR method showed that all comparisons between blood type pairs in terms of survival time after transplantation were significantly different. At the 5% significance level, it was found that all comparisons between blood groups showed significant differences in survival time after transplantation, with p-adj values far smaller than 0.05.
Significant differences were observed between blood group A and AB, B, and O. Blood group AB also differed significantly from B and O. Additionally, blood group B differed significantly from O.
Negative statistical values indicate that the second blood type has longer survival time. For example, the comparison between A and O yielded a negative statistical value, indicating that patients with blood type O have longer survival time than those with blood type A. The largest difference was observed between blood types AB and O, meaning that patients with blood type O survive significantly longer than those with blood type AB.
Thus, these post hoc results reinforce previous findings that blood type significantly influences patient survival time after transplantation. Blood type O consistently has the longest survival time compared to other blood types.
Testing the residual normality assumption is important because the ANCOVA model assumes that the residuals, or the difference between predicted and actual values, must be normally distributed. If the residuals are not normally distributed, significant test results, such as the p-value, can be inaccurate, as the F-statistic calculation in ANCOVA relies on this normality assumption.
Hypothesis:
H0: The residuals are normally distributed. H1: The residuals are not normally distributed.
shapiro.test(resid(aov(futime ~ abo + age, data = df)))
##
## Shapiro-Wilk normality test
##
## data: resid(aov(futime ~ abo + age, data = df))
## W = 0.97702, p-value = 0.06037
Based on the results of the residual normality test using the Shapiro-Wilk test, the W value was 0.97702 with a p-value of 0.06037. Because the p-value is greater than the 0.05 significance level, there is insufficient evidence to reject the null hypothesis that the residuals are normally distributed.
In other words, the residuals from the model that includes blood type and age as predictors of survival time after transplantation can be considered normally distributed. Therefore, the assumption of normality in the ANOVA analysis is met, allowing for a more valid interpretation of the model.
The homogeneity of variance assumption is made to ensure that the variance of the data in each group of independent variable categories is equal or uniform. If the variance between groups is not homogeneous, the model estimation may be biased and the F-test in the ANCOVA will be invalid. Homogeneity of variance ensures that comparisons between groups are fair because the data distribution in each group is equal.
Hypothesis:
H0: The variance of survival time (futime) is homogeneous or the same across all blood types (abo).
H1: There is a difference in the variance of survival time between blood types, so the variance is not homogeneous.
bartlett.test(futime ~ abo, data = df)
##
## Bartlett test of homogeneity of variances
##
## data: futime by abo
## Bartlett's K-squared = 3.3499, df = 3, p-value = 0.3408
Based on the results of the homogeneity of variance test using the Bartlett’s test, the Bartlett’s K-squared value was 3.3499 with a p-value of 0.3408. Because the p-value is greater than the 0.05 significance level, there is insufficient evidence to reject the null hypothesis that the variance of survival times between blood types is homogeneous.
Therefore, it can be concluded that the variance of survival times for each blood type is relatively similar, thus meeting the assumption of homogeneity of variance in the ANOVA analysis. Meeting this assumption is crucial for the validity of the ANOVA results and the reliability of the model interpretation.
This assumption test is essential in ANCOVA because there should be no interaction between the categorical independent variable and the covariate. This can be evaluated by including an interaction term between the variables abo and age in the ANOVA model. If this assumption is violated, the treatment effect may differ across levels of the covariate. In such cases, it is advisable to consider alternative approaches to ANCOVA, such as the Johnson-Neyman technique.
Hypothesis:
H0: There is no interaction between blood type and age, meaning the effect of age on survival time is the same for all blood types (homogeneous regression slope).
H1: There is an interaction between blood type and age, meaning the effect of age on survival time differs across blood types (non-homogeneous regression slope).
Anova(aov(futime ~ abo * age, data = df), type = 3)
## Anova Table (Type III tests)
##
## Response: futime
## Sum Sq Df F value Pr(>F)
## (Intercept) 49326 1 126.8330 < 2.2e-16 ***
## abo 36026 3 30.8785 3.557e-14 ***
## age 354 1 0.9100 0.3424
## abo:age 357 3 0.3057 0.8212
## Residuals 38501 99
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Based on the results of the homogeneity test for the regression slope, the p-value for the interaction between blood type and age (abo:age) was 0.8212. Because this p-value is significantly greater than the 0.05 significance level, there is insufficient evidence to reject the null hypothesis that there is no interaction between blood type and age on survival time after transplantation.
Therefore, it can be concluded that the effect of age on survival time is the same for all blood types, meaning the regression slope is homogeneous. This assumption has been met, so the ANCOVA model remains appropriate, and the analysis results can be interpreted with greater validity.
The relationship between the covariate and the dependent variable in each group of independent variables must be linear. This assumption can be verified by examining the scatterplot between the covariate and the dependent variable in each group of independent variables. If the data points form a pattern resembling a straight line, then the linearity assumption has been met.
ggplot(df, aes(age, futime, colour = abo)) +
geom_point(size = 3) +
geom_smooth(method = "lm", aes(fill = abo), alpha = 0.1) +
theme(legend.position = "top")
## `geom_smooth()` using formula = 'y ~ x'
It can be seen that the data points form a pattern resembling a straight or linear line. Furthermore, the linear lines between groups are parallel and do not potentially intersect at any point. Therefore, it can be concluded that the linearity assumption test is met. This aligns with the homogeneity assumption test for regression slopes, which states that there is no interaction between the covariate variable, age, and the dependent variable, future time.
The ANCOVA analysis revealed a significant difference in the mean survival time (futime) across blood type groups after controlling for the effect of age as a covariate. Blood type O was associated with the longest average survival time following liver transplantation, while blood type AB had the shortest. These findings are consistent with the experimental study by Lee et al. (2017), which indicated a link between blood type compatibility and the success of liver transplantation. This strengthens the evidence that blood type is not merely an administrative factor in donor and recipient matching but also plays a role in the biological processes that occur after transplantation.
This finding is supported by the results of post hoc tests, which showed that all blood type pair combinations differed significantly. This strengthens the conclusion that blood type is an important determinant of long term transplant success.
Age is often seen as an important factor in medical prognosis. However, this study shows that age does not have a significant effect on survival time based on the data used.
All basic assumptions in the ANCOVA analysis, including residual normality, homogeneity of variances, homogeneity of regression slopes, and linearity, were satisfied. Therefore, the results can be considered valid and reliable.
Based on these results, it is recommended that healthcare practitioners consider blood type as an indicator in predicting patient prognosis after transplantation or similar medical procedures. Further research is also recommended to further explore other biological factors that may interact with blood type in influencing patient survival, as well as to test the generalizability of these findings to a broader population.
Badan Penelitian dan Pengembangan Kesehatan. (2018). Laporan Nasional Riskesdas 2018. Kementerian Kesehatan RI.
Barone, M., Avolio, A. W., Di Leo, A., Burra, P., & Francavilla, A. (2008). ABO blood group-related waiting list disparities in liver transplant candidates: Effect of the MELD adoption. Transplantation, 85(6), 844–849.
Basiriyah, S., Listiowarni, I., & Hapantenda, A. K. W. (2020). Analisis penerapan game-based student response system pada flipped classroom biologi SMAN 5 Pamekasan. Konvergensi, 16(2), 62–70.
GBD 2019 Indonesia Collaborators. (2022). The state of health in Indonesia’s provinces, 1990–2019: A systematic analysis for the Global Burden of Disease Study 2019. The Lancet Global Health, 10(10), e1373–e1387.
Gil, E., Kim, J. M., Jeon, K., Park, H., Kang, D., Cho, J., & Park, J. (2018). Recipient age and mortality after liver transplantation: A population-based cohort study. Transplantation, 102(12), 2025–2032.
Hendrawan, S., & Rumawas, M. (2017). Transplantasi hepatosit: Terapi potensial yang menjanjikan untuk sirosis hepatik. Ebers Papyrus, 16(2), 115–124.
Institute for Health Metrics and Evaluation. (2020). GBD Compare: Indonesia Liver Cirrhosis Mortality.
Lee, E. C., Kim, S. H., & Park, S.-J. (2017). Outcomes after liver transplantation in accordance with ABO compatibility: A systematic review and meta-analysis. World Journal of Gastroenterology, 6516–6533.
Natsir, R., Mubin, H. T., & Yusuf, I. (2020). Gambaran klinis dan laboratorium penderita sirosis hati. Makassar Journal of Medical Sciences, 9(2), 132–138.
Panahatan, L. T., Lalisang, T. J. M., & Kusumadewi, I. (2020). Kualitas hidup donor transplantasi hati pada resipien hati nonsintas. eJurnal Kedokteran Indonesia, 8(2), 105–112.
Renesh, B. (2022). ANCOVA (Analysis of Covariance). Retrieved from https://www.reneshbedre.com/blog/ancova.html
Romanos-Sirakis, E. C., & Desai, D. (2025). ABO blood group system. StatPearls Publishing.
Schwarz, W. (2025). The ANCOVA model for comparing two groups: A tutorial emphasizing statistical distribution theory. Frontiers in Psychology, 16.
Wang, M., Ge, J., Ha, N., Shui, A. M., Huang, C.-Y., Cullaro, G., & Lai, J. C. (2024). Clinical characteristics associated with post-transplant survival among adults ≥70 years old undergoing liver transplantation. Journal of Clinical Gastroenterology, 516–521.
Zunita, P. O., Dewi, H., & S. G. (2018). Efektifitas model discovery learning dan guided discovery ditinjau dari keterampilan pemecahan masalah matematika terhadap hasil belajar. Journal for Lesson and Learning Studies.