1 Introduction

Renal failure is the kidney’s inability to filter and dispose waste products from the blood. This inability leads to waste fluid circulating the body and leads to further complications and ultimately death. Renal failure symptoms are loss of appetite, concentration, nausea and difficulty in breathing. It can be caused by high blood pressure, diabetes, heart disease, kidney stones, dehydration and other medications.

It is estimated that 1 in 3 American adults are at high risk of renal failure and 1 in 9 has renal disease but is unaware (Chronic Kidney Disease, 2024). While there is no cure for renal failure, the only options is to have regular dialysis and a kidney transplant. One of the main complications with transplants, is that there is a risk that recipient body could reject the donor’s kidney. This is because the recipient’s immune system recognises the kidney as not part of the body and starts attacking it. If not treated in a timely manner the transplated kidney will get damaged. As such it is crucial to determine if there is a way to assess the compatibility of donated kidneys. In this report, using the Australia and New Zealand Dialysis and Transplant Registry data, it will be determined if the donor age and gender have an effect on the outcome of the transplant.

2 Results

#Loading data 
library(readxl)
anz <- read.csv("/Users/alyssayamada/Desktop/AMED3002/ANZ DATA/42264_AnzdataTransplants.csv")

#Getting rid of the unknown genders so that only F and M remain, and combining P (failed) and Z (died) together and removing K (lost to f/u) from data.
anz_filtered <- subset(anz, donorgendercode == "F" | donorgendercode == "M")
anz <- anz[complete.cases(anz), ]
anz_filtered$endtransplantcode <- ifelse(anz_filtered$endtransplantcode == "K", NA,
                                         ifelse(anz_filtered$endtransplantcode %in% c("P", "Z"), "PZ", anz_filtered$endtransplantcode))
#Contingency table
table <- table(anz_filtered$endtransplantcode, anz_filtered$donorgendercode)

#Chi squared test
chisq.test(table)
## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  table
## X-squared = 2.0087, df = 1, p-value = 0.1564
#Checking assumptions 
test = chisq.test(table)
test$expected >= 5
##     
##         F    M
##   PZ TRUE TRUE
##   S  TRUE TRUE
#Odds ratio
OR <- (table[1, 1] * table[2, 2])/(table[2, 1] * table[1, 2])
OR
## [1] 1.251201

A chi-squared test for independence was conducted where the null hypothesis states that there is no association between gender of the donor and the recipient outcome, and the alternative hypothesis is where there is an association between donor gender and recipient outcome. The result of the chi-squared test is that there was no significant association between donor gender and recipient outcome as the p-value was greater than 0.05 (p=0.1564), this also implies that the recipient outcome is similar for patients that recieve kidneys from females and males.

However, while the chi-squared test suggest no association between donor gender and recipient outcome, the odd ratio of 1.251 suggests that females have a slight increase in odd ratio (1.251) implying that females are 1.251 times more likey to have failed/death outcomes than males. Due to the lack of significance in the chi-squared test, this could be due to random chance and so further research is needed for more conclusive results.

library(MASS)
library(tidyverse)
## Warning: package 'ggplot2' was built under R version 4.3.2
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.3     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ ggplot2   3.5.0     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.0
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ✖ dplyr::select() masks MASS::select()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(naniar)
## Warning: package 'naniar' was built under R version 4.3.2
library(dendextend)
## 
## ---------------------
## Welcome to dendextend version 1.17.1
## Type citation('dendextend') for how to cite the package.
## 
## Type browseVignettes(package = 'dendextend') for the package vignette.
## The github page is: https://github.com/talgalili/dendextend/
## 
## Suggestions and bug-reports can be submitted at: https://github.com/talgalili/dendextend/issues
## You may ask questions at stackoverflow, use the r and dendextend tags: 
##   https://stackoverflow.com/questions/tagged/dendextend
## 
##  To suppress this message use:  suppressPackageStartupMessages(library(dendextend))
## ---------------------
## 
## 
## Attaching package: 'dendextend'
## 
## The following object is masked from 'package:stats':
## 
##     cutree
#Took out data needed from dataset
new_anz <- anz[, c("endtransplantcode", "transplantperiod", "donorage")]
DataScaled <- new_anz %>%
  dplyr::select(-endtransplantcode) %>%
  scale()

set.seed(51773)
kM <- kmeans(DataScaled,2)
pca <- prcomp(DataScaled)
df <- data.frame(pca$x, cluster = paste("cluster", kM$cluster, sep = "_"), new_anz)
ggplot(df, aes(x = PC1, y = PC2, colour = endtransplantcode, shape = cluster)) + geom_point()
Figure 2: Clustering visualised in a scatterplot where K is no follow up, S is survived, P is failed and Z is death.

Figure 2: Clustering visualised in a scatterplot where K is no follow up, S is survived, P is failed and Z is death.

It can be seen in terms of shapes that there is a line between triangles and circles with little overlap. There seems to be no clustering in terms of transplant status.

anova1 <- aov(anz$transplantperiod ~ anz$donorage * anz$donorgendercode, anz)
summary(anova1)
##                                   Df    Sum Sq  Mean Sq F value  Pr(>F)   
## anz$donorage                       1  25569021 25569021   6.898 0.00931 **
## anz$donorgendercode                2   6677320  3338660   0.901 0.40793   
## anz$donorage:anz$donorgendercode   2  11121833  5560917   1.500 0.22560   
## Residuals                        197 730192722  3706562                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
boxplot(transplantperiod ~ donorgendercode, anz)
Figure 3: Used 2-way ANOVA, to show the significant effect of the donor's age and gender on the transplant period. Boxplot was used to visualise the variances and normality.

Figure 3: Used 2-way ANOVA, to show the significant effect of the donor’s age and gender on the transplant period. Boxplot was used to visualise the variances and normality.

boxplot(donorage ~ donorgendercode, anz)
Figure 3: Used 2-way ANOVA, to show the significant effect of the donor's age and gender on the transplant period. Boxplot was used to visualise the variances and normality.

Figure 3: Used 2-way ANOVA, to show the significant effect of the donor’s age and gender on the transplant period. Boxplot was used to visualise the variances and normality.

A 2-way ANOVA was conducted to see if there is a significant association between donor’s age and gender on the transplant period. There was a significant association between the donor’s age and the longevity of the kidney transplant (p=0.00931) whereas, there is no significant association between the donor’s gender and the transplant period (p=0.40793). The interaction term of donor’s age and gender on transplany period was not significant (p=0.22560).

The assumptions of an ANOVA is normality, equal variance and independence of variables. Boxplots were made to visualise the normality and variance, both of which have similiar shaped boxplots and so it is safe to assume normality and equal variance.

regression <- lm(anz$transplantperiod ~ anz$donorage, anz)
plot(regression)
Figure 4: Plotted a residuals vs fitted, normal Q-Q, scale-location, and residuals vs leverage, which tests for homoscedasticity, normality of residuals, and outliers respectively.

Figure 4: Plotted a residuals vs fitted, normal Q-Q, scale-location, and residuals vs leverage, which tests for homoscedasticity, normality of residuals, and outliers respectively.

Figure 4: Plotted a residuals vs fitted, normal Q-Q, scale-location, and residuals vs leverage, which tests for homoscedasticity, normality of residuals, and outliers respectively.

Figure 4: Plotted a residuals vs fitted, normal Q-Q, scale-location, and residuals vs leverage, which tests for homoscedasticity, normality of residuals, and outliers respectively.

Figure 4: Plotted a residuals vs fitted, normal Q-Q, scale-location, and residuals vs leverage, which tests for homoscedasticity, normality of residuals, and outliers respectively.

Figure 4: Plotted a residuals vs fitted, normal Q-Q, scale-location, and residuals vs leverage, which tests for homoscedasticity, normality of residuals, and outliers respectively.

Figure 4: Plotted a residuals vs fitted, normal Q-Q, scale-location, and residuals vs leverage, which tests for homoscedasticity, normality of residuals, and outliers respectively.

Figure 4: Plotted a residuals vs fitted, normal Q-Q, scale-location, and residuals vs leverage, which tests for homoscedasticity, normality of residuals, and outliers respectively.

Residuals vs fitted plots tests for linearity and homoscedasticity. The points seem to be randomly scattered with no discernable pattern across 0 and so the assumption of linearity and equal variance are met. The Q-Q plot has a slight s curve which indicates that the data doest’t follow normality. Additionally, the tails is where we see more variation, this could suggest that there are more data points then expected on the extreme ends which could skew the data. The scale-location plot doesn’t have a horizontal line indicating an unequal spread, however there is no discerable pattern, suggesting equal variance in residuals. The residuals va leverage plot shows the outliers in the dataset, it is clear that ther points at around 0.035 on the x axis are outliers and will influence the data.

3 Conclusion

It can be seen that the donor’s gender does not affect the status of the recipient or the transplant period. However, the age of the donor is significant in terms of transplant period. The results from this report should be taken with a grain of salt as the regression plots show that the data doesn’t folllow normality and could have unequal variance.

4 References

Chronic Kidney Disease. (2024, February 16). Www.hopkinsmedicine.org. https://www.hopkinsmedicine.org/health/conditions-and-diseases/chronic-kidney-disease#:~:text=One%20in%20three%20American%20adults

Data used: ANZ Data from the Australia and New Zealand Dialysis and Transplant Registry.