library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5     v purrr   0.3.4
## v tibble  3.1.4     v dplyr   1.0.7
## v tidyr   1.1.3     v stringr 1.4.0
## v readr   2.0.1     v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
setwd("C:/Users/Latoya/Downloads")
blood<-read_csv("Blood Storage.csv")
## Rows: 316 Columns: 20
## -- Column specification --------------------------------------------------------
## Delimiter: ","
## dbl (20): RBC Age Group, Median RBC Age, Age, AA, FamHx, PVol, TVol, T Stage...
## 
## i Use `spec()` to retrieve the full column specification for this data.
## i Specify the column types or set `show_col_types = FALSE` to quiet this message.
head(blood)
## # A tibble: 6 x 20
##   `RBC Age Group` `Median RBC Age`   Age    AA FamHx  PVol  TVol `T Stage`   bGS
##             <dbl>            <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>     <dbl> <dbl>
## 1               3               25  72.1     0     0  54       3         1     3
## 2               3               25  73.6     0     0  43.2     3         2     2
## 3               3               25  67.5     0     0 103.      1         1     3
## 4               2               15  65.8     0     0  46       1         1     1
## 5               2               15  63.2     0     0  60       2         1     2
## 6               3               25  65.4     0     0  45.9     2         1     1
## # ... with 11 more variables: BN+ <dbl>, OrganConfined <dbl>, PreopPSA <dbl>,
## #   PreopTherapy <dbl>, Units <dbl>, sGS <dbl>, AnyAdjTherapy <dbl>,
## #   AdjRadTherapy <dbl>, Recurrence <dbl>, Censor <dbl>, TimeToRecurrence <dbl>
boxplot(blood$PreopPSA, main = "Preop Prostate Specific Antigen (PSA)", ylab = "PSA (ng/mL")

summary(blood$PreopPSA)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   1.300   4.980   6.200   8.185   9.000  40.100       3

BoxPlot PSA

The median Preop PSA was 6.2 ng/mL. The date ranges from 1.3 to 40.1 ng/mL. There are multiple points that are outliers and fall above the maximum/upper quartile fence. 22

plot(blood$`Median RBC Age`, blood$TimeToRecurrence, Main = "Median RBC and Time to Recurrence", xlab = "Median RBC of all tranfused units (days)", ylab="Time to Recrrence of Prostate Cancer(months)" )
## Warning in plot.window(...): "Main" is not a graphical parameter
## Warning in plot.xy(xy, type, ...): "Main" is not a graphical parameter
## Warning in axis(side = side, at = at, labels = labels, ...): "Main" is not a
## graphical parameter

## Warning in axis(side = side, at = at, labels = labels, ...): "Main" is not a
## graphical parameter
## Warning in box(...): "Main" is not a graphical parameter
## Warning in title(...): "Main" is not a graphical parameter

plot(blood$PreopPSA, blood$TimeToRecurrence, 
     main = "PreOp PSA and Time to Recurrence of Prostate Cancer", xlab = "Preop PSA(ng/mL)", 
     ylab="Time to Recrrence of Prostate Cancer(months)" )

There does not seem to be a relationship between the Preop PSA and the months it took for there to be Recurrence of prostate cancer.

hist(blood$TimeToRecurrence, main = "Time to Recurrence of Prostate Cancer",xlab = "Time to Recurrence (months)")

The histogram is not normally distributed and is right skewed. Most cases of prostate cancer recurrence happen within the first 10 months following radical prostatectomy.

hist(blood$PreopPSA, main = "PreOperative PSA", xlab = "PreopPSA (ng/mL)")

The normal PSA level of patients not in this study, varies by age; however research shows that it tends to within 0 to 5 ng/mL. I would have expected the histogram to show most of the patients in this study to have higher PSA than shown above. It would be interesting to know how many patients took underwent chemotherapy and other therapeutic agents to shrink the tumor prior to surgery.

RBCage<-blood$`RBC Age Group`
Recurrence<-blood$Recurrence
table(RBCage, Recurrence)
##       Recurrence
## RBCage  0  1
##      1 87 19
##      2 86 17
##      3 89 18
chisq.test(RBCage,Recurrence)
## 
##  Pearson's Chi-squared test
## 
## data:  RBCage and Recurrence
## X-squared = 0.082401, df = 2, p-value = 0.9596

RBC Age vs Recurrence

Null hypothesis: There is no difference in the recurrence of prostate cancer between RBC Age groups Alternative hypothesis: There is a difference in the recurrence of prostate cancer between RBC Age groups

Since the p-value is high, there is no compelling evidence that there is a difference in the recurrence of prostate cancer given the age of RBCs used in perioperative blood transfusions

For RBC Age Group : 1 = younger blood, 2 = middle, 3 = older For Recurrence: 0 = No recurrence, 1= Recurrence

AfAm<-blood$AA
FamilyHx<-blood$FamHx
table(AfAm, FamilyHx)
##     FamilyHx
## AfAm   0   1
##    0 200  61
##    1  48   7
chisq.test(AfAm, FamilyHx)
## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  AfAm and FamilyHx
## X-squared = 2.45, df = 1, p-value = 0.1175

AfAm: 0 = non-African American; 1= African American Family History: 0 = No, 1 = Yes

The table itself differs from national statistics as African American men tend to have a higher rate of prostate cancer. I would have expected African Americans to show a family history of prostate cancer.

Given the p-value there is no compelling evidence that ethnicity indicates a family history of prostate cancer.