library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5     v purrr   0.3.4
## v tibble  3.1.4     v dplyr   1.0.7
## v tidyr   1.1.3     v stringr 1.4.0
## v readr   2.0.1     v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
setwd("C:/Users/Latoya/Downloads")
blood<-read_csv("Blood Storage.csv")
## Rows: 316 Columns: 20
## -- Column specification --------------------------------------------------------
## Delimiter: ","
## dbl (20): RBC Age Group, Median RBC Age, Age, AA, FamHx, PVol, TVol, T Stage...
## 
## i Use `spec()` to retrieve the full column specification for this data.
## i Specify the column types or set `show_col_types = FALSE` to quiet this message.
head(blood)
## # A tibble: 6 x 20
##   `RBC Age Group` `Median RBC Age`   Age    AA FamHx  PVol  TVol `T Stage`   bGS
##             <dbl>            <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>     <dbl> <dbl>
## 1               3               25  72.1     0     0  54       3         1     3
## 2               3               25  73.6     0     0  43.2     3         2     2
## 3               3               25  67.5     0     0 103.      1         1     3
## 4               2               15  65.8     0     0  46       1         1     1
## 5               2               15  63.2     0     0  60       2         1     2
## 6               3               25  65.4     0     0  45.9     2         1     1
## # ... with 11 more variables: BN+ <dbl>, OrganConfined <dbl>, PreopPSA <dbl>,
## #   PreopTherapy <dbl>, Units <dbl>, sGS <dbl>, AnyAdjTherapy <dbl>,
## #   AdjRadTherapy <dbl>, Recurrence <dbl>, Censor <dbl>, TimeToRecurrence <dbl>
boxplot(blood$PreopPSA, main = "Preop Prostate Specific Antigen (PSA)", ylab = "PSA (ng/mL")

summary(blood$PreopPSA)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   1.300   4.980   6.200   8.185   9.000  40.100       3

BoxPlot PSA

The median Preop PSA was 6.2 ng/mL. The date ranges from 1.3 to 40.1 ng/mL. There are multiple points that are outliers and fall above the maximum/upper quartile fence

plot(blood$`Median RBC Age`, blood$TimeToRecurrence, Main = "Median RBC and Time to Recurrence", xlab = "Median RBC of all tranfused units (days)", ylab="Time to Recrrence of Prostate Cancer(months)" )
## Warning in plot.window(...): "Main" is not a graphical parameter
## Warning in plot.xy(xy, type, ...): "Main" is not a graphical parameter
## Warning in axis(side = side, at = at, labels = labels, ...): "Main" is not a
## graphical parameter

## Warning in axis(side = side, at = at, labels = labels, ...): "Main" is not a
## graphical parameter
## Warning in box(...): "Main" is not a graphical parameter
## Warning in title(...): "Main" is not a graphical parameter

plot(blood$PreopPSA, blood$TimeToRecurrence, 
     main = "PreOp PSA and Time to Recurrence of Prostate Cancer", xlab = "Preop PSA(ng/mL)", 
     ylab="Time to Recrrence of Prostate Cancer(months)" )

There does not seem to be a relationship between the Preop PSA and the months it took for there to be Recurrence of prostate cancer.

hist(blood$TimeToRecurrence)

The histogram is not normally distributed and is right skewed. Most cases of prostate cancer recurrence happen within the first 10 months following radical prostatectomy.

hist(blood$PreopPSA)

The normal PSA level of patients not in this study, varies by age; however research shows that it tends to within 0 to 5 ng/mL. I would have expected the histogram to show most of the patients in this study to have higher PSA than shown above. It would be interesting to know how many patients took underwent chemotherapy and other therapeutic agents to shrink the tumor prior to surgery.