Peter Olejua
Oct-25-2015
Context: Patients diagnosed with prostate cancer must go under surgery to determine whether the cancer has spread to the surrounding lymph nodes. Nodal involvement => treatment strategy.
Goal: Get an accurate assessment of nodal involvement without surgery.
Data Analyst's Solution: Develop a prediction algorithm based on a bunch of available variables and operated patients.
Nodal_Involvement Shiny App: Help the analyst to exclude variables that show no significant relationship with the response variable.
The data has 53 rows and 6 variables which are:
r: An indicator of nodal involvement.
aged: The patients age dichotomized into less than 60 (0) and 60 or over 1.
stage: A measurement of the size and position of the tumour observed by palpitation with the fingers via the rectum. A value of 1 indicates a more serious case of the cancer.
grade: Another indicator of the seriousness of the cancer, this one is determined by a pathology reading of a biopsy taken by needle before surgery. A value of 1 indicates a more serious case of the cancer.
xray: A third measure of the seriousness of the cancer taken from an X-ray reading. A value of 1 indicates a more serious case of the cancer.
acid: The level of acid phosphatase in the blood serum.
r aged stage grade xray acid
0:33 0:29 0:26 0:32 0:37 0:23
1:20 1:24 1:27 1:21 1:16 1:30
Playing with the app, he found the most and least related predictors with the response:
Pearson's Chi-squared test
data: r and xray
X-squared = 9.3826, df = 1, p-value = 0.002191
Pearson's Chi-squared test
data: r and aged
X-squared = 1.3708, df = 1, p-value = 0.2417