load("out.rData")
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 3.1.3
One subjectID for each patient
Calcium routine metabolic panel to assess kidney, bone, or nerve disease. Normal ranges are 2.2-2.5 mmol/L
Phosphorus Associated with kidney function, nutritional status, and a variety of chronic illnesses. Normal ranges are 1-1.5 mmol/L.
summary(out$Calcium)
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 1.747 2.270 2.344 2.343 2.411 4.160 297
ggplot(na.omit(out), aes(na.omit(out$Calcium))) +
geom_histogram(aes(y=..density..), binwidth=0.01, fill="lightblue") +
geom_density() +
geom_vline(xintercept=c(2.2,2.5)) +
ylab("") +
xlab(" Calcium level mmol/L")
ggplot(na.omit(out), aes(na.omit(out$Phosphorus))) +
geom_histogram(aes(y=..density..), binwidth=0.01, fill="lightblue") +
geom_density() +
geom_vline(xintercept=c(1,1.5)) +
ylab("") +
xlab("Phosphorus level mmol/L")
Red Blood Cells (RBC) A count of the actual number of red blood cells per volume of blood. Both increases and decreases lead to abnormal conditions. Normal ranges are 4.2 - 6.9 \(\times 10^9\)/L.
[127] “RBC Distribution Width”
[128] “RBC Morphology”
[129] “RBC Morphology: Anisocytosis”
[130] “RBC Morphology: Macrocytosis”
[131] “RBC Morphology: Microcytosis”
[132] “RBC Morphology: Spherocytes”
[133] “Red Blood Cells (RBC)”
boxplot(out[,133], horizontal=TRUE, pch=5, xlab="Red Blood Cells (L)")
Substantially positive skewness is observed. So transformation methods will be applied and selected for further analysis.
rbc <- out[,133]
trans1 = sqrt(rbc)
trans2 = log10(rbc)
trans3 = 1/rbc
par(mfrow=c(1,3))
qqnorm(trans1, main="Square root", pch=15)
qqnorm(trans2, main="Logarithmic", pch=15)
qqnorm(trans3, main="Reciprocal", pch=15)
The ALSFRS scale is a list of 10 assessments regarding motor function to assess symptom severity, with each measure ranging from 0 to 4, with 4 being the highest (normal function) and 0 being no function. The score for the individual questions are then summed together to generate a number, and that is the ALSFRS score.
In ALSFRS-R, the modified version of ALSFRS, assessment 10 (respiratory function) is further divided into three questions to better reflect the importance (weighting) of respiratory changes within the scale.
[14] “ALSFRS_R _Total"
[15] “ALSFRS_Total”
[111] “Q1_Speech”
[112] “Q10_Respiratory”
[113] “Q2_Salivation”
[114] “Q3_Swallowing”
[115] “Q4_Handwriting”
[116] “Q5_Cutting”
[117] “Q5a_Cutting without Gastrostomy”
[118] “Q5b_Cutting with Gastrostomy”
[119] “Q6_Dressing and Hygiene”
[120] “Q7_Turning in Bed”
[121] “Q8_Walking”
[122] “Q9_ Climbing_Stairs”
par(mfrow=c(1,1))
boxplot(out[,14], horizontal=TRUE, pch=5, xlab="Summed scores in ALSFRS assessments")
ALSFRS score is negatively skewed in a slight manner. So transformation methods will be applied and selected for further analysis.
frs <- out[,14]
trans1 = frs^2
trans2 = frs^3
trans3 = frs^4
par(mfrow=c(1,3))
qqnorm(trans1, main="Squared", pch=15)
qqnorm(trans2, main="Cubed", pch=15)
qqnorm(trans3, main="To the fourth", pch=15)
Forced vital capacity (FVC) the volume of air that can forcibly be blown out after full inspiration, measured in liters.
FVC Normal: the expected value for a non-ALS patient (control) matched by gender, age and height)
FVC percent: divide FVC liters by FVC normal
[53] “fvc”
[54] “fvc_normal”
[55] “fvc_percent”
Slow vital capacity the maximum volume of air that can be exhaled slowly after slow maximum inhalation, measured in liters.
[142] “svc”
[143] “svc_normal”
[144] “svc_percent”
summary(out$fvc_percent)
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 22.72 72.10 82.81 83.00 93.80 149.80 163
summary(out$svc_percent)
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 53.87 75.22 83.98 85.08 93.46 122.40 2025
Some patients received placebo treatments, while others received experimental treatments (medication).
out$treatment_group[is.na(out$treatment_group)] <- "Unknown"
lv <- levels(factor(out$treatment_group))
ggplot(out, aes(out$fvc_percent, fill=out$treatment_group)) +
geom_density(alpha=.6)
## Warning: Removed 163 rows containing non-finite values (stat_density).