There is some confusion about coding of STA variable in the icu dataset, which is part of aplore3 package. Official website of “Applied Logistic Regression”(Lemeshow, Hosmer, Sturdivant, 3rd edition). It all boils down to whether there are 160 out of 200 dead or alive.
This is what aplore3 and book’s website says:
library(aplore3)
data(icu)
str(icu)
## 'data.frame': 200 obs. of 21 variables:
## $ id : int 4 8 12 14 27 28 32 38 40 41 ...
## $ sta : Factor w/ 2 levels "Died","Lived": 2 1 1 1 2 1 1 1 1 1 ...
## $ age : int 87 27 59 77 76 54 87 69 63 30 ...
## $ gender: Factor w/ 2 levels "Male","Female": 2 2 1 1 2 1 2 1 1 2 ...
## $ race : Factor w/ 3 levels "White","Black",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ ser : Factor w/ 2 levels "Medical","Surgical": 2 1 1 2 2 1 2 1 2 1 ...
## $ can : Factor w/ 2 levels "No","Yes": 1 1 1 1 1 1 1 1 1 1 ...
## $ crn : Factor w/ 2 levels "No","Yes": 1 1 1 1 1 1 1 1 1 1 ...
## $ inf : Factor w/ 2 levels "No","Yes": 2 2 1 1 2 2 2 2 1 1 ...
## $ cpr : Factor w/ 2 levels "No","Yes": 1 1 1 1 1 1 1 1 1 1 ...
## $ sys : int 80 142 112 100 128 142 110 110 104 144 ...
## $ hra : int 96 88 80 70 90 103 154 132 66 110 ...
## $ pre : Factor w/ 2 levels "No","Yes": 1 1 2 1 2 1 2 1 1 1 ...
## $ type : Factor w/ 2 levels "Elective","Emergency": 2 2 2 1 2 2 2 2 1 2 ...
## $ fra : Factor w/ 2 levels "No","Yes": 2 1 1 1 1 2 1 1 1 1 ...
## $ po2 : Factor w/ 2 levels "> 60","<= 60": 2 1 1 1 1 1 1 2 1 1 ...
## $ ph : Factor w/ 2 levels ">= 7.25","< 7.25": 2 1 1 1 1 1 1 1 1 1 ...
## $ pco : Factor w/ 2 levels "<= 45","> 45": 2 1 1 1 1 1 1 1 1 1 ...
## $ bic : Factor w/ 2 levels ">= 18","< 18": 1 1 1 1 1 1 1 2 1 1 ...
## $ cre : Factor w/ 2 levels "<= 2.0","> 2.0": 1 1 1 1 1 1 1 1 1 1 ...
## $ loc : Factor w/ 3 levels "Nothing","Stupor",..: 1 1 1 1 1 1 1 1 1 1 ...
summary(icu$sta)
## Died Lived
## 160 40
From aplore3 package:
icu$sta = Vital Status at hospital discharge (1: Died, 2: Lived)
icu$loc = Level of consciousness at ICU admission (1: No coma or deep stupor, 2: Deep stupor, 3: Coma)
summary(icu$loc)
## Nothing Stupor Coma
## 185 5 10
Btw, level “Nothing” stands for no coma or deep stupor. This is the page that briefly describes some altered states of consciousness.
However, after making some graphs, and attending the course at canvas i’d agree that there are in fact 160 patient that lived and 40 that died. This is the graph:
fit <- glm(sta ~ age + loc, data = icu, family = binomial(link = logit))
summary(fit)$coef
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.48445767 8.134410e-01 -4.28360232 1.838915e-05
## age 0.02820048 1.219358e-02 2.31273116 2.073742e-02
## locStupor 18.34967346 1.065825e+03 0.01721641 9.862640e-01
## locComa 3.05930062 8.254756e-01 3.70610664 2.104697e-04
library(ggplot2)
ggplot(data = icu, aes(x = age, y = sta)) +
geom_point(color = "transparent") +
geom_point(data = icu, aes(x = age, y = fit$fitted + 1, color = loc))
It is not logical that no coma or stupor is associated with death more than coma or stupor. Hence, we can say that there are 160 live patients on release from hospital and 40 dead. The logistic regression is a bit off - the pretty large slope and SE for icu$loc = "Stupor" could be due to small number of such cases (only 5) which are all uniform, hence case of complete separation. But this isn’t important because I just wanted to show the general trend.
icu[icu$loc == "Stupor", c(2, 21)]
## sta loc
## 50 Lived Stupor
## 74 Lived Stupor
## 100 Lived Stupor
## 186 Lived Stupor
## 196 Lived Stupor
icu[icu$loc == "Coma", c(2, 21)]
## sta loc
## 22 Died Coma
## 46 Lived Coma
## 57 Lived Coma
## 79 Lived Coma
## 103 Lived Coma
## 119 Lived Coma
## 147 Lived Coma
## 154 Lived Coma
## 163 Lived Coma
## 190 Died Coma
summary(icu[icu$loc == "Nothing", c(2, 21)])
## sta loc
## Died :158 Nothing:185
## Lived: 27 Stupor : 0
## Coma : 0