Contingency tables
- Consider an example of two treatment groups, e.g. Control group and Treatment group
- The patients are also classified according to the grade of disease, with grades I, II, and III
- The count for each of these can be expressed as in the code below using
rbind()
# Row form using rbind()
rbind(c(22, 29), c(28, 23), c(27, 24))
## [,1] [,2]
## [1,] 22 29
## [2,] 28 23
## [3,] 27 24
- In this example there are 77 patients in the control arm (sum of the first column) and 76 in the treatment arm (sum of the second column)
- The rows represent the three disease grades with 51 with grade I disease, 51 with grade II disease, and 51 with grade III disease (sums of the rows)
- The contingency tables expresses the division of the patients in the size possible groups
- The same can be achived with
cbind() where each numeric vector will have its values expressed as a column
# Column form using cbind()
cbind(c(22, 28, 27), c(29, 23, 24))
## [,1] [,2]
## [1,] 22 29
## [2,] 28 23
## [3,] 27 24
- Yet another way to create the same data is through the use of the
matrix() command
- The number of rows is specified
- Default is to fill in the table column by column
matrix(c(22, 28, 27, 29, 23, 24),
nrow = 3)
## [,1] [,2]
## [1,] 22 29
## [2,] 28 23
## [3,] 27 24
- This can also be done row by row
- Note the change in the order of the values needed to achieve this
matrix(c(22, 29, 28, 23, 27, 24),
byrow = TRUE,
nrow = 3)
## [,1] [,2]
## [1,] 22 29
## [2,] 28 23
## [3,] 27 24
- Storing the table as a computer variable allows for the addition of row and column names
contingencyTable <- matrix(c(22, 29, 28, 23, 27, 24),
byrow = TRUE,
nrow = 3)
rownames(contingencyTable) <- c("Grade I",
"Grade II",
"Grade III")
colnames(contingencyTable) <- c("Control",
"Treatment")
contingencyTable
## Control Treatment
## Grade I 22 29
## Grade II 28 23
## Grade III 27 24
- Data in a
data.frame can also be expressed as a contingency table
set.seed(123)
df <- data.frame(Grade = sample(c("Grade I",
"Grade II",
"Grade III"),
size = 153,
replace = TRUE),
Group = sample(c("Control",
"Treatment"),
size = 153,
replace = TRUE))
head(df)
## Grade Group
## 1 Grade I Control
## 2 Grade III Control
## 3 Grade II Control
## 4 Grade III Treatment
## 5 Grade III Control
## 6 Grade I Control
- Using the
table() command
table(df$Grade,
df$Group)
##
## Control Treatment
## Grade I 22 29
## Grade II 28 23
## Grade III 27 24
Stacked bar chart
- A stacked bar chart is a simple way of visualizing the data
barplot(table(df$Grade,
df$Group),
legend.text = TRUE,
main = "Number of patients with grade of disease (by treatment group)",
xlab = "Treatment group",
ylab = "Grade count",
col = c("deepskyblue",
"orange",
"gray"),
border = NA,
las = 1)

- By changing the x and y axis values a transpose of the data can be visualized
barplot(table(df$Group,
df$Grade),
legend.text = TRUE,
main = "Number of patients in each group (by grade of disease)",
xlab = "Grade of disease",
ylab = "Treatment group count",
col = c("deepskyblue",
"orange"),
border = NA,
las = 1)
