library(pacman); p_load(dplyr)
Because the assumption of equal intervals rarely applies to ordinal-scaled data, it is frequently inappropriate to utilize change scores with them. Ferreira, Almeida & Luiz (2013) presented a simple, interpretable measure of ordinal change. Because methods are often not adopted unless code is made available, here, I present code to compute it and to bootstrap it to get an arbitrary confidence interval.
The first quantity Ferreira, Almeida & Luiz (henceforth, FAL) introduced was \(I_pC\), the Indicator of Positive Change. This quantity was given by the formula
\[\frac{\sum_{i=1}^2 \sum_{j=i+1}^3 C_{ij}(j-i)}{\sum_{j=2}^3N_j(j-1)}\times 100\]
The second quantity was the \(I_nC\), or Indicator of Negative Change. This quantity was simply \(I_pC\) used with transposed data. The following function will take your contingency table - formatted as a table or a matrix or in long format - and give you both values.
IpnC <- function(x, r = 2){
PNum <- sum(x[v<-which(upper.tri(x), arr.ind = T)] * (v[,2] - v[,1]))
PDen <- sum(colSums(x)*(1:ncol(x)-1))
CRat <- PNum/PDen
xT <- t(x)
NNum <- sum(xT[v<-which(upper.tri(xT), arr.ind = T)] * (v[,2] - v[,1]))
NDen <- sum(colSums(xT)*(1:ncol(xT)-1))
NRat <- NNum/NDen
cat(paste("Numerator (IpC) = ", PNum, "Denominator (IpC) = ", PDen, "IpC = ", round(CRat, r)), "\n\n")
cat(paste("Numerator (InC) = ", NNum, "Denominator (InC) = ", NDen, "InC = ", round(NRat, r)), "\n\n")}
To show that this works, consider their Table 2 and the resulting values for \(I_pC\) and \(I_nC\) they arrived at, 0.58 and 0.06, respectively.
m = read.table(text = "
1 2 3 4
562 674 213 3
29 489 412 7
6 31 76 6
0 0 0 0", header = T)
IpnC(m)
## Numerator (IpC) = 1541 Denominator (IpC) = 2644 IpC = 0.58
##
## Numerator (InC) = 72 Denominator (InC) = 1175 InC = 0.06
Finally, we can use the nonparametric bootstrap to construct a confidence interval in the same sort of fashion one might get the p-value and confidence interval for the overlap of two sets of genes (e.g., https://www.biostars.org/p/458853/, which is compared to the results from an exact hypergeometric test). To do this, we first need a function for resampling a contingency table.
ContingencyResample <- function(x){
m <- as.matrix(x)
obs <- sum(m)
cols <- ncol(m)
rows <- nrow(m)
col.m <- rep(1:cols%x%rep.int(1, rows), as.vector(m))
row.m <- rep(rep.int(1, cols)%x%1:rows, as.vector(m))
bs <- sample(length(col.m), replace = T)
rem <- t(diag(rows)[row.m[bs],])%*%diag(cols)[col.m[bs],]
colnames(rem) <- colnames(m)
rownames(rem) <- rownames(m)
return(as.table(rem))}
Now that we can resample our contingency table, we can move on to the actual bootstrapping. If you want to combine these functions, you can do so with the code below.
IpCBS <- function(x, p = 0.05, tail = 2, reps = sum(x)){
replicate(reps, ContingencyResample(x), simplify = F) %>%
sapply(function(x){
numP = sum(x[v<-which(upper.tri(x), arr.ind = T)]*(v[,2] - v[,1]))
denP = sum(colSums(x)*(1:ncol(x)-1))
numP/denP}) %>% quantile(c((0+p/tail), (0+p), 0.10, 0.25, 0.50, 0.75, 0.90, (1-p), (1-p/tail)))}
InCBS <- function(x, p = 0.05, tail = 2, reps = sum(x)){
tx = t(x)
replicate(reps, ContingencyResample(tx), simplify = F) %>%
sapply(function(tx){
numN = sum(tx[v<-which(upper.tri(tx), arr.ind = T)]*(v[,2] - v[,1]))
denN = sum(colSums(tx)*(1:ncol(tx)-1))
numN/denN}) %>% quantile(c((0+p/tail), (0+p), 0.10, 0.25, 0.50, 0.75, 0.90, (1-p), (1-p/tail)))}
We can use these with FAL’s data. In the interest of exactly reproducing their results to verify the output, they computed a 95% CI for \(I_pC\) of 0.56 to 0.60 by resampling 100 times, though the default for this function is set to the n represented in the contingency table. They did not provide bootstrap values for \(I_nC\), but we can guess they were probably from 0.05 to about 0.08. The seed was chosen for no reason except that I wanted to repeat “100” another time.
set.seed(100)
IpCBS(m, reps = 100); InCBS(m, reps = 100)
## 2.5% 5% 10% 25% 50% 75% 90% 95%
## 0.5629996 0.5683587 0.5721302 0.5787766 0.5832848 0.5885735 0.5959418 0.5982235
## 97.5%
## 0.6000077
## 2.5% 5% 10% 25% 50% 75% 90%
## 0.04994870 0.05192576 0.05360353 0.05664432 0.06167984 0.06726186 0.07232989
## 95% 97.5%
## 0.07381760 0.07469734
The \(I_pC\) results are basically theirs, probably plus or minus some minuscule software-related variance.
If you need to measure change in an ordinal variable measured at two occasions, this method will suit you better than change scores, and it’ll probably be simpler than working with RIDITs. As a final note, if you make a method, please make an effort to provide code. It’ll help people out and may even get you more citations.
Ferreira, M. L. P., Almeida, R. M. V. R., & Luiz, R. R. (2013). A new indicator for the measurement of change with ordinal scores. Quality of Life Research, 22(8), 1999–2003.