Consider we have the numbers of people infected with parasite of type 1, parasite of type 2, and coinfected with both parasites. The following function returns the minus log-likelihood of a model with no interaction:
mLL_no_interaction <- function(p, data) {
p <- c(p, 1 - p, p * (1 - p))
-dmultinom(data, prob = p, log = TRUE)
}
Here, data is a vector of length 3 that contains the number of cases infected with parasite of type 1, parastie of type 2 and co-infected with the 2 parasites. Argument p (0 < p < 1) is a measure of the force of infection of parasite of type 1 relative to the force of infection of parasite of type 2. For example, p = 0.25 means that parasite of type 1 is 3 times less infectious than parasite of type 2, p = 0.50 means that both parasites are equally infectious. Of note, the dmultinom() function automatically rescales the prob argument so that it sums to 1.
The following function returns the minus log-likelihood of a model with positive interaction:
mLL_positive_interaction <- function(p, data) {
p <- exp(p) # to force parameters to be positive
p <- c(p[1], 1 - p[1], p[1] * (1 - p[1]) + p[2])
-dmultinom(data, prob = p, log = TRUE)
}
Here data is the same as before but p is a vector of length 2. The first value of this vector corresponds to the parameter p of function mLL_no_interaction(). The second value is an additive correction term on the probability of mixed infection. If this parameter is forced to be 0, then the function mLL_positive_interaction() is equivalent to mLL_no_interaction().
The following function returns the minus log-likelihood of a model with negative interaction:
mLL_negative_interaction <- function(p, data) {
p <- exp(p)
if (p[1] * (1 - p[1]) < p[2]) return(NA) # to force interaction parameter to be positive
p <- c(p[1], 1 - p[1], p[1] * (1 - p[1]) - p[2])
-dmultinom(data, prob = p, log = TRUE)
}
Customizing the optim() function a bit so that it forces parameters to be positive:
optim2 <- function(par, ...) {
out <- optim(log(par), ...)
out$par <- exp(out$par)
out
}
The following function uses all the codes above to test interaction by a likelihood ratio test (LRT):
test_interaction <- function(data) {
# starting values:
zero <- 1e-3
p1 <- data[1] / sum(data)
# fitting the 3 models:
no_interaction <- optimize(mLL_no_interaction, 0:1, data = data)
ps_interaction <- optim2(c(p1, zero), mLL_positive_interaction, data = data)
ng_interaction <- optim2(c(p1, zero), mLL_negative_interaction, data = data)
# comparing models with LRT and returning the output:
no_interaction_LL <- no_interaction$objective
interaction_LL <- min(ps_interaction$value, ng_interaction$value)
out <- c(expected = no_interaction$minimum * sum(data),
observed = data[3],
"p-value" = 1 - pchisq(2 * (no_interaction_LL - interaction_LL), 1))
round(out, 4)
}
Trying the test_interaction() function on 3 data sets:
test_interaction(c(23, 54, 3))
## expected observed p-value
## 18.8359 3.0000 0.0003
test_interaction(c(23, 54, 13))
## expected observed p-value
## 25.7511 13.0000 0.4879
test_interaction(c(23, 54, 33))
## expected observed p-value
## 38.3340 33.0000 0.0022
bbmle packagelibrary(bbmle)
## Loading required package: stats4
mLL <- function(p1, p_i, data) {
zero <- 1e-16
p1 <- exp(p1)
p_i <- exp(p_i)
p <- c(p1, 1 - p1, p1 * (1 - p1) * p_i)
if (any(p < zero)) return(NA)
-dmultinom(data, prob = p, log = TRUE)
}
starting_values <- lapply(list(p1 = .2, p_i = 1), log)
model1 <- mle2(mLL, starting_values, data = list(data = c(23, 54, 13)))
exp(coef(model1))
## p1 p_i
## 0.2987058 0.8059566
exp(confint(model1))
## 2.5 % 97.5 %
## p1 0.2042032 0.4063247
## p_i 0.4174455 1.4621331
model0 <- mle2(mLL, starting_values, fixed = lapply(list(p_i = 1), log), data = list(data = c(23, 54, 13)))
anova(model0, model1)
## Likelihood Ratio Tests
## Model 1: model0, [mLL]: p1
## Model 2: model1, [mLL]: p1+p_i
## Tot Df Deviance Chisq Df Pr(>Chisq)
## 1 1 9.3679
## 2 2 8.8867 0.4812 1 0.4879
Putting everything inside a function:
test_interaction2 <- function(data) {
require(bbmle) # for mle2
zero <- 1e-16 # more efficient if taking this out of the "mLL" function.
mLL <- function(p1, p_i, data) {
p1 <- exp(p1)
p_i <- exp(p_i)
p <- c(p1, 1 - p1, p1 * (1 - p1) * p_i)
if (any(p < zero)) return(NA)
-dmultinom(data, prob = p, log = TRUE)
}
model0 <- mle2(mLL, starting_values, fixed = lapply(list(p_i = 1), log),
data = list(data = data))
model1 <- mle2(mLL, starting_values, data = list(data = data))
anova(model0, model1)
}
Note that here, compared to the test_interaction() function, the test_interaction2() function does not depends on any other function defined outside of it: the likelihood function is defined in it.
test_interaction2(c(23, 54, 3))
## Likelihood Ratio Tests
## Model 1: model0, [mLL]: p1
## Model 2: model1, [mLL]: p1+p_i
## Tot Df Deviance Chisq Df Pr(>Chisq)
## 1 1 20.3856
## 2 2 7.5804 12.805 1 0.0003457 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
test_interaction2(c(23, 54, 13))
## Likelihood Ratio Tests
## Model 1: model0, [mLL]: p1
## Model 2: model1, [mLL]: p1+p_i
## Tot Df Deviance Chisq Df Pr(>Chisq)
## 1 1 9.3679
## 2 2 8.8867 0.4812 1 0.4879
test_interaction2(c(23, 54, 33))
## Likelihood Ratio Tests
## Model 1: model0, [mLL]: p1
## Model 2: model1, [mLL]: p1+p_i
## Tot Df Deviance Chisq Df Pr(>Chisq)
## 1 1 18.9477
## 2 2 9.6101 9.3375 1 0.002245 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Give exactly the same results!