This is a 2-dimensional Monte Carlo simulation using the mc2d package.
We will use mcmodel and evalmcmod from the mc2d package. This is a simple way to implement the 2-D simulation because we just need to define the model with mcmodel()
and evaluate it with evalmcmod()
.
We did this earlier in the document, but will add it again here as a reminder of what packages are needed for this example.
# Load packages.
library(mc2d)
## Loading required package: mvtnorm
##
## Attaching package: 'mc2d'
## The following objects are masked from 'package:base':
##
## pmax, pmin
Set the number of simulations for the variability and uncertainty dimensions.
ndvar(401) # Variability
## [1] 401
ndunc(59) # Uncertainty
## [1] 59
The mc2d functions will set nsv
to ndvar()
and nsu
to ndunc()
by default. Alternatively, we could supply these values to the mc2d model functions when we call them.
Define the variables we will use to set the seed
for random sampling and the number of digits
for print()
statements.
seed <- 1
digits <- 5
We will use this variable to explicitly set the seed with the various mc2d functions. Another approach would be to only set it through the model evaluation functions, or not at all. Since we want to do our best to provide the most reproducible results, we set the seed explicitly.
Within the mcmodel function, use mcstoc to define “mc nodes” for each component of the model.
For each stochastic variable, use the mcstoc()
function to create a “mcnode”, supply a probability function, the node type as “V” for variability or “U” for uncertainty, and any additional parameters to be passed to the probability function.
For deterministic variables, create nodes with the mcdata()
function using the “0” type.
We will model the deterministic factors as uniform probablity distributions.
The last statement makes an “mc” object using the mc function.
Here is where you first create all your variables. Look for what the variables mean, and then for the section after “Estimate the exposure using the 0, V and U nodes to create a VU node,” because that is the equation that connects them.
The program will then run through a bunch of loops for that equation, taking into account uncertainty for whichver variables you called “U” or “UV”.
If we wanted to, I think we could throw in a dose-response equation at the end here and generate different plots for the exposure and for the likelihood of infection. TBD.
# Define an exposure model for evaluation by evalmcmod().
expo.mod1 <- mcmodel({
# Values from Example 6.18 from Quantitative Microbial Risk Assessment,
# 2nd Edition by Charles N. Haas, Joan B. Rose, and Charles P. Gerba.
# (Wiley, 2014), pp. 215-216. Other fictitious values are noted below.
# Cistern water viral loading (viruses/L):
cw.vl <- cw.vl <- mcstoc(rnorm, type = "VU", seed = seed, mean = 8.39, sd = 21.8, linf = 0)
# Water consumption (L/day):
dw.ing <- mcdata(1.466, type = "U")
# Hosing per event ingestion rate (L/event): worst case estimate = 10mL or .01L
hose.perevent <- mcdata(.01, type = "0")
# Hosing frequency (hosing/day), fictitious discrete distribution (this basically says that there is a 10% chance someone never hoses, a 70% chance they use the hose every other day, and a 10% chance they use the hose every day):
hosing.frequency <- mcstoc(rempiricalD, type = "V", seed = seed,
values = c(0, 0.5, 1),
prob = c(0.1, 0.7, 0.2))
# Estimate the exposure using the 0, V and U nodes to create a VU node.
expo.mc1 <- (cw.vl * dw.ing) +
(cw.vl * (hose.perevent * hosing.frequency))
# Build a mc model from all of the mcnode objects.
mc(cw.vl, dw.ing, hose.perevent, hosing.frequency, expo.mc1)
})
Evaluate the model with 401 iterations in the variability dimension and 59 iterations in the uncertainty dimension, as set previously.
expo.ev1 <- evalmcmod(expo.mod1, seed = seed)
print(expo.ev1, digits = digits)
## node mode nsv nsu nva variate min mean median
## 1 cw.vl numeric 401 59 1 1 -85.411 8.36581 8.2169
## 2 dw.ing numeric 1 59 1 1 1.466 1.46600 1.4660
## 3 hose.perevent numeric 1 1 1 1 0.010 0.01000 0.0100
## 4 hosing.frequency numeric 401 1 1 1 0.000 0.54613 0.5000
## 5 expo.mc1 numeric 401 59 1 1 -125.639 12.30993 12.0834
## max Nas type outm
## 1 91.454 0 VU each
## 2 1.466 0 U each
## 3 0.010 0 0 each
## 4 1.000 0 V each
## 5 134.986 0 VU each
Print a summary and a plot of the evaluation results (expo.ev1
).
# Summarize the results.
summary(expo.ev1)
## cw.vl :
## mean sd Min 2.5% 25% 50% 75% 97.5% Max nsv Na's
## median 8.34 21.8 -56.7 -33.3 -6.34 8.22 23.3 50.5 71.4 401 0
## mean 8.37 21.8 -56.7 -33.9 -6.16 8.24 22.9 51.0 72.7 401 0
## 2.5% 6.52 20.4 -71.8 -39.3 -8.83 5.93 19.8 43.1 60.2 401 0
## 97.5% 10.85 23.6 -43.1 -29.5 -3.55 11.00 25.7 57.5 88.9 401 0
##
## dw.ing :
## NoVar
## median 1.47
## mean 1.47
## 2.5% 1.47
## 97.5% 1.47
##
## hose.perevent :
## NoUnc
## NoVar 0.01
##
## hosing.frequency :
## mean sd Min 2.5% 25% 50% 75% 97.5% Max nsv Na's
## NoUnc 0.546 0.262 0 0 0.5 0.5 0.5 1 1 401 0
##
## expo.mc1 :
## mean sd Min 2.5% 25% 50% 75% 97.5% Max nsv Na's
## median 12.3 32.1 -83.4 -49.0 -9.35 12.09 34.2 74.5 105.0 401 0
## mean 12.3 32.1 -83.4 -49.9 -9.06 12.13 33.8 75.0 106.9 401 0
## 2.5% 9.6 30.0 -105.6 -57.9 -12.99 8.73 29.1 63.5 88.7 401 0
## 97.5% 16.0 34.7 -63.3 -43.5 -5.22 16.18 37.8 84.6 130.8 401 0
# Plot the results.
plot(expo.ev1)
Report the mean and median of the means with a 95% confidence interval (CI95).
mean.risk1 <- sapply(1:ndunc(), function(j) mean(expo.ev1$expo.mc1[, j, ]))
mean(mean.risk1)
## [1] 12.30993
quantile(mean.risk1, probs = seq(0, 1, 0.025))[c("2.5%", "50%", "97.5%")]
## 2.5% 50% 97.5%
## 9.603475 12.284070 15.956326
Plot the empirical cumulative distribution function (ecdf) of the exposure model (expo.mc1
) estimates.
I added that title and those X and Y labels based on what I think it probably means! Tell me if you think it’s wrong.
# Generate an "ecdf" plot. This actually calls plot.mcnode().
plot(expo.ev1$expo.mc1, main = "Exposure Assessment", xlab = "Daily Ingested Crypto Oocysts",ylab = "Proportion of Population")
To Dos:
-decide which variables have variation, which are uncertain, and which are both.
do this for pathogens other than Crypto.
figure out a way to integrate data on the same pathogen from multiple sources. Not at all sure how to do this so far.
add dose-response component
also, figure out how to tell the model that NO ONE GETS AN EXPOSURE LESS THAN 0. Actually, WHY it is generating exposures less than 0 in the first place. THAT SHOULD NOT BE POSSIBLE.
-Update: it has something to do with the variability and uncertainty in the cistern water pathogen estimates, because if you just set that to be a value of 8.39, you get a nice narrow range of dose estimates around 12, with just a little variation based on how often you use a hose. So how to fix that????????