For our first alternative, we will use mcmodel and evalmcmod from the mc2d package. This is a simple way to implement the 2-D simulation because we just need to define the model with mcmodel() and evaluate it with evalmcmod().

Notes: http://stackoverflow.com/questions/15303283/how-to-do-vlookup-and-fill-down-like-in-excel-in-r

Load packages

We did this earlier in the document, but will add it again here as a reminder of what packages are needed for this example.

# Load packages.
library(mc2d)
## Loading required package: mvtnorm
## 
## Attaching package: 'mc2d'
## The following objects are masked from 'package:base':
## 
##     pmax, pmin
library(data.table)

Define variables

Set the number of simulations for the variability and uncertainty dimensions.

nsv <- 401
nsu <- 59
ndvar(nsv)  # Variability
## [1] 401
ndunc(nsu)   # Uncertainty
## [1] 59

The mc2d functions will set nsv to ndvar() and nsu to ndunc() by default. Alternatively, we could supply these values to the mc2d model functions when we call them.

Define the variables we will use to set the seed for random sampling and the number of digits for print() statements.

seed <- 1
digits <- 5

Introducing the bootstrapped dose-response data, fread command and sampling from Brian:

alpha <- fread("~/u_drive/envh543/output/alphalist.csv", sep = ',')
n50 <- fread("~/u_drive/envh543/output/N50list.csv", sep = ',')
cdr <- merge(alpha,n50, by = "V1")
scdr <- cdr[sample(nrow(cdr), nsu), list(alpha, N50)] #sampling rows in the dose-response bootstrapped table equal to the number of simulations in the uncertainty dimension, aka NSU

We will use this variable to explicitly set the seed with the various mc2d functions. Another approach would be to only set it through the model evaluation functions, or not at all. Since we want to do our best to provide the most reproducible results, we set the seed explicitly.

Define exposure model

Within the mcmodel function, use mcstoc to define “mc nodes” for each component of the model.

For each stochastic variable, use the mcstoc() function to create a “mcnode”, supply a probability function, the node type as “V” for variability or “U” for uncertainty, and any additional parameters to be passed to the probability function.

For deterministic variables, create nodes with the mcdata() function using the “0” type.

We will model the deterministic factors as uniform probablity distributions.

The last statement makes an “mc” object using the mc function.

# Define an exposure model for evaluation by evalmcmod().
expo.mod1 <- mcmodel({
  # Values from Example 6.18 from Quantitative Microbial Risk Assessment, 
  # 2nd Edition by Charles N. Haas, Joan B. Rose, and Charles P. Gerba. 
  # (Wiley, 2014), pp. 215-216. Other fictitious values are noted below.
  
  # Cistern water viral loading (viruses/L):
  cw.vl <- cw.vl <- mcstoc(rnorm, type = "VU", seed = seed, mean = 8.39, sd = 21.8, rtrunc = TRUE, linf = 0)
  
  # Water consumption (L/day):
  dw.ing <- mcdata(1.466, type = "U")

  # Hosing per event ingestion rate (L/event): worst case estimate = 10mL or .01L - will be IGNORED for the moment
  hose.perevent <- mcdata(.01, type = "0")

  # Hosing frequency (hosing/day), fictitious discrete distribution (this basically says that there is a 10% chance someone never hoses, a 70% chance they use the hose every other day, and a 10% chance they use the hose every day) - will be IGNORED for the moment:
  hosing.frequency <- mcstoc(rempiricalD, type = "V", seed = seed, 
                             values = c(0, 0.5, 1), 
                             prob = c(0.1, 0.7, 0.2))
  
  # Estimate the exposure based ONLY on drinking water ingestion using the 0, V and U nodes to create a VU node.
  expo.mc1 <- (cw.vl * dw.ing) 
  
  #Estimate the risk using a dose-response model pulling from a bootstrapped model of the dose-response
  dose <- mcstoc(rpois, type = "VU", lambda = expo.mc1)
  
  # Create mcnodes from previously sampled dose-response bootstrap estimates. - Brian
  
  a <- mcdata(scdr$alpha, type = "U")
  n50 <- mcdata(scdr$N50, type = "U")
  
  risk <- 1 - (1 + dose*((2^(1/a)-1)/n50))^(-a) #this equation is my version of what came off QMRA Wiki <http://qmrawiki.canr.msu.edu/index.php?title=Dose_response_assessment> and someone should check it -A
  
  # Build a mc model from all of the mcnode objects.
  mc(cw.vl, dw.ing, expo.mc1, risk) 
})

Evaluate the model

Evaluate the model with 5000 iterations in the variability dimension and 250 iterations in the uncertainty dimension, as set previously.

expo.ev1 <- evalmcmod(expo.mod1, seed = seed)
print(expo.ev1, digits = digits)
##       node    mode nsv nsu nva variate        min    mean   median
## 1    cw.vl numeric 401  59   1       1 0.00032242 20.7998 18.15685
## 2   dw.ing numeric   1  59   1       1 1.46600000  1.4660  1.46600
## 3 expo.mc1 numeric 401  59   1       1 0.00047267 30.4925 26.61794
## 4     risk numeric 401  59   1       1 0.00000000  0.1575  0.13028
##        max Nas type outm
## 1  93.7486   0   VU each
## 2   1.4660   0    U each
## 3 137.4355   0   VU each
## 4   0.7396   0   VU each

Summarize results

Print a summary and a plot of the evaluation results (expo.ev1).

# Summarize the results.
summary(expo.ev1)
## cw.vl :
##        mean   sd     Min  2.5%   25%  50%  75% 97.5%  Max nsv Na's
## median 20.9 14.6 0.05243 0.976  9.01 18.2 30.2  54.1 73.0 401    0
## mean   20.8 14.7 0.08499 1.002  9.14 18.1 30.0  54.7 74.0 401    0
## 2.5%   19.7 13.7 0.00431 0.621  7.78 16.3 27.9  51.1 62.8 401    0
## 97.5%  22.2 15.9 0.33585 1.747 10.40 20.0 31.9  60.2 90.0 401    0
## 
## dw.ing :
##        NoVar
## median  1.47
## mean    1.47
## 2.5%    1.47
## 97.5%   1.47
## 
## expo.mc1 :
##        mean   sd     Min  2.5%  25%  50%  75% 97.5% Max nsv Na's
## median 30.7 21.4 0.07686 1.432 13.2 26.7 44.3  79.3 107 401    0
## mean   30.5 21.6 0.12460 1.469 13.4 26.5 43.9  80.1 108 401    0
## 2.5%   28.9 20.1 0.00633 0.911 11.4 23.8 41.0  74.9  92 401    0
## 97.5%  32.5 23.4 0.49236 2.561 15.3 29.3 46.7  88.2 132 401    0
## 
## risk :
##          mean     sd Min    2.5%    25%   50%   75% 97.5%   Max nsv Na's
## median 0.1423 0.0915   0 0.00530 0.0643 0.125 0.203 0.348 0.456 401    0
## mean   0.1575 0.0962   0 0.00663 0.0810 0.150 0.226 0.351 0.436 401    0
## 2.5%   0.0707 0.0498   0 0.00000 0.0303 0.062 0.103 0.179 0.229 401    0
## 97.5%  0.3098 0.1576   0 0.02002 0.1963 0.318 0.437 0.562 0.652 401    0
# Plot the results.
plot(expo.ev1)

Report the mean and median of the exposure means with a 95% confidence interval (CI95).

mean.expo <- sapply(1:ndunc(), function(j) mean(expo.ev1$expo.mc1[, j, ]))
mean(mean.expo)
## [1] 30.49253
quantile(mean.expo, probs = seq(0, 1, 0.025))[c("2.5%", "50%", "97.5%")]
##     2.5%      50%    97.5% 
## 28.89867 30.68882 32.49722

Plot the empirical cumulative distribution function (ecdf) of the exposure model (expo.mc1) estimates.

# Generate an "ecdf" plot. This actually calls plot.mcnode().
plot(expo.ev1$expo.mc1)

Plot the empirical cumulative distribution function (ecdf) of the risk model (risk) estimates.

# Generate an "ecdf" plot. This actually calls plot.mcnode().
plot(expo.ev1$risk)

So if we did this right, could the plot above can be correctly labeled as follows?

# Generate an "ecdf" plot. This actually calls plot.mcnode().
plot(expo.ev1$risk, main = "Daily Risk of Infection with Cryptosporidium by Drinking Water in the USVI", ylab = "Proportion of Population", xlab = "Daily Probability of Infection")

Ongoing questions:

-Can we give this model multiple dose-response bootstraps and ask it to combine them based on how reliable we think each one is? Reread chapter section on this.

-would like to clarify what is happening in the creation of “dose”, in line: >dose <- mcstoc(rpois, type = “VU”, lambda = expo.mc1)

-Best way to combine this with risks for other pathogens, such as Giardia and Pseudomonas

-Confirm 100% that it is okay that we are really only randomly sampling a fraction of the total bootstrapped samples for alpha and N50 that were originally created

-Is that an overly high risk estimate for just ONE pathogen? At least it seems to recognize that no one should have greater than about a 50% risk of infection on any given day.