1 Notes on Arrival module

There is a part of the EpiModel simulation in which we attempting to achieve a network that has the correct

Joint age / race population distribution
Marginal age and race-specific prevalences.
Joint age / race / network specific mean degree distributions

Importantly, we seek to hit all of these targets in a specific year, which is currently set to be 2019, but which I will call the target year.

The arrival module (as well as some other modules) behaves differently before and after the target year. Before the target year, many aspects of the modules are pushing the network to conform to our targets. However, after the target year the modules will no longer be pushing the network in any specific direction, with the hope of observing the “natural” consequences of the network’s characteristics. We hope this approach allows us to hit our targets using a more hands-on approach, but also allows us to observe realistic outcomes after the target year.

1.1 Number of Nodes Added

1.1.1 Pre Target Year

A back-calculation is done based on the assumed age and race-specific mortality rates to determine the number of individuals in each age and race that should enter the population each year.

exp_num <- dat$param$demog_match_arrival_df %>%
 filter(entry_year == cur_year)
nNew <- rpois(n = 1, lambda = sum(exp_num$num_ent_week))

1.1.2 Post Target Year

After the target year, the total population size grows according to the OFM projections for overall population growth in Washington State.

Because the OFM uses estimates of the entire state rather than our 15,000 node network, we calculate the OFM projected population growth as a percentage of the population size in 2019.
A linear interpolation is used to calculate the expected population size during each week. This prevents large jumps in the network size during the beginning of each year in the simulation.

c_year_ex <- dat$control$year_start + (at - dat$control$start) / 52
cur_year <- floor(c_year_ex)
post_19_grwth <- dat$param$pop_growth_post_targ
if (cur_year %in% post_19_grwth$year){
  yr_rem <- 1 - c_year_ex %% 1
  grwth_vals <- post_19_grwth$s_perc[match(c(cur_year, cur_year + 1),
                                           post_19_grwth$year)]
  multiplier <- sum(grwth_vals * c(yr_rem, 1 - yr_rem))
  orig_num <- orig_num * multiplier
}
nNew <- max(0, round(orig_num - current_num +
                       rnorm(n = 1, mean = 0, sd = 5)))
nNew <- min(nNew, 10) # This is to account for what happens when
                      # We change targets midway through the simulation.
                      # This makes sure we don't shift too fast.

The following is a plot of the OFM population growth projections:

suppressPackageStartupMessages(library(tidyverse))
options(dplyr.summarise.inform = FALSE)
pop_grw_p19 <- WApopdata::wa_age_race_sex_hisp_proj %>%
  filter(hisp != "all") %>% 
  filter(age.grp != "Total") %>%
  group_by(year) %>% summarise(tot = sum(value))
pop_grw_p19$s_perc <- pop_grw_p19$tot / 
  pop_grw_p19$tot[pop_grw_p19$year == "2019"]
pop_grw_p19 %>% ggplot(aes(x = year, y = 100 * s_perc)) +
  geom_line(color = "blue", size = 2) + ylab("Percentage") + xlab("Year") +
  geom_hline(yintercept = 100) + geom_vline(xintercept = 2019)

1.2 Attributes

Initially, all attributes are set to an NA value:

dat$attr <- lapply(dat$attr, {
  function(x)
    c(x, rep(NA, nNew))
})

1.2.1 uid

Each node is assigned a Unique id. This uid is used in order to keep track of nodes even after they leave the edge list.

dat$attr$uid[newIds] <- dat$temp$max.uid + (1:nNew)
dat$temp$max.uid <- dat$temp$max.uid + nNew

1.2.2 Race

1.2.2.1 Pre Target Year

Before the target year, the race of each individual is based on the back-calculated number of individuals of each race that will enter the population during each year:

race_dist <- dat$param$demog_match_arrival_df %>%
  filter(entry_year == cur_year) %>%
  select(race, num_ent_week, prop)
race <- sample(race_dist$race, length(newIds),
               prob = race_dist$prop, replace = T)
dat$attr$race[newIds] <- race

1.2.2.2 Post Target Year

After the target year, the race of each individual is based on the racial distribution of the Census’s estimate of the 15-24-year-old population in Washington state.

race_dist <- dat$param$post_race_distr
race <- sample(race_dist$race, length(newIds),
               prob = race_dist$prop, replace = T)
dat$attr$race[newIds] <- race

1.2.3 Region

The regional distribution is taken at all points from the regional distribution target (stored in param$epistats$attrdist).

region_dist <- dat$param$epistats$attrdist$region
region <- sample(region_dist$region, length(newIds),
               prob = region_dist$prop, replace = T)
dat$attr$region[newIds] <- region

1.2.4 Age

All individuals entering the network come in at the age of 15

arrival_ages <- dat$param$arrival.age
dat$attr$age[newIds] <- arrival_ages

1.2.5 Disease status and related

Disease status for everyone entering the population is set to be zero.

dat$attr$status[newIds] <- rep(0, nNew)
dat$attr$diag.status[newIds] <- rep(0, nNew)
dat$attr$rGC[newIds] <- dat$attr$GC.timesInf[newIds] <- 0
dat$attr$uGC[newIds] <- dat$attr$GC.timesInf[newIds] <- 0
dat$attr$rCT[newIds] <- dat$attr$CT.timesInf[newIds] <- 0
dat$attr$uCT[newIds] <- dat$attr$CT.timesInf[newIds] <- 0
dat$attr$count.trans[newIds] <- 0

1.2.6 Testing behavior

Testing behavior is assigned randomly, conditional on race.

rates <- dat$param$hiv.test.late.prob[race_num]
dat$attr$late.tester[newIds] <- rbinom(length(rates), 1, rates)

1.2.7 Treatment Trajectory

Treatment trajectory is assigned randomly, with probabilities given conditionally on race and region. There are three different trajectories (prop.1 - prop.3), but I am not sure what these do.

tt.traj.dt <- dat$param$tt.traj.dt
tmp_mat <- as.matrix(
  tt.traj.dt[.(races, regions), .(prop.1, prop.2, prop.3)]
)
tt.traj <- apply(tmp_mat, 1, function(x) {
  sample(c(1:3), 1, x, replace = T)
})
dat$attr$tt.traj[newIds] <- tt.traj

1.2.8 Circumcision

This attribute is once again assigned randomly with probabilities given conditionally on race

circ <- rep(NA, nNew)
for (i in 1:3) {
  ids.race <- which(dat$attr$race[newIds] == c("B", "H", "O")[i])
  circ[ids.race] <- rbinom(length(ids.race), 1, dat$param$circ.prob[i])
}
dat$attr$circ[newIds] <- circ

1.2.9 Role Type

This attribute is assigned with probabilities of each role being the same for all individuals:

ns <- dat$param$netstats$attr
role.type_dist <- dat$param$epistats$attrdist$role.type
role.type <- sample(role.type_dist$role.type, size = length(newIds),
                    replace = TRUE, prob = role.type_dist$prop)
dat$attr$role.type[newIds] <- role.type
ins.quot <- rep(NA, nNew)
ins.quot[dat$attr$role.type[newIds] == 0]  <- 1
ins.quot[dat$attr$role.type[newIds] == 1]  <- 0
ins.quot[dat$attr$role.type[newIds] == 2]  <-
                                runif(sum(dat$attr$role.type[newIds] == 2))
dat$attr$ins.quot[newIds] <- ins.quot

1.2.10 Risk Group for One-Time Partners

This attribute is randomly assigned without consideration of any other attributes.

dat$attr$risk.grp[newIds] <- sample(1:5, nNew, TRUE)

1.2.11 Prep and Testing

All entering nodes are assigned to not being using PrEP, or have ever been tested:

dat$attr$prepStat[newIds] <- 0
dat$attr$num.neg.tests[newIds] <- 0

1.3 Other attributes (assigned in init_whamp_attribute_msm)

1.3.1 Awareness and Interest in PrEP

On entry, PrEP awareness and interest is assigned in the same way it is for other individuals in the population. This assignment is governed by random chance, various parameters, and the assign_prep_awareness_interest function. Here is the code chunk used to assign these two attributes:

prep_aware <- dat$param$prep_aware
prep_interest <- dat$param$prep_interest
prep_current <- dat$param$prep_current
prep.adhr.dist <- dat$param$prep.adhr.dist

# pull demographic attributes 
race <- dat$attr$race
region <- dat$attr$region

# assign awareness 
ids_aware <- assign_prep_awareness_interest(
  ids, prep_aware, list(race, region), prep_cascade = "prep_aware"
) 

dat$attr$prepAware[ids_aware] <- 1

if (length(ids_aware) > 0) {
  # assign interest (among the awared)
  ids_interest <- assign_prep_awareness_interest(
    ids_aware, prep_interest, list(region), 
    prep_cascade = "prep_interest", process = "initial"
  ) 
  dat$attr$prepInterest[ids_interest] <- 1
}

For more information, see mod.pdap.R and the assign_prep_awareness_interest function inside this script.

Arrival Module Summary

Adam Elder

2021-01-09