– Aditya, Nick, Jonathan, and Nikki
As in previous times, we will see results from the Chicago model, for a 25,000 day simulation (approx. 68 years).
New additions since last time:
Testing module
PrEP module
The model includes:
Currently, we are modeling a testing-and-diagnosis piece. The implementation for the starting network is complete. Temporal simulation of testing and diagnosis is in progress. More details are here.
Following this step, we will include PrEP.
Other points for discussion: Lag time between diagnosis and ART initiation; ART/PrEP cessation; mixing parameters; matching simulated and empirical age structure.
Below, is an “interrogation” of the model, to demonstrate its various features.
These datasets include the sexual network at the last time step, biomarker data with detailed trajectories, infection, death and partnership events, and counts of various quantities at each time step.
We first compute the annual growth rate.
The final population size is 1.0914 × 104, corresponding to a growth rate of 2.89%. (The rate of entry of new individuals is a free parameter, and will be adjusted to reflect data form the three cities).
The age distribution at the end of the simulation is below.
age <- net%v%"age"
plot(
qplot(age, geom="histogram", binwidth=5,
col=I("red"),
alpha=I(.2))
)
The overall prevalence at the end of the simulation was about 38.5%. HIV prevalence as a function of time is shown below.
The annual incidence rates over the course of the simulation are shown below.
The mean annual incidence, averaged over the length of the simulation, is 2.48%.
The target main mean degree is 0.46 and a plot of the simulated mean degree, over the course of the simulation, is below.
The simulated and target momentary degree distributions are below.
## Loading required package: data.table
##
## Attaching package: 'data.table'
## The following objects are masked from 'package:dplyr':
##
## between, last
## The following objects are masked from 'package:reshape2':
##
## dcast, melt
The target mean duration is 911. In the simulated data above, the interquartile range is 227, 1086.
The target casual mean degree is 0.59 and a plot of the simulated mean degree, over the course of the simulation, is below.
The simulated and target momentary degree distributions are below.
## degree0 degree1 degree2 degree3 degree4 degree5 degree6
## 6044 3731 877 214 38 9 1
pt_data_casual <- pt_data[network_type == 1]
# group by p1, p2
pt_data_casual[,.(counts = .N),by=.(p1,p2),][,unique(counts)] # 1 2 4 3
## [1] 1 2 4 3
pt_data_casual[,.(counts = .N),by=.(p1,p2),][,.N,by=counts]
## counts N
## 1: 1 9723
## 2: 2 157075
## 3: 4 158
## 4: 3 10
# so focus on just 2 is justified
## (old DONOTUSE: res <- pt_data_casual[,.(tick, counts = .N),by=.(p1,p2),][counts == 2,abs(diff(tick))])
res <- pt_data_casual[,.(tick, counts = .N),by=.(p1,p2),][counts == 2,.(dur = abs(diff(tick))), by=.(p1,p2)][,dur]
summary(res)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.0 92.0 220.0 317.9 442.0 4436.0
res_dt <- as.data.frame(res)
qplot(res, geom="histogram", binwidth=100,
col=I("red"),
alpha=I(.2))
sum.res <- summary(res)
The target mean duration is 335. In the simulated data above, the interquartile range is 92, 442.
p_inf_status <- person_data$infection.status
p_time_of_infection <- person_data$time.of.infection
p_art_status <- person_data$art.status
p_time_of_art_initiation <- person_data$time.of.art.initiation
table(p_inf_status)
## p_inf_status
## 0 1
## 10522 9291
table(p_art_status)
## p_art_status
## 0 1
## 12961 6852
xtabs(~factor(p_inf_status, exclude=NULL) +
factor(p_art_status, exclude=NULL))
## factor(p_art_status, exclude = NULL)
## factor(p_inf_status, exclude = NULL) 0 1
## 0 10522 0
## 1 2439 6852
## p_time_of_art_initiation == -1 implies that the person is not on ART (i.e. ART status is 0).
The distribution of the number of tests for all people (tests are recorded until a positive diagnosis, or death, or completion of the simulation) is below,
ggplot(person_data, aes(number.of.tests))+
geom_histogram(binwidth=1, col=I("red"))+
xlab("number of times a person tests")+
ylab("number of persons")+
scale_x_continuous(breaks=c(seq(0, max(person_data$number.of.tests), 3)))
ggplot(person_data, aes(number.of.tests))+
geom_bar(aes(col=I("red"), y=(..count..)/sum(..count..)))+
xlab("number of times a person tests")+
ylab("proportion of persons")+
scale_x_continuous(breaks=c(seq(0, max(person_data$number.of.tests), 3)))
The summary statistics for the number of tests is below:
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.000 1.000 5.000 7.316 12.000 39.000
The distribution of the number of tests for individuals who received a positive diagnosis (testing stops when the person is diagnosed with HIV):
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.000 3.000 7.000 8.332 12.000 37.000
## [1] 5213 3
## [1] "p_id" "counts"
## p_id counts
## Min. : 7694 Min. :1.000
## 1st Qu.:12860 1st Qu.:1.000
## Median :15666 Median :1.000
## Mean :15148 Mean :1.521
## 3rd Qu.:17828 3rd Qu.:2.000
## Max. :19793 Max. :6.000
In the last 2 years, there were a total of 5213 tests.
The mean (median) number of times a negative/undiagnosed individual tested was 1.52 (1), relative to empirical estimates of 3.2 (3) tests, in the last 2 years.
The ART initiation portion was simplified so that all infected's initate ART 1 year after seroconversion. At presetnt, the proportion of the infected on ART was
on.art <- which(net%v%"art.status" == 1)
length(on.art)
[1] 3576
length(on.art)/length(infected)
[1] 0.8508208
0.85. The total proportion of all people in the population who were on ART is 32.77%.
## Selected upto n.to.follow infecteds at random
n_to_follow <- 10
infectees <- inf_event_data$infectee
uniq_biom_pid <- unique(biom_data$p_id)
infectee_uniq_biom_pid <- uniq_biom_pid[which(uniq_biom_pid %in% infectees)]
new <- biom_data[which(biom_data$p_id == infectee_uniq_biom_pid[1]),]
if (length(infectee_uniq_biom_pid > n_to_follow)){
infectee_uniq_biom_pid <- infectee_uniq_biom_pid[1:n_to_follow]#first 10
#infectee_uniq_biom_pid <- sample(infectee_uniq_biom_pid,
#n_to_follow, replace=FALSE)# random 10
}
for (i in 2:length(infectee_uniq_biom_pid)){
new_entry <- biom_data[which(biom_data$p_id == infectee_uniq_biom_pid[i]),]
new <- rbind(new, new_entry)
}
new$p_id <- as.character(new$p_id)
par(mfrow=c(2,1))
ggplot(new, aes(x=tick/365, y=viral_load, color=p_id))+
geom_line()+xlab("year")
ggplot(new, aes(x=tick/365, y=cd4_count, color=p_id))+
geom_line()+xlab("year")
Example CD4 and viral load trajectories for 10 randomly selected, HIV-infected individuals are shown above. Individuals on ART are identifiable as those whose CD4 counts start to recover after decreasing initially; individuals who do not go on ART do experience a monotonic decrease.
The features we have in the model appear to work well so far. We are in the process of adding more (as mentioned above), and continuing to monitor emergent trends in the behavior of key parameters and outcomes.
Suggestions welcome!