Effects of alternate approaches to coding partnership active status

Intro

The purpose of this analysis is to evaluate the impact of coding decisions on the active status indicator for partnerships. This has implications for the for the definition of partnership type and duration, and estimates of respondent degree.

The problem

There are two key issues:

Definition of casl (ptype 2) vs. inst (ptype 3) partners
The survey asks respondents if they have only had sex with this partner one time. In a cross-sectional survey, a “yes” answer can mean two things: this is the first of what will be more in the future, so this is a casl partner, or this is the one and only act, so this is an inst partner. The respondent’s answer to the active status question may allow us to distinguish between these two things. Whether we decide to use or ignore active status, this determines the distribution of partnership types, and that, in turn, affects the degree distribution and the duration distribution of active partnerships, since inst partners are excluded from momentary degree counts and durations.
- Current ARTnetData scripts code all one-time partners as “inst” partners, even if the respondent explicitly says they do expect to have sex with this person again. (script here)
Treatment of missing (“DK”) responses to the active status question
The survey asks respondents whether they expect to have sex with this partner again, and answer options include DK. The DK fraction is typically small for heterosexual partnerships, but large for MSM. The assignment of these responses will affect the estimates of degree, the question is by how much.
- Current ARTnet scripts code DK as “not active” (script here).

The distributions of partner type, degree and duration are fundamental features of the sexual networks, the ERGM specifications, and the reachable path. So it is important to understand the implications that our coding decisions can have on these outcomes.

Both current ARTnet coding decisions result in the exclusion of first-time casl partners from the casl partner pool. This will lower the estimated mean degree and concurrency fraction, and raise the estimated mean age for active casl partnerships.

The coding alternatives

These two features, active status, and partner type, are interdependent, so there is a decision tree of options. The sequence is:

Do we use active=yes to allow one-time partners to be classified as casl instead of inst?
Do we impute the “DK” answers, using some rule? This will also reclassify some one-time partners as casl instead of inst.

Raw survey responses

This section presents descriptive stats on the raw survey responses to the “expect to have sex again” (active) and “had sex only once” (once) questions.

Active

The raw responses to the ongoing question, ONGOING.orig are shown below, broken down by the current ARTnet ptype variable (any non-main partnership reported as ONCE is coded as ptype=3).

Plot

Almost 20% of ptype 3 partners are reported as ongoing/active.

# have to construct the overall prev for each yr

activedf <- l %>%
  group_by(ptype, ONGOING.orig) %>%
  summarize(num = n()) %>%
  mutate(pct = num/sum(num))

ggplot(activedf, 
       aes(x=factor(ONGOING.orig), 
           y=pct,
           group=factor(ptype), 
           fill=factor(ptype))) +
  geom_bar(stat="identity", position="dodge", alpha=0.7) +
  scale_y_continuous(labels = scales::percent_format(accuracy = 0.1)) +
  labs(title = "Raw responses to active status by ARTnet ptype",
       x = "ONGOING.orig", y = "Percent",
       fill = "ptype")

Table

sjPlot::tab_xtab(var.row = l$ONGOING.orig, 
                 var.col = l$ptype, 
                 title = "Raw responses to ONGOING.orig x ptype",
                 show.row.prc = T,
                 show.summary = F)

Raw responses to ONGOING.orig x ptype
ONGOING.orig	ptype			Total
ONGOING.orig	1	2	3	Total
0	402 6.8 %	1084 18.4 %	4406 74.8 %	5892 100 %
1	2110 29.4 %	3659 50.9 %	1414 19.7 %	7183 100 %
88	103 3.4 %	1220 39.7 %	1749 56.9 %	3072 100 %
99	1 4.3 %	4 17.4 %	18 78.3 %	23 100 %
Total	2616 16.2 %	5967 36.9 %	7587 46.9 %	16170 100 %

One-time

The raw responses to the one-time question, ONCE are shown below, broken down by the raw response to the ONGOING.orig question.

Plot

About 20% of those who say the partnership is ongoing/active report having sex only once.

otdf <- l %>%
  mutate(ONCE = ifelse(ONCE==1, "yes", "no")) %>%
  group_by(ONGOING.orig, ONCE) %>%
  summarize(num = n()) %>%
  mutate(pct = num/sum(num))

ggplot(otdf,
       aes(x=ONCE, 
           y=pct, 
           group=factor(ONGOING.orig), 
           fill=factor(ONGOING.orig))) +
  geom_bar(stat="identity", position="dodge", alpha=0.7) +
  scale_y_continuous(labels = scales::percent_format(accuracy = 0.1)) +
  labs(title = "Raw responses to had sex only once by active status",
       x = "Once", y = "Percent",
       fill = "ONGOING.orig")

Table

sjPlot::tab_xtab(var.row = l$ONGOING.orig, 
                 var.col = l$ONCE, 
                 title = "Raw responses to ONCE x ONGOING.orig",
                 show.row.prc = T,
                 show.col.prc = T,
                 show.summary = F)

Raw responses to ONCE x ONGOING.orig
ONGOING.orig	ONCE		Total
ONGOING.orig	1	2	Total
0	4406 74.8 % 58.1 %	1486 25.2 % 17.3 %	5892 100 % 36.4 %
1	1414 19.7 % 18.6 %	5769 80.3 % 67.2 %	7183 100 % 44.4 %
88	1749 56.9 % 23.1 %	1323 43.1 % 15.4 %	3072 100 % 19 %
99	18 78.3 % 0.2 %	5 21.7 % 0.1 %	23 100 % 0.1 %
Total	7587 46.9 % 100 %	8583 53.1 % 100 %	16170 100 % 100 %

Alternate definitions: comparison

Active

We compare 2 alternatives:

active.strict – the current ARTnet coding, where DK is coded 0.
active.imp – imputes the DK responses by ptype: p=0.5 for ptype=2, p=0.1 for ptype=3. Note that ARTnet already recodes all “DK” to active for main partners, and we don’t change that.

Overall, the imputation results in about 10% of casl and inst partnerships having their status changed to active (see table).

Plot

The effect is larger for ARTnet defined casl partners: 25% of the non-active partners have their status changed to active.

activedf <- l %>%
  group_by(ptype, active.strict, active.imp) %>%
  summarize(num = n()) %>%
  mutate(pct = num/sum(num))

ggplot(activedf,
       aes(x=factor(active.strict), y=pct, 
           group=factor(active.imp), fill=factor(active.imp))) +
  geom_bar(stat="identity", position="dodge", alpha=0.7) +
  facet_wrap(~ptype) +
  scale_y_continuous(labels = scales::percent_format(accuracy = 0.1)) +
  labs(title = "active.strict vs. active.imp by ptype",
       x = "active.strict", y = "Percent",
       fill = "active.imp")

Table

sjPlot::tab_xtab(var.row = l$active.strict, 
                 var.col = l$active.imp, 
                 title = "Strict vs. imputed active status indicator",
                 show.row.prc = T,
                 show.summary = F)

Strict vs. imputed active status indicator
active.strict	active.imp		Total
active.strict	0	1	Total
0	8106 91 %	803 9 %	8909 100 %
1	0 0 %	7289 100 %	7289 100 %
Total	8106 50 %	8092 50 %	16198 100 %

Partner type

Here there are 3 possible definitions, depending on the use of the active status indicator for ptype classification, and which active status indicator is used.

pype – the original ARTnet coding, where all non-main partners are classified exclusively by the ONCE variable.
ptype.alt – non-main partners are classified by both ONCE and active.strict. One-time partners are classified as casl partners if the respondent said they expect to have sex with them again.

ptype==3 if ONCE=1 & active.strict=0
ptype==2 if ONCE=1 & active.strict=1.

ptype.imp – uses active.imp in place of active.strict. Only affects the active=DK partners. Changes some of these casl partners to active, and reclassifies a small number of inst partners as active (and therefore) casl partners.

About 19% of inst partners are reclassified to casl if we use the active status indicator (ptype.alt). The additional impact of imputing active status when unknown is minimal.

Plot

ptype v. ptype.alt

partnerdf <- l %>%
  group_by(ptype, ptype.alt) %>%
  summarize(num = n()) %>%
  mutate(pct = num/sum(num))

ggplot(partnerdf,
       aes(x=factor(ptype), y=pct, 
           group=factor(ptype.alt), fill=factor(ptype.alt))) +
  geom_bar(stat="identity", position="dodge", alpha=0.7) +
  scale_y_continuous(labels = scales::percent_format(accuracy = 0.1)) +
  labs(title = "ptype vs. ptype.alt",
       x = "ptype", y = "Percent",
       fill = "ptype.alt")

ptype v. ptype.imp

partnerdf <- l %>%
  group_by(ptype, ptype.imp) %>%
  summarize(num = n()) %>%
  mutate(pct = num/sum(num))

ggplot(partnerdf,
       aes(x=factor(ptype), y=pct, 
           group=factor(ptype.imp), fill=factor(ptype.imp))) +
  geom_bar(stat="identity", position="dodge", alpha=0.7) +
  scale_y_continuous(labels = scales::percent_format(accuracy = 0.1)) +
  labs(title = "ptype vs. ptype.imp",
       x = "ptype", y = "Percent",
       fill = "ptype.imp")

Tables

ptype v. ptype.alt

sjPlot::tab_xtab(var.row = l$ptype, 
                 var.col = l$ptype.alt, 
                 title = "Ptype: Ignoring vs. using active.strict",
                 show.row.prc = T,
                 show.summary = F)

Ptype: Ignoring vs. using active.strict
ptype	ptype.alt			Total
ptype	1	2	3	Total
1	2618 100 %	0 0 %	0 0 %	2618 100 %
2	0 0 %	5978 100 %	0 0 %	5978 100 %
3	0 0 %	1414 18.6 %	6188 81.4 %	7602 100 %
Total	2618 16.2 %	7392 45.6 %	6188 38.2 %	16198 100 %

ptype v. ptype.imp

sjPlot::tab_xtab(var.row = l$ptype, 
                 var.col = l$ptype.imp, 
                 title = "Ptype: Ignoring vs. using active.imp",
                 show.row.prc = T,
                 show.summary = F)

Ptype: Ignoring vs. using active.imp
ptype	ptype.imp			Total
ptype	1	2	3	Total
1	2618 100 %	0 0 %	0 0 %	2618 100 %
2	0 0 %	5978 100 %	0 0 %	5978 100 %
3	0 0 %	1597 21 %	6005 79 %	7602 100 %
Total	2618 16.2 %	7575 46.8 %	6005 37.1 %	16198 100 %

Impact on key measures: Comparison

Degree distributions

The active and once coding alternatives have effects on the deg.casl and deg.tot degree distributions. There is no effect on deg.main.

There are lots of plots here, and the impacts shown in the 1st plot on each tab are summarized in the Mean Degree and Concurrency sections. The key finding from the 2nd plot on each tab in this section is that:

Degree shifts are not always from deg to deg+1. We also see shifts from deg0 to deg2, deg2 to deg5, etc.
Again, the big impact is from using the active status indicator (rather than ONCE), not the imputation.

Plots

deg.casl v. deg.casl.alt

The difference here is using the strict active status indicator to classify one-time partners as casl. Respondent has answered “yes, i expect to have sex with this partner again”.

# by degree value
orig <- d %>%
  group_by(deg.casl) %>%
  summarize(num = n()) %>%
  mutate(pct = num/sum(num),
         measure = "orig") %>%
  rename(deg = deg.casl) %>%
  select(-num)

alt <- d %>%
  group_by(deg.casl.alt) %>%
  summarize(num = n()) %>%
  mutate(pct = num/sum(num),
         measure = "alt") %>%
  rename(deg = deg.casl.alt) %>%
  select(-num)

bind_rows (orig, alt) %>%
  mutate(measure = forcats::fct_relevel(measure, "orig")) %>%
  ggplot(aes(x=factor(deg), y=pct, 
           group=measure, fill=measure)) +
  geom_bar(stat="identity", position="dodge", alpha=0.7) +
  scale_y_continuous(labels = scales::percent_format(accuracy = 0.1)) +
  labs(title = "deg.casl vs. deg.casl.alt",
       x = "degree", y = "Percent",
       fill = "measure")

# by original deg.casl
degdf <- d %>%
  group_by(deg.casl, deg.casl.alt) %>%
  summarize(num = n()) %>%
  mutate(pct = num/sum(num))

ggplot(degdf,
       aes(x=factor(deg.casl), y=pct, 
           group=factor(deg.casl.alt), fill=factor(deg.casl.alt))) +
  geom_bar(stat="identity", position="dodge", alpha=0.7) +
  scale_y_continuous(labels = scales::percent_format(accuracy = 0.1)) +
  labs(title = "deg.casl vs. deg.casl.alt",
       x = "deg.casl", y = "Percent",
       fill = "deg.casl.alt")

deg.casl v. deg.casl.imp

The difference here is using the strict + imputed active status indicator to classify one-time partners as casl. Respondent has answered either “yes, i expect to have sex with this partner again”, or “DK”, and the “DK” have been imputed with p=0.5 or 0.1 as being active, for repeat and one-time partners respectively.

# by degree value

imp <- d %>%
  group_by(deg.casl.imp) %>%
  summarize(num = n()) %>%
  mutate(pct = num/sum(num),
         measure = "imp") %>%
  rename(deg = deg.casl.imp) %>%
  select(-num)

bind_rows (orig, imp) %>%
  mutate(measure = forcats::fct_relevel(measure, "orig")) %>%
  ggplot(aes(x=factor(deg), y=pct, 
           group=measure, fill=measure)) +
  geom_bar(stat="identity", position="dodge", alpha=0.7) +
  scale_y_continuous(labels = scales::percent_format(accuracy = 0.1)) +
  labs(title = "deg.casl vs. deg.casl.imp",
       x = "degree", y = "Percent",
       fill = "measure")

# by original deg.casl
degdf <- d %>%
  group_by(deg.casl, deg.casl.imp) %>%
  summarize(num = n()) %>%
  mutate(pct = num/sum(num))

ggplot(degdf,
       aes(x=factor(deg.casl), y=pct, 
           group=factor(deg.casl.imp), fill=factor(deg.casl.imp))) +
  geom_bar(stat="identity", position="dodge", alpha=0.7) +
  scale_y_continuous(labels = scales::percent_format(accuracy = 0.1)) +
  labs(title = "deg.casl vs. deg.casl.imp",
       x = "deg.casl", y = "Percent",
       fill = "deg.casl.imp")

deg.tot v. deg.tot.alt

The difference here is using the strict active status indicator to classify one-time partners as casl. Respondent has answered “yes, i expect to have sex with this partner again”.

# by degree value
orig <- d %>%
  group_by(deg.tot) %>%
  summarize(num = n()) %>%
  mutate(pct = num/sum(num),
         measure = "orig") %>%
  rename(deg = deg.tot) %>%
  select(-num)

alt <- d %>%
  group_by(deg.tot.alt) %>%
  summarize(num = n()) %>%
  mutate(pct = num/sum(num),
         measure = "alt") %>%
  rename(deg = deg.tot.alt) %>%
  select(-num)

bind_rows (orig, alt) %>%
  mutate(measure = forcats::fct_relevel(measure, "orig")) %>%
  ggplot(aes(x=factor(deg), y=pct, 
           group=measure, fill=measure)) +
  geom_bar(stat="identity", position="dodge", alpha=0.7) +
  scale_y_continuous(labels = scales::percent_format(accuracy = 0.1)) +
  labs(title = "deg.tot vs. deg.tot.alt",
       x = "degree", y = "Percent",
       fill = "measure")

# by original deg.tot
degdf <- d %>%
  group_by(deg.tot, deg.tot.alt) %>%
  summarize(num = n()) %>%
  mutate(pct = num/sum(num))

ggplot(degdf,
       aes(x=factor(deg.tot), y=pct, 
           group=factor(deg.tot.alt), fill=factor(deg.tot.alt))) +
  geom_bar(stat="identity", position="dodge", alpha=0.7) +
  scale_y_continuous(labels = scales::percent_format(accuracy = 0.1)) +
  labs(title = "deg.tot vs. deg.tot.alt",
       x = "deg.tot", y = "Percent",
       fill = "deg.tot.alt")

deg.tot v. deg.tot.imp

# by degree value

imp <- d %>%
  group_by(deg.tot.imp) %>%
  summarize(num = n()) %>%
  mutate(pct = num/sum(num),
         measure = "imp") %>%
  rename(deg = deg.tot.imp) %>%
  select(-num)

bind_rows (orig, imp) %>%
  mutate(measure = forcats::fct_relevel(measure, "orig")) %>%
  ggplot(aes(x=factor(deg), y=pct, 
           group=measure, fill=measure)) +
  geom_bar(stat="identity", position="dodge", alpha=0.7) +
  scale_y_continuous(labels = scales::percent_format(accuracy = 0.1)) +
  labs(title = "deg.tot vs. deg.tot.imp",
       x = "degree", y = "Percent",
       fill = "measure")

# by original deg.tot
degdf <- d %>%
  group_by(deg.tot, deg.tot.imp) %>%
  summarize(num = n()) %>%
  mutate(pct = num/sum(num))

ggplot(degdf,
       aes(x=factor(deg.tot), y=pct, 
           group=factor(deg.tot.imp), fill=factor(deg.tot.imp))) +
  geom_bar(stat="identity", position="dodge", alpha=0.7) +
  scale_y_continuous(labels = scales::percent_format(accuracy = 0.1)) +
  labs(title = "deg.tot vs. deg.tot.imp",
       x = "deg.tot", y = "Percent",
       fill = "deg.tot.imp")

Mean degree

Again, only the mean degree for casual and total partnerships is affected.

orig <- d %>% 
  select(deg.casl, deg.tot) %>% 
  sjmisc::descr() 
alt <- d %>% 
  select(deg.casl.alt, deg.tot.alt) %>% 
  sjmisc::descr()
imp <- d %>% 
  select(deg.casl.imp, deg.tot.imp) %>% 
  sjmisc::descr()


degs <- data.frame((bind_rows(orig[1,], alt[1,], imp[1,],
                              orig[2,], alt[2,], imp[2,]))) %>% 
  select(-c("label", "iqr", "skew")) %>%
  mutate(var = c(rep("deg.casl",3), rep("deg.tot",3)),
         type = rep(c("orig", "alt", "imp"),2)
         ) %>%
  rename(status = var,
         measure = type,
         NA.pct = NA.prc,
         median = md,
         tr.mean = trimmed)

This just summarizes the changes in the degree distributions. Bottom line:

casl mean degree is 31% higher using the alt measure and 51% higher using the imp measure
total mean degree is 18% higher using the alt measure and 30% higher using the imp measure

degs %>%
  mutate(status = rep("", 6)) %>%
  kable(caption= "Mean degree estimate comparisons", 
        digits = c(0,0,2,2,2,2,2,1,2)) %>%
  pack_rows("deg.casl", 1,3) %>%
  pack_rows("deg.tot", 4,6) %>%
  kable_styling(full_width=F, position="center",
                bootstrap_options = c("striped"))

Mean degree estimate comparisons
status	measure	n	NA.pct	mean	sd	se	median	tr.mean	range
deg.casl
	orig	4904	0	0.59	1.01	0.01	0	0.36	5 (0-5)
	alt	4904	0	0.77	1.17	0.02	0	0.52	5 (0-5)
	imp	4904	0	0.89	1.23	0.02	0	0.65	5 (0-5)
deg.tot
	orig	4904	0	1.00	1.08	0.02	1	0.81	5 (0-5)
	alt	4904	0	1.19	1.21	0.02	1	0.99	5 (0-5)
	imp	4904	0	1.30	1.27	0.02	1	1.11	5 (0-5)

Concurrency

Again, only influences concurrency fractions for casual and total partnerships.

orig <- d %>% 
  select(conc.casl, conc.tot) %>% 
  sjmisc::descr() 
alt <- d %>% 
  select(conc.casl.alt, conc.tot.alt) %>% 
  sjmisc::descr()
imp <- d %>% 
  select(conc.casl.imp, conc.tot.imp) %>% 
  sjmisc::descr()


conc <- data.frame((bind_rows(orig[1,], alt[1,], imp[1,],
                              orig[2,], alt[2,], imp[2,]))) %>% 
  select(-c("label", "iqr", "skew")) %>%
  mutate(var = c(rep("conc.casl",3), rep("conc.tot",3)),
         type = rep(c("orig", "alt", "imp"),2)
         ) %>%
  rename(status = var,
         measure = type,
         NA.pct = NA.prc,
         median = md,
         tr.mean = trimmed)

This also just summarizes the changes in the degree distributions. Bottom line:

casl concurrency is 41% higher using the alt measure and 64% higher using the imp measure
total concurrency is 28% higher using the alt measure and 47% higher using the imp measure

conc %>%
  mutate(status = rep("", 6)) %>%
  kable(caption= "Concurrency estimate comparisons", 
        digits = c(0,0,2,2,2,2,2,1,2)) %>%
  pack_rows("conc.casl", 1,3) %>%
  pack_rows("conc.tot", 4,6) %>%
  kable_styling(full_width=F, position="center",
                bootstrap_options = c("striped"))

Concurrency estimate comparisons
status	measure	n	NA.pct	mean	sd	se	median	tr.mean	range
conc.casl
	orig	4904	0	0.15	0.36	0.01	0	0.06	1 (0-1)
	alt	4904	0	0.21	0.41	0.01	0	0.14	1 (0-1)
	imp	4904	0	0.24	0.43	0.01	0	0.18	1 (0-1)
conc.tot
	orig	4904	0	0.21	0.41	0.01	0	0.13	1 (0-1)
	alt	4904	0	0.27	0.44	0.01	0	0.21	1 (0-1)
	imp	4904	0	0.30	0.46	0.01	0	0.26	1 (0-1)

Duration distn for casl partnerships

For this, we need to contstruct the duration variable for the one-time partners classified as active (either strict, or imputed), because the ARTnet scripts set duration to missing for all one-time partners. The raw variable needed is OFENDMM, as all these cases start and end in the survey year. Units for duration is weeks. This is now done in the make_artnet_data script.

orig <- l %>% 
  filter(ptype==2) %>% select(duration, active.strict) %>% 
  group_by(active.strict) %>%
  sjmisc::descr() 
alt <- l %>% 
  filter(ptype.alt==2) %>% select(duration.alt, active.imp) %>% 
  group_by(active.imp) %>%
  sjmisc::descr()
imp <- l %>% 
  filter(ptype.imp==2) %>% select(duration.imp, active.imp) %>% 
  group_by(active.imp) %>%
  sjmisc::descr()

# NB: alt and imp are the same for ended partnerships
durs <- data.frame((bind_rows(orig[2], alt[2], imp[2],
                              orig[1], alt[1]))) %>% 
  select(-c("label", "iqr", "skew")) %>%
  mutate(var = c(rep("active",3), rep("ended",2)),
         type = c("orig", "alt", "imp", "orig", "alt")
         ) %>%
  rename(status = var,
         measure = type)

Table

The effect of most interest is the impact on the estimated mean age active partnerships:

mean age of active partnerships is 28% lower using the alt measure and 30% lower using the imp measure

There is less effect on the estimated duration of ended partnerships.

durs %>%
  mutate(status = rep("", nrow(durs))) %>%
  kable(caption= "Duration estimate comparisons, by active status", 
        digits = c(0,0,2,1,1,1,1,1,0)) %>% 
  pack_rows(durs$status[1], 1,3) %>%
  pack_rows(durs$status[4], 4,nrow(durs)) %>%
  kable_styling(full_width=F, position="center",
                bootstrap_options = c("striped")) %>%
  add_footnote(label="NA.prc range is 0-100%", notation="symbol")

Duration estimate comparisons, by active status
status	measure	n	NA.prc	mean	sd	se	md	trimmed	range
active
	orig	3642	0.5	132.8	196.1	3.2	70.6	89	1946 (0.14-1946.14)
	alt	5669	0.4	96.0	171.3	2.3	31.9	57	1946.02 (0.13-1946.14)
	imp	5852	0.4	93.4	169.2	2.2	30.3	54	1946.02 (0.13-1946.14)
ended
	orig	2303	0.7	70.7	139.4	2.9	21.9	41	2143.69 (0.02-2143.71)
	alt	1688	0.6	68.4	143.0	3.5	21.6	38	2143.69 (0.02-2143.71)
^* NA.prc range is 0-100%

Plots

All

colors <- c("orig"="cadetblue", "alt"="goldenrod", "imp" = "purple")

ggplot() +
  geom_density(data=l %>% filter(ptype.imp==2),
               aes(x=duration.imp, fill="imp")) +
  geom_density(data=l %>% filter(ptype.alt==2),
               aes(x=duration.alt, fill="alt"), show.legend = FALSE) +
  geom_density(data=l %>% filter(ptype==2),
               aes(x=duration, fill="orig"), show.legend = FALSE) +
  xlim(0,200) +
  labs(title = "Duration distribution comparisons: All partnerships",
       caption = "ARTnet Data",
       x = "Weeks",
       fill = "Legend") +
  scale_fill_manual(breaks=c("orig","alt","imp"),
                    values = adjustcolor(colors, alpha.f = 0.3)) +
  guides(fill = guide_legend(override.aes = list(alpha = 0.3)))

Active

ggplot() +
  geom_density(data=l %>% filter(ptype.imp==2 & active.imp==1),
               aes(x=duration.imp, fill="imp")) +
  geom_density(data=l %>% filter(ptype.alt==2 & active.imp==1),
               aes(x=duration.alt, fill="alt"), show.legend = FALSE) +
  geom_density(data=l %>% filter(ptype==2 & active==1),
               aes(x=duration, fill="orig"), show.legend = FALSE) +
  xlim(0,200) +
  labs(title = "Duration distribution comparisons: Active partnerships",
       caption = "ARTnet Data",
       x = "Weeks",
       fill = "Legend") +
  scale_fill_manual(breaks=c("orig","alt","imp"),
                    values = adjustcolor(colors, alpha.f = 0.3)) +
  guides(fill = guide_legend(override.aes = list(alpha = 0.3)))

Ended

ggplot() +
  geom_density(data=l %>% filter(ptype.imp==2 & active.imp==0),
               aes(x=duration.imp, fill="imp")) +
  geom_density(data=l %>% filter(ptype.alt==2 & active.imp==0),
               aes(x=duration.alt, fill="alt"), show.legend = FALSE) +
  geom_density(data=l %>% filter(ptype==2 & active==0),
               aes(x=duration, fill="orig"), show.legend = FALSE) +
  xlim(0,200) +
  labs(title = "Duration distribution comparisons: Ended partnerships",
       caption = "ARTnet Data",
       x = "Weeks",
       fill = "Legend") +
  scale_fill_manual(breaks=c("orig","alt","imp"),
                    values = adjustcolor(colors, alpha.f = 0.3)) +
  guides(fill = guide_legend(override.aes = list(alpha = 0.3)))

Effects of alternate approaches to coding partnership active status

source code: makeDSdata repository

M.Morris

2021-09-01

Intro

The problem

The coding alternatives

Raw survey responses

Active

Plot

Table

One-time

Plot

Table

Alternate definitions: comparison

Active

Plot

Table

Partner type

Plot

ptype v. ptype.alt

ptype v. ptype.imp

Tables

ptype v. ptype.alt

ptype v. ptype.imp

Impact on key measures: Comparison

Degree distributions

Plots

deg.casl v. deg.casl.alt

deg.casl v. deg.casl.imp

deg.tot v. deg.tot.alt

deg.tot v. deg.tot.imp

Mean degree

Concurrency

Duration distn for casl partnerships

Table

Plots

All

Active

Ended