We did a mini A/B test on the consent text for pilot version 5. Version A uses the regular way where we do sentence by sentence and have check-in points in the middle so that there is interaction. Version B gives all consent information in two-sentence block, and only ask for consent at the very end.
The outcome of interest is dropoff rate. The lower the dropoff the better.
## # A tibble: 3 × 5
## chatbot_abtest_version n consent dropoff dropoff_rate
## <chr> <int> <dbl> <dbl> <dbl>
## 1 A 1147 865 282 0.246
## 2 B 1130 953 177 0.157
## 3 <NA> 173 3 170 0.983
##
## Call:
## lm(formula = consent_binary ~ chatbot_abtest_version, data = ab_consent)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.98295 0.01705 0.06108 0.06108 0.06108
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.982955 0.006618 148.53 < 2e-16 ***
## chatbot_abtest_versionB -0.044038 0.009043 -4.87 1.21e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1963 on 1893 degrees of freedom
## (555 observations deleted due to missingness)
## Multiple R-squared: 0.01237, Adjusted R-squared: 0.01185
## F-statistic: 23.72 on 1 and 1893 DF, p-value: 1.208e-06
## # A tibble: 3 × 5
## chatbot_abtest_version n full_complete dropoff dropoff_rate
## <chr> <int> <dbl> <dbl> <dbl>
## 1 A 1147 702 445 0.388
## 2 B 1130 746 384 0.34
## 3 <NA> 173 3 170 0.983
##
## Call:
## lm(formula = full_complete_binary ~ chatbot_abtest_version, data = ab_consent)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.170e-16 -3.170e-16 -3.170e-16 0.000e+00 2.363e-13
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.000e+00 2.347e-16 4.261e+15 <2e-16 ***
## chatbot_abtest_versionB 3.171e-16 3.269e-16 9.700e-01 0.332
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 6.217e-15 on 1446 degrees of freedom
## (1002 observations deleted due to missingness)
## Multiple R-squared: 0.4998, Adjusted R-squared: 0.4995
## F-statistic: 1445 on 1 and 1446 DF, p-value: < 2.2e-16
## Error in is.data.frame(x): argument "dat" is missing, with no default
##
## Call:
## lm(formula = enjoyable_numeric ~ chatbot_abtest_version, data = df_B)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.6505 -0.6393 0.3495 0.3607 0.3607
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.65050 0.02064 128.404 <2e-16 ***
## chatbot_abtest_versionB -0.01120 0.02878 -0.389 0.697
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.5465 on 1442 degrees of freedom
## (1 observation deleted due to missingness)
## Multiple R-squared: 0.000105, Adjusted R-squared: -0.0005884
## F-statistic: 0.1515 on 1 and 1442 DF, p-value: 0.6972
##
## Call:
## lm(formula = comfortable_numeric ~ chatbot_abtest_version, data = df_B)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.7660 0.2339 0.2339 0.2446 0.2446
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.76605 0.01895 145.993 <2e-16 ***
## chatbot_abtest_versionB -0.01067 0.02640 -0.404 0.686
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.5016 on 1443 degrees of freedom
## Multiple R-squared: 0.0001132, Adjusted R-squared: -0.0005797
## F-statistic: 0.1634 on 1 and 1443 DF, p-value: 0.6861
main metric we care about: Cost per full Survey Complete/Cost per full Survey Complete (Unvax)/Cost per Open to treatment)
## Metric Version_A Version_B
## 1 Full Survey Complete $0.459 $0.426
## 2 Full Survey Complete (Unvax) $1.075 $0.996
## 3 Full Survey Complete (Unvax, Open to Treatment) $1.581 $1.394
Takeaway
Goals for this section:
List of analyses in this section:
## Error: Problem with `filter()` input `..1`.
## ℹ Input `..1` is `ad_set %in% c("pilot_v4_1", "pilot_v4_2", "pilot_v4_3")`.
## x object 'ad_set' not found
## Error in eval(expr, envir, enclos): object 'list_C' not found
## Error in eval(expr, envir, enclos): object 'list_C' not found
## Error in is.data.frame(x): object 'df_full_C' not found
## Error in list2(...): object 'dropoff_old' not found
## Error in filter(., full_complete == "yes"): object 'df_full_C' not found
## Error: Can't subset columns that don't exist.
## x Column `enjoyable` doesn't exist.
Goals for this section:
We start off with a plot demonstrating the distribution of vaccinated and unvaccinated participants in this pilot:
We asked participants about whether they have the motivation to get the COVID-19 vaccine and whether they have the ability to get the vaccine for both vaccinated and unvaccinated people. We then fork them into 8 different segments based on the vaccination status, motivation to get the vaccine, and ability to get the vaccine. We obtain the distribution below:
| Vaccination status | Able to get vaccine | Have motivation to get vaccine | Count | Percentage of total participants |
|---|---|---|---|---|
| unvax | yes | no | 193 | 49% |
| unvax | yes | yes | 27 | 7% |
| unvax | no | no | 15 | 4% |
| unvax | no | yes | 11 | 3% |
| vax | yes | yes | 80 | 20% |
| vax | yes | no | 63 | 16% |
| vax | no | yes | 6 | 2% |
| vax | no | no | 1 | 0% |
| Vaccination status | Able to get vaccine | Have motivation to get vaccine | Count | Percentage of total participants |
|---|---|---|---|---|
| unvax | yes | no | 403 | % |
| unvax | yes | yes | 126 | % |
| unvax | no | no | 62 | % |
| unvax | no | yes | 26 | % |
| vax | yes | yes | 500 | % |
| vax | yes | no | 291 | % |
| vax | no | no | 25 | % |
| vax | no | yes | 11 | % |
Takeaways
Let’s investigate each impediment (motivation & ability) in detail, and see the distributions of the reasons why they have such impediments.
Takeaways
We asked: What’s the main reason you don’t want to be vaccinated?
Provided options:
The distribution of the answers demonstrates below:
## $x
## [1] "Motivational Impediment"
##
## $y
## [1] "count"
##
## attr(,"class")
## [1] "labels"
We asked: is there a main reason why you think there isn’t a benefit?
Provided options:
The distribution of the answers demonstrates below:
We asked: is there a main reason why you think there is risk?
Provided options:
The distribution of the answers demonstrates below:
We asked: is there a main reason why against your belief?
Provided options:
The distribution of the answers demonstrates below:
We asked: What’s the main difficulty of getting vaccinated?
Provided options:
The distribution of the answers demonstrates below:
We asked: is there a main reason why there isn’t availability?
Provided options:
We asked: is there a main reason why there isn’t time?
Provided options:
We asked: is there a main reason why there isn’t money?
Provided options:
We mapped binary and ordinal demographic variables to continuous variables (with value 0, 1, 2,…).
How we did the mapping:
ability: 1 if the participant has the ability to get
vax, 0 if notfemale: 1 if female, 0 if malecountry: 1 if live in South Africa, 0 if notincome: 0 if the participant is unemployed, 1 if income
< R5,000, 2 if income in R5,000 – R9,999, …, 6 if income >
R100,000education: 1 if the participant’s education < high
school, 2 if education is high school, …, 6 if education is a graduate
degreereligiosity: 1 if the participant is not very
religious, 2 if somewhat religious, 3 if very religiouspolitics: 1 if the participant is conservative, 2 if
moderate, 3 if liberallocation: 1 if the participant lives in rural, 2 if
suburban, 3 if urban,white: 1 if the participant is a white or caucasian, 0
if notNote that here we aggregate free text responses from all
respondents.Therefore, N + missing should equal to
number of respondents * number of free text response questions,
and missing means that respondent did not encounter one of the free text
questions (either they chose another option that did not need free text
response or they chose another path that would not encounter some free
text questions).
Note that here we aggregate free text responses from all
respondents.Therefore, N + missing should equal to
number of respondents * number of free text response questions,
and missing means that respondent did not encounter one of the free text
questions (either they chose another option that did not need free text
response or they chose another path that would not encounter some free
text questions).
## version vax_status N Missing Mean SD Min Q1 Median Q3 Max
## 1 Pilot 4B Vaccinated 910 3890 41.73 44.81 2 12 28 55 333
## 2 Pilot 4B Unvaccinated 1566 6306 35.96 47.81 1 9 20 43 430
## 3 Pilot 5 Vaccinated 3978 17550 43.11 44.87 1 13 30 59 458
## 4 Pilot 5 Unvaccinated 2663 13379 40.96 60.37 1 4 22 54 784
##
## Call:
## lm(formula = nchar ~ version, data = free_text_combined)
##
## Residuals:
## Min 1Q Median 3Q Max
## -41.25 -31.08 -16.08 13.92 741.75
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 38.080 1.013 37.609 < 2e-16 ***
## versionPilot 5 4.167 1.186 3.512 0.000446 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 50.38 on 9115 degrees of freedom
## (41125 observations deleted due to missingness)
## Multiple R-squared: 0.001352, Adjusted R-squared: 0.001242
## F-statistic: 12.34 on 1 and 9115 DF, p-value: 0.0004464
##
## Call:
## lm(formula = nchar ~ version + vax_status + version * vax_status,
## data = free_text_combined)
##
## Residuals:
## Min 1Q Median 3Q Max
## -42.11 -31.11 -15.96 13.89 743.04
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 35.961 1.273 28.259 < 2e-16 ***
## versionPilot 5 4.996 1.604 3.115 0.00184 **
## vax_statusVaccinated 5.764 2.099 2.746 0.00604 **
## versionPilot 5:vax_statusVaccinated -3.612 2.449 -1.475 0.14022
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 50.36 on 9113 degrees of freedom
## (41125 observations deleted due to missingness)
## Multiple R-squared: 0.002496, Adjusted R-squared: 0.002168
## F-statistic: 7.601 on 3 and 9113 DF, p-value: 4.499e-05
Note that here we aggregate free text responses from all
respondents.Therefore, N + missing should equal to
number of respondents * number of free text response questions,
and missing means that respondent did not encounter one of the free text
questions (either they chose another option that did not need free text
response or they chose another path that would not encounter some free
text questions).
Note that here we aggregate free text responses from all
respondents.Therefore, N + missing should equal to
number of respondents * number of free text response questions,
and missing means that respondent did not encounter one of the free text
questions (either they chose another option that did not need free text
response or they chose another path that would not encounter some free
text questions).
## version vax_status N Missing Mean SD Min Q1 Median Q3 Max
## 1 Pilot 4B Vaccinated 117 33 42.90 41.26 2 12 26 67 186
## 2 Pilot 4B Unvaccinated 148 98 28.51 37.55 1 9 17 35 264
## 3 Pilot 5 Vaccinated 635 193 47.21 45.16 1 16 34 63 314
## 4 Pilot 5 Unvaccinated 418 199 50.30 58.41 1 14 34 62 512
##
## Call:
## lm(formula = nchar ~ version, data = free_text_combined)
##
## Residuals:
## Min 1Q Median 3Q Max
## -47.44 -29.44 -14.44 13.99 463.56
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 34.864 2.999 11.626 < 2e-16 ***
## versionPilot 5 13.572 3.355 4.045 5.53e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 48.82 on 1316 degrees of freedom
## (523 observations deleted due to missingness)
## Multiple R-squared: 0.01228, Adjusted R-squared: 0.01153
## F-statistic: 16.36 on 1 and 1316 DF, p-value: 5.532e-05
##
## Call:
## lm(formula = nchar ~ version + vax_status + version * vax_status,
## data = free_text_combined)
##
## Residuals:
## Min 1Q Median 3Q Max
## -49.30 -30.30 -14.30 13.79 461.70
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 28.514 4.006 7.118 1.79e-12 ***
## versionPilot 5 21.783 4.661 4.673 3.27e-06 ***
## vax_statusVaccinated 14.384 6.028 2.386 0.01717 *
## versionPilot 5:vax_statusVaccinated -17.470 6.765 -2.582 0.00992 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 48.73 on 1314 degrees of freedom
## (523 observations deleted due to missingness)
## Multiple R-squared: 0.0173, Adjusted R-squared: 0.01505
## F-statistic: 7.709 on 3 and 1314 DF, p-value: 4.17e-05
Note that here we aggregate free text responses from all
respondents.Therefore, N + missing should equal to
number of respondents * number of free text response questions,
and missing means that respondent did not encounter one of the free text
questions (either they chose another option that did not need free text
response or they chose another path that would not encounter some free
text questions).
Note that here we aggregate free text responses from all
respondents.Therefore, N + missing should equal to
number of respondents * number of free text response questions,
and missing means that respondent did not encounter one of the free text
questions (either they chose another option that did not need free text
response or they chose another path that would not encounter some free
text questions).
## version vax_status N Missing Mean SD Min Q1 Median Q3 Max
## 1 Pilot 4B Vaccinated 142 1508 50.27 44.02 2 23 37.5 65 264
## 2 Pilot 4B Unvaccinated 530 2176 58.83 57.58 1 21 40.5 76 430
## 3 Pilot 5 Vaccinated 395 6229 59.44 46.53 2 26 50.0 81 297
## 4 Pilot 5 Unvaccinated 697 4239 66.78 67.87 1 25 50.0 85 713
##
## Call:
## lm(formula = nchar ~ version, data = free_text_combined)
##
## Residuals:
## Min 1Q Median 3Q Max
## -63.12 -38.02 -15.12 18.20 648.88
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 57.022 2.271 25.11 <2e-16 ***
## versionPilot 5 7.102 2.887 2.46 0.014 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 58.87 on 1762 degrees of freedom
## (14152 observations deleted due to missingness)
## Multiple R-squared: 0.003424, Adjusted R-squared: 0.002858
## F-statistic: 6.054 on 1 and 1762 DF, p-value: 0.01397
##
## Call:
## lm(formula = nchar ~ version + vax_status + version * vax_status,
## data = free_text_combined)
##
## Residuals:
## Min 1Q Median 3Q Max
## -65.78 -37.53 -14.83 18.22 646.22
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 58.832 2.554 23.033 <2e-16 ***
## versionPilot 5 7.947 3.389 2.345 0.0191 *
## vax_statusVaccinated -8.564 5.556 -1.541 0.1234
## versionPilot 5:vax_statusVaccinated 1.226 6.677 0.184 0.8544
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 58.8 on 1760 degrees of freedom
## (14152 observations deleted due to missingness)
## Multiple R-squared: 0.00698, Adjusted R-squared: 0.005287
## F-statistic: 4.124 on 3 and 1760 DF, p-value: 0.00633
Goals for this section:
Metrics explanation:
the total number of times our ad has been viewed
the total number of times our ad has been viewed
number of clicks / number of impressions
number of conversations / number of clicks
number of consents / number of conversations
number of forking section completed / number of consents
number of treatment section completed / number of forking section completed
number of demog section completed / number of treatment section completed
number of full chat completed / number of demog section completed
average number of character in best treatment explanation per full chat completed
average number of character in impediment explanations per full chat completed
amount spent / number of impressions (in USD)
amount spent / number of clicks (in USD)
amount spent / number of full chat completed (in USD)
amount spent / number of full chat completed with unvaccinated participants (in USD)
amount spent / number of full chat completed with unvaccinated and open to treatment participants (in USD)
This table compared nine images (provided below the table) in terms of the metrics described above.
This table compared three Ad impediment sources (vaccine is unnecessary vs vaccine is risky vs vaccine is inaccessible) in terms of the metrics described above.
This table compared three Ad body text approaches - control (share your opinion) vs airtime (take a short survey and earn airtime) vs survey (take this short survey)- in terms of the metrics described above.