In a nice conversation with our very good friend and colleague Nikolai Wenzel during the peak of the pandemic, he mentioned that he was organizing a conference and making the tough decision of which requirements to put in place. Among the options, there were the following: 1) Requiring proof of vaccination, 2) Requiring proof of a negative Covid test, and 3) Not doing the conference at all. What a dilemma! He asked for my expert opinion, statistically. We wanted to have the conference, of course (i.e., eliminate option 3), but options 1 and 2 didn’t provide enough “safety” for us to feel fully confident about organizing the conference.
I told him on the spot that we could design a protocol based on repeated testing and, theoretically, we could have as much safety as desired. It’s a matter of testing enough and doing the math! He seemed to like the idea (maybe testing twice sounded OK), but he was also skeptical. He certainly did not follow up with me regarding doing the abstract math. However, I did the math for my Mathematical Statistics class in Spring 2022 (of course!) and then for my REU (Research Experience with Undergraduates) students during the summer. After all, this is a classical problem of conditional probability and the Bayes Theorem that I have used in my exams for years at both Ursinus College, Instituto de Estudios Superiores de Administracion (IESA), and Universidad Simon Bolivar(USB. By the way, if you have been my student in the past, and this problem was in your homework or exam, I would like to know.
The following is a question, verbatim, from previous exams and homeworks of mine.
“(5 points) A medical condition occurs in 20% of the population. There is an inexpensive market test which reports 30% false negatives and 10% false positives. Different test applications can be considered independent. a) (2 points) Calculate the probability that a person who shows positive in the test actually suffers the condition? b) (2 points) A hypochondriac takes the test twice. What is the likelihood that he really suffers the condition if one of the two test results is positive, and the other is negative? c) (1 points) If two different people take the test and both come out positive, what is the probability that both suffer from the condition? Extra credit: (d) (1 point EC) The hypochondriac takes the test 100 times. What is the likelihood that s/he/they really suffers the condition if sixty times the test is positive and forty times the test is negative? (e) (1 point EC) Suppose the hypochondriac actually has the condition, if s/he/they takes the test 100 times, what is the expected number of positive results? (And if s/he/they does not have the condition?). Use that to explain your answer in (d). Leave all your answers indicated if necessary.”
For a mathematical solution, come to our classes, IDSIRI workshops, office hours, or the Math and CS Coffee, Tea, and QED. We’d be glad to show you. Hint: Recall the Bayes Theorem!
We can visualize the solution to this problem using R. We can perform simulations or use the mathematical solution and R tools to visualize it. Together with my REU students Ed Coleman, Jhavon Innocent, and Sarah Kircher, we created a Shiny App that shows the solution as a function of the multiple variables involved in the problem: 1) Number of test taken, 2) Prevalence, 3) Sensitivity, and 4) Specificity. Link to the Shiny Web App here. Our recommendation is to open the link on a different tab or window.
Alternatively, we can try to embed the Web App withing this Website:
Using this app, can you come up with an inclusive protocol based on repeated testing?
First, to get the idea, notice that using repeated testing, we can determine with arbitrary precision the probability that a person has the condition (COVID-19 in this case) or not. The precision depends on the number of tests, the quality of the tests (the sensitivity and the specificity), and the prevalence of the condition in the population. The most important parameters, however, are the number of tests taken and the quality of the tests. In the app above, the more separated the green and red curves are, the easier it is to separate healthy people (green distribution) from those with the condition (red distribution), and the line graph calculates the relevant probability of having the condition given the results of the tests.
Next week, we will replicate the graph above using plot_ly so that we can hover over with the mouse and see the actual calculated probabilities in each case. Then, we can change the parameters (especially the quality of the tests) to model more realistic COVID-19 scenarios.
Information regarding the quality of the tests (sensitivity and specificity) can be obtained from the manufacturer, from the CDC or from the FDA. We can also find the information compiled by New York Times, for example, which list that information: https://www.nytimes.com/wirecutter/reviews/at-home-covid-test-kits/
For convenience, here it is a list of the top 4 NYT’s recommended tests, their sensitivity, specificity, and their documentation:
Abbott BinaxNow COVID-19 Antigen Self Test. Sensitivity: 84.6% (PDF) within seven days of symptom onset. Specificity: 98.5% (PDF) within seven days of symptom onset. Tests included: two. App needed: no. Omicron variant detected: yes. Cost: about $20 at the time of publication. Availability: Amazon, CVS, Walgreens.
Access Bio CareStart COVID-19 Antigen Home Test. Sensitivity: 87% (PDF) within seven days of symptom onset. Specificity: 98% (PDF) within seven days of symptom onset. Tests included: two.App needed: no. Omicron variant detected: yes. Cost: about $20 at the time of publication. Availability: Target.
BD Veritor At-Home COVID-19 Test (app required). Sensitivity: 84.6% (PDF), Specificity: 99.8% (PDF). Tests included: two. App needed: yes (iOS and Android). Omicron variant detected: yes. Cost: about $30 at the time of publication. Availability: Amazon.
Acon Flowflex COVID-19 Antigen Home Test. Sensitivity: 93% (PDF), Specificity: 100% ((PDF)). Tests included: one. App needed: no. Omicron variant detected: yes. Cost: about $10 at the time of publication. Availability: CVS, Target, Walgreens, WellBefore.
Let’s use plot_ly to visualize the probabilities of each of the settings above, in each of our TV computer stations! A direct link to the plot_ly tutorial is here: https://rpubs.com/hugomoises/RepeatedTesting2
Note: For now, we are assuming that a person repeats testing using the same test. We can, however, improve our protocol by mixing up tests. Why is this better? Let’s have an in person conversation about that. The underlying probability distribution becomes a mixture of binomials with different probabilities of success, which is not binomial anymore, but we can still model it easily and use the Bayes Theorem in the same way we did above, we just need to know the results with each test.