PEER staff
10/27/17
Today I want to achieve two things:
Discuss the standards for evidence-based practice used in Mississippi.
Show examples of those standards applied to adult criminal justice programs in use – or available for use – in this state.
“Evidence-based practice” is kind of like “the right outcome” of a trial.
Nobody's against it – but people have very different opinions of what it is!
But they're alike in another way: not all opinions are created equal.
Here's what I hear a lot, and I'm sure you do too:
“Our program definitely works. Just look at Timmy! He went through it, and turned his whole life around!”
With enough participants and normal conditions, you're pretty much guaranteed to have a few Timmies even in a bad program!
That's why stories about Timmy and his friends shouldn't convince you. In other words:
Anecdotes aren't good evidence!
MISS. CODE ANN. §27-103-159 gives some relevant definitions, including:
“Evidence-based program” shall mean a program or practice that has had multiple site random controlled trials across heterogeneous populations demonstrating that the program or practice is effective for the population.
An evidence-based program has had:
All of these ensure that some common-sense questions that we have are answered.
The importance of effectiveness and generalizability should be pretty clear.
Why do statistical trials? Why not just compare the numbers?
Imagine two programs doing the same thing.
Simple numeric comparisons – my number is bigger than your number – are meaningless without context!
Rigorous statistical trials provide that context.
But one more point: Why randomized, controlled trials?
The short answer: RCTs are our best method of establishing that A causes B.
Imagine you’re a researcher for a shoe company; you’re testing a running shoe that is supposed to shave time off of your sprint.
So you set up a test: Runners in your shoes versus runners in some different shoe.
After statistical analysis, it turns out the group with your shoe crossed the finish line significantly before the other group!
So we've now satisfied the “trial” requirement. Congratulations!
But wait: It turns out that you had your group running 100m, while the comparison group ran 200m!
Obviously, this comparison wasn’t fair.
Even if the results are good, it doesn’t seem to be because of the shoe.
When you hear people talk about “controlling for confounding variables” this is all they mean!
(Statistical) control = making sure everybody has the same starting line before comparing them.
It’s basic fairness!
There are several ways to control for confounding variables. For instance:
These methods of control can be very sophisticated. But there's a problem:
“… the golden rule of causal analysis: No causal claim can be established by a purely statistical method, be it propensity scores, regression, stratification, or any other distribution-based design.”
-Judea Pearl, “Causality,” p. 350
Well-conducted random sampling guarantees that all possible confounding variables are randomly distributed among conditions – which is to say, there’s no correlation between any trait and group membership!
Which means the groups, overall, start and finish on the same lines…
Which lets us assume that if they finish at different times, it's because of the program!
The MS standard for evidence-based practice is the gold standard. Research quality drops off dramatically the more of these standards you lose.
Gold is rare, though. What if we don't have any and still need to act?
MISS. CODE ANN. §27-103-159 provides some loose definitions of less rigorous alternatives:
But these definitions are very loose!
We've adopted an existing scale to rate research below the MS standard of evidence:
The Maryland Scientific Methods scale!
Described by Farrington et al. (2002) in Evidence-based Crime Prevention.
It's a five-point ordinal scale – 1 is the worst, 5 is the best!
It rates our general ability to draw conclusions from the study.
It's not safe to make inferences from any trial below level 3!
So that's where we've drawn our line for “High-quality research”…
(Although you should always want the gold standard if possible!)
Coalition for Evidence-Based Policy (2013). Randomized Controlled Trials Commissioned by the Institute of Education Sciences Since 2002: How Many Found Positive Versus Weak or No Effects. Retrieved from http://coalition4evidence.org/wp-content/uploads/2013/06/IES-Commissioned-RCTs-positive-vs-weak-or-null-findings-7-2013.pdf
Farrington, D.P., Gottfredson, D.C., Sherman, L.W. & Welsh, B.C. (2002). The Maryland Scientific Methods Scale. In Farrington, D.P., MacKenzie. D. L., Sherman, L.W.,& Welsh, B.C. (Eds.), Evidence-Based Crime Prevention (pp. 13-21). London: Routledge.
Ioannidis, J.P.A. (2005). Contradicted and Initially Stronger Effects in Highly Cited Clinical Research. Journal of the American Medical Association, 294(2), 218-228.
Manzi, J. (2012). Uncontrolled: The Surprising Payoff of Trial-and-Error for Business, Politics, and Society. New York: Perseus Books Group.
Pearl, J. (2009). Causality (2nd ed.). Cambridge: Cambridge University Press.
Zia, M. I., Siu, L. L., Pond, G. R., & Chen, E. X. (2005). Comparison of Outcomes of Phase II Studies and Subsequent Randomized Control Studies Using Identical Chemotherapeutic Regimens. Journal of Clinical Oncology, 23(28), 6982-6991.