In this activity you will explore the above research question. You will use simulation to model the process of randomly assigning subjects to groups, assuming dolphins have no effect on depression. You will repeatedly randomly assign subjects to groups and will collect statistics after each randomization. By plotting these statistics in a distribution we can see the possible results from many different random assignments. Then we will assess the statistical significance by finding out where the result actually observed in the study falls in this null distribution of the randomly collected statistics to see if it is common or rare.
Antonioli and Reveley (2005) investigated whether swimming with dolphins was therapeutic for patients suffering from clinical depression. The researchers recruited 30 subjects aged 18-65 with a clinical diagnosis of mild to moderate depression through announcements on the internet, radio, newspapers, and hospitals in the U.S. and Honduras. Subjects were required to discontinue use of any antidepressant drugs or psychotherapy four weeks prior to the experiment, and throughout the experiment. These 30 subjects went to an island off the coast of Honduras, where they were randomly assigned to one of two treatment groups. Both groups engaged in one hour of swimming and snorkeling each day, but one group (Dolphin Therapy) did so in the presence of bottlenose dolphins and the other group (Control) did not. At the end of two weeks, the level of depression for each subject was evaluated, as it had been at the beginning of the study, and each subject was categorized as experiencing improvement in their depression symptoms or not. (Afterwards, the control group had a one day session with dolphins.)
Is this an observational study or an experiment? Discuss with your neighbors.
What are the explanatory and response variables in this study?
The contingency table as it appears in Chance & Rossman (2011) summarizes the study results.
| Dolphin Therapy | Control Group | Total | |
|---|---|---|---|
| Showed improvement | 10 | 3 | 13 |
| Did not show improvement | 5 | 12 | 17 |
| ————————————— | |||
| Total | 15 | 15 | 30 |
What is the number of subjects who showed improvement in the Dolphin Group?
Is this number big enough to convince us that dolphin therapy is effective? Think carefully here. How often would we get 10 improvers by random chance alone (just by luck)? We need to have something to compare to know if 10 is unusual or not.
Could we expect to see 10 in the dolphin therapy group with substantial improvement, even if the dolphin therapy was not effective? Discuss with your neighbors.
We will use a randomization simulation to model how these results are expected to vary, if the dolphin therapy had no effect. This hypothetical null distribution will give us something to compare to!
Key Assumption: 13 of the 30 people in the study would see a substantial improvement, regardless of whether they swam with dolphins or not. This is consistent with the idea that whether people improved was not related to the group they were put in (our null hypothesis).
Key Question: How unlikely is it for the random assignment process alone to produce 10 people with substantial improvement?
If the observed results would rarely occur in a world where dolphin therapy had no effect, then we can conclude that the dolphin treatment was effective.
Reshuffle 1:
| Dolphin Therapy | Control Group | Total | |
|---|---|---|---|
| Showed improvement | |||
| Did not show improvement | |||
| ————————————— | |||
| Total | 15 | 15 | 30 |
With your partner(s), repeat this hypothetical random assignment, reshuffling four times and recording four more contingency tables for groupings that “could have been”. Record your results in a notebook or on a sheet of paper (you can just copy down the empty table above).
For each shuffle, write down the number of the dolphin group that showed improvement.
| Reshuffle | Number of improvers in the dolphin group |
|---|---|
| 1. | |
| 2. | |
| 3. | |
| 4. | |
| 5. |
Pool your results with the class. Put your results on the number line on the white board. This randomization distribution is what we expect to observe if the dolphin therapy had no effect. This is also called the null distribution. Now we have something to compare our observed results to!
Does it seem like the observed results (10 with substantial improvement (or even more!)) would be surprising to see from the randomization procedure? How often would we get 10 or more in this randomization procedure?
We really need to do this simulated process hundreds or thousands of times. We can’t feasibly do this with cards, so let’s use technology.
The left panel is the original data from the study. There is also a stacked bar chart - hopefully you notice that there are a larger proportion of successes in the dolphin group. Change the Statistic on the left side to
Cell 1 Countso we keep track of the number of imoroved in dolphin therapy. Then check theShow Shuffle Optionsbox and clickShuffle. Note that the applet repeats what you just did with the cards. It also adds a square dot to the dotplot on the right for the difference in proportions.
Click
Show Tablein the left panel and we see the original table on the left and the last of the reshuffled tables on the right.
Now enter 995 for the Number of shuffles for a total of 1,000 total simulations. Next to the Count samples greater than or equal to in the right panel, type in the results we observed in our original study (10).
In what proportion of your 1,000 simulated random assignments were the results as or more extreme as the result in the original study? That is, what is the value for the count in the tail of interested divided by 1,000? This is the p-value based on our randomization simulation!
Is this p-value small enough to convince you that the original observed data provide evidence that the dolphin therapy is indeed effective or not? Briefly discuss with your neighbors and explain.
Can we say that dolphin therapy caused the reduced depression based on this study design? Why or why not?
To what population are you willing to generalize, if any?