Designing studies, experiments, surveys (Ch 14)
“The goal of experimental design is to eliminate BIAS and reduce SAMPLING ERROR when ESTIMATING [calculating the mean] and TESTING [hypothesis testing w/ a p value] the effects of one variable on another [looking for causal connections]” (pg 47)
Today: focus on true experiments
(but principals apply to all studies)
Wed: Focus on observational studies
BIAS and SAMPLING ERROR
Bias = wrong answer
= inaccurate
= “systematic discrepency” (pg 6) (overshoot or undershoot)
= caused by properties of instrument (not calibrated),
experimental design, or stat procedure
Sampling error = creates noise around answer
= makes estimates imprecise
= due to random variation in sampling unit
= (like roll of dice, flip of coin)
Classic Accuracy vs. Precision Illustration:
Bulleyes Diagram
See chapter 1 in book (pg 6)

Lake Erie Stakeholders:
PA, OH, NY, MI, Ontaria, USFWS, Canada FWS, Trout Unlimted
Thought experiment:
Everyone:
-wants to estimate abundance of steel head
-uses the exact same method (method “A”) using randomized sampling
-BUT draw diff. random numbers to locate sample points

Lake Erie Stakeholders:
PA, OH, NY, MI, Ontaria, USFWS, Canada FWS, Trout Unlimted
Everyone:
-wants to estimate abundance of steel head
-uses the exact same method using randomized sampling
-BUT draw diff. random numbers to locate sample points
If the REAL number of steelhead in the lake is 100 million, what range of numbers might you expect from these 8 stakeholder if “method A” is…
1)accurate and precise?
2)accurate but not precise?
3)precise but not accurate?
“Estimation” vs “Hypothesis testing”
Concepts are inter-related
Estimation: estimating the value of an unknown parameter / quantity
eg, number of steelhead in Lake Erie, population growth rate of Allegheny county, incidence of HIV in West Africa
Goal of estimation: calculate mean value from sample data that is accurate (the right answer)
and precise (un-ambiguous)
Testing: are 2 estimated vlues different from each other
eg, are steelhead more abundant in PA or NY streams, is HIV declining over time
Involves a stistical model, p-values, etc
-Increasing sample size is easiest way to increase precision
-Random sampling best way to reduce bias

Which value is most precise relative to the real value?
Why do randomized experiments?
Deals with “confounding” variables
“randomization minimizes the influence of confounding variables, allowing the experimenter to [conclusively] isolate the effects of the treatment variable” (pg 424) and be confident about causation.
Randomized sampling in observational studies & randomized allocation of treatments in experiments can be said to “break up” the effects of “confounding variables” (p 435)
Confounding variables
Definition:
“A confoudning variables is a variable that masks or distorts the causal relationship betwen measured variables in a study.”
Consequences of confounding:
-Biased estimation of means
-Incorrect conclusions about causation
-can reverse the apparent direction of causation
Book Ex. of Confounding: Breastfeeding Studies
(Kramer et al 2002)
-Initial observational study: breast-fed babies weighted less @ 6 mo
-Later experimental study w/ randomization: breast-feds weighed more
-Confounding variable: misc, including socio-economic status
“w/ an experiment, random assignment of treatments to participants allows researchers to tease appart the effects of the explanatory variable. With random assignment, no confounding variables will be associated w/ treatment except by chance” (pg 435)
Example of confounding
Made up example: parasites & fish
Research Question: Does parasite infection cause reduce fish health & mass?
Say we notice a lot of sickly fish in a lake
Dissection indicates that they have intestinal parasites
We sample a bunch of fish and see that mass ~ parasite load

Aside on Causality: Proximate vs. Ultimate causes
Does temperature variation drive variation in parasite loads?
-Parasites are the proximate cause
-Temperature change is the real driver
-If lake temp drops, parasite abundance goes down, and fish health improves

Alternative hypothesis:
Temperature stress impacts fish health & immune system
Fish with compromised immune systems more likely to acquire parasites.
Parasites not causing poor health; poor health is resulting in parasite infection

KEY: An observational study – even if it uses random sampling would have great difficulty in determing the right answer
only some kind of experiment could figure this out
OR a long-term study following individual fish over time

Problem w/ Experiments: Experimental Artifacts
Definition: “An experimental artifact is a bias in a measurement produced by unintended consequences of experimental procedures”(pg 425)
What experimental artifacts could occur with exclusion experiments?
-Turkey exclusion (Chips et al 2014)
-????

Chips et al 2014. Quantifying deer and turkey leaf litter disturbances in the east. decid. forest: have nontrophic effects of consumers been overlooked? Can. Jrn of For. Res.
Book example of experimental artificat:
Seabirds submerged experimentally experience a drop in heart rate
interpretted as an adaptation
better experimetns: no change in heart rate
old result: stress response due to being water-boarded
Solution to experimental artifcacts:
-make experiments as natural as possible
-observational studies: do research in representative locations
But…
-Realism results in more random variation / noise (because nature is random and noisy)
-Random noise Reduces precision of estimates (larger error bars)
You can’t conceive of every realistic thing to factor in (like Turkeys…)
Lessons from Clinical Trials (14.2)
-Getting the right answer
Reducing SAMPLING ERROR:
## -Getting an answer you can be confident int ## (small error bars)
Lessons from Clinical Trials (14.2)
Key components of clinical trials to reduce BIAS:
1)Simultaneous CONTROLS
-Both treatment experience similar environmental conditions
-why is simultaneous important?
2)RANDOM treatment assignment
3)“BLINDING”
-Neither subject NOR researcher know treatment assignment
Lessons from Clinical Trials (14.2)
Key components to Reduce SAMPLING ERROR:
1)REPLICATION
-Multiple INDEPENDENT study units
2)“BALANCE”
-equal number of study units in each treatment
3)“BLOCKING”
-grouping by relevant factors
-very common consideration in experimental design
How to reduce bias (14.3)
Reducing bias: Simultaneous control groups
Defintion: “a control group is a group of subjects who do not receive the treatment of interested but otherwise experience similar conditions as the treated subjects”
Types of controls
1)No treatment
-Drug vs. no drug
2)“Procedural control” / sham treatment
-Drug vs. saline treatment
-Insecticide vs. just solvent/surfactant
3)Positive control
-New drug w/ unknown effect vs. known powerful drug
Positive controls rare in ecology, not always possible, but probably useful
My experience: invasive species
Reducing bias: Randomization
-Applies to experiments and (ideally) field sampling, surveys, etc
-Randomization “Breaks the association between possible confounding variables & the explanatory variable to be assessed. Randomization doesn’t eliminate the variation contributed by confounding variables, only their correlation with the treatment. It ensures variation from confounding variables is spread more evenly between diff. treatment groups” (429)
-If its not done with a random number generator or flip of a coin, its not random!
-ANY OTHER METHOD of assigning treatment / selecting study unit could result in bias!
Confounding example:
Confounding due to an environmental gradient
-We want to know whether birds improve plant health by eating insects (aka provide an “ecosystem service”)
- Experiment: exclude birds from shrubs,
- When birds excluded, insects herbivores should be free to munch
-monitor plant reproductive output.
-Bird exclusion should reduce plant reproduce by increasing insect herbivory

Confounding due to an environmental gradient
-We use cages around shrubs with fence posts and nettting
- Fence posts heavy: build all exclosures near road
- What if there is an un-observed soil nutrient gradient ## - Nutrient levels highest near road

Reducing bias: Blinding
-Double blinding
Definition: “blinding is the process of concealing info. from participants (sometimes including researchers) about which individulas recieve which treatments”
Rarely done – but probably should be - in ecology!

Aside: what are “recognization studies” “recognition studies typically perform intra- and inter colony aggression assays, with the a priori expectation that there should be little or no aggression among nestmates. Aggressive interactions between ants can include subtle behaviours such as mandible flaring and recoil, which can be hard to quantify, making these types of assays prone to confirmation bias.” “Confirmation Bias in Studies of Nestmate Recognition: A Cautionary Note for Research into the Behaviour of Animals” PLOS 1
How to reduce the influence of sampling error (14.4)
-Sampling error = noise
-more noise = larger standard deviation / stand. error
-harder to detect ..
-differences between groups,
-trends over time / space
Statistical “Power”
-“Power” is the ability to detect a difference between groups or a trend when it is actually there
-A large sample size increases power
-A large difference between groups / steep trend essentialy increases power
(but you don’t have control over this)
-Study design / stat. procedures can increase power
Reducing sampling error: Replication
Definition:
Replication is the application of every treatment to multiple, INDEPENDENT experimental units."
Reducing sampling error: Balance
Definition: “balanced experiments have equal sample sizes” (pg 434)
-lack of balance reduces power
-or, “unbalanced experiments have lower power”
-observational studies usually unbalanced (such is life)
A random and balanced experiment

AN UNbalanced experiment

Test-ish question: “What is the effect of balanced on an experiment?”
Reducing sampling error: Blocking
1)Organizing study by major components of phsyical locations / natural groupings ## 2)AND including this informationin statistical model!
3)within each block, study units receive each experimental treatment
4)Similar to “stratifying”
Blocking of Trillium Trail Deer Exclosures

What is advantegous about blocking for forest ecology experiments?
Why would a completley random plot location have had drawbacks?
Blocking’s easy forms: pairing & repeated measures (RM)
Example: A natural experiment using pairing for its controls

Example: A natural experiment using pairing for its controls



- What does this design accomplish?
- Would this be necessary if you were studying corn in a plowed field?
- How might this be converted to a true experiment?
Repeated measures (RM)
-similar to blocking / pairing
-measurements conducted before vs. after treatment delivered
-individuals serve as own controls
-common in human studies
-ex: heart rate b/f vs. after drinking Red Bull
Often/Ideally will still have un-treated controls in case any change occurs over time due to environmental conditiosn
Example: repeated measures
-unlike blocking, stats for simple pairing / RM are usually easy
-but, need to have enough replicates for sufficient “power”
Blocking - Book’s definition:
“blocking is the grouping of experimental units that have similar propoerties. Within each block, treatments are randomly assigned to experimental units.”
Problem with blocking:
-Contamination of nearby treatments
-Edge effects
eg: In my fenced deer exclusion plots, roots grow out of the plot into the control and compete with plants still being eaten
Pseudo-replication: when replication isn’t really replication
See Interleaf 2
Involves “sub-sampling”
Each sample is NOT a true sample
What is the source of the problem?

Pseudo-replication: when replication isn’t really replication
This 2nd design avoids pseudo-replication using “blocks”
Easiest way to analyze this experiment:
-take the mean of each samplig unit within each block

Experiments with more than one factor (14.5)
Definition: “A factor is a single treatment variables whose effects are of interest” (438)
**Levels of a factor“ ## -control vs. treatment, no fertilizer vs. fertilizer ## Different factors** ## -Nitrogen vs. Phosphorus (N vs P)
R uses the factor() command; we’ll get to know this well
“Factorial designs”
Defintion: a “factorial design investigates all treatment combinations of 2 or more variables. A factorial design can measure interactions between treatment variables”
Factorial fertilization experiment treatments:
1)Control (no fertilizer)
2)Nitrogen
3)Phosphorus
Interactions
Definition: “An interaction between 2 explanatory variables mmeans that the effect of 1 variables on the reponse depends on the state of a 2nd variable” (440)
When both factors occur, their effect is “greater than the sum of their parts”
-often denoted with a multiplication sign
-eg, “Nitrogen x Phosophorus interaction”
-or, “Nitrogen by Phosophorus interaction”
Interactions can occur between
1) 2 categorical variables (easy to understand)
-N treatment (Y/N) x P treatment (Y/N)
2) A categorical and continous effect (no too bad)
-N treatment (Y/N)) x rain fall (in cm)
3) Continous x Continuous (mind bending!)
-Amount of N (grams) x rain fall (in cm)
Example interaction: categorical x categorical
Example interaction: categorical x continous
What if you [ ] do experiments (14.6)
-can’t, its not relevant, unethical
Case-control
related to pairing
Using covariates / *control variables**
Considerations when Choosing a sample size (14.7)
Precision
Power
Data loss