Designing studies, experiments, surveys (Ch 14)

“The goal of experimental design is to eliminate BIAS and reduce SAMPLING ERROR when ESTIMATING [calculating the mean] and TESTING [hypothesis testing w/ a p value] the effects of one variable on another [looking for causal connections]” (pg 47)


Today: focus on true experiments

(but principals apply to all studies)


Wed: Focus on observational studies





























BIAS and SAMPLING ERROR

Bias = wrong answer

= inaccurate

= “systematic discrepency” (pg 6) (overshoot or undershoot)

= caused by properties of instrument (not calibrated),

experimental design, or stat procedure


Sampling error = creates noise around answer

= makes estimates imprecise

= due to random variation in sampling unit

= (like roll of dice, flip of coin)





























Classic Accuracy vs. Precision Illustration:

Bulleyes Diagram

See chapter 1 in book (pg 6)





























Lake Erie Stakeholders:

PA, OH, NY, MI, Ontaria, USFWS, Canada FWS, Trout Unlimted


Thought experiment:

Everyone:

-wants to estimate abundance of steel head

-uses the exact same method (method “A”) using randomized sampling

-BUT draw diff. random numbers to locate sample points





























Lake Erie Stakeholders:

PA, OH, NY, MI, Ontaria, USFWS, Canada FWS, Trout Unlimted


Everyone:

-wants to estimate abundance of steel head

-uses the exact same method using randomized sampling

-BUT draw diff. random numbers to locate sample points


If the REAL number of steelhead in the lake is 100 million, what range of numbers might you expect from these 8 stakeholder if “method A” is…


1)accurate and precise?

2)accurate but not precise?

3)precise but not accurate?





























“Estimation” vs “Hypothesis testing”


Estimation: estimating the value of an unknown parameter / quantity

eg, number of steelhead in Lake Erie, population growth rate of Allegheny county, incidence of HIV in West Africa

Goal of estimation: calculate mean value from sample data that is accurate (the right answer)

and precise (un-ambiguous)


Testing: are 2 estimated vlues different from each other

eg, are steelhead more abundant in PA or NY streams, is HIV declining over time

Involves a stistical model, p-values, etc





























-Increasing sample size is easiest way to increase precision

-Random sampling best way to reduce bias

Which value is most precise relative to the real value?





























Why do randomized experiments?

Deals with “confounding” variables

“randomization minimizes the influence of confounding variables, allowing the experimenter to [conclusively] isolate the effects of the treatment variable” (pg 424) and be confident about causation.


Randomized sampling in observational studies & randomized allocation of treatments in experiments can be said to “break up” the effects of “confounding variables” (p 435)





























Confounding variables


Definition:

“A confoudning variables is a variable that masks or distorts the causal relationship betwen measured variables in a study.”

Consequences of confounding:

-Biased estimation of means

-Incorrect conclusions about causation

-can reverse the apparent direction of causation















Book Ex. of Confounding: Breastfeeding Studies

(Kramer et al 2002)

-Initial observational study: breast-fed babies weighted less @ 6 mo

-Later experimental study w/ randomization: breast-feds weighed more

-Confounding variable: misc, including socio-economic status

“w/ an experiment, random assignment of treatments to participants allows researchers to tease appart the effects of the explanatory variable. With random assignment, no confounding variables will be associated w/ treatment except by chance” (pg 435)





























Example of confounding

Made up example: parasites & fish

Research Question: Does parasite infection cause reduce fish health & mass?

Say we notice a lot of sickly fish in a lake

Dissection indicates that they have intestinal parasites

We sample a bunch of fish and see that mass ~ parasite load





























Aside on Causality: Proximate vs. Ultimate causes

Does temperature variation drive variation in parasite loads?

-Parasites are the proximate cause

-Temperature change is the real driver

-If lake temp drops, parasite abundance goes down, and fish health improves





























Alternative hypothesis:

Temperature stress impacts fish health & immune system

Fish with compromised immune systems more likely to acquire parasites.

Parasites not causing poor health; poor health is resulting in parasite infection





























KEY: An observational study – even if it uses random sampling would have great difficulty in determing the right answer

only some kind of experiment could figure this out

OR a long-term study following individual fish over time





























Problem w/ Experiments: Experimental Artifacts


Definition: “An experimental artifact is a bias in a measurement produced by unintended consequences of experimental procedures”(pg 425)


What experimental artifacts could occur with exclusion experiments?

-Turkey exclusion (Chips et al 2014)

-????

Chips et al 2014. Quantifying deer and turkey leaf litter disturbances in the east. decid. forest: have nontrophic effects of consumers been overlooked? Can. Jrn of For. Res.





























Book example of experimental artificat:

Seabird submersion


Seabirds submerged experimentally experience a drop in heart rate

interpretted as an adaptation

better experimetns: no change in heart rate

old result: stress response due to being water-boarded





























Solution to experimental artifcacts:

-make experiments as natural as possible

-observational studies: do research in representative locations

-ask for input when designing experiments


But…

-Realism results in more random variation / noise (because nature is random and noisy)

-Random noise Reduces precision of estimates (larger error bars)

You can’t conceive of every realistic thing to factor in (like Turkeys…)





























Lessons from Clinical Trials (14.2)


Reduceing BIAS:


-Getting the right answer

Reducing SAMPLING ERROR:


## -Getting an answer you can be confident int ## (small error bars)





























Lessons from Clinical Trials (14.2)

Key components of clinical trials to reduce BIAS:

1)Simultaneous CONTROLS

-Both treatment experience similar environmental conditions

-why is simultaneous important?


2)RANDOM treatment assignment


3)“BLINDING”

-Neither subject NOR researcher know treatment assignment





























Lessons from Clinical Trials (14.2)

Key components to Reduce SAMPLING ERROR:

1)REPLICATION

-Multiple INDEPENDENT study units


2)“BALANCE”

-equal number of study units in each treatment


3)“BLOCKING”

-grouping by relevant factors

-very common consideration in experimental design





























How to reduce bias (14.3)


Reducing bias: Simultaneous control groups

Defintion: “a control group is a group of subjects who do not receive the treatment of interested but otherwise experience similar conditions as the treated subjects”


Types of controls

1)No treatment

-Drug vs. no drug

2)“Procedural control” / sham treatment

-Drug vs. saline treatment

-Insecticide vs. just solvent/surfactant

3)Positive control

-New drug w/ unknown effect vs. known powerful drug


Positive controls rare in ecology, not always possible, but probably useful

My experience: invasive species





























Reducing bias: Randomization

-Applies to experiments and (ideally) field sampling, surveys, etc

-Randomization “Breaks the association between possible confounding variables & the explanatory variable to be assessed. Randomization doesn’t eliminate the variation contributed by confounding variables, only their correlation with the treatment. It ensures variation from confounding variables is spread more evenly between diff. treatment groups” (429)


-If its not done with a random number generator or flip of a coin, its not random!

-ANY OTHER METHOD of assigning treatment / selecting study unit could result in bias!





























Confounding example:

Confounding due to an environmental gradient

-We want to know whether birds improve plant health by eating insects (aka provide an “ecosystem service”)

- Experiment: exclude birds from shrubs,

- When birds excluded, insects herbivores should be free to munch

-monitor plant reproductive output.

-Bird exclusion should reduce plant reproduce by increasing insect herbivory





























Confounding due to an environmental gradient

-We use cages around shrubs with fence posts and nettting

- Fence posts heavy: build all exclosures near road

- What if there is an un-observed soil nutrient gradient ## - Nutrient levels highest near road





























Reducing bias: Blinding


-Double blinding

-Single blinding


Definition: “blinding is the process of concealing info. from participants (sometimes including researchers) about which individulas recieve which treatments”


Rarely done – but probably should be - in ecology!





























Blinding





























Aside: what are “recognization studies” “recognition studies typically perform intra- and inter colony aggression assays, with the a priori expectation that there should be little or no aggression among nestmates. Aggressive interactions between ants can include subtle behaviours such as mandible flaring and recoil, which can be hard to quantify, making these types of assays prone to confirmation bias.” “Confirmation Bias in Studies of Nestmate Recognition: A Cautionary Note for Research into the Behaviour of Animals” PLOS 1





























How to reduce the influence of sampling error (14.4)

-Sampling error = noise

-more noise = larger standard deviation / stand. error

-harder to detect ..

-differences between groups,

Statistical “Power”

-“Power” is the ability to detect a difference between groups or a trend when it is actually there

-A large sample size increases power

-A large difference between groups / steep trend essentialy increases power

(but you don’t have control over this)

-Study design / stat. procedures can increase power





























Reducing sampling error: Replication

Definition:

Replication is the application of every treatment to multiple, INDEPENDENT experimental units."





























Reducing sampling error: Balance

Definition: “balanced experiments have equal sample sizes” (pg 434)

-lack of balance reduces power

-or, “unbalanced experiments have lower power”

-observational studies usually unbalanced (such is life)

-doing extra work to increase balance can be beneficial

A random and balanced experiment





























AN UNbalanced experiment

Test-ish question: “What is the effect of balanced on an experiment?”

















Reducing sampling error: Blocking

1)Organizing study by major components of phsyical locations / natural groupings ## 2)AND including this informationin statistical model!

3)within each block, study units receive each experimental treatment

4)Similar to “stratifying”

5)Easiest forms:

a) pairing

b) repeated measures (RM)





























Blocking: common forms

-Humans: class, school, school distriction hospital

eg, study effect of drug, use 4 hospitals as blocks, within each hospital randomly assing drug treatment and placebo

what would be wrong with using placebo at 2 hospitals and expt. drug at 2 others?





























-Ecology: eggs in nest, wildlife managment unit, lake, stream

-often multiple treatments w/in blocks

-Fertilizer experiment: control, N, P, K treatments w/in the same block


Issue w/blocking:

-makes stats tougher

-Usually good to have 4+ blocks





























Blocking of Trillium Trail Deer Exclosures

What is advantegous about blocking for forest ecology experiments?

Why would a completley random plot location have had drawbacks?





























Blocking’s easy forms: pairing & repeated measures (RM)

Pairing: apply treatment to one study unit & control to a nearby/related unit

-use paired t-test for stats

ex: -Experiments on twins,

-paired control plots / fenced exclosures





























Example: A natural experiment using pairing for its controls





























Example: A natural experiment using pairing for its controls





















































































Repeated measures (RM)

-similar to blocking / pairing

-measurements conducted before vs. after treatment delivered

-individuals serve as own controls

-common in human studies

-ex: heart rate b/f vs. after drinking Red Bull

-WBC b/f vs after chemo


Often/Ideally will still have un-treated controls in case any change occurs over time due to environmental conditiosn






























Example: repeated measures





























-unlike blocking, stats for simple pairing / RM are usually easy

-but, need to have enough replicates for sufficient “power”





























Blocking - Book’s definition:

“blocking is the grouping of experimental units that have similar propoerties. Within each block, treatments are randomly assigned to experimental units.”





























Problem with blocking:

-Contamination of nearby treatments

-Edge effects

eg: In my fenced deer exclusion plots, roots grow out of the plot into the control and compete with plants still being eaten





























Pseudo-replication: when replication isn’t really replication

See Interleaf 2

Involves “sub-sampling”

Each sample is NOT a true sample

What is the source of the problem?





























Pseudo-replication: when replication isn’t really replication

This 2nd design avoids pseudo-replication using “blocks”

Easiest way to analyze this experiment:

-take the mean of each samplig unit within each block





























Experiments with more than one factor (14.5)

Definition: “A factor is a single treatment variables whose effects are of interest” (438)

**Levels of a factor“ ## -control vs. treatment, no fertilizer vs. fertilizer ## Different factors** ## -Nitrogen vs. Phosphorus (N vs P)


R uses the factor() command; we’ll get to know this well





























“Factorial designs”

Defintion: a “factorial design investigates all treatment combinations of 2 or more variables. A factorial design can measure interactions between treatment variables”

Factorial fertilization experiment treatments:

1)Control (no fertilizer)

2)Nitrogen

3)Phosphorus

4)N + P





























Interactions

Definition: “An interaction between 2 explanatory variables mmeans that the effect of 1 variables on the reponse depends on the state of a 2nd variable” (440)

When both factors occur, their effect is “greater than the sum of their parts”

-often denoted with a multiplication sign

-eg, “Nitrogen x Phosophorus interaction”

-or, “Nitrogen by Phosophorus interaction”





























Interactions can occur between

1) 2 categorical variables (easy to understand)

-N treatment (Y/N) x P treatment (Y/N)

2) A categorical and continous effect (no too bad)

-N treatment (Y/N)) x rain fall (in cm)

3) Continous x Continuous (mind bending!)

-Amount of N (grams) x rain fall (in cm)





























Example interaction: categorical x categorical

Example interaction: categorical x continous





























What if you [ ] do experiments (14.6)

-can’t, its not relevant, unethical


Match & adjust


Case-control

related to pairing


Using covariates / *control variables**





























Considerations when Choosing a sample size (14.7)


Precision

Power

Data loss