14 de enero de 2016
Introduction
Introduction
- This course is partially based on the book
Introduction

Introduction

Introduction

Introduction

Introduction
- Statistical data analysis fundamental in science
- t tests
- p values
- propensity scores
- logistic regressions
- least squares fits
- confidence intervals
- others…
Introduction
- Statisticians provide scientists with
- Powerful tools to find order and meaning in complex datasets
- Scientists in different fields use these tools
- There is lack of statistics education
Introduction
- The truth is that many peer-reviewed scientific literature fall in statistical errors
- Misinterpretation of p values –> causes false positives
- Flexible data analysis –> find correlation where it doesn't exist
- Inappropriate model choices –> bias results
- Most errors are not detected by peer reviewers and editors
- Not all journals employ statisticians to review submissions
- Papers don't provide sufficient statistical detail (to be appropriately evaluated)
- It's not fraud but poor statistical education
Introduction
- Methodological complexity of moder research causes:
- Scientist without extensive statistical training might not understand most published research in their fields
- In medicine, a doctor who took an introductory statistics course
- Might only understand one fifth of research articles published in the New England Journal of Medicine
- N.J. Horton and S.S. Switzer. “Statistical Methods in the Journal.” New England Journal of Medicine 353, no. 18 (2005): 1977– 1979. DOI: 10.1056/ NEJM200511033531823.
- This also happens in other disciplines
Introduction
- Errors when scientists don't understand p values
- Medical clinical trials
- direct our health care
- determining safety of new prescription drugs
- Criminologists
- evaluating strategies to mitigate crime
- Epidemiologists
- trying to control new diseases
- Marketers & business managers
- finding best way to sell products
Introduction
- In many fields
- Initial results tend to be contradicted by later results
- Pressure to publish exciting results early
- surpasses responsibility to publish carfully checked results supported by great evidence
Introduction
- Other problems exist not only lack of education
- Statistical errors due to lack of funding or resources
- Lack of data, no money to collect more data and perform more studies
- Time to collect the data
Introduction
- Pharmaceutical industry
- Tempted to bias evidence, neglects publish studies showing their drugs don't work
- "Missing data and publication bias plague science, skewing our perceptions of important issues."
Introduction
- Even doing good statistics could lead to errors
- There are many statistical techniques
- Analysis allow researches too much freedom to analyze data
- Scientists could torture the data until it confesses
- Try different analyses until one provides the desired result!
- Pretend this was the analysis we planed to perform
- How could we know if a result was obtained through data torture?
Statistical Significance
Statistical Significance

Statistical Significance

Statistical Significance
- Experimental Science
- Common to measure differences
- Does a medicine work better than another?
- Do cells with one version of a gene synthesize more of an enzyme than cells with another version?
- Dows one kind of signal processing algorithm detect pulsars better than another'
- Is learning algorithm A better than learning algorithm B?
- Use statistics to make judgments about differences
Statistical Significance
- So, judgments about differences…
- There's always differences due to luck, chance, random variation
- A difference is statistically significant when is larger than could easily be produced by luck
The power of p values
- The Cold Drug Test
- New drug for cold
- Claim to cut cold 1 day earlir than existing drugs
- Study 20 patients to prove new drug works
- 10 new drug, 10 placebo
- Track length of colds: average cold length with and without medicine
The power of p values
- Problem
- Not all colds have same length
- Some last a week, others a few days, others two weeks or more
- What if all 10 patients with medicine had very short colds
- Did medicine work?, was it just luck?
- How can we prove this?
- Statistical hypothesis testing
The power of p values
- Knowing the distribution of typical cold cases (short, long, average colds length)
- Can know how likely our random sample of patients will have longer or shorter colds than average
- We need a hypothesis test also known as a significance test
- Even if medication completely fails: what are the chances the experiment produced observed results?
The power of p values
- Test with 1 patient
- Hard to get an average length cold
- DON'T KNOW IF MEDICATION WORKS
- Test with 10 million patients
- Unlikely that all patients get short colds
- MORE LIKELY THE MEDICATION WORKS
- Scientist quantify this intiution with the concept of p value
p Value
- p value is the probability, under the assumption that there is no true effect or no true difference, of collecting data that shows a difference equal to or more extreme than what you actually observed
p Value
- Test medication with 100 patients
- Find colds were 1 day shorter, on average, with medication
- p value of this result means
- the chance that the medication didn't actually do anything
- the fact of the average cold being 1 day shorter (or more) than that of the control group was only luck
p Value
- p value depends on size of the effect
- Colds shorter by 4 days are less common than colds shorter by 1 day
- Also dependent on number of patients tested
p Value
- p value IS NOT a measure of:
- how right we are on a result
- how important is a difference
- p value IS:
- Suppose new medicine is ineffective
- There's no other reason than luck for the two groups to differ
- The smaller the p value, the more surprising and lucky the results are
p Value
- Interpreting the p value
- Is there really a difference between these groups?
- Rule of thumb: any difference where p < 0.05 is statistically significant
- We use 0.05 more by a scientific convention
- Notice
- p value works by ASSUMING THERE IS NO DIFFERENCE BETWEEN THE EXPERIMENTAL GROUPS
P Value
- Interpreting the p value
- If we want to show that the new drug actually works
- We do it by showing the data is inconsistent with the drug not working
- p values extend to any situation where we can Mathematically express a hypothesis that WE WANT TO TURN DOWN
p Values Limitations
- p is a measure of surprise
- smaller value suggests we should be more surprised
- It's not measure of size of effect
- We can get a tiny value by measuring a huge effect (or measure a tiny effect with great certainty)
- This medicine makes people live 4 times longer
- i.e. medicine usually has some real effect
- We can get statistical significant result by collecting so much data
- We then detect extremely tiny but relatively unimportant differences
p Values Limitations
- Statistical significance doens't mean a result has any practical significance
- Statistical insignificance doesn't tell much either
- A statistically insignificant difference:
- could be due to noise
- could represent a real effect that can change with more data
p Values Limitations
- "There's no mathematical tool to tell you whether your hypothesis is true or false; you can see only whether it's consistent with the data"
- "If the data is sparse or unclear, your conclusions will be uncertain"
Statistical Power