14 de enero de 2016

Introduction

Introduction

Introduction

Introduction

Introduction

Introduction

Introduction

  • Statistical data analysis fundamental in science
    • t tests
    • p values
    • propensity scores
    • logistic regressions
    • least squares fits
    • confidence intervals
    • others…

Introduction

  • Statisticians provide scientists with
    • Powerful tools to find order and meaning in complex datasets
    • Scientists in different fields use these tools
  • There is lack of statistics education

Introduction

  • The truth is that many peer-reviewed scientific literature fall in statistical errors
    • Misinterpretation of p values –> causes false positives
    • Flexible data analysis –> find correlation where it doesn't exist
    • Inappropriate model choices –> bias results
  • Most errors are not detected by peer reviewers and editors
    • Not all journals employ statisticians to review submissions
    • Papers don't provide sufficient statistical detail (to be appropriately evaluated)
  • It's not fraud but poor statistical education

Introduction

  • Methodological complexity of moder research causes:
    • Scientist without extensive statistical training might not understand most published research in their fields
    • In medicine, a doctor who took an introductory statistics course
      • Might only understand one fifth of research articles published in the New England Journal of Medicine
      • N.J. Horton and S.S. Switzer. “Statistical Methods in the Journal.” New England Journal of Medicine 353, no. 18 (2005): 1977– 1979. DOI: 10.1056/ NEJM200511033531823.
    • This also happens in other disciplines

Introduction

  • Errors when scientists don't understand p values
    • Medical clinical trials
      • direct our health care
      • determining safety of new prescription drugs
    • Criminologists
      • evaluating strategies to mitigate crime
    • Epidemiologists
      • trying to control new diseases
    • Marketers & business managers
      • finding best way to sell products

Introduction

  • In many fields
    • Initial results tend to be contradicted by later results
    • Pressure to publish exciting results early
      • surpasses responsibility to publish carfully checked results supported by great evidence

Introduction

  • Other problems exist not only lack of education
  • Statistical errors due to lack of funding or resources
    • Lack of data, no money to collect more data and perform more studies
    • Time to collect the data

Introduction

  • Pharmaceutical industry
    • Tempted to bias evidence, neglects publish studies showing their drugs don't work
  • "Missing data and publication bias plague science, skewing our perceptions of important issues."

Introduction

  • Even doing good statistics could lead to errors
    • There are many statistical techniques
    • Analysis allow researches too much freedom to analyze data
    • Scientists could torture the data until it confesses
      • Try different analyses until one provides the desired result!
      • Pretend this was the analysis we planed to perform
    • How could we know if a result was obtained through data torture?

Statistical Significance

Statistical Significance

Statistical Significance

Statistical Significance

  • Experimental Science
    • Common to measure differences
    • Does a medicine work better than another?
    • Do cells with one version of a gene synthesize more of an enzyme than cells with another version?
    • Dows one kind of signal processing algorithm detect pulsars better than another'
    • Is learning algorithm A better than learning algorithm B?
    • Use statistics to make judgments about differences

Statistical Significance

  • So, judgments about differences…
    • There's always differences due to luck, chance, random variation
  • A difference is statistically significant when is larger than could easily be produced by luck

The power of p values

  • The Cold Drug Test
  • New drug for cold
    • Claim to cut cold 1 day earlir than existing drugs
  • Study 20 patients to prove new drug works
    • 10 new drug, 10 placebo
    • Track length of colds: average cold length with and without medicine

The power of p values

  • Problem
    • Not all colds have same length
    • Some last a week, others a few days, others two weeks or more
    • What if all 10 patients with medicine had very short colds
    • Did medicine work?, was it just luck?
  • How can we prove this?
    • Statistical hypothesis testing

The power of p values

  • Knowing the distribution of typical cold cases (short, long, average colds length)
    • Can know how likely our random sample of patients will have longer or shorter colds than average
    • We need a hypothesis test also known as a significance test
    • Even if medication completely fails: what are the chances the experiment produced observed results?

The power of p values

  • Test with 1 patient
    • Hard to get an average length cold
    • DON'T KNOW IF MEDICATION WORKS
  • Test with 10 million patients
    • Unlikely that all patients get short colds
    • MORE LIKELY THE MEDICATION WORKS
  • Scientist quantify this intiution with the concept of p value

p Value

  • p value is the probability, under the assumption that there is no true effect or no true difference, of collecting data that shows a difference equal to or more extreme than what you actually observed

p Value

  • Test medication with 100 patients
    • Find colds were 1 day shorter, on average, with medication
  • p value of this result means
    • the chance that the medication didn't actually do anything
    • the fact of the average cold being 1 day shorter (or more) than that of the control group was only luck

p Value

  • p value depends on size of the effect
    • Colds shorter by 4 days are less common than colds shorter by 1 day
    • Also dependent on number of patients tested

p Value

  • p value IS NOT a measure of:
    • how right we are on a result
    • how important is a difference
  • p value IS:
    • a measure of surprise
  • Suppose new medicine is ineffective
    • There's no other reason than luck for the two groups to differ
    • The smaller the p value, the more surprising and lucky the results are

p Value

  • Interpreting the p value
    • Is there really a difference between these groups?
    • Rule of thumb: any difference where p < 0.05 is statistically significant
    • We use 0.05 more by a scientific convention
  • Notice
    • p value works by ASSUMING THERE IS NO DIFFERENCE BETWEEN THE EXPERIMENTAL GROUPS

P Value

  • Interpreting the p value
    • If we want to show that the new drug actually works
    • We do it by showing the data is inconsistent with the drug not working
  • p values extend to any situation where we can Mathematically express a hypothesis that WE WANT TO TURN DOWN

p Values Limitations

  • p is a measure of surprise
  • smaller value suggests we should be more surprised
  • It's not measure of size of effect
  • We can get a tiny value by measuring a huge effect (or measure a tiny effect with great certainty)
    • This medicine makes people live 4 times longer
  • i.e. medicine usually has some real effect
    • We can get statistical significant result by collecting so much data
    • We then detect extremely tiny but relatively unimportant differences

p Values Limitations

  • Statistical significance doens't mean a result has any practical significance
  • Statistical insignificance doesn't tell much either
    • A statistically insignificant difference:
      • could be due to noise
      • could represent a real effect that can change with more data

p Values Limitations

  • "There's no mathematical tool to tell you whether your hypothesis is true or false; you can see only whether it's consistent with the data"
  • "If the data is sparse or unclear, your conclusions will be uncertain"

Statistical Power