M. Drew LaMar

April 29, 2016

“Increased quantification of scientific research and a proliferation of large, complex datasets in recent years have expanded the scope of applications of statistical methods.” From Wasserstein, Ronald L., and Nicole A. Lazar. “The ASA's statement on p-values: context, process, and purpose.” The American Statistician (2016).

Introduction to Biostatistics, Spring 2016

- Reading assignment for today is posted (you have till midnight)
- Final Exam: Thursday, May 5, 9 am-12 noon
- Designed for 2 hours
- Cumulative
- 8-10 questions
- 3 questions from Chaps. 16-18
- Know the same formulas as you needed for Exam #3

- Projects
- Solutions to HW will be up tomorrow

“Appropriately chosen techniques, properly conducted analyses and correct interpretation of statistical results also play a key role in ensuring that conclusions are sound and that uncertainty surrounding them is represented properly.”

“The issues touched on here affect not only research, but research funding, journal practices, career advancement, scientific education, public policy, journalism, and law.”

“Informally, a p-value is the probability under a specified statistical model that a statistical summary of the data … would be equal to or more extreme than its observed value.”

Definition:A \( P \)-value is the probability of obtaining a statistical summary of the data or worsegiven that the null hypothesis is true.

Principle #1:P-values can indicate how incompatible the data are with a specified statistical model.

- The most common context is a model, constructed under a set of assumptions, together with a so-called â€śnull hypothesis.“
- The null hypothesis postulates
**the absence of an effect.** - This incompatibility can be interpreted as casting doubt on or providing evidence against the null hypothesis
**or the underlying assumptions.**

Principle #2:P-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone.

- The P-value is a statement about data in relation to a specified hypothetical explanation, and
**is not a statement about the explanation itself.** - Remember, a P-value is the probability of getting a specific statistical summary of the data or worse
**assuming the null hypothesis is true.**

Principle #3:Scientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold.

- A conclusion does not immediately become “true” on one side of the divide and “false” on the other.
- Researchers should bring many contextual factors into play to derive scientific inferences:
- the design of a study,
- the quality of the measurements,
- the external evidence for the phenomenon under study,
- and the validity of assumptions that underlie the data analysis.

Principle #4:Proper inference requires full reporting and transparency.

- P-values and related analyses should not be reported selectively.
- Researchers should disclose:
- the number of hypotheses explored during the study,
- all data collection decisions,
- all statistical analyses conducted,
- and all p-values computed.

Principle #5:A p-value, or statistical significance, does not measure the size of an effect or the importance of a result.

Principle #6:By itself, a p-value does not provide a good measure of evidence regarding a model or hypothesis.

- A p-value near 0.05 taken by itself offers only weak evidence against the null hypothesis.
- A relatively large p-value does not imply evidence in favor of the null hypothesis; many other hypotheses may be equally or more consistent with the observed data.

Alternatives:

- Estimation (e.g. confidence intervals)
- Bayesian methods
- alternative measures of evidence, such as likelihood ratios or Bayes Factors
- decision-theoretic modeling
- false discovery rates

Remember to use statistics for good not evil!