Benjamin et al. use Bayesian arguments to recommend that \(P\) values should be lowered to make results reproducible–claiming an effect at \(\alpha = 0.05\) makes for too many false positives. Today we discuss elements of the argument and review progress on exercises from last time.

Logistics

Resources

**web notes from last time
Software includes:
- Class code on Sakai: clarkFunctions2021.r
- Getting started with R
Discussion reading:
- Redefining statistical significance, classical and Bayesian statisticians compromise, if \(P\) values are to used, then \(P = 0.05\) is way too high, Nature.
Optional background:
- Why Big Data Could Be a Big Fail, Jordan on potential and limitations of Big Data (misleading title).
- Why environmental scientists are becoming Bayesians, Clark on proliferation of Bayes in environmental science, Ecol Letters.
- Bayesian method for hierarchical models: Are ecologists making a Faustian bargain?, Lele and Dennis offer contrarian view, Ecol Appl.

For next time

review basic rules in this document
review next unit here
Exercises from Unit 1 due 28 January using exerciseTemplate.Rmd

Today’s plan

check in onexerciseTemplate.Rmd for assignments
Breakout discussion of Redefining statistical significance: consensus/remaining issues with discussion questions
group summaries
Jim: P values, Bayes’ theorem for normal distribution and regression, graphs, factors, R

Objectives:

Where does a P value come from?
Derive posterior mean for the “normal-normal”
Use R for basic operations

A few rules to recall:

logs and exponentiation:

these are equivaluent: \(exp(x) = e^x\)
\(\log( e^x ) = e^{ \log x } = x\)
\(exp(y_1) exp(y_2) = exp(y_1 + y_2)\)
\(\prod_{i=1}^n exp( y_i ) = exp \left( \sum_{i=1}^n y_i \right)\)
\(log( a/b ) = log(a) - log(b)\)

Simple derivatives

\(\frac{d}{dx} \left( x^a \right) = a x^{a-1}\)
\(\frac{d}{dx} \left( e^x \right) = e^x\)
\(\frac{d}{dx} \left( e^{f(x)} \right) = \frac{df}{dx} e^{f(x)}\)

Some sample statistics

The notation \(E(y)\) refers to the expectation of \(y\). For random data it is the sample mean. We do not simply refer to it as a mean, because it is more general: a probability distribution has an expectation, even if there are no data.

mean: \(E(y) = \bar{y} = \frac{1}{n} \sum_{i=1}^n y_i\)
variance: \(Var(y) = E(y^2) - E^2(y) =\frac{1}{n} \sum_{i=1}^n (y_i - \bar{y})^2 = \frac{1}{n} \sum_{i=1}^n y^2_i - \bar{y}^2\)
covariance: \(Cov(x,y) = E(xy) - E(x)E(y) = \frac{1}{n} \sum_{i=1}^n x_i y_i - \bar{x} \bar{y}\)

Models we will encounter

Univariate response models include the following:

Name	response	additional attributes
linear (LM)	normal	linear in parameters
linear mixed model (LMM)	normal	LM with random effects
generalized linear model (GLM)	discrete	linear in parameters on link scale
logistic regression	binomial	GLM includes logit link
probit regression	binomial	GLM includes probit link
Poisson regression	Poisson	GLM typically with log link
mixed GLM (GLMM)	binomial, Poisson, …	GLM with random effects

Multivariate response models have a vector of responses:

Name	response	additional attributes
linear (LM)	MVN	linear in parameters
categorical	multinomial	multiple classes, one outcome, MV logit or probit link
multinomial	multinomial	multiple classes, mulitple outcomes
generalized joint attribute model (GJAM)	all types	linear in parameters

Time series have dependence in time. They can be state-space models that are normal or not for continuous states. The Kalman filter is a simple (normal-normal) state-space model. Hidden Markov model is a term are most often applied to discrete states. Autoregressive models are normal and have dependence on \(p\) previous times, e.g., AR(\(p\)).

Spatial models have dependence in space. They can be LM or GLM with \(n = 1\) and a \(m \times m\) covariance matrix \(\Sigma\), where \(m\) is the number of locations. Kriging is a traditional spatial model for continuous space. Spatial autoregressive models are used where space is viewed as discrete blocks (e.g., census tracks, counties, …).

2. P values, Bayes, model types
Discussion and R
Duke University

env/bio 665 Bayesian inference for environmental models

Jim Clark

2021-01-28

Logistics

Resources

For next time

Today’s plan

A few rules to recall:

logs and exponentiation:

Simple derivatives

Some sample statistics

Models we will encounter

2. P values, Bayes, model types Discussion and R Duke University

env/bio 665 Bayesian inference for environmental models

Jim Clark

2021-01-28

Logistics

Resources

For next time

Today’s plan

A few rules to recall:

logs and exponentiation:

Simple derivatives

Some sample statistics

Models we will encounter

2. P values, Bayes, model types
Discussion and R
Duke University