Week 1 Skill Builders

The assignment

In McElreath:

Warm up with chapter 2’s Martian-Earthling problem 2M3. Then try the alternate priors in 2M2.
Continue to chapter 3: problems 3M1-M3, M5 (M is for medium difficulty).

These will reinforce the basics of the inferential techniques we will use.

Chapter 2 – Small worlds and large worlds

2M3: My Favorite Martian

Here’s the data from the problem:

\[ \begin{align} Pr(land \mid Earth) &= 1 − 0.7 \\ Pr(land \mid Mars) &= 1 \end{align} \]

Are there not equal prior expectations of each globe?

\[ \begin{align} Pr(Earth) &= 0.5 \\ Pr(Mars) &= 0.5 \end{align} \]

Now the piece de resistance, Bayes,

\[ Pr(Earth \mid land) = \frac{Pr(land \mid Earth)Pr(Earth)}{Pr(land)} \]

We code this in R as

pr_earth_land <- 0.42 # It is not 42! Let R do the arithmetic

The answer is 0.42.

2M2: We know a little something

We start with this model and the improper prior (does not add to one like a good probability distribution should).

p_grid <- seq( from=0 , to=1 , length.out=100 )
likelihood <- dbinom( 3 , size=3 , prob=p_grid )
prior <- 1
posterior <- likelihood * prior
posterior <- posterior / sum(posterior) # standardize
plot( posterior ~ p_grid , type="l" )

In this code chunk we insert a new prior prior <- ifelse( p_grid < 0.5 , 0 , 1 ) # new prior in place of the improper 1 and rerun.

Next we try adding more land than water in this chunk with this data:

\[ L, W, W, L, W, W, W \]

# moreland

The upshot?

Point one
Point two
Kai toi loipa, usw, etc.

Chapter 3 – Sampling the imaginary

Start up with 3M1-2

Here we input the new number of tosses and results. We just change the data in this existing model.

p_grid <- seq( from=0 , to=1 , length.out=1000 ) # hypptheses
prior <- rep( 1 , 1000 )
tosses <- 9 # data
water <- 6 # data
likelihood <- dbinom( water , size=tosses , prob=p_grid )
posterior <- likelihood * prior
posterior <- posterior / sum(posterior)
plot( posterior ~ p_grid , type="l" )

Next 3M2-3 and a predictive check

Draw samples first, then check the results with a density plot and a 90% HPDI (Highest Posterior Density Interval) from the rethinking package.

library(rethinking)
samples <- sample( p_grid , prob=posterior , size=1e4 , replace=TRUE )
dens( samples )

HPDI( samples , prob=0.9 ) #90% HPDI

##      |0.9      0.9| 
## 0.4154154 0.8688689

Interpretation:

The HPDI indicates…
The density function shows that…

We now run a predictive check on the posterior distribution. This routine …

Sample …
Simulate …
Compare simulation with the data that w==8. Calculate the proportions.

samples <- sample( p_grid , prob=posterior , size=1e4 , replace=TRUE )
w <- rbinom( 1e4 , size=tosses , prob=samples )
# Look this sum() / 1e4 ratio in the chapter and calculate here. Use the simplehist() of w to plot the results

What does this mean?

A good model?
Compatibility?

Now to change up the priors 3M5

Start with the alternate priors from 2M2.

# alternate priors

Run the HPDI() and simulate w <- rbinom() and show simplehist().

# check again

Some questions to ponder.

We don’t trust the data? Is that what the new prior is saying?
What about the observed value off 8. What happened to the predictive distribution?
What does this mean for building models? Communicating with those who would consume your analysis?