Bayes Week 4 HW

knitr::opts_chunk$set(message = F)

Question 1

Use the !Kung data from chapter 4 that we worked with in class (18+ only, Howell1 dataset in the rethinking package).

library(rethinking)

data(Howell1)
d <- Howell1 # Data
d2 <- d[d$age >= 18, ] # Only 18+

First, try to construct what you consider a “better” set of priors than what we used in class to look at the weight/height relationship. There isn’t a right answer here, but I’d like you to come up with a principled choice and explain why you did it (you don’t need to select new priors for all of the parameters unless you want to).

\(h_i \sim Normal(\mu,\sigma)\) likelihood (height)

\(\mu_i = \alpha + \beta(x_i - \bar{x})\) Linear model for mu

\(\alpha \sim Normal(152,20)\) alpha prior

\(\beta \sim Log-Normal(0,0.95)\) beta prior

\(\sigma \sim Uniform(0,50)\) sigma prior

set.seed(124)

N <- 1e2 # hard to make out spaghetti plot below at higher values here
a <- rnorm( N , 152 , 20 ) # a ~ Normal(152, 20)
b <- rlnorm( 1e4 , 0 , .95 ) # B ~ log-normal(0, .95)

1a:

I looked up the average height of Kalahari peoples and found it was around 5’, ca. 1964 (based on a Google Scholar abstract…). I did not touch the SD, because ~7" seems like a reasonable value to me!

I also slightly reduced the beta prior because the range of weight-height betas that we generated with our in-class/Ch. 4 prior seemed maybe a little too inclusive, though basically reasonable. It’s implausible to expect the !Kung people to include both the tallest and smallest people ever recorded, at least not to the point where we need our model to account for those possibilities. (And the example from Ch. 4 included a few lines that extended from below 0 height to well above the tallest person ever, over the range of weight in the Howell1 dataset.)

Next, run a prior predictive simulation and briefly describe what you conclude from it.

plot( NULL , xlim=range(d2$weight) , ylim=c(-100,400) ,
      xlab="weight" , ylab="height" )
abline( h=0 , lty=2 ) # embryo: preserving these reference points b/c they're useful
abline( h=272 , lty=1 , lwd=0.5 ) # world's tallest person
mtext( "b ~ rlnorm(0, 0.95)" )
xbar <- mean(d2$weight)
for ( i in 1:N ) curve( # simulations w/in plot using forloop
  a[i] + b[i]*(x - xbar) ,
  from=min(d2$weight) , 
  to=max(d2$weight) , 
  add=TRUE ,
  col=col.alpha("black",0.2) )

1b:

This looks reasonable to me! Even the most extreme simulated betas are within the “impossible” constraints of predictions beyond 0cm & 272cm. The majority of them look at least somewhat plausible.

use quap() to run your model. Summarize your model how you see fit and note whether your conclusions are any different from the conclusions from the slides.

# fit model
m1c <- quap(
  alist(
    height ~ dnorm( mu , sigma ) ,
    mu <- a + b*( weight - xbar ) ,
    a ~ dnorm( 152 , 20 ) ,
    b ~ dlnorm( 0 , .95 ) ,
    sigma ~ dunif( 0 , 50 )
  ) ,
data = d2)

## Summaries
precis( m1c , prob = .95 )

##              mean         sd       2.5%       97.5%
## a     154.5966182 0.27030752 154.066825 155.1264112
## b       0.9033022 0.04191826   0.821144   0.9854605
## sigma   5.0718817 0.19115482   4.697225   5.4465383

diag(round( vcov( m1c ) , 3 )) # very little covariance here

##     a     b sigma 
## 0.073 0.002 0.037

post <- extract.samples( m1c , n=1e4 )
head(post)

##          a         b    sigma
## 1 154.7049 0.8089684 5.255436
## 2 154.3279 0.9138603 4.927109
## 3 154.6908 0.9027068 4.964012
## 4 154.8760 0.9144905 5.184900
## 5 154.7801 0.9189501 4.944660
## 6 154.5780 0.8976717 4.907925

precis( post , prob = .95 )

##              mean         sd        2.5%       97.5%    histogram
## a     154.5947998 0.27051588 154.0741769 155.1315697 ▁▁▁▂▃▇▇▃▂▁▁▁
## b       0.9033676 0.04194655   0.8198444   0.9846263      ▁▂▇▇▂▁▁
## sigma   5.0720140 0.19069058   4.6933827   5.4477606    ▁▁▁▅▇▃▁▁▁

plot( height ~ weight , data = d2 , col = rangi2 )
post <- extract.samples( m1c )
{
  a_map <- mean(post$a)
  b_map <- mean(post$b)
    }
curve( a_map + b_map*(x - xbar) , add=T)

# visualize vac/cov
pairs( m1c )

# extracting samples
(extract.samples( m1c )[1:10,])

##           a         b    sigma
## 1  154.1553 0.9331170 5.144758
## 2  155.3214 0.9312054 4.970331
## 3  154.5469 0.9772932 5.321042
## 4  154.2190 0.8396732 4.803230
## 5  154.2760 0.9233432 5.321848
## 6  154.5837 0.8494858 5.111305
## 7  154.5399 0.9000540 5.103513
## 8  154.5935 0.9346021 5.340856
## 9  154.7916 0.8501258 5.625256
## 10 154.6580 0.9602327 5.200707

N <- 10
dN <- d2[ 1:N , ]
mN <- quap(
  alist(
    height ~ dnorm( mu , sigma ) ,
    mu <- a + b*( weight - mean(weight) ) ,
    a ~ dnorm( 152 , 20 ) ,
    b ~ dlnorm( 0 , .95 ) ,
    sigma ~ dunif( 0 , 50 )
      ) , data=dN )


## Link!

# define sequence of weights to compute predictions for
# these values will be on the horizontal axis
weight.seq <- seq( from=25 , to=70 , by=1 )
# use link to compute mu
# for each sample from posterior
# and for each weight in weight.seq
mu <- link( m1c , data=data.frame(weight=weight.seq) )
str(mu)

##  num [1:1000, 1:46] 135 137 136 136 137 ...

# use type="n" to hide raw data
plot( height ~ weight , d2 , type="n" )
# loop over samples and plot each mu value
for ( i in 1:100 )
points( weight.seq , mu[i,] , pch=16 , col=col.alpha(rangi2,0.1) )

## Summaries!

# for mu
mu.mean <- apply( mu , 2 , mean )
mu.PI <- apply( mu , 2 , PI , prob=0.95 )
mu.HPDI <- apply( mu , 2 , HPDI , prob=0.95 )
# plot raw data 4.57
# fading out points to make line and interval more visible
plot( height ~ weight , data=d2 , col=col.alpha(rangi2,0.5))
# plot the MAP line, aka the mean mu for each weight
lines( weight.seq , mu.mean )
# plot a shaded region for 95% PI
shade( mu.PI , weight.seq )

## Plot!
sim.height <- sim( m1c , data=list(weight=weight.seq) , n=1e4 )

height.PI <- apply( sim.height , 2 , PI , prob=0.95 )

# plot raw data 
plot( height ~ weight , d2 , col=col.alpha(rangi2,0.5) , main = "95% prediction interval for height of
      !Kung adults as a function of weight" )
# draw MAP line
lines( weight.seq , mu.mean )
# draw HPDI region for line
shade( mu.HPDI , weight.seq )
# draw PI region for simulated heights
shade( height.PI , weight.seq )

1c:

Using quadratic approximation of the posterior, we fit a linear regression to our !Kung height data for adults with weight as a predictor. We found that our posterior means were 154.6cm (with 95% of the posterior probability between 154.1 and 155.13) for the intercept, 0.90 (with 95% posterior probability between 0.82 and 0.99) for the weight-height relationship, and 5.1 (with 95% of posterior probability between 4.70 and 5.45) for sigma. These parameters show very little covariance with one another (rounding to 0).

The plot above titled “95% prediction interval for height of !Kung adults as a function of weight” is a visual representation of our posterior. The rangi-colored (?) dots are our actual !Kung height by weight datapoints. The black line is our maximum a posteriori line (MAP line; or, the linear model generated by plotting our mean predicted-mu at each value of weight). The darker shaded region around the MAP line is the 95% prediction interval of that line, demonstrating the most plausible values for mu. The lighter shaded region captures the area where the model predicts 95% of the actual heights in the population to fall, at each value of weight.

Question 2

Using your model from question 2, give the expected height (mean of posterior) and 95% PI for individuals of the following weights:

(I’m also graphing the posterior distributions of mu at each weight, just to see.)

# post defined above, using extract.samples function!
mu_at_45 <- post$a + post$b * ( 45 - xbar )
round(mean(mu_at_45), 2)

## [1] 154.61

(round(PI( mu_at_45 , prob=0.95 ), 2))

##     3%    98% 
## 154.08 155.14

dens( mu_at_45 , col=rangi2 , lwd=2 , xlab="mu|weight=45")

mu_at_63 <- post$a + post$b * ( 63 - xbar )
round(mean(mu_at_63), 2)

## [1] 170.86

(round(PI( mu_at_63 , prob=0.95 ), 2))

##     3%    98% 
## 169.28 172.46

dens( mu_at_63 , col=rangi2 , lwd=2 , xlab="mu|weight=63")

mu_at_31 <- post$a + post$b * ( 31 - xbar )
round(mean(mu_at_31), 2)

## [1] 141.97

(round(PI( mu_at_31 , prob=0.95 ), 2))

##     3%    98% 
## 140.70 143.22

dens( mu_at_31 , col=rangi2 , lwd=2 , xlab="mu|weight=31")

Question 3

Now select only the children (age < 13) in the dataset.

d3 <- d[d$age < 13,]

Choose your priors, briefly explain why you chose them, and show their reasonableness with a prior predictive simulation.

# setting priors
N <- 1e2 
a <- rnorm( N , 99 , 25 ) # a ~ Normal(99, 25)
b <- rlnorm( 1e4 , .5 , .95 ) # B ~ log-normal(0, .95)

# prior predictive sim
plot( NULL , xlim=range(d3$weight) , ylim=c(-100,400) ,
      xlab="weight" , ylab="height" )
abline( h=0 , lty=2 ) # embryo: preserving these reference points b/c they're useful
abline( h=272 , lty=1 , lwd=0.5 ) # world's tallest person
mtext( "b ~ rlnorm(0,0.95)" )
xbar <- mean(d3$weight)
for ( i in 1:N ) curve( # simulations w/in plot using forloop
  a[i] + b[i]*(x - xbar) ,
  from=min(d3$weight) , 
  to=max(d3$weight) , 
  add=TRUE ,
  col=col.alpha("black",0.2) )

3a:

I chose to alter my priors by reducing the mean of the alpha prior to 65% of the first alpha (because children are shorter than adults), and I increased the sd of the alpha prior (because 0 - 13 years old is a wide range of possible heights/weights!). I also increased the mean of the beta prior because I expect the relationship between height and weight to be more steep for children, but left the sd where it was for Q1.

Analyze the weight/height relationship. Provide a way of summarizing your findings and show the uncertainty in the model. (This is intentionally open-ended: I’m not asking you to do every possible thing we did in class for an analysis, but I want you to think about how you can efficiently but effectively summarize your results.)

# fit model
m3b <- quap(
  alist(
    height ~ dnorm( mu , sigma ) ,
    mu <- a + b*( weight - xbar ) ,
    a ~ dnorm( 99 , 25 ) ,
    b ~ dlnorm( 0 , .95 ) ,
    sigma ~ dunif( 0 , 50 )
  ) ,
data = d3)

# summaries of posterior parameters
precis( m3b )

##            mean         sd      5.5%     94.5%
## a     98.810798 0.46108076 98.073902 99.547694
## b      3.621396 0.08088922  3.492120  3.750673
## sigma  5.572208 0.32609525  5.051045  6.093371

round(vcov ( m3b ), 3)

##           a     b sigma
## a     0.213 0.000 0.000
## b     0.000 0.007 0.000
## sigma 0.000 0.000 0.106

post.3b <- extract.samples( m3b , n=1e4 )
precis( post.3b , prob = .95 )

##            mean         sd      2.5%     97.5%       histogram
## a     98.804407 0.45880059 97.911086 99.711249       ▁▁▁▃▇▅▁▁▁
## b      3.621816 0.08164068  3.465732  3.779706 ▁▁▁▂▃▇▇▇▃▁▁▁▁▁▁
## sigma  5.576871 0.32873621  4.938558  6.222934   ▁▁▁▁▂▅▇▇▅▂▁▁▁

pairs( m3b ) # definitely very little covariance!!

## Link!
weight.seq.3b <- seq( from=0 , to=35 , by=1 )
mu.3b <- link( m3b , data=data.frame(weight=weight.seq.3b) )
# use type="n" to hide raw data
plot( height ~ weight , d3 , type="n" )
# loop over samples and plot each mu value
for ( i in 1:100 )
points( weight.seq.3b , mu.3b[i,] , pch=16 , col=col.alpha(rangi2,0.1) )

## Summaries!

# for mu
mu.mean.3b <- apply( mu.3b , 2 , mean )
mu.PI.3b <- apply( mu.3b , 2 , PI , prob=0.95 )
mu.HPDI.3b <- apply( mu.3b , 2 , HPDI , prob=0.95 )
# plot raw data 4.57
# fading out points to make line and interval more visible
plot( height ~ weight , data=d3 , col=col.alpha(rangi2,0.5))
# plot the MAP line, aka the mean mu for each weight
lines( weight.seq.3b , mu.mean.3b )
# plot a shaded region for 95% PI
shade( mu.PI.3b , weight.seq.3b )

#####
# Hmm! This doesn't look linear to me.

## Plot!
sim.height.3b <- sim( m3b , data=list(weight=weight.seq.3b) , n=1e4 )

height.PI.3b <- apply( sim.height.3b , 2 , PI , prob=0.95 )

# actual plot
plot( height ~ weight , d3 , col=col.alpha(rangi2,0.5) , main = "95% prediction interval for height of
      !Kung children as a function of weight" )
lines( weight.seq.3b , mu.mean.3b )
shade( mu.HPDI.3b , weight.seq.3b )
shade( height.PI.3b , weight.seq.3b )

3b:

Using quadratic approximation of the posterior, we fit a linear regression to our !Kung children height data with weight as a predictor. We found that our posterior means were 98.8cm (with 95% of the posterior probability between 98.07 and 99.55) for the intercept, 3.62 (with 95% posterior probability between 3.49 and 3.75) for the weight-height relationship, and 5.57 (with 95% of posterior probability between 5.05 and 6.09) for sigma. These parameters again show very little covariance with one another (rounding to 0).

The plot above titled “95% prediction interval for height as a function of weight” is a visual representation of our posterior. The rangi-colored dots are our actual !Kung height by weight datapoints. The black line is our maximum a posteriori line (MAP line; or, the linear model generated by plotting our mean predicted-mu at each value of weight). The darker shaded region around the MAP line is the 95% prediction interval of that line, demonstrating the most plausible values for mu. The lighter shaded region captures the area where the model predicts 95% of the actual heights in the population to fall, at each value of weight.

However, in this the linear model doesn’t seem to fit the data well. The data fall consistently below the MAP line at high and low values of weight, and consistently above the MAP line at middling values (between about 8 and 23 kg), indicating that we may want to explore non-linear models (e.g., a polynomial regression of some kind).

Bayes Week 4 HW

Alexandra Flores

07 February, 2022

Question 1

Question 2

Question 3