Chapter 4 - Geocentric Models

This chapter introduced the simple linear regression model, a framework for estimating the association between a predictor variable and an outcome variable. The Gaussian distribution comprises the likelihood in such models, because it counts up the relative numbers of ways different combinations of means and standard deviations can produce an observation. To fit these models to data, the chapter introduced quadratic approximation of the posterior distribution and the tool quap. It also introduced new procedures for visualizing prior and posterior distributions.

Place each answer inside the code chunk (grey box). The code chunks should contain a text response or a code that completes/answers the question or activity requested. Problems are labeled Easy (E), Medium (M), and Hard(H).

Finally, upon completion, name your final output .html file as: YourName_ANLY505-Year-Semester.html and publish the assignment to your R Pubs account and submit the link to Canvas. Each question is worth 5 points.

Questions

4E1. In the model definition below, which line is the likelihood? \[\begin{align} \ yi∼Normal(μ,σ) \\ \ μ∼Normal(0,10) \\ \ σ∼Exponential(1) \\ \end{align}\]

#y_i ∼ Normal(μ, σ) is the likelihood.

4E2. In the model definition just above, how many parameters are in the posterior distribution?

#Two: μ and σ

4E3. Using the model definition above, write down the appropriate form of Bayes’ theorem that includes the proper likelihood and priors.

#Pr(μ,σ|y)=∏iNormal(yi|μ,σ)Normal(μ|0,10)Uniform(σ|0,10)∫∫∏iNormal(hi|μ,σ)Normal(μ|0,10)Uniform(σ|0,10)dμdσ

4E4. In the model definition below, which line is the linear model? \[\begin{align} \ yi∼Normal(μ,σ) \\ \ μi=α+βxi \\ \ α∼Normal(0,10) \\ \ β∼Normal(0,1) \\ \ σ∼Exponential(2) \\ \end{align}\]

#μ_i = α + βx_i 

4E5. In the model definition just above, how many parameters are in the posterior distribution?

#Three: α, β, σ

4M1. For the model definition below, simulate observed y values from the prior (not the posterior). \[\begin{align} \ yi∼Normal(μ,σ) \\ \ μ∼Normal(0,10) \\ \ σ∼Exponential(1)\\ \end{align}\]

sample_mu <- rnorm(1e4, 0, 10)
sample_sigma <- runif(1e4, 0, 10)
prior_y <- rnorm(1e4, sample_mu, sample_sigma)
dens(prior_y)

4M2. Translate the model just above into a quap formula.

formula <- alist(y ~ dnorm(mu, sigma),
  mu ~ dnorm(0, 10),
  sigma ~ dunif(0, 10)
)
4M3. Translate the quap model formula below into a mathematical model definition:

y ~ dnorm( mu , sigma ),
mu <- a + b*x,
a ~ dnorm( 0 , 10 ),
b ~ dunif( 0 , 1 ),
sigma ~ dexp( 1 )

# y_i ∼ Normal(μ, σ) 
# μ_i = α + βx_i 
# α ∼ Normal(0, 10) 
# β ∼ Uniform(0, 1) 
# σ ∼ Exponential(1)

4M4. A sample of students is measured for height each year for 3 years. After the third year, you want to fit a linear regression predicting height using year as a predictor. Write down the mathematical model definition for this regression, using any variable names and priors you choose. Be prepared to defend your choice of priors.

# y_i ~ Normal(μ, σ)
# μ_i = α + βx_i
# α ~ Normal(165,25)
# β ~ Normal(4,3)
# σ ~ Uniform(0,50)
#This is a linear model. Assume the average height of student is 165 and standard deviation 25.

4M5. Now suppose I remind you that every student got taller each year. Does this information lead you to change your choice of priors? How?

#Yes. I will adjust the range in my prior.
#The assumption is already in the prior.
# y_i ~ Normal(μ, σ)
# μ_i = α + βx_i
# α ~ Normal(120,25)
# β ~ Normal(4,3)
# σ ~ Uniform(0,50)

4M6. Now suppose I tell you that the variance among heights for students of the same age is never more than 64cm. How does this lead you to revise your priors?

# y_i ~ Normal(μ, σ)
# μ_i = α + βx_i
# α ~ Normal(120,25)
# β ~ Normal(4,3)
# σ ~ Uniform(0,8)
#σ is the square root of variance. If the variance is never more than 64, then max σ is 8.

4M7. Refit model m4.3 from the chapter, but omit the mean weight xbar this time. Compare the new model’s posterior to that of the original model. In particular, look at the covariance among the parameters. What is different? Then compare the posterior predictions of both models.

d <- Howell1
d2 <- d[d$age >= 18,]
xbar <- mean(d2$weight)
m4.3<-quap(alist(
    height ~ dnorm(mu,sigma),
    mu <- a + b*(weight - xbar),
    a ~ dnorm(178,20),
    b ~ dlnorm(0,1),
    sigma ~ dunif(0,50)
  ),
  data=d2)
precis(m4.3)
##              mean         sd        5.5%       94.5%
## a     154.6013654 0.27030779 154.1693614 155.0333695
## b       0.9032804 0.04192365   0.8362784   0.9702825
## sigma   5.0718833 0.19115501   4.7663807   5.3773859
post <- extract.samples(m4.3)
mu.link <-function(weight)post$a+post$b*(weight-xbar)
weight.seq <-seq(from=25,to=70,by=1)
mu <-sapply(weight.seq,mu.link)
mu.mean <-apply(mu,2,mean)
mu.CI <-apply(mu,2,PI,prob=0.89)
mu.HPDI <- apply(mu, 2, HPDI , prob=0.89)
sim.height <-sim(m4.3,data=list(weight=weight.seq))
str(sim.height)
##  num [1:1000, 1:46] 138 137 133 133 139 ...
height.PI <-apply(sim.height,2,PI,prob=0.89)
# plot raw data
plot( height~weight,d2,col=col.alpha(rangi2,0.5))
# draw MAP line
lines( weight.seq,mu.mean)
# draw HPDI region for line
shade( mu.HPDI,weight.seq)
# draw PI region for simulated heights
shade( height.PI,weight.seq)

m4.3_new<-quap(alist(
    height ~ dnorm(mu,sigma),
    mu <- a + b*(weight),
    a ~ dnorm(178,20),
    b ~ dlnorm(0,1),
    sigma ~ dunif(0,50)
  ),
  data=d2)
precis(m4.3_new)
##              mean         sd        5.5%       94.5%
## a     114.5343101 1.89774660 111.5013445 117.5672757
## b       0.8907301 0.04175797   0.8239928   0.9574675
## sigma   5.0727171 0.19124877   4.7670646   5.3783695
post <- extract.samples(m4.3_new)
mu.link <- function(weight) post$a + post$b*weight 
weight.seq <- seq( from=25 , to=70 , by=1 )
mu <- sapply( weight.seq , mu.link )
mu.mean <- apply( mu , 2 , mean )
mu.CI <- apply( mu , 2 , PI , prob=0.89)
mu.HPDI <- apply( mu , 2 , HPDI , prob=0.89)
sim.height <- sim( m4.3_new , data=list(weight=weight.seq) )
height.PI <- apply( sim.height , 2 , PI , prob=0.89 )

plot( height ~ weight , d2 , col=col.alpha(rangi2,0.5) )
lines( weight.seq , mu.mean )
shade( mu.HPDI , weight.seq )
shade( height.PI , weight.seq )

4M8. In the chapter, we used 15 knots with the cherry blossom spline. Increase the number of knots and observe what happens to the resulting spline. Then adjust also the width of the prior on the weights—change the standard deviation of the prior and watch what happens. What do you think the combination of knot number and the prior on the weights controls?

data(cherry_blossoms)
d3 <- cherry_blossoms
precis(d3)
##                   mean          sd      5.5%      94.5%
## year       1408.000000 350.8845964 867.77000 1948.23000
## doy         104.540508   6.4070362  94.43000  115.00000
## temp          6.141886   0.6636479   5.15000    7.29470
## temp_upper    7.185151   0.9929206   5.89765    8.90235
## temp_lower    5.098941   0.8503496   3.78765    6.37000
##                                                                                                                           histogram
## year                       <U+2587><U+2587><U+2587><U+2587><U+2587><U+2587><U+2587><U+2587><U+2587><U+2587><U+2587><U+2587><U+2581>
## doy                                                                <U+2581><U+2582><U+2585><U+2587><U+2587><U+2583><U+2581><U+2581>
## temp                                                               <U+2581><U+2583><U+2585><U+2587><U+2583><U+2582><U+2581><U+2581>
## temp_upper <U+2581><U+2582><U+2585><U+2587><U+2587><U+2585><U+2582><U+2582><U+2581><U+2581><U+2581><U+2581><U+2581><U+2581><U+2581>
## temp_lower <U+2581><U+2581><U+2581><U+2581><U+2581><U+2581><U+2581><U+2583><U+2585><U+2587><U+2583><U+2582><U+2581><U+2581><U+2581>
d4<-d3[complete.cases(cherry_blossoms$temp),]
numknots15<-15
knot15<-quantile(d4$year,probs=seq(0,1,length.out=numknots15))
knot15
##        0% 7.142857% 14.28571% 21.42857% 28.57143% 35.71429% 42.85714%       50% 
##  839.0000  937.2143 1017.4286 1097.6429 1177.8571 1258.0714 1338.2857 1418.5000 
## 57.14286% 64.28571% 71.42857% 78.57143% 85.71429% 92.85714%      100% 
## 1498.7143 1578.9286 1659.1429 1739.3571 1819.5714 1899.7857 1980.0000
numknots30 <- 30
knot30 <- quantile(d4$year, probs = seq(0, 1, length.out = numknots30))
knot30
##        0% 3.448276% 6.896552% 10.34483%  13.7931% 17.24138% 20.68966% 24.13793% 
##  839.0000  895.7241  934.4483  973.1724 1011.8966 1050.6207 1089.3448 1128.0690 
## 27.58621% 31.03448% 34.48276% 37.93103% 41.37931% 44.82759% 48.27586% 51.72414% 
## 1166.7931 1205.5172 1244.2414 1282.9655 1321.6897 1360.4138 1399.1379 1437.8621 
## 55.17241% 58.62069% 62.06897% 65.51724% 68.96552% 72.41379% 75.86207% 79.31034% 
## 1476.5862 1515.3103 1554.0345 1592.7586 1631.4828 1670.2069 1708.9310 1747.6552 
## 82.75862%  86.2069% 89.65517% 93.10345% 96.55172%      100% 
## 1786.3793 1825.1034 1863.8276 1902.5517 1941.2759 1980.0000
knot_list <- quantile(d4$year , probs=seq(0,1,length.out=numknots30))
B <- bs(d4$year, knots=knot_list[-c(1,numknots30)] , degree=3 , intercept=TRUE )
plot( NULL , xlim=range(d4$year) , ylim=c(0,1) , xlab="year" , ylab="basis value" )
for ( i in 1:ncol(B) ) lines( d4$year , B[,i])

m4.7_new <- quap(
  alist(D ~ dnorm(mu, sigma) ,
        mu <- a + B %*% w ,
        a ~ dnorm(6,10),
        w ~ dnorm(0,1),
        sigma ~ dexp(1)),
  data=list( D=d4$temp, B=B ) ,
start=list( w=rep( 0 , ncol(B) ) ) )

post <- extract.samples(m4.7_new)
w <- apply( post$w , 2 , mean )
plot( NULL , xlim=range(d4$year) , ylim=c(-2,2) , xlab="year" , ylab="basis * weight" )
for ( i in 1:ncol(B) ) lines( d4$year , w[i]*B[,i] )

mu <- link(m4.7_new)
mu_PI <- apply(mu,2,PI,0.97)
plot( d4$year , d4$temp , col=col.alpha(rangi2,0.3) , pch=16 ) 
shade(mu_PI, d4$year, col=col.alpha("black",0.5))

4H2. Select out all the rows in the Howell1 data with ages below 18 years of age. If you do it right, you should end up with a new data frame with 192 rows in it.

  1. Fit a linear regression to these data, using quap. Present and interpret the estimates. For every 10 units of increase in weight, how much taller does the model predict a child gets?

  2. Plot the raw data, with height on the vertical axis and weight on the horizontal axis. Superimpose the MAP regression line and 89% interval for the mean. Also superimpose the 89% interval for predicted heights.

  3. What aspects of the model fit concern you? Describe the kinds of assumptions you would change, if any, to improve the model. You don’t have to write any new code. Just explain what the model appears to be doing a bad job of, and what you hypothesize would be a better model.

d2 <- Howell1[Howell1$age < 18, ]
nrow(d2)
## [1] 192
quap4H2 <- alist(
  height ~ dnorm(mu, sigma),
  mu <- a + b * weight,
  a ~ dnorm(110, 30),
  b ~ dnorm(0, 10),
  sigma ~ dunif(0, 60)
)

quaplist <- alist(
  height ~ dnorm(mu, sigma),
  mu <- a + b * weight,
  a ~ dnorm(110, 30),
  b ~ dnorm(0, 10),
  sigma ~ dunif(0, 60))

m <- map(quaplist, data = d2)
qp <- quap(quap4H2, data = d2)
precis(qp)
##            mean         sd      5.5%     94.5%
## a     58.344958 1.39573938 56.114297 60.575619
## b      2.715035 0.06823404  2.605984  2.824086
## sigma  8.437247 0.43057622  7.749103  9.125391
?link
## starting httpd help server ... done
d <- Howell1
d2 <- d[d$age < 18,]
plot(height ~ weight, data = d2, col = col.alpha(rangi2, 0.3))

weight.seq <- seq(from = min(d2$weight), to = max(d2$weight), by = 1)
mu <- link(m, data = data.frame(weight = weight.seq))
mu.mean <- apply(mu, 2, mean)
mu.HPDI <- apply(mu, 2, HPDI, prob = 0.89)
lines(weight.seq, mu.mean)
shade(mu.HPDI, weight.seq)
sim.height <- sim(m, data = list(weight = weight.seq))
height.HPDI <- apply(sim.height, 2, HPDI, prob = 0.89)
shade(height.HPDI, weight.seq)

#The model is not very accurate when the weights are under 10 or over 30. Probably should try a polynomial regression.