Assignment #3

4E1. In the model definition below, which line is the likelihood? \[\begin{align} \ y_i ∼ Normal(μ, σ) \\ \ μ ∼ Normal(0, 10) \\ \ σ ∼ Exponential(1) \\ \end{align}\]

#y_i ∼ Normal(μ, σ)

4E2. In the model definition just above, how many parameters are in the posterior distribution?

#μ, σ

4E3. Using the model definition above, write down the appropriate form of Bayes’ theorem that includes the proper likelihood and priors.

#Pr(μ,σ|y)=∏iNormal(yi|μ,σ)Normal(μ|0,10)Uniform(σ|0,10)∫∫∏iNormal(hi|μ,σ)Normal(μ|0,10)Uniform(σ|0,10)dμdσ

4E4. In the model definition below, which line is the linear model? \[\begin{align} \ y_i ∼ Normal(μ, σ) \\ \ μ_i = α + βx_i \\ \ α ∼ Normal(0, 10) \\ \ β ∼ Normal(0, 1) \\ \ σ ∼ Exponential(2) \\ \end{align}\]

#μ_i = α + βx_i

4E5. In the model definition just above, how many parameters are in the posterior distribution?

#α, β, and  σ

4M1. For the model definition below, simulate observed y values from the prior (not the posterior). \[\begin{align} \ y_i ∼ Normal(μ, σ) \\ \ μ ∼ Normal(0, 10) \\ \ σ ∼ Exponential(1) \\ \end{align}\]

sample_mu <- rnorm(1e4, 0, 10)
sample_sigma <- rexp(1e4, 1)
prior_y <- rnorm(1e4, sample_mu, sample_sigma)
dens(prior_y)

4M2. Translate the model just above into a quap formula.

flist <- alist( 
  yi ~ dnorm(mu, sigma),
  mu ~ dnorm(0, 10),
  sigma ~ dexp(1)
)

4M3. Translate the quap model formula below into a mathematical model definition:

y ~ dnorm( mu , sigma ),
mu <- a + b*x,
a ~ dnorm( 0 , 10 ),
b ~ dunif( 0 , 1 ),
sigma ~ dexp( 1 )

#y_i ~ Normal (μ, σ)
#μ = a + bx
#a ~ Normal (0,10)
#b ~ Uniform (0,1)
#σ ~ Exponential (1)

4M4. A sample of students is measured for height each year for 3 years. After the third year, you want to fit a linear regression predicting height using year as a predictor. Write down the mathematical model definition for this regression, using any variable names and priors you choose. Be prepared to defend your choice of priors.

#h_i ~Normal (μ, σ)
#μ = a + bx
#a ~ Normal (160,20)
#b ~ Normal (4,2)
#σ ~ Uniform (0,50)

#I used a linear model. a is the prior that capture likely heights for students from primary school to college students with mean is 160 and SD is 20. b is the prior for hearly growth rates with mean is 4 and SD is 2.  σ is a uniform prior for the SD of heights.

4M5. Now suppose I remind you that every student got taller each year. Does this information lead you to change your choice of priors? How?

#h_i ~Normal (μ, σ)
#μ = a + bx
#a ~ Normal (120,20)
#b ~ Normal (4,2)
#σ ~ Uniform (0,50)
# Change the mean for a to 120

4M6. Now suppose I tell you that the variance among heights for students of the same age is never more than 64cm. How does this lead you to revise your priors?

#h_i ~Normal (μ, σ)
#μ = a + bx
#a ~ Normal (120,20)
#b ~ Normal (4,2)
#σ ~ Uniform (0,8)
#Narrowed σ from 0~50 to 0~8

4M7. Refit model m4.3 from the chapter, but omit the mean weight xbar this time. Compare the new model’s posterior to that of the original model. In particular, look at the covariance among the parameters. What is different? Then compare the posterior predictions of both models.

data(Howell1)
d<- Howell1
d2<- d[d$age>=18,]

#orginal m4.3
xbar <- mean(d2$weight)

m4.3 <- quap( 
  alist(
    height ~ dnorm( mu , sigma ) , 
    mu <- a + b*( weight - xbar ) , 
    a ~ dnorm( 178 , 20 ) , 
    b ~ dnorm( 0 , 1 ) , 
    sigma ~ dunif( 0 , 50 )
) , 
data=d2 
)
precis(m4.3)

##              mean         sd        5.5%       94.5%
## a     154.6013679 0.27030749 154.1693643 155.0333714
## b       0.9034409 0.04189136   0.8364904   0.9703914
## sigma   5.0718777 0.19115444   4.7663760   5.3773794

cov2cor(vcov(m4.3))

##                   a             b        sigma
## a      1.000000e+00 -3.400447e-06  0.001191984
## b     -3.400447e-06  1.000000e+00 -0.002852762
## sigma  1.191984e-03 -2.852762e-03  1.000000000

#omit the mean weight xbar 

m4.3_omit <- quap(
  alist(
    height~dnorm(mu,sigma),
    mu<- a+b*weight,
    a~dnorm(156,10),
    b~dnorm(0,10),
    sigma~dunif(0,50)
  ),
  data=d2
)
precis(m4.3_omit)

##              mean         sd        5.5%       94.5%
## a     115.3519209 1.87657777 112.3527872 118.3510546
## b       0.8729128 0.04130261   0.8069032   0.9389223
## sigma   5.0763084 0.19164200   4.7700274   5.3825893

cov2cor(vcov(m4.3_omit))

##                 a           b       sigma
## a      1.00000000 -0.98955830  0.05716683
## b     -0.98955830  1.00000000 -0.05665232
## sigma  0.05716683 -0.05665232  1.00000000

4M8. In the chapter, we used 15 knots with the cherry blossom spline. Increase the number of knots and observe what happens to the resulting spline. Then adjust also the width of the prior on the weights—change the standard deviation of the prior and watch what happens. What do you think the combination of knot number and the prior on the weights controls?

data("cherry_blossoms")
d<- cherry_blossoms
precis(d)

##                   mean          sd      5.5%      94.5%
## year       1408.000000 350.8845964 867.77000 1948.23000
## doy         104.540508   6.4070362  94.43000  115.00000
## temp          6.141886   0.6636479   5.15000    7.29470
## temp_upper    7.185151   0.9929206   5.89765    8.90235
## temp_lower    5.098941   0.8503496   3.78765    6.37000
##                                                                                                                           histogram
## year                       <U+2587><U+2587><U+2587><U+2587><U+2587><U+2587><U+2587><U+2587><U+2587><U+2587><U+2587><U+2587><U+2581>
## doy                                                                <U+2581><U+2582><U+2585><U+2587><U+2587><U+2583><U+2581><U+2581>
## temp                                                               <U+2581><U+2583><U+2585><U+2587><U+2583><U+2582><U+2581><U+2581>
## temp_upper <U+2581><U+2582><U+2585><U+2587><U+2587><U+2585><U+2582><U+2582><U+2581><U+2581><U+2581><U+2581><U+2581><U+2581><U+2581>
## temp_lower <U+2581><U+2581><U+2581><U+2581><U+2581><U+2581><U+2581><U+2583><U+2585><U+2587><U+2583><U+2582><U+2581><U+2581><U+2581>

d2 <- d[ complete.cases(d$temp) , ] 

# 15 knots

num_knots <- 15 
knot_list <- quantile( d2$year , probs=seq(0,1,length.out=num_knots) )

library(splines) 
B <- bs(d2$year, knots=knot_list[-c(1,num_knots)] , degree=3 , intercept=TRUE )

m4.7 <- quap( 
  alist(
    T ~ dnorm( mu , sigma ) , 
    mu <- a + B %*% w , 
    a ~ dnorm(6,10), 
    w ~ dnorm(0,1), 
    sigma ~ dexp(1) ), 
  data=list( T=d2$temp , B=B ) , 
  start=list( w=rep( 0 , ncol(B) ) ) )

post <- extract.samples(m4.7) 
w <- apply( post$w , 2 , mean ) 
plot( NULL , xlim=range(d2$year) , ylim=c(-2,2) , xlab="year" , ylab="basis * weight" ) 
for ( i in 1:ncol(B) ) 
   lines( d2$year , w[i]*B[,i] )

mu <- link( m4.7 ) 
mu_PI <- apply(mu,2,PI,0.97) 
plot( d2$year , d2$temp , col=col.alpha(rangi2,0.3) , pch=16 ) 
shade( mu_PI , d2$year , col=col.alpha("black",0.5) )

# 25 knots

num_knots <- 25 
knot_list <- quantile( d2$year , probs=seq(0,1,length.out=num_knots) )

library(splines) 
B <- bs(d2$year, knots=knot_list[-c(1,num_knots)] , degree=3 , intercept=TRUE )

m4.7 <- quap( 
  alist(
    T ~ dnorm( mu , sigma ) , 
    mu <- a + B %*% w , 
    a ~ dnorm(6,10), 
    w ~ dnorm(0,1), 
    sigma ~ dexp(1) ), 
  data=list( T=d2$temp , B=B ) , 
  start=list( w=rep( 0 , ncol(B) ) ) )

post <- extract.samples(m4.7) 
w <- apply( post$w , 2 , mean ) 
plot( NULL , xlim=range(d2$year) , ylim=c(-2,2) , xlab="year" , ylab="basis * weight" ) 
for ( i in 1:ncol(B) ) 
   lines( d2$year , w[i]*B[,i] )

mu <- link( m4.7 ) 
mu_PI <- apply(mu,2,PI,0.97) 
plot( d2$year , d2$temp , col=col.alpha(rangi2,0.3) , pch=16 ) 
shade( mu_PI , d2$year , col=col.alpha("black",0.5) )

# 25 knots, change B SD to 5

num_knots <- 25 
knot_list <- quantile( d2$year , probs=seq(0,1,length.out=num_knots) )

library(splines) 
B <- bs(d2$year, knots=knot_list[-c(1,num_knots)] , degree=3 , intercept=TRUE )

m4.7 <- quap( 
  alist(
    T ~ dnorm( mu , sigma ) , 
    mu <- a + B %*% w , 
    a ~ dnorm(6,10), 
    w ~ dnorm(0,5), 
    sigma ~ dexp(1) ), 
  data=list( T=d2$temp , B=B ) , 
  start=list( w=rep( 0 , ncol(B) ) ) )

post <- extract.samples(m4.7) 
w <- apply( post$w , 2 , mean ) 
plot( NULL , xlim=range(d2$year) , ylim=c(-2,2) , xlab="year" , ylab="basis * weight" ) 
for ( i in 1:ncol(B) ) 
   lines( d2$year , w[i]*B[,i] )

mu <- link( m4.7 ) 
mu_PI <- apply(mu,2,PI,0.97) 
plot( d2$year , d2$temp , col=col.alpha(rangi2,0.3) , pch=16 ) 
shade( mu_PI , d2$year , col=col.alpha("black",0.5) )

# Number of knots changes the smothness of the ling. The weidth of B SD changes the thickness of the line.

4H2. Select out all the rows in the Howell1 data with ages below 18 years of age. If you do it right, you should end up with a new data frame with 192 rows in it.

Fit a linear regression to these data, using quap. Present and interpret the estimates. For every 10 units of increase in weight, how much taller does the model predict a child gets?
Plot the raw data, with height on the vertical axis and weight on the horizontal axis. Superimpose the MAP regression line and 89% interval for the mean. Also superimpose the 89% interval for predicted heights.
What aspects of the model fit concern you? Describe the kinds of assumptions you would change, if any, to improve the model. You don’t have to write any new code. Just explain what the model appears to be doing a bad job of, and what you hypothesize would be a better model.

d3 <- Howell1[Howell1$age < 18, ]

#a
m_linear <- quap( 
  alist(
    height ~ dnorm( mu , sigma ) , 
    mu <- a + b* weight , 
    a ~ dnorm( 156 , 10) , 
    b ~ dnorm( 0 , 10 ) , 
    sigma ~ dunif( 0 , 50 )
) , 
data=d3
)


#b
post <- extract.samples(m_linear) 
 
 mu.link <- function(weight) post$a + post$b*weight 
 
 weight.seq <- seq( from = min(d3$weight), to = max(d3$weight), by = 1) 
 
 mu <- sapply( weight.seq , mu.link ) 
 mu.mean <- apply( mu , 2 , mean ) 
 mu.CI <- apply( mu , 2 , PI , prob=0.89 )
 mu.HPDI <- apply(mu, 2, HPDI, prob = 0.89)
 
 sim.height <- sim( m_linear , data=list(weight=weight.seq) ) 
 str(sim.height)

##  num [1:1000, 1:41] 68.4 73.8 69.2 75.5 71.6 ...

 height.PI <- apply( sim.height , 2 , PI , prob=0.89 )
 
 # plot raw data 
 plot( height ~ weight , d3 , col=col.alpha(rangi2,0.5) )
# draw MAP line 
 lines( weight.seq , mu.mean )
# draw HPDI region for line 
 shade( mu.HPDI , weight.seq )
# draw PI region for simulated heights 
 shade( height.PI , weight.seq )

#c 
#The model is doing a bad job mostly becuase it overestimates hight when weight is over 35 and below 10. It also underestimates height when weight is between 10 and 35 
#I would change the assumption that the relationship between height and weight is linear.

Assignment #3

Lucy Lyu

2020-06-07