Assignment #3

Questions

4E1. In the model definition below, which line is the likelihood? \[\begin{align} \ y_i ∼ Normal(μ, σ) \\ \ μ ∼ Normal(0, 10) \\ \ σ ∼ Exponential(1) \\ \end{align}\]

print("The first line is the likelihood.")

## [1] "The first line is the likelihood."

4E2. In the model definition just above, how many parameters are in the posterior distribution?

print("Two parameters i.e. μ and σ")

## [1] "Two parameters i.e. μ and σ"

4E3. Using the model definition above, write down the appropriate form of Bayes’ theorem that includes the proper likelihood and priors.

 #Pr(μ, σ|y) = ( Normal(y|μ, σ)Normal(μ|0, 10)Uniform(σ|0, 10) ) / ( ∫ ∫ Normal(y|μ, σ)Normal(μ|0, 10)Uniform(σ|0, 10)dμdσ)

4E4. In the model definition below, which line is the linear model? \[\begin{align} \ y_i ∼ Normal(μ, σ) \\ \ μ_i = α + βx_i \\ \ α ∼ Normal(0, 10) \\ \ β ∼ Normal(0, 1) \\ \ σ ∼ Exponential(2) \\ \end{align}\]

print("The second line is the linear model.")

## [1] "The second line is the linear model."

4E5. In the model definition just above, how many parameters are in the posterior distribution?

print("There are 3 parameters in the posterior: α, β, and σ.")

## [1] "There are 3 parameters in the posterior: α, β, and σ."

4M1. For the model definition below, simulate observed y values from the prior (not the posterior). Make sure to plot the simulation. \[\begin{align} \ y_i ∼ Normal(μ, σ) \\ \ μ ∼ Normal(0, 10) \\ \ σ ∼ Exponential(1) \\ \end{align}\]

mu_prior <- rnorm( 1e4 , 0 , 10 )
sigma_prior <- runif( 1e4 , 0, 10 )
h_sim <- rnorm( 1e4 , mu_prior , sigma_prior )
dens( h_sim )

4M2. Translate the model just above into a quap formula.

f <- alist(y ~ dnorm( mu , sigma ), 
           mu ~ dnorm( 0 , 10 ), 
           sigma ~ dunif( 0 , 10 ))

4M3. Translate the quap model formula below into a mathematical model definition:

y ~ dnorm( mu , sigma ),
mu <- a + b*x,
a ~ dnorm( 0 , 10 ),
b ~ dunif( 0 , 1 ),
sigma ~ dexp( 1 )

# yi ∼ Normal(μ, σ) 
# μ = α + βxi
# α ∼ Normal(0, 50) 
# β ∼ Normal(0, 10) 
# σ ∼ Uniform(0, 50)

4M4. A sample of students is measured for height each year for 3 years. After the third year, you want to fit a linear regression predicting height using year as a predictor. Write down the mathematical model definition for this regression, using any variable names and priors you choose. Be prepared to defend your choice of priors. Simulate from the priors that you chose to see what the model expects before it sees the data. Do this by sampling from the priors. Then consider 50 students, each simulated with a different prior draw. For each student simulate 3 years. Plot the 50 linear relationships with height(cm) on the y-axis and year on the x-axis. What can we do to make these priors more likely?

#The simplest model could be:
  
# hi ∼ Normal(μ, σ) 
# μ = α + βyi
# α ∼ Normal(0, 100) 
# β ∼ Normal(0, 10) 
# σ ∼ Uniform(0, 50)

# h is height and y is year. The prior on the intercept α is effectively uninformative, so it’s hard to do much. The prior on β is very weakly informative, centered on zero, which corresponds to no impact of year. So maybe a uniform prior above zero would be better.

# hi ∼ Normal(μ, σ) 
# μ = α + βyi
# α ∼ Normal(0, 100) 
# β ∼ Uniform(0, 10) 
# σ ∼ Uniform(0, 50)

4M5. Now suppose I remind you that every student got taller each year. Does this information lead you to change your choice of priors? How? Again, simulate from the priors and plot.

# Since every student got taller. So I set the mean as 140 and lowered sd to be 10. I will recenter the β prior around 5 cm/year and decrease its SD to 1 cm/year. I will also reduce the maximum value in the σ prior to 20 cm, because it’s less likely to have a high sd on students.

# hi ∼ Normal(μ, σ) 
# μ = α + βyi
# α ∼ Normal(140, 10) 
# β ∼ Normal(5, 1) 
# σ ∼ Uniform(0, 20)

4M6. Now suppose I tell you that the variance among heights for students of the same age is never more than 64cm. How does this lead you to revise your priors?

# hi ∼ Normal(μ, σ) 
# μ = α + βyi
# α ∼ Normal(140, 10) 
# β ∼ Normal(5, 1) 
# σ ∼ Uniform(0, 8)
# sd to be the square foot of 64

4M7. Refit model m4.3 from the chapter, but omit the mean weight xbar this time. Compare the new model’s posterior to that of the original model. In particular, look at the covariance among the parameters. Show the pairs() plot. What is different? Then compare the posterior predictions of both models.

data(Howell1)
d <- Howell1
d4 <- d[ d$age >= 18 , ]

#the average weight, x-bar
xbar <- mean(d4$weight)

#fit model
m4.31 <- quap(
alist(
  height ~ dnorm( mu , sigma ) ,
mu <- a + b*( weight - xbar ) ,
a ~ dnorm( 178 , 20 ) ,
b ~ dlnorm( 0 , 1 ) ,
sigma ~ dunif( 0 , 50 )
) ,
data=d4 )

m4.31

## 
## Quadratic approximate posterior distribution
## 
## Formula:
## height ~ dnorm(mu, sigma)
## mu <- a + b * (weight - xbar)
## a ~ dnorm(178, 20)
## b ~ dlnorm(0, 1)
## sigma ~ dunif(0, 50)
## 
## Posterior means:
##           a           b       sigma 
## 154.6017639   0.9032725   5.0709716 
## 
## Log-likelihood: -1071.01

#The estimates for α changed but β and σ are still the same 
m4.32 <- quap(
    alist(
        height ~ dnorm( mu , sigma ) ,
        mu <- a + b*weight ,
        a ~ dnorm(178, 20) ,
        b ~ dnorm( 0 , 1 ) ,
        sigma ~ dunif( 0 , 50 )
), data=d4 )
m4.32

## 
## Quadratic approximate posterior distribution
## 
## Formula:
## height ~ dnorm(mu, sigma)
## mu <- a + b * weight
## a ~ dnorm(178, 20)
## b ~ dnorm(0, 1)
## sigma ~ dunif(0, 50)
## 
## Posterior means:
##           a           b       sigma 
## 114.5264037   0.8909059   5.0726935 
## 
## Log-likelihood: -1071.07

4M8. In the chapter, we used 15 knots with the cherry blossom spline. Increase the number of knots and observe what happens to the resulting spline. Then adjust also the width of the prior on the weights—change the standard deviation of the prior and watch what happens. What do you think the combination of knot number and the prior on the weights controls?

data(cherry_blossoms)
d<-cherry_blossoms
precis(d)

##                   mean          sd      5.5%      94.5%       histogram
## year       1408.000000 350.8845964 867.77000 1948.23000   ▇▇▇▇▇▇▇▇▇▇▇▇▁
## doy         104.540508   6.4070362  94.43000  115.00000        ▁▂▅▇▇▃▁▁
## temp          6.141886   0.6636479   5.15000    7.29470        ▁▃▅▇▃▂▁▁
## temp_upper    7.185151   0.9929206   5.89765    8.90235 ▁▂▅▇▇▅▂▂▁▁▁▁▁▁▁
## temp_lower    5.098941   0.8503496   3.78765    6.37000 ▁▁▁▁▁▁▁▃▅▇▃▂▁▁▁

d4<-d[complete.cases(d$doy),]
numknots<-15
knot_list<-quantile(d4$year, probs=seq(0,1,length.out=numknots))
knot_list

##        0% 7.142857% 14.28571% 21.42857% 28.57143% 35.71429% 42.85714%       50% 
##       812      1036      1174      1269      1377      1454      1518      1583 
## 57.14286% 64.28571% 71.42857% 78.57143% 85.71429% 92.85714%      100% 
##      1650      1714      1774      1833      1893      1956      2015

B<- bs(d4$year,knots=knot_list[-c(1,numknots)],degree=3,intercept = TRUE)

plot(NULL, xlim=range(d4$year),ylim=c(0,1),  ylab="basis")
for (i in 1:ncol(B)) lines(d4$year,B[,i])

numknots2<-30
knot_list2<-quantile(d4$year, probs=seq(0,1,length.out=numknots2))

B2<- bs(d4$year,knots=knot_list2[-c(1,numknots2)],degree=3,intercept = TRUE)

plot(NULL, xlim=range(d4$year),ylim=c(0,1),  ylab="basis")
for (i in 1:ncol(B2)) lines(d4$year,B2[,i])

4H2. Select out all the rows in the Howell1 data with ages below 18 years of age. If you do it right, you should end up with a new data frame with 192 rows in it.

Fit a linear regression to these data, using quap. Present and interpret the estimates. For every 10 units of increase in weight, how much taller does the model predict a child gets?
Plot the raw data, with height on the vertical axis and weight on the horizontal axis. Superimpose the MAP regression line and 89% interval for the mean. Also superimpose the 89% interval for predicted heights.
What aspects of the model fit concern you? Describe the kinds of assumptions you would change, if any, to improve the model. You don’t have to write any new code. Just explain what the model appears to be doing a bad job of, and what you hypothesize would be a better model.

data(Howell1)
d <- Howell1
d3 <- d[ d$age < 18 , ]
str(d3)

## 'data.frame':    192 obs. of  4 variables:
##  $ height: num  121.9 105.4 86.4 129.5 109.2 ...
##  $ weight: num  19.6 13.9 10.5 23.6 16 ...
##  $ age   : num  12 8 6.5 13 7 17 16 11 17 8 ...
##  $ male  : int  1 0 0 1 0 1 0 1 0 1 ...

#(a)
m <- map(
    alist(
        height ~ dnorm( mu , sigma ),
        mu <- a + b*weight ,
        a ~ dnorm( 140 , 10 ),
        b ~ dnorm( 0 , 10 ) , 
        sigma ~ dunif( 0 , 50 ) ),
    data=d3)
precis(m)

##            mean         sd      5.5%     94.5%
## a     59.808667 1.39731305 57.575491 62.041844
## b      2.650642 0.06833942  2.541422  2.759862
## sigma  8.465082 0.43480520  7.770179  9.159984

#(b)
plot(height ~ weight, data = d3, col = col.alpha(rangi2, 0.3))

weight.seq <- seq(from = min(d3$weight), to = max(d3$weight), by = 1)

mu <- link(m, data = data.frame(weight = weight.seq))

mu.mean <- apply(mu, 2, mean)
mu.HPDI <- apply(mu, 2, HPDI, prob = 0.89)
lines(weight.seq, mu.mean)
shade(mu.HPDI, weight.seq)

sim.height <- sim(m, data = list(weight = weight.seq))

height.HPDI <- apply(sim.height, 2, HPDI, prob = 0.89)
shade(height.HPDI, weight.seq)

#(c)
print("The major problem with this model appears to be that the relationship between weight and height is not linear, weight 10-30 is underestimated while < 10 and > 30 are over-estimated. Instead it is curved. As a result, at low weight values, the predicted mean is above most of the actual heights. At middle weight values, the predicted mean is below most of the heights. Then again at high weight values, the mean is above the heights. A parabolic model would likely fit these data much better.")

## [1] "The major problem with this model appears to be that the relationship between weight and height is not linear, weight 10-30 is underestimated while < 10 and > 30 are over-estimated. Instead it is curved. As a result, at low weight values, the predicted mean is above most of the actual heights. At middle weight values, the predicted mean is below most of the heights. Then again at high weight values, the mean is above the heights. A parabolic model would likely fit these data much better."

Assignment #3

Shivam Awasthi

2021-04-12

Chapter 4 - Geocentric Models

Questions