If we have \(k\) different predictor variables \(x_1,x_2,...,x_k\), what is the primary advantage of fitting a joint linear model (multiple regression \(E(y)=\beta_0+\beta_1 x_1+...+\beta_k x_k\)) over fitting \(k\) simple linear regressions (\(E(y)=\beta_0+\beta_j x_j\)), one for each predictor?
Answer: Each coefficient in the multiple regression model accounts for the presence of the other predictors whereas the coefficients in the simple linear regressions do not. Moreover, the coefficient \(\beta_1\) measures the change in \(E(y)\) due to \(x_1\) while holding \(x_j\) constant for all other \(j\ne1\).
As an example of linear regression, we’ll look at the Leinhardt data
from the car package in R. The Leinhardt data frame has 105
rows and 4 columns. The observations are nations of the world around
1970.
library("car")
## Loading required package: carData
data("Leinhardt")
head(Leinhardt)
## income infant region oil
## Australia 3426 26.7 Asia no
## Austria 3350 23.7 Europe no
## Belgium 3346 17.0 Europe no
## Canada 4751 16.8 Americas no
## Denmark 5029 13.5 Europe no
## Finland 3312 10.1 Europe no
str(Leinhardt)
## 'data.frame': 105 obs. of 4 variables:
## $ income: int 3426 3350 3346 4751 5029 3312 3403 5040 2009 2298 ...
## $ infant: num 26.7 23.7 17 16.8 13.5 10.1 12.9 20.4 17.8 25.7 ...
## $ region: Factor w/ 4 levels "Africa","Americas",..: 3 4 4 2 4 4 4 4 4 4 ...
## $ oil : Factor w/ 2 levels "no","yes": 1 1 1 1 1 1 1 1 1 1 ...
pairs(Leinhardt)
We’ll start with a simple linear regression model that relates infant mortality to per capita income.
plot(infant ~ income, data=Leinhardt)
hist(Leinhardt$infant)
hist(Leinhardt$income)
Leinhardt$loginfant = log(Leinhardt$infant)
Leinhardt$logincome = log(Leinhardt$income)
plot(loginfant ~ logincome, data=Leinhardt)
Since infant mortality and per capita income are positive and right-skewed quantities, we consider modeling them on the logarithmic scale. A linear model appears much more appropriate on this scale.
The reference Bayesian analysis (with a noninformative prior) is
available directly in R.
lmod = lm(loginfant ~ logincome, data=Leinhardt)
summary(lmod)
##
## Call:
## lm(formula = loginfant ~ logincome, data = Leinhardt)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.66694 -0.42779 -0.02649 0.30441 3.08415
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 7.14582 0.31654 22.575 <2e-16 ***
## logincome -0.51179 0.05122 -9.992 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.6867 on 99 degrees of freedom
## (4 observations deleted due to missingness)
## Multiple R-squared: 0.5021, Adjusted R-squared: 0.4971
## F-statistic: 99.84 on 1 and 99 DF, p-value: < 2.2e-16
JAGSNow we’ll fit this model in JAGS. A few countries have
missing values, and for simplicity, we will omit those.
library("rjags")
## Loading required package: coda
## Linked to JAGS 4.3.2
## Loaded modules: basemod,bugs
dat = na.omit(Leinhardt)
mod1_string = " model {
for (i in 1:n) {
y[i] ~ dnorm(mu[i], prec)
mu[i] = b[1] + b[2]*log_income[i]
}
for (i in 1:2) {
b[i] ~ dnorm(0.0, 1.0/1.0e6)
}
prec ~ dgamma(5/2.0, 5*10.0/2.0)
sig2 = 1.0 / prec
sig = sqrt(sig2)
} "
set.seed(72)
data1_jags = list(y=dat$loginfant, n=nrow(dat),
log_income=dat$logincome)
params1 = c("b", "sig")
inits1 = function() {
inits = list("b"=rnorm(2,0.0,100.0), "prec"=rgamma(1,1.0,1.0))
}
mod1 = jags.model(textConnection(mod1_string), data=data1_jags, inits=inits1, n.chains=3)
## Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 101
## Unobserved stochastic nodes: 3
## Total graph size: 404
##
## Initializing model
update(mod1, 1000) # burn-in
mod1_sim = coda.samples(model=mod1,
variable.names=params1,
n.iter=5000)
mod1_csim = do.call(rbind, mod1_sim) # combine multiple chains
Before we check the inferences from the model, we should perform convergence diagnostics for our Markov chains.
plot(mod1_sim)
gelman.diag(mod1_sim)
## Potential scale reduction factors:
##
## Point est. Upper C.I.
## b[1] 1.01 1.01
## b[2] 1.01 1.01
## sig 1.00 1.00
##
## Multivariate psrf
##
## 1
autocorr.diag(mod1_sim)
## b[1] b[2] sig
## Lag 0 1.00000000 1.00000000 1.0000000000
## Lag 1 0.95378238 0.95392639 0.0345558066
## Lag 5 0.78935313 0.78809598 -0.0006694157
## Lag 10 0.62483819 0.62201800 0.0138352803
## Lag 50 0.06282677 0.06550381 0.0061967222
autocorr.plot(mod1_sim)
effectiveSize(mod1_sim)
## b[1] b[2] sig
## 354.8369 353.6982 12945.2556
We can get a posterior summary of the parameters in our model.
summary(mod1_sim)
##
## Iterations = 1001:6000
## Thinning interval = 1
## Number of chains = 3
## Sample size per chain = 5000
##
## 1. Empirical mean and standard deviation for each variable,
## plus standard error of the mean:
##
## Mean SD Naive SE Time-series SE
## b[1] 7.1764 0.45204 0.0036909 0.0242032
## b[2] -0.5167 0.07320 0.0005977 0.0039241
## sig 0.9710 0.06784 0.0005539 0.0005994
##
## 2. Quantiles for each variable:
##
## 2.5% 25% 50% 75% 97.5%
## b[1] 6.2628 6.8801 7.1695 7.4712 8.0734
## b[2] -0.6635 -0.5644 -0.5162 -0.4688 -0.3694
## sig 0.8500 0.9231 0.9670 1.0145 1.1144
Don’t forget that these results are for a regression model relating the logarithm of infant mortality to the logarithm of income.
Checking residuals (the difference between the response and the model’s prediction for that value) is important with linear models since residuals can reveal violations of the assumptions we made to specify the model. In particular, we are looking for any sign that the model is not linear, normally distributed, or that the observations are not independent (conditional on covariates).
First, let’s look at what would have happened if we fit the reference linear model to the un-transformed variables.
lmod0 = lm(infant ~ income, data=Leinhardt)
plot(resid(lmod0)) # to check independence (looks okay)
plot(predict(lmod0), resid(lmod0)) # to check for linearity, constant variance (looks bad)
qqnorm(resid(lmod0)) # to check Normality assumption (we want this to be a straight line)
Now let’s return to our model fit to the log-transformed variables. In a Bayesian model, we have distributions for residuals, but we’ll simplify and look only at the residuals evaluated at the posterior mean of the parameters.
X = cbind(rep(1.0, data1_jags$n), data1_jags$log_income)
head(X)
## [,1] [,2]
## [1,] 1 8.139149
## [2,] 1 8.116716
## [3,] 1 8.115521
## [4,] 1 8.466110
## [5,] 1 8.522976
## [6,] 1 8.105308
(pm_params1 = colMeans(mod1_csim)) # posterior means
## b[1] b[2] sig
## 7.1763560 -0.5166638 0.9710010
yhat1 = drop(X %*% pm_params1[1:2])
resid1 = data1_jags$y - yhat1
plot(resid1) # residuals against data index
plot(yhat1, resid1) # residuals against predicted values
qqnorm(resid1) # checking normality of residuals
plot(predict(lmod), resid(lmod)) # to compare with reference linear model
rownames(dat)[order(resid1, decreasing=TRUE)[1:5]] # which countries have the largest positive residuals?
## [1] "Saudi.Arabia" "Libya" "Zambia" "Brazil" "Afganistan"
The residuals look pretty good here (no patterns, shapes) except for two strong outliers, Saudi Arabia and Libya. When outliers appear, it is a good idea to double check that they are not just errors in data entry. If the values are correct, you may reconsider whether these data points really are representative of the data you are trying to model. If you conclude that they are not (for example, they were recorded on different years), you may be able to justify dropping these data points from the data set.
If you conclude that the outliers are part of data and should not be removed, we have several modeling options to accommodate them. We will address these in the next segment.
In the previous segment, we saw two outliers in the model relating the logarithm of infant mortality to the logarithm of income. Here we will discuss options for when we conclude that these outliers belong in the data set.
The first approach is to look for additional covariates that may be able to explain the outliers. For example, there could be a number of variables that provide information about infant mortality above and beyond what income provides.
Looking back at our data, there are two variables we haven’t used
yet: region and oil. The oil
variable indicates oil-exporting countries. Both Saudi Arabia and Libya
are oil-exporting countries, so perhaps this might explain part of the
anomaly.
library("rjags")
mod2_string = " model {
for (i in 1:length(y)) {
y[i] ~ dnorm(mu[i], prec)
mu[i] = b[1] + b[2]*log_income[i] + b[3]*is_oil[i]
}
for (i in 1:3) {
b[i] ~ dnorm(0.0, 1.0/1.0e6)
}
prec ~ dgamma(5/2.0, 5*10.0/2.0)
sig = sqrt( 1.0 / prec )
} "
set.seed(73)
data2_jags = list(y=dat$loginfant, log_income=dat$logincome,
is_oil=as.numeric(dat$oil=="yes"))
data2_jags$is_oil
## [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0
## [38] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [75] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
params2 = c("b", "sig")
inits2 = function() {
inits = list("b"=rnorm(3,0.0,100.0), "prec"=rgamma(1,1.0,1.0))
}
mod2 = jags.model(textConnection(mod2_string), data=data2_jags, inits=inits2, n.chains=3)
## Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 101
## Unobserved stochastic nodes: 4
## Total graph size: 507
##
## Initializing model
update(mod2, 1e3) # burn-in
mod2_sim = coda.samples(model=mod2,
variable.names=params2,
n.iter=5e3)
mod2_csim = as.mcmc(do.call(rbind, mod2_sim)) # combine multiple chains
As usual, check the convergence diagnostics.
plot(mod2_sim)
gelman.diag(mod2_sim)
## Potential scale reduction factors:
##
## Point est. Upper C.I.
## b[1] 1.01 1.03
## b[2] 1.01 1.03
## b[3] 1.00 1.00
## sig 1.00 1.00
##
## Multivariate psrf
##
## 1.01
autocorr.diag(mod2_sim)
## b[1] b[2] b[3] sig
## Lag 0 1.00000000 1.00000000 1.000000000 1.000000000
## Lag 1 0.95239764 0.95328566 0.074087346 0.022168459
## Lag 5 0.78512036 0.78650774 0.014945880 0.021408494
## Lag 10 0.61762067 0.61954906 0.001532449 0.009553919
## Lag 50 0.07855918 0.07817641 -0.004843572 0.007344331
autocorr.plot(mod2_sim)
effectiveSize(mod2_sim)
## b[1] b[2] b[3] sig
## 360.4470 358.6838 12392.7380 13225.3798
We can get a posterior summary of the parameters in our model.
summary(mod2_sim)
##
## Iterations = 1001:6000
## Thinning interval = 1
## Number of chains = 3
## Sample size per chain = 5000
##
## 1. Empirical mean and standard deviation for each variable,
## plus standard error of the mean:
##
## Mean SD Naive SE Time-series SE
## b[1] 7.1633 0.44332 0.0036197 0.0232489
## b[2] -0.5250 0.07174 0.0005858 0.0037700
## b[3] 0.7939 0.35418 0.0028919 0.0031834
## sig 0.9524 0.06691 0.0005463 0.0005841
##
## 2. Quantiles for each variable:
##
## 2.5% 25% 50% 75% 97.5%
## b[1] 6.30387 6.8563 7.1652 7.4604 8.0323
## b[2] -0.66618 -0.5730 -0.5256 -0.4757 -0.3865
## b[3] 0.09896 0.5563 0.7939 1.0269 1.4943
## sig 0.83228 0.9057 0.9484 0.9958 1.0938
It looks like there is a positive relationship between oil-production and log-infant mortality. Because these data are merely observational, we cannot say that oil-production causes an increase in infant mortality (indeed that most certainly isn’t the case), but we can say that they are positively correlated.
Now let’s check the residuals.
X2 = cbind(rep(1.0, data1_jags$n), data2_jags$log_income, data2_jags$is_oil)
head(X2)
## [,1] [,2] [,3]
## [1,] 1 8.139149 0
## [2,] 1 8.116716 0
## [3,] 1 8.115521 0
## [4,] 1 8.466110 0
## [5,] 1 8.522976 0
## [6,] 1 8.105308 0
(pm_params2 = colMeans(mod2_csim)) # posterior mean
## b[1] b[2] b[3] sig
## 7.1633185 -0.5250452 0.7938880 0.9524315
yhat2 = drop(X2 %*% pm_params2[1:3])
resid2 = data2_jags$y - yhat2
plot(resid2) # residuals against data index
plot(yhat2, resid2) # residuals against predicted values
plot(yhat1, resid1) # revisit residuals from the first model
sd(resid2) # standard deviation of residuals
## [1] 0.6488911
These look much better, although the residuals for Saudi Arabia and Libya are still more than three standard deviations away from the mean of the residuals. We might consider adding the other covariate region, but instead let’s look at another option when we are faced with strong outliers.
Let’s consider changing the likelihood. The normal likelihood has thin tails (almost all of the probability is concentrated within the first few standard deviations from the mean). This does not accommodate outliers well. Consequently, models with the normal likelihood might be overly-influenced by outliers. Recall that the \(t\) distribution is similar to the normal distribution, but it has thicker tails which can accommodate outliers.
The \(t\) linear model might look something like this. Notice that the \(t\) distribution has three parameters, including a positive “degrees of freedom” parameter. The smaller the degrees of freedom, the heavier the tails of the distribution. We might fix the degrees of freedom to some number, or we can assign it a prior distribution.
mod3_string = " model {
for (i in 1:length(y)) {
y[i] ~ dt( mu[i], tau, df )
mu[i] = b[1] + b[2]*log_income[i] + b[3]*is_oil[i]
}
for (i in 1:3) {
b[i] ~ dnorm(0.0, 1.0/1.0e6)
}
df = nu + 2.0 # we want degrees of freedom > 2 to guarantee existence of mean and variance
nu ~ dexp(1.0)
tau ~ dgamma(5/2.0, 5*10.0/2.0) # tau is close to, but not equal to the precision
sig = sqrt( 1.0 / tau * df / (df - 2.0) ) # standard deviation of errors
} "
We will leave it up to you to fit this model.
We have now proposed three different models. How do we compare their performance on our data? In the previous course, we discussed estimating parameters in models using the maximum likelihood method. Similarly, we can choose between competing models using the same idea.
We will use a quantity known as the deviance information criterion (DIC). It essentially calculates the posterior mean of the log-likelihood and adds a penalty for model complexity.
Let’s calculate the DIC for our first two models:
dic.samples(mod1, n.iter=1e3)
## Mean deviance: 231.5
## penalty 3.05
## Penalized deviance: 234.5
dic.samples(mod2, n.iter=1e3)
## Mean deviance: 225.1
## penalty 3.957
## Penalized deviance: 229.1
The first number is the Monte Carlo estimated posterior mean deviance, which equals \(−2 \space\times\) the log-likelihood (plus a constant that will be irrelevant for comparing models). Because of that \(−2\) factor, a smaller deviance means a higher likelihood.
Next, we are given a penalty for the complexity of our model. This penalty is necessary because we can always increase the likelihood of the model by making it more complex to fit the data exactly. We don’t want to do this because over-fit models generalize poorly. This penalty is roughly equal to the effective number of parameters in your model. You can see this here. With the first model, we had a variance parameter and two betas, for a total of three parameters. In the second model, we added one more beta for the oil effect.
We add these two quantities to get the DIC (the last number). The
better-fitting model has a lower DIC value. In this case, the gains we
receive in deviance by adding the is_oil covariate outweigh
the penalty for adding an extra parameter. The final DIC for the second
model is lower than for the first, so we would prefer using the second
model.
We encourage you to explore different model specifications and
compare their fit to the data using DIC. Wikipedia
provides a good introduction to DIC and we can find more details about
the JAGS implementation through the rjags
package documentation by entering ?dic.samples in the
R console.
library("car") # load the 'car' package
data("Anscombe") # load the data set
?Anscombe # read a description of the data
head(Anscombe) # look at the first few lines of the data
## education income young urban
## ME 189 2824 350.7 508
## NH 169 3259 345.9 564
## VT 230 3072 348.5 322
## MA 168 3835 335.3 846
## RI 180 3549 327.1 871
## CT 193 4256 341.0 774
pairs(Anscombe) # scatter plots for each pair of variables
education_model <- lm(education~income+young+urban,data=Anscombe)
summary(education_model)
##
## Call:
## lm(formula = education ~ income + young + urban, data = Anscombe)
##
## Residuals:
## Min 1Q Median 3Q Max
## -60.240 -15.738 -1.156 15.883 51.380
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -2.868e+02 6.492e+01 -4.418 5.82e-05 ***
## income 8.065e-02 9.299e-03 8.674 2.56e-11 ***
## young 8.173e-01 1.598e-01 5.115 5.69e-06 ***
## urban -1.058e-01 3.428e-02 -3.086 0.00339 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 26.69 on 47 degrees of freedom
## Multiple R-squared: 0.6896, Adjusted R-squared: 0.6698
## F-statistic: 34.81 on 3 and 47 DF, p-value: 5.337e-12
library("rjags")
mod_string = " model {
for (i in 1:length(education)) {
education[i] ~ dnorm(mu[i], prec)
mu[i] = b0 + b[1]*income[i] + b[2]*young[i] + b[3]*urban[i]
}
b0 ~ dnorm(0.0, 1.0/1.0e6)
for (i in 1:3) {
b[i] ~ dnorm(0.0, 1.0/1.0e6)
}
prec ~ dgamma(1.0/2.0, 1.0*1500.0/2.0)
## Initial guess of variance based on overall
## variance of education variable. Uses low prior
## effective sample size. Technically, this is not
## a true 'prior', but it is not very informative.
sig2 = 1.0 / prec
sig = sqrt(sig2)
} "
data_jags = as.list(Anscombe)
params1 = c("b0","b", "sig")
inits1 = function() {
inits = list("b"=rnorm(3,0.0,100.0), "prec"=rgamma(1,1.0,1.0))
}
mod1 = jags.model(textConnection(mod_string), data=data_jags, inits=inits1, n.chains=3)
## Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 51
## Unobserved stochastic nodes: 5
## Total graph size: 422
##
## Initializing model
update(mod1, 1000) # burn-in
mod1_sim = coda.samples(model=mod1,
variable.names=params1,
n.iter=5000)
mod1_csim = do.call(rbind, mod1_sim) # combine multiple chains
plot(mod1_sim)
gelman.diag(mod1_sim)
## Potential scale reduction factors:
##
## Point est. Upper C.I.
## b[1] 1.03 1.12
## b[2] 1.06 1.15
## b[3] 1.05 1.17
## b0 1.05 1.13
## sig 1.00 1.01
##
## Multivariate psrf
##
## 1.07
autocorr.diag(mod1_sim)
## b[1] b[2] b[3] b0 sig
## Lag 0 1.0000000 1.0000000 1.0000000 1.0000000 1.00000000
## Lag 1 0.9831520 0.9936467 0.9720950 0.9948872 0.08072946
## Lag 5 0.9206416 0.9699671 0.8742759 0.9749790 0.05177640
## Lag 10 0.8506233 0.9433945 0.7765704 0.9517076 0.02808404
## Lag 50 0.5452206 0.7756046 0.3883893 0.7960094 0.01645752
autocorr.plot(mod1_sim)
effectiveSize(mod1_sim)
## b[1] b[2] b[3] b0 sig
## 118.78532 45.70286 198.69623 38.43828 5231.71333
plot(education_model)
dic.samples(mod1, n.iter=1e5)
## Mean deviance: 480.9
## penalty 5.226
## Penalized deviance: 486.2
library("rjags")
mod_string_question_4a = " model {
for (i in 1:length(education)) {
education[i] ~ dnorm(mu[i], prec)
mu[i] = b0 + b[1]*income[i] + b[2]*young[i]
}
b0 ~ dnorm(0.0, 1.0/1.0e6)
for (i in 1:2) {
b[i] ~ dnorm(0.0, 1.0/1.0e6)
}
prec ~ dgamma(1.0/2.0, 1.0*1500.0/2.0)
## Initial guess of variance based on overall
## variance of education variable. Uses low prior
## effective sample size. Technically, this is not
## a true 'prior', but it is not very informative.
sig2 = 1.0 / prec
sig = sqrt(sig2)
} "
data_jags = as.list(Anscombe)
params1 = c("b0","b", "sig")
inits1 = function() {
inits = list("b"=rnorm(2,0.0,100.0), "prec"=rgamma(1,1.0,1.0))
}
mod_question_4a = jags.model(textConnection(mod_string_question_4a), data=data_jags, inits=inits1, n.chains=3)
## Warning in jags.model(textConnection(mod_string_question_4a), data = data_jags,
## : Unused variable "urban" in data
## Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 51
## Unobserved stochastic nodes: 4
## Total graph size: 320
##
## Initializing model
update(mod_question_4a, 1000) # burn-in
mod_question_4a_sim = coda.samples(model=mod_question_4a,
variable.names=params1,
n.iter=5000)
mod_question_4a_csim = do.call(rbind, mod_question_4a_sim) # combine multiple chains
plot(mod_question_4a_sim)
gelman.diag(mod_question_4a_sim)
## Potential scale reduction factors:
##
## Point est. Upper C.I.
## b[1] 1.01 1.04
## b[2] 1.11 1.32
## b0 1.11 1.33
## sig 1.00 1.01
##
## Multivariate psrf
##
## 1.08
autocorr.diag(mod_question_4a_sim)
## b[1] b[2] b0 sig
## Lag 0 1.0000000 1.0000000 1.0000000 1.00000000
## Lag 1 0.9681978 0.9936505 0.9950037 0.05120861
## Lag 5 0.8491241 0.9691617 0.9747177 0.02312660
## Lag 10 0.7239999 0.9401533 0.9497977 0.02793077
## Lag 50 0.2575059 0.7439478 0.7533292 0.01041833
autocorr.plot(mod_question_4a_sim)
effectiveSize(mod_question_4a_sim)
## b[1] b[2] b0 sig
## 242.88985 47.76755 38.91240 7510.24763
dic.samples(mod_question_4a, n.iter=1e5)
## Mean deviance: 489.2
## penalty 4.159
## Penalized deviance: 493.4
library("rjags")
mod_string_question_4b = " model {
for (i in 1:length(education)) {
education[i] ~ dnorm(mu[i], prec)
mu[i] = b0 + b[1]*income[i] + b[2]*young[i]+ b[3]*young[i]*income[i]
}
b0 ~ dnorm(0.0, 1.0/1.0e6)
for (i in 1:3) {
b[i] ~ dnorm(0.0, 1.0/1.0e6)
}
prec ~ dgamma(1.0/2.0, 1.0*1500.0/2.0)
## Initial guess of variance based on overall
## variance of education variable. Uses low prior
## effective sample size. Technically, this is not
## a true 'prior', but it is not very informative.
sig2 = 1.0 / prec
sig = sqrt(sig2)
} "
data_jags = as.list(Anscombe)
params1 = c("b0","b", "sig")
inits1 = function() {
inits = list("b"=rnorm(3,0.0,100.0), "prec"=rgamma(1,1.0,1.0))
}
mod_question_4b = jags.model(textConnection(mod_string_question_4b), data=data_jags, inits=inits1, n.chains=3)
## Warning in jags.model(textConnection(mod_string_question_4b), data = data_jags,
## : Unused variable "urban" in data
## Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 51
## Unobserved stochastic nodes: 5
## Total graph size: 372
##
## Initializing model
update(mod_question_4b, 1000) # burn-in
mod_question_4b_sim = coda.samples(model=mod_question_4b,
variable.names=params1,
n.iter=5000)
mod_question_4b_csim = do.call(rbind, mod_question_4b_sim) # combine multiple chains
plot(mod_question_4b_sim)
gelman.diag(mod_question_4b_sim)
## Potential scale reduction factors:
##
## Point est. Upper C.I.
## b[1] 15.91 41.9
## b[2] 16.31 44.0
## b[3] 15.49 40.7
## b0 16.63 45.8
## sig 1.65 2.6
##
## Multivariate psrf
##
## 12.7
autocorr.diag(mod_question_4b_sim)
## b[1] b[2] b[3] b0 sig
## Lag 0 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000
## Lag 1 0.9953131 0.9965044 0.9956522 0.9962596 0.4630612
## Lag 5 0.9772698 0.9837085 0.9791110 0.9825277 0.4400860
## Lag 10 0.9571882 0.9696384 0.9588185 0.9675369 0.4223150
## Lag 50 0.8206555 0.8864046 0.8130327 0.8717996 0.2965933
autocorr.plot(mod_question_4b_sim)
effectiveSize(mod_question_4b_sim)
## b[1] b[2] b[3] b0 sig
## 35.23262 23.44485 32.34437 25.48249 392.31325
dic.samples(mod_question_4b, n.iter=1e5)
## Mean deviance: 487.1
## penalty 5.186
## Penalized deviance: 492.3