Processing math: 100%
  • Code chunk example
  • Why need to standardize
  • Proof of calculation

Writing the model string in JAGS, we started to be familiar of the standardizing data and eventually scaling back to former scale (we can also do that in STAN lately). This topic, we talk about:

  • the reason why we should do that.
  • provide the proof of transfomation data and scaling back in final step.

Code chunk example

Let look at 2 code chucks:

  • Standardize the data
  • Transform to original scale

between ---------- signs from the Simple Linear Model fitting with Robust assumption

 modelString = "

# Standardize the data---------------------------------------------------------
data {
    Ntotal <- length(y)
    xm <- mean(x)
    ym <- mean(y)
    xsd <- sd(x)
    ysd <- sd(y)
    for ( i in 1:length(y) ) {
      zx[i] <- (x[i] - xm) / xsd
      zy[i] <- (y[i] - ym) / ysd
    }
}#-----------------------------------------------------------------------------

# Specify the model for standardized data:
model {
    for ( i in 1:Ntotal ) {
      zy[i] ~ dt( zbeta0 + zbeta1 * zx[i] , 1/zsigma^2 , nu )
    }

    # Priors vague on standardized scale:
    zbeta0 ~ dnorm(0, 1/(10)^2 )  
    zbeta1 ~ dnorm(0, 1/(10)^2 )
    zsigma ~ dunif(1.0E-3, 1.0E+3 )
    nu ~ dexp(1/30.0)

  # Transform to original scale------------------------------------------------
  beta1 <- zbeta1 * ysd / xsd  
  beta0 <- zbeta0 * ysd  + ym - zbeta1 * xm * ysd / xsd 
  sigma <- zsigma * ysd
  #----------------------------------------------------------------------------

}
"

Why need to standardize

So why we need to standardize all response and predictors then later on scaling back

The intention of using z-scores in JAGS is to overcome a problem of correlation of the parameters (as the simulation the correlation between β0 and β1 in another Bayesian workshop).

Strong correlation creates thin and long shape on scatter-plot of the variables which makes Gibbs sampling very slow and inefficient.

But remember to scale back to the original measures. This can be applied to STAN in all situation !!! HMC implemented in Stan does not have this problem.

Proof of calculation

For the regression model using the standardized variables, we assume the following form for the regression line (in the present case, we assumed the response and predictors were tranformed)

E[Yscaled]=β0+kj=1βjzj

where zj is the j-th (standardized) regressor followed:

zj=xjˉxjSx, and ˆYscaled=ˆYunscaledˉYSY

Sj: sample standard deviation

Carrying out the regression with the standardized regressors, we obtain the fitted regression line:

ˆYscaled=^β0+kj=1ˆβjzj

We now wish to find the regression coefficients for the raw (non-standardized) predictors, from:

ˆYscaled=^β0+kj=1ˆβj(xjˉxjSx)ˆYunscaledˉYSY=^β0+kj=11Sx(xjˉxj)ˆβjˆYunscaled=^β0SY+ˉY+kj=1(SYSx)(xjˉxj)ˆβjˆYunscaled=^β0SY+ˉYkj=1(SYSx)ˉxjˆβj+kj=1(SYSx)xjˆβj

As we can see,

  • the intercept for the regression using the non-transformed all variables is given by ^β0SY+ˉYkj=1(SYSx)ˉxjˆβj.
  • the regression coefficient of the jth predictor is kj=1(SYSx)ˆβj.

This was ended of the proof.