Converting Standardized Beta Coefficient Estimates to Raw Data Scale

Code chunk example

Why need to standardize

Proof of calculation

Writing the model string in JAGS, we started to be familiar of the standardizing data and eventually scaling back to former scale (we can also do that in STAN lately). This topic, we talk about:

the reason why we should do that.
provide the proof of transfomation data and scaling back in final step.

Code chunk example

Let look at 2 code chucks:

Standardize the data
Transform to original scale

between ---------- signs from the Simple Linear Model fitting with Robust assumption

 modelString = "

# Standardize the data---------------------------------------------------------
data {
    Ntotal <- length(y)
    xm <- mean(x)
    ym <- mean(y)
    xsd <- sd(x)
    ysd <- sd(y)
    for ( i in 1:length(y) ) {
      zx[i] <- (x[i] - xm) / xsd
      zy[i] <- (y[i] - ym) / ysd
    }
}#-----------------------------------------------------------------------------

# Specify the model for standardized data:
model {
    for ( i in 1:Ntotal ) {
      zy[i] ~ dt( zbeta0 + zbeta1 * zx[i] , 1/zsigma^2 , nu )
    }

    # Priors vague on standardized scale:
    zbeta0 ~ dnorm(0, 1/(10)^2 )  
    zbeta1 ~ dnorm(0, 1/(10)^2 )
    zsigma ~ dunif(1.0E-3, 1.0E+3 )
    nu ~ dexp(1/30.0)

  # Transform to original scale------------------------------------------------
  beta1 <- zbeta1 * ysd / xsd  
  beta0 <- zbeta0 * ysd  + ym - zbeta1 * xm * ysd / xsd 
  sigma <- zsigma * ysd
  #----------------------------------------------------------------------------

}
"

Why need to standardize

So why we need to standardize all response and predictors then later on scaling back

The intention of using z-scores in JAGS is to overcome a problem of correlation of the parameters (as the simulation the correlation between $\beta_0$ and $\beta_1$ in another Bayesian workshop).

Strong correlation creates thin and long shape on scatter-plot of the variables which makes Gibbs sampling very slow and inefficient.

But remember to scale back to the original measures. This can be applied to STAN in all situation !!! HMC implemented in Stan does not have this problem.

Proof of calculation

For the regression model using the standardized variables, we assume the following form for the regression line (in the present case, we assumed the response and predictors were tranformed)

$E[Y_{scaled}] = \beta_0 + \sum_{j=1}^k \beta_j z_j$

where $z_j$ is the j-th (standardized) regressor followed:

$z_j = \frac{x_j - \bar{x}_j}{S_x},$ and $\hat{Y}_{scaled} = \frac{\hat{Y}_{unscaled} - \bar{Y}}{S_Y}$

$S_j$ : sample standard deviation

Carrying out the regression with the standardized regressors, we obtain the fitted regression line:

$\hat{Y}_{scaled} = \hat{\beta_0} + \sum_{j=1}^k \hat{\beta}_j z_j$

We now wish to find the regression coefficients for the raw (non-standardized) predictors, from:

$\begin{align*} \hat{Y}_{scaled} &= \hat{\beta_0} + \sum_{j=1}^k \hat{\beta}_j \big{(} \frac{x_j - \bar{x}_j}{S_x}\big{)} \\ \frac{\hat{Y}_{unscaled} - \bar{Y}}{S_Y} &= \hat{\beta_0} + \sum_{j=1}^k \frac{1}{S_x} (x_j - \bar{x}_j) \hat{\beta}_j \\ \hat{Y}_{unscaled} &= \hat{\beta_0} S_Y + \bar{Y} + \sum_{j=1}^k \big{(} \frac{S_Y}{S_x} \big{)} (x_j - \bar{x}_j) \hat{\beta}_j \\ \hat{Y}_{unscaled} &= \hat{\beta_0} S_Y + \bar{Y} - \sum_{j=1}^k \big{(} \frac{S_Y}{S_x} \big{)} \bar{x}_j \hat{\beta}_j + \sum_{j=1}^k \big{(} \frac{S_Y}{S_x} \big{)} x_j \hat{\beta}_j \end{align*}$

As we can see,

the intercept for the regression using the non-transformed all variables is given by $\hat{\beta_0} S_Y + \bar{Y} - \sum_{j=1}^k \big{(} \frac{S_Y}{S_x} \big{)} \bar{x}_j \hat{\beta}_j$ .
the regression coefficient of the $j^{th}$ predictor is $\sum_{j=1}^k \big{(} \frac{S_Y}{S_x} \big{)} \hat{\beta}_j$ .

This was ended of the proof.

Converting Standardized Beta Coefficient Estimates to Raw Data Scale

Hai Nguyen

Last compiled on 24 March, 2021

Code chunk example

Why need to standardize

Proof of calculation