AIC: Works when the posterior is approximately multivariate gaussian. The Akaike information criterion (AIC) is an estimator of prediction error and thereby relative quality of statistical models for a given set of data. Given a collection of models for the data, AIC estimates the quality of each model, relative to each of the other models. Thus, AIC provides a means for model selection.
WAIC: This is the Widely Applicable Information Criterion (WAIC), which is \(-2(lppd - pWAIC)\), meaning log posterior predictive density with a penalty proportional to the variance in the posterior predictions. WAIC is an extension of the Akaike Information Criterion (AIC). WAIC estimates the effective number of parameters to adjust for overfitting. This is more general criteria/approach to model fitting as this equation makes no assumption about the shape of the posterior.
Model selection means choosing the model with the lowest criterion value and then discarding the others, which is not great. This procedure looses information about relative model accuracy contained in the differences among the CV/PSIS/WAIC values. Instead of model selection, it is better to use model comparison. This is a more general approach that uses simple models to understand how different variables influence predictions and in combination with a causal model, implied conditional independencies among variables, help us infer causal relationships.
The models must be fit to the same number of observations because information criterion is based on deviance. If you increase the number of observations, it is likely that you will get a higher deviance and therefore a lower accuracy. Deviance is derived from the sum and not the mean of the observations, and will increase the deviance. A model based on a larger number of observations will always give a higher deviance, thus the deviance (information criteria) is not appropriate to use for comparison in these cases.
When defining more concentrated priors,the effective number of parameters will decrease. This is because the priors become more regularized.
Overfitting refers to when a statistical model fits exactly against the training data, and therefore cannot perform accurately against unseen data. Defining an appropriate/informative prior guides the model when fitting it to the data by not allowing the model to not estimate the parameters based on extreme values.
If the priors are defined too strict then the model will not be able to pick out patterns from the data, resulting in underfittning.