Bayesian linear regression are presented in Box and Tiao(1973);Gelman et al.(2004);Lee(2004);Zellner(1971). Consider a standard linear regression problem with throug to origin, in which for our data is
\[ Actual_{i} = \beta Actual_{i} + \epsilon_{i}, \epsilon_{i} \sim N(0, \sigma^2) \]
The parameter of greatest interest in Bayesian simple linear regression usually is \(\beta_{1}\), \(\beta_{0}\) and \(\sigma^2\),We will consider first the standard noninformative prior that yields Bayesian inference analogous to the frequentist results. As in the models with normal likelihood and both mean and variance unknown, the standard noninformative prior in Bayesian regression is the product of independent improper priors on the meansrelated parameters and the variance parameter. The parameters related to the means are the regression coefficients \(\beta_{0}\) and \(\beta_{1}\). Multiplying flat priors (proportional to a constant over the whole real line) on both coefficients times an inverse gamma prior with both parameters going to 0 for \(\sigma^2\) yields
\[p(\beta,\sigma^{2})\sim {1\over\sigma^{2}}\]
This frequentist approach assumes that there are enough measurements to say something meaningful about \(\boldsymbol\beta\). In the Bayesian inference|Bayesian approach, the data are supplemented with additional information in the form of a prior probability distribution. The prior belief about the parameters is combined with the data’s likelihood function according to [[Bayes theorem]] to yield the posterior probability|posterior belief about the parameters \(\boldsymbol\beta\) and \(\sigma\). The prior can take different functional forms depending on the domain and the information that is available a priori.
In Bayesian statistical inference, a prior probability distribution, often called simply the prior, of an uncertain quantity is the probability distribution that would express one’s beliefs about this quantity before some evidence is taken into account. For example, the prior could be the probability distribution representing the relative proportions of voters who will vote for a particular politician in a future election. The unknown quantity may be a parameter of the model or a latent variable rather than an observable variable. Bayes’ theorem calculates the renormalized pointwise product of the prior and the likelihood function, to produce the posterior probability distribution, which is the conditional distribution of the uncertain quantity given the data. A prior can be the purely subjective assessment of an experienced expert. It can also be chosen according to some principle, such as the Jeffreys prior or Bernardo’s reference prior. When a family of conjugate priors exists, choosing a prior from that family simplifies calculation of the posterior distribution. An uninformative prior expresses vague or general information about a variable. The term “uninformative prior” is somewhat of a misnomer; often, such a prior might be called a not very informative prior, or an objective prior, i.e. one that’s not subjectively elicited. Uninformative priors can express “objective” information such as “the variable is positive” or “the variable is less than some limit”. The simplest and oldest rule for determining a non-informative prior is the principle of indifference, which assigns equal probabilities to all possibilities. In parameter estimation problems, the use of an uninformative prior typically yields results which are not too different from conventional statistical analysis, as the likelihood function often yields more information than the uninformative prior.
The first step before we conduct bayesian regression, we make some exploration data analysis from 12 data Actual Survey Vs Sample Survey from Consolidated HIS Site Inventory file data. The Scatter plot Figure 1 below, show the relationship between Survey Site vs Actual Survey data.
The Completely raw data can we see detail in Table 1
We using OpenBUGS and the R2OpenBUGS package in R to perform some simple linear regression. The following data need to be fit by classical linear regression. They concern the Actual Voleme (reponse variable) and Survey (explanatory variable) of twelve data samples.
We can see from the above figure that there is what appears to be a multiple influential point that may be an outlier.In ordinary regression we assume normally distributed errors. If we think that there might be outliers, then it is better to assume an error distribution with longer tails than the normal. The t distribution is better. For this reaseon we conduct bayesian regession analysis with an error distribution has t distribution. The results of our bayesian analysis can see in following result :
From the results above we conclude that With the t distribution, we get a very different result. The regression is now what we call resistant regression, meaning that it is not influenced by outliers. It works because the likelihood now recognizes the possibility of finding an odd point in the tails of the distribution
Evaluation bayesian model conduct to evaluate quality of bayesian parameter estimate, Several method uses to evaluate including convergence diagnostic (eg. traceplot,Density Plots,Autocorrelation PLot)
A traceplot is a plot of the iteration number against the value of the draw of the parameter at each iteration. We can see whether our chain gets stuck in certain areas of the parameter space, which indicates bad mixing or moving around the parameter space
The density plot is visulalization of probability distribution parameter estimate
Autocorrelation PLot is Another way to assess convergence is to assess the autocorrelations between the draws of our Markov chain
An alternative model to investigate the relatonship between Actual and Suvery (HIS) is using OLS Regression through the origin, the result below showa a comparison between OLS Regression and Bayesians Regression.
The analysis showed that the origin regression coefficient has a larger standard error when compared with the Bayesian approach to the same model.
Box, G.E., & Tiao, G.C. (1973).Bayesian inference in statistical analysis. Reading: AddisonWesley.
Gelman, A., Carlin, J.B., Stern, H.S., Rubin, D.B. (2004).Bayesian data analysis, 2nd edn. London: Chapman & Hall.
Lee, P.M. (2004).Bayesian statistics: an introduction, 3rd edn. London: Hodder Arnold. Zellner, A. (1971).An introduction to Bayesian inference in econometrics. New York: Wiley.