JAGS-Regression, Bayes factors, Paired t-test

Aim

Purpose of assignment is to get some practical experience using JAGS on common methods we have come across in statistics. These methods include fitting linearity regression models with and without interactions, Bayes factors, and paired t-tests.

Dataset Overview

This datasets aims to estimate the biomass of paddocks by using satellites measurements of the reflected wavelengths of green, blue, red, and near-infrared (henceforth NIR).

DryMatter	drymatter	NDVI	Date	Paddock	blue	green	red	nir
1875	1536.238	0.6785089	2017-10-13	A12	325.0920	532.5517	520.3084	2716.536
2250	1748.723	0.7144538	2017-10-18	A12	335.7911	551.1741	528.0619	3170.547
2550	2012.143	0.7427501	2018-02-15	A12	435.8410	592.2337	509.5287	3451.824
1950	1815.039	0.6794601	2018-05-03	A12	554.5660	672.9560	601.7782	3153.000
1950	1879.764	0.7208708	2018-05-03	A12	388.9216	649.5143	614.5010	3788.486
2700	2393.333	0.8187152	2018-05-28	A12	218.9326	498.8568	442.2653	4436.966

Other variables include DryMatter (ground biomass measurements), drymatter (current algorithm estimate of biomass), Date (date of estimate), Paddock (paddock on farm) and RGB, NIR (respective reflected wavelengths).

Shown in the matrix below there is strong positive correlation with the biomass estimate from our current model estimates (expected). Also NIR by itself has a strong positive correlation while RGB all have weaker negative correlations. There also appears to be multidisciplinary between the predictors especially between the RGB reflected wavelengths. Possibly an indication the not all variables will be needed in the model.

Data Exploration

The graph below shows the NIR wavelengths reflected the most but also has the most variance relative to RGB. It was shown by it’s strong linear relationship that has NIR increases, biomass increases. RGB on the other hand have smaller variances and weak negative relationships so as they increase biomass decreases (very slightly).

Fit a Linear Regression Model Using JAGS

Set out variables

Separating the variables out from the data the we will use later. Will also mean center the data at this point as it helps with thinning later on.

Y <- All_Clean$DryMatter - mean(All_Clean$DryMatter)
b1_nir <- All_Clean$nir - mean(All_Clean$nir)
b2_red <- All_Clean$red - mean(All_Clean$red)
b3_green <- All_Clean$green - mean(All_Clean$green)
b4_blue <- All_Clean$blue - mean(All_Clean$blue)
n <- length(Y)
all_clean_centered = data.frame(cbind(Y,b1_nir,b2_red,b3_green,b4_blue))
colnames(all_clean_centered) = c("DryMatter","nir","red","green","blue")

The model specification

In setting up our model we must first know the form in which it should follow, being a linear regression with four predictor variables we know the form should be of:

\[ y = \beta_{o} + x_{1}\beta_{1} + x_{2}\beta_{2}+ x_{3}\beta_{3}+ x_{4}\beta_{4}+ \epsilon,\;\;\;\;\; \epsilon \sim N(0, \sigma^2) \] Now we should specify our priors the will make up the model. For the \(\beta\) coefficients we have no knowledge of what value the should take so I will set the prior to be from a normal distribution with a standard deviation centered around zero. This is because normal distribution allows for all possible values and can set the standard deviation as needed depending on how large a distance I expect the MCMC will need to make. Initially I set s.d. at 0.001 but found my values didn’t converge anywhere close to the lm() model so I later changed to 0.00001 to allow more movement.

Note: I made smaller because in JAGS it is the inverse variance, outside JAGS it would be making the standard deviation larger

The prior for sigma has been taken from a \(gamma(0.001, 0.001)\) distributions as it allows for all possible values but tends to favour smaller values. Shown in the image below:

Gamma(0.001, 0.001)

model_string <- "model{

  # Posterior Likelihood
  for(i in 1:n){
    Y[i]   ~ dnorm(mu[i],inv.var)
    mu[i] <- beta_intercept + beta_nir*b1_nir[i] + beta_red*b2_red[i] + beta_green*b3_green[i] + beta_blue*b4_blue[i]
  }

  # Prior for beta
  beta_intercept ~ dnorm(0,0.00001)
  beta_nir ~ dnorm(0,0.00001)
  beta_red ~ dnorm(0,0.00001)
  beta_green ~ dnorm(0,0.00001)
  beta_blue ~ dnorm(0,0.00001)
  

  # Prior for the inverse variance
  inv.var   ~ dgamma(0.001, 0.001)
  sigma     <- 1/sqrt(inv.var)

}"

Compile the model in JAGS

model <- jags.model(textConnection(model_string), 
                    data = list(Y=Y,n=n,b1_nir=b1_nir,b2_red=b2_red,b3_green=b3_green,b4_blue=b4_blue))

## Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
## Graph information:
##    Observed stochastic nodes: 897
##    Unobserved stochastic nodes: 6
##    Total graph size: 8983
## 
## Initializing model

Draw sample

Initialing with a small burn in and without thinning and having 10000 steps. Initially I found that thinning was extremely large at around 4000 so with the aim of around 5000 samples I would need the steps to be around 20 million. Then when mean centering the data it became a lot more reasonable. As my laptop cannot handle this I took 100,000 steps as the results were still very accurate.

burn_in = 1
thin = 1 
variable_names = c('beta_intercept','beta_nir','beta_red','beta_green','beta_blue', 'sigma')
steps = 100000

update(model, burn_in)
  draw = jags.samples(model, steps, thin=thin, variable.names = variable_names)
  # Convert to a list
  make_list <- function(draw)
  {
    results = list()
    for(name in names(draw))
    {
      # Extract "chain 1"
      results[[name]] = as.array(draw[[name]][,,1])
      # Transpose 2D arrays
      if(length(dim(results[[name]])) == 2)
        results[[name]] = t(results[[name]])
    }
    return(results)
  }
  results = make_list(draw)

Determine the thinning and burn in

beta_blue	beta_green	beta_intercept	beta_nir	beta_red	sigma
Min. :0.7434	Min. :-7.67126	Min. :-39.13509	Min. :0.3983	Min. :-2.0949	Min. :241.2
1st Qu.:3.0356	1st Qu.:-5.79094	1st Qu.: -6.04993	1st Qu.:0.4973	1st Qu.: 0.2805	1st Qu.:265.0
Median :3.2192	Median :-5.39336	Median : 0.00535	Median :0.5099	Median : 0.5033	Median :269.3
Mean :3.2193	Mean :-5.38676	Mean : -0.00290	Mean :0.5099	Mean : 0.5001	Mean :269.4
3rd Qu.:3.4042	3rd Qu.:-4.98457	3rd Qu.: 6.05209	3rd Qu.:0.5226	3rd Qu.: 0.7221	3rd Qu.:273.6
Max. :4.3583	Max. : 0.03094	Max. : 36.22369	Max. :0.5913	Max. : 1.7239	Max. :300.2

Adjustments

After making adjustments of a burn in at 1000 (little large but no harm) and thinning should be around 150. But to get a sample of around 5000 I will need to thin at 20 as I have 100,000 steps. This will just add more variance in the end model. In practice I would run more steps to have a less covariance as possible.

beta_blue	beta_green	beta_intercept	beta_nir	beta_red	sigma
Min. :2.174	Min. :-7.606	Min. :-32.25904	Min. :0.4432	Min. :-0.5665	Min. :241.2
1st Qu.:3.040	1st Qu.:-5.801	1st Qu.: -5.95349	1st Qu.:0.4977	1st Qu.: 0.2828	1st Qu.:265.0
Median :3.219	Median :-5.390	Median : 0.21901	Median :0.5099	Median : 0.4983	Median :269.3
Mean :3.222	Mean :-5.392	Mean : 0.07197	Mean :0.5102	Mean : 0.5018	Mean :269.4
3rd Qu.:3.409	3rd Qu.:-4.982	3rd Qu.: 6.08858	3rd Qu.:0.5228	3rd Qu.: 0.7278	3rd Qu.:273.4
Max. :4.229	Max. :-3.369	Max. : 32.39724	Max. :0.5776	Max. : 1.6195	Max. :296.8

Comparison to lm() model.

Comes out almost exactly the same with the exception of intercept which is much closer to zero than the estimates. But all and all very good.

## 
## Call:
## lm(formula = DryMatter ~ nir + red + green + blue, data = all_clean_centered)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -981.8 -193.0   -9.7  166.2 1041.8 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -2.102e-13  8.986e+00   0.000    1.000    
## nir          5.096e-01  1.868e-02  27.281   <2e-16 ***
## red          4.915e-01  3.264e-01   1.506    0.133    
## green       -5.370e+00  5.936e-01  -9.046   <2e-16 ***
## blue         3.212e+00  2.716e-01  11.827   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 269.1 on 892 degrees of freedom
## Multiple R-squared:  0.6034, Adjusted R-squared:  0.6017 
## F-statistic: 339.3 on 4 and 892 DF,  p-value: < 2.2e-16

beta_blue	beta_green	beta_intercept	beta_nir	beta_red	sigma
Min. :2.174	Min. :-7.606	Min. :-32.25904	Min. :0.4432	Min. :-0.5665	Min. :241.2
1st Qu.:3.040	1st Qu.:-5.801	1st Qu.: -5.95349	1st Qu.:0.4977	1st Qu.: 0.2828	1st Qu.:265.0
Median :3.219	Median :-5.390	Median : 0.21901	Median :0.5099	Median : 0.4983	Median :269.3
Mean :3.222	Mean :-5.392	Mean : 0.07197	Mean :0.5102	Mean : 0.5018	Mean :269.4
3rd Qu.:3.409	3rd Qu.:-4.982	3rd Qu.: 6.08858	3rd Qu.:0.5228	3rd Qu.: 0.7278	3rd Qu.:273.4
Max. :4.229	Max. :-3.369	Max. : 32.39724	Max. :0.5776	Max. : 1.6195	Max. :296.8

Fit a Regression model with some Interactions

The model specification

Again this is fairly similar in terms of the above model with same priors and reasoning. But as you can see in the model I have now added two new beta coefficients including the interactions between NIR/Red and Blue/Green. With priors similar to the rest of the \(\beta\) coefficients.

model_string1 <- "model{

  # Posterior Likelihood
  for(i in 1:n){
    Y[i]   ~ dnorm(mu[i],inv.var)
    mu[i] <- beta_intercept + beta_nir*b1_nir[i] + beta_red*b2_red[i] + beta_green*b3_green[i] + beta_blue*b4_blue[i] + beta_nir_red*b1_nir[i]*b2_red[i] + beta_blue_green*b4_blue[i]*b3_green[i]
  }

  # Prior for beta
  beta_intercept ~ dnorm(0,0.00001)
  beta_nir ~ dnorm(0,0.00001)
  beta_red ~ dnorm(0,0.00001)
  beta_green ~ dnorm(0,0.00001)
  beta_blue ~ dnorm(0,0.00001)
  beta_nir_red ~ dnorm(0,0.00001)
  beta_blue_green ~ dnorm(0,0.00001)
  

  # Prior for the inverse variance
  inv.var   ~ dgamma(0.001, 0.001)
  sigma     <- 1/sqrt(inv.var)

}"

Compile the model in JAGS

model <- jags.model(textConnection(model_string1), 
                    data = list(Y=Y,n=n,b1_nir=b1_nir,b2_red=b2_red,b3_green=b3_green,b4_blue=b4_blue))

## Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
## Graph information:
##    Observed stochastic nodes: 897
##    Unobserved stochastic nodes: 8
##    Total graph size: 10779
## 
## Initializing model

Draw sample

Similar again

burn_in = 1
thin = 1 
variable_names = c('beta_intercept','beta_nir','beta_red','beta_green','beta_blue','beta_nir_red','beta_blue_green', 'sigma')
steps = 100000

update(model, burn_in)
  draw = jags.samples(model, steps, thin=thin, variable.names = variable_names)
  # Convert to a list
  make_list <- function(draw)
  {
    results2 = list()
    for(name in names(draw))
    {
      # Extract "chain 1"
      results2[[name]] = as.array(draw[[name]][,,1])
      # Transpose 2D arrays
      if(length(dim(results2[[name]])) == 2)
        results2[[name]] = t(results2[[name]])
    }
    return(results2)
  }
  results2 = make_list(draw)

Determine the thinning and burn in

beta_blue	beta_blue_green	beta_green	beta_intercept	beta_nir	beta_nir_red	beta_red	sigma
Min. :0.4784	Min. :-0.0035710	Min. :-7.1653	Min. :-38.644	Min. :0.3902	Min. :-4.367e-05	Min. :-2.39708	Min. :242.6
1st Qu.:2.7987	1st Qu.:-0.0011480	1st Qu.:-5.3436	1st Qu.: 3.747	1st Qu.:0.4868	1st Qu.: 4.349e-04	1st Qu.: 0.05318	1st Qu.:262.8
Median :2.9852	Median :-0.0006934	Median :-4.9451	Median : 11.658	Median :0.4995	Median : 5.209e-04	Median : 0.28233	Median :267.1
Mean :2.9838	Mean :-0.0006914	Mean :-4.9388	Mean : 11.617	Mean :0.4995	Mean : 5.209e-04	Mean : 0.27983	Mean :267.2
3rd Qu.:3.1693	3rd Qu.:-0.0002370	3rd Qu.:-4.5312	3rd Qu.: 19.559	3rd Qu.:0.5123	3rd Qu.: 6.070e-04	3rd Qu.: 0.50768	3rd Qu.:271.4
Max. :4.2369	Max. : 0.0023117	Max. : 0.2695	Max. : 61.439	Max. :0.5841	Max. : 1.167e-03	Max. : 1.56821	Max. :295.3

Adjustments

Very similar to simple linear model, burn in will remain the same as previous and thinning should be around 150. But again to get a sample of around 5000 I will need to thin at 20 as I have 100,000 steps. This will just add more variance in the end model. In practice I would run more steps to have a less covariance as possible.

beta_blue	beta_blue_green	beta_green	beta_intercept	beta_nir	beta_nir_red	beta_red	sigma
Min. :2.077	Min. :-0.0027952	Min. :-6.889	Min. :-32.569	Min. :0.4301	Min. :0.0000370	Min. :-0.86943	Min. :246.7
1st Qu.:2.803	1st Qu.:-0.0011440	1st Qu.:-5.340	1st Qu.: 3.291	1st Qu.:0.4863	1st Qu.:0.0004331	1st Qu.: 0.05632	1st Qu.:262.7
Median :2.983	Median :-0.0006795	Median :-4.948	Median : 11.227	Median :0.4993	Median :0.0005216	Median : 0.28182	Median :267.0
Mean :2.987	Mean :-0.0006807	Mean :-4.945	Mean : 11.299	Mean :0.4994	Mean :0.0005223	Mean : 0.27996	Mean :267.1
3rd Qu.:3.165	3rd Qu.:-0.0002384	3rd Qu.:-4.537	3rd Qu.: 19.326	3rd Qu.:0.5120	3rd Qu.:0.0006088	3rd Qu.: 0.50476	3rd Qu.:271.5
Max. :3.909	Max. : 0.0017977	Max. :-2.680	Max. : 55.482	Max. :0.5758	Max. :0.0009891	Max. : 1.30904	Max. :288.6

Comparison to lm() model.

Again all are very close only slight deviations between the two models.

## 
## Call:
## lm(formula = DryMatter ~ nir * red + green * blue, data = all_clean_centered)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1038.95  -192.58    -6.58   175.33  1000.53 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 11.6508802 11.7829044   0.989    0.323    
## nir          0.5003528  0.0188874  26.491  < 2e-16 ***
## red          0.3020214  0.3330745   0.907    0.365    
## green       -4.9792709  0.5977240  -8.330 3.03e-16 ***
## blue         2.9995726  0.2745099  10.927  < 2e-16 ***
## nir:red      0.0005193  0.0001276   4.070 5.12e-05 ***
## green:blue  -0.0006955  0.0006761  -1.029    0.304    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 267 on 890 degrees of freedom
## Multiple R-squared:  0.6107, Adjusted R-squared:  0.6081 
## F-statistic: 232.7 on 6 and 890 DF,  p-value: < 2.2e-16

beta_blue	beta_blue_green	beta_green	beta_intercept	beta_nir	beta_nir_red	beta_red	sigma
Min. :2.077	Min. :-0.0027952	Min. :-6.889	Min. :-32.569	Min. :0.4301	Min. :0.0000370	Min. :-0.86943	Min. :246.7
1st Qu.:2.803	1st Qu.:-0.0011440	1st Qu.:-5.340	1st Qu.: 3.291	1st Qu.:0.4863	1st Qu.:0.0004331	1st Qu.: 0.05632	1st Qu.:262.7
Median :2.983	Median :-0.0006795	Median :-4.948	Median : 11.227	Median :0.4993	Median :0.0005216	Median : 0.28182	Median :267.0
Mean :2.987	Mean :-0.0006807	Mean :-4.945	Mean : 11.299	Mean :0.4994	Mean :0.0005223	Mean : 0.27996	Mean :267.1
3rd Qu.:3.165	3rd Qu.:-0.0002384	3rd Qu.:-4.537	3rd Qu.: 19.326	3rd Qu.:0.5120	3rd Qu.:0.0006088	3rd Qu.: 0.50476	3rd Qu.:271.5
Max. :3.909	Max. : 0.0017977	Max. :-2.680	Max. : 55.482	Max. :0.5758	Max. :0.0009891	Max. : 1.30904	Max. :288.6

Compare Two Models using Bayes Factor

We get our Bayes Factor by looking at the ratio of the integrated likelihoods which is shown as follows:

\[P(y|m_{i}) = \int f(y|m_{i},\theta_{1})\pi(\theta_{1}|m_{i})d\theta_{1}\] Where \(m_{i}\) is our two models, our Bayes regression model and our Bayes regression model with interactions. Our Bayes factor can be simplified down to:

\[B^{12} = \frac{P(m_{1}|y)}{P(m_{2}|y)}\frac{P(m_{1})}{P(m_{2})}\] \[ = \frac{P(y|m_{1})}{P(y|m_{2})}\]

We can calculate our integrated likelihood by:

\[sample \ \theta_{1} \sim \pi(\theta_{1}|m_{i})\]

\[P(y|m_{i})\approx \frac{1}{N}\sum^{N}_{i=1}f(y|m_{1},\theta_{1}^{(i)})\] We can now compare our output to Bayes Factor table given by Jeffery’s in 1961 and we can see our value being between -1.1 and 1.1 is not worth more than a bare mention. Meaning the better model would be the more simple one since there is no evidence they are significantly different. Resulting in evidence to favour the linear model being better than the model with interactions.

# Linear Model
y_given_m1 = c()
number_rows_m1 = length(mcmc_samples[,1])
for (i in 1:length(mcmc_samples[1,])){
  y_given_m1[i] = sum(mcmc_samples[,i][-number_rows_m1])
  sigma_given_m1 = mcmc_samples[,i][number_rows_m1]
}

# interactive Model
y_given_m2 = c()
number_rows_m2 = length(mcmc_samples1[,1])
for (i in 1:length(mcmc_samples1[1,])){
  y_given_m2[i] = sum(mcmc_samples1[,i][-number_rows_m2])
  sigma_given_m2 = mcmc_samples1[,i][number_rows_m2]
}

# log likelihood
loglik_m1 = dnorm(All_Clean$DryMatter, y_given_m1,sigma_given_m1,log=TRUE)
loglik_m2 = dnorm(All_Clean$DryMatter, y_given_m2,sigma_given_m2,log=TRUE)
# 
m1 = sum(loglik_m1)/length(All_Clean$DryMatter)
m2 = sum(loglik_m2)/length(All_Clean$DryMatter)
bf = m1/m2
bf

## [1] 0.9820489

Jeffrey’s 1961

Visual Check

In class we went over we could look at the distribution of the difference and check to see if zero is found within the 95% credible interval. As we can see this confirms the result we found using the Bayes factor, that is there is evidence to believe the models are not significantly different

Compare DryMatter (farmer estimate) against drymatter (satellite estimate) to test if means are Significantly Different

Using Paired t-test

First we plot our data so we can get a visual understanding at what we are preforming the paired t-test on. Visually we can see slight difference in the data. Before we can test is this difference is significant we first must make sure our assumptions are meet. The data is paired as we know the two drymatter values were derived from the same data and our \(n\) is sufficiently large as n > 30. But we will preform an additional Shapiro-Willk normality test which showed p-value < 0.05 so we can assume normality has been met.

The p-value of the paired t-test is 2.2e-16, which is less than the significance level alpha = 0.05. We can then reject null hypothesis and conclude that the ground farmer estimates are significantly different from the satellite estimates with a p-value = 2.2e-16.

d = All_Clean$DryMatter - All_Clean$drymatter
shapiro.test(d)

## 
##  Shapiro-Wilk normality test
## 
## data:  d
## W = 0.99499, p-value = 0.004746

t.test(All_Clean$DryMatter, All_Clean$drymatter, paired = TRUE, alternative = "two.sided")

## 
##  Paired t-test
## 
## data:  All_Clean$DryMatter and All_Clean$drymatter
## t = 9.8319, df = 896, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##   72.85716 109.19874
## sample estimates:
## mean of the differences 
##                91.02795

Using Bayesian Posterior Difference on the Mean Difference

The model specification

In this model we specify two normal distributions (as we assume normality) with a \(\mu_{1}\) from our farmer DryMatter and \(\mu_{2}\) from the satellite drymatter. Sigma has been set from \(dunif(-5,5)\) as it will be positive when the square root has been taken and close to zero. After running I found sigma prior didn’t allow it converge as the true sigma must not have been in the range so I increased it to \(dunif(-8,8)\). Since the null hypothesis is our default position and is the hypothesis of no difference we sample our priors for \(\mu_{1}\) and \(\mu_{2}\) from the same mean \(m\) and standard deviation \(s\) where \(s\) is set similar to our sigma but slightly smaller so \(s^2\) we allow our means to be close to one another and represent our null hypothesis better.

model = "model
{
  for (i in 1:length(x1)) {
  x1[i]~dnorm(mu1,(1/(sigma^2)))
  x2[i]~dnorm(mu2,(1/(sigma^2)))
  }  
  m ~ dnorm(0, (1/(100^2)))   ## N(m,s^2) prior for mu1, mu2  
  log_s ~ dunif(-3,3)         ## m and s^2 are hyper-parameters
  s <- exp(log_s)
  mu1 ~ dnorm(m, (1/(s^2)))  #prior for mu1 and mu2
  mu2 ~ dnorm(m, (1/(s^2)))

  log_sig ~ dunif(-8,8)       #prior for common sigma
  sigma<-exp(log_sig)
}
"

data = list(x1=All_Clean$DryMatter, x2=All_Clean$drymatter)
# Variables to monitor
variable_names = c('mu1','mu2','sigma')

Draw samples

Starting of with a small burn-in and then check in the following step

# How many burn-in steps?
burn_in = 1
# How many proper steps?
steps = 100000
# Thinning?
thin = 1

# Write model out to file
fileConn=file("model.temp")
writeLines(model, fileConn)
close(fileConn)
if(all(is.na(data)))
{
  m = jags.model(file="model.temp")
} else
{
  m = jags.model(file="model.temp", data=data)
}

## Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
## Graph information:
##    Observed stochastic nodes: 1794
##    Unobserved stochastic nodes: 5
##    Total graph size: 1815
## 
## Initializing model

update(m, burn_in)
draw = jags.samples(m, steps, thin=thin, variable.names = variable_names)
# Convert to a list
make_list <- function(draw)
{
  results = list()
  for(name in names(draw))
  {
    # Extract "chain 1"
    results[[name]] = as.array(draw[[name]][,,1])
    # Transpose 2D arrays
    if(length(dim(results[[name]])) == 2)
      results[[name]] = t(results[[name]])
  }
  return(results)
}
results3 = make_list(draw)

Checking plots

We can see our burn-in at 100 is satisfactory and and thinning of 20 looks appropriate.

Adjustments

mu1	mu2	sigma
Min. :2092	Min. :2030	Min. :370.7
1st Qu.:2125	1st Qu.:2065	1st Qu.:388.7
Median :2133	Median :2073	Median :393.1
Mean :2133	Mean :2073	Mean :393.1
3rd Qu.:2141	3rd Qu.:2081	3rd Qu.:397.5
Max. :2181	Max. :2122	Max. :419.7

Evaluation

We can see the zero does not fall into our 95% credible interval indicating that there is evidence to reject our null that the means are the same. This confirms what we established using the pair t-test earlier.

	mu_diff
	Min. : 5.567
	1st Qu.: 8.037
	Median : 8.604
	Mean : 8.614
	3rd Qu.: 9.198
	Max. :11.755

JAGS-Regression, Bayes factors, Paired t-test

Mitchell Rawlings

Aim

Dataset Overview

Data Exploration

Fit a Linear Regression Model Using JAGS

Set out variables

The model specification

Compile the model in JAGS

Draw sample

Determine the thinning and burn in

Adjustments

Comparison to lm() model.

Fit a Regression model with some Interactions

The model specification

Compile the model in JAGS

Draw sample

Determine the thinning and burn in

Adjustments

Comparison to lm() model.

Compare Two Models using Bayes Factor

Visual Check

Compare DryMatter (farmer estimate) against drymatter (satellite estimate) to test if means are Significantly Different

Using Paired t-test

Using Bayesian Posterior Difference on the Mean Difference

The model specification

Draw samples

Checking plots

Adjustments

Evaluation