model_string2 <- "model{
# Likelihood of the data: Bernouilli distribution
for (i in 1:length(y)) {
y[i] ~ dbern(p[i])
logit(p[i]) = int + b[1]*gravity[i] + b[2]*cond[i] + b[3]*calc[i]
}
# Noninformative Normal distribution prior for the intercept coefficient
int ~ dnorm(0.0, 1.0/25)
# Prior for the individual beta terms:
# switch from the double exponential (Laplace) to a fairly noninformative Normal distribution for the coefficients
for (j in 1:3) {
b[j] ~ dnorm(0.0, sqrt(2))
}
}"
set.seed(92)
mod2 <- jags.model(textConnection(model_string2), data = data_jags,
n.chains = 3)Compiling model graph
Resolving undeclared variables
Allocating nodes
Graph information:
Observed stochastic nodes: 77
Unobserved stochastic nodes: 4
Total graph size: 637
Initializing model
Iterations = 1001:6000
Thinning interval = 1
Number of chains = 3
Sample size per chain = 5000
1. Empirical mean and standard deviation for each variable,
plus standard error of the mean:
Mean SD Naive SE Time-series SE
b[1] 0.9888 0.3833 0.003130 0.005586
b[2] -0.8223 0.3454 0.002820 0.005271
b[3] 1.3377 0.3807 0.003108 0.004614
int -0.2252 0.2917 0.002382 0.003155
2. Quantiles for each variable:
2.5% 25% 50% 75% 97.5%
b[1] 0.2735 0.7234 0.9821 1.24396 1.7751
b[2] -1.5106 -1.0525 -0.8146 -0.58993 -0.1628
b[3] 0.6264 1.0734 1.3237 1.58713 2.1119
int -0.7987 -0.4223 -0.2253 -0.02749 0.3409
0 1
FALSE 39 12
TRUE 5 21
[1] 0.7792208
Now suppose that it is considered really bad to predict no formation of calcium oxalate crystals when there in fact is one (i.e. false negatives), then you might choose to lower the threshold for classifying data points as 1s (say, change it to 0.3). That is, if the model says the probability is greater than 0.3, it will classify it as having a calcium oxalate crystal. In that case, the model classifies with an accuracy of about 75-76%, so classification accuracy decreased a bit, but it did indeed increase your chances of detecting a true positive. Anyway, this was an in-sample accuracy value; to get a less biased assessment of how well your model performs, it would be better to use the model with training data to make predictions on a testing dataset and evaluate accuracy once again.
0 1
FALSE 30 4
TRUE 14 29
[1] 0.7662338