1 Baye’s Theorem

\[\overbrace{p(\theta/D)}^{Posterior}=\frac{\overbrace{p(D/\theta)}^{Likelihood}.\overbrace{p(\theta)}^{Prior}}{\underbrace{p(D)}_{Evidence}}\]

2 Data

SolarRadPrediction <- read_csv("Data/SolarRadPrediction.csv", 
    col_types = cols(Data = col_datetime(format = "%m/%d/%Y %H:%M:%S %p"), 
        Time = col_time(format = "%H:%M:%S"), 
        TimeSunRise = col_time(format = "%H:%M:%S"), 
        TimeSunSet = col_time(format = "%H:%M:%S")))

df <- SolarRadPrediction[, -1]
colnames(df)[1] <- "Date"
colnames(df)[7] <- "WindDirection"
colnames(df)[8] <- "WindSpeed"

## Select numerical data for only model fitting part
df1 <- df %>%
  select(Radiation, Temperature, Pressure, Humidity, WindDirection, WindSpeed)
summary(df1)

##    Radiation        Temperature      Pressure        Humidity     
##  Min.   :   1.11   Min.   :34.0   Min.   :30.19   Min.   :  8.00  
##  1st Qu.:   1.23   1st Qu.:46.0   1st Qu.:30.40   1st Qu.: 56.00  
##  Median :   2.66   Median :50.0   Median :30.43   Median : 85.00  
##  Mean   : 207.12   Mean   :51.1   Mean   :30.42   Mean   : 75.02  
##  3rd Qu.: 354.24   3rd Qu.:55.0   3rd Qu.:30.46   3rd Qu.: 97.00  
##  Max.   :1601.26   Max.   :71.0   Max.   :30.56   Max.   :103.00  
##  WindDirection      WindSpeed     
##  Min.   :  0.09   Min.   : 0.000  
##  1st Qu.: 82.23   1st Qu.: 3.370  
##  Median :147.70   Median : 5.620  
##  Mean   :143.49   Mean   : 6.244  
##  3rd Qu.:179.31   3rd Qu.: 7.870  
##  Max.   :359.95   Max.   :40.500

str(df1)

## tibble [32,686 x 6] (S3: tbl_df/tbl/data.frame)
##  $ Radiation    : num [1:32686] 1.21 1.21 1.23 1.21 1.17 1.21 1.2 1.24 1.23 1.21 ...
##  $ Temperature  : num [1:32686] 48 48 48 48 48 48 49 49 49 49 ...
##  $ Pressure     : num [1:32686] 30.5 30.5 30.5 30.5 30.5 ...
##  $ Humidity     : num [1:32686] 59 58 57 60 62 64 72 71 80 85 ...
##  $ WindDirection: num [1:32686] 177 177 159 138 105 ...
##  $ WindSpeed    : num [1:32686] 5.62 3.37 3.37 3.37 5.62 5.62 6.75 5.62 4.5 4.5 ...

Looking at from the summary data we can see that there is no missing value problem for the data set.

3 Model Fitting

From our data we are going to fit a model to predict Radiation. So we take Radiation as out dependent variable and others as independent variables.

3.1 Linear Regression Model

#pairs(df1)
model_freq <- lm(Radiation ~ ., data = df1)
summary(model_freq)

## 
## Call:
## lm(formula = Radiation ~ ., data = df1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -752.37 -132.65  -20.19  102.98 1121.15 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    2.098e+04  6.959e+02  30.153  < 2e-16 ***
## Temperature    3.837e+01  2.064e-01 185.930  < 2e-16 ***
## Pressure      -7.471e+02  2.290e+01 -32.622  < 2e-16 ***
## Humidity      -2.691e-01  4.844e-02  -5.555 2.79e-08 ***
## WindDirection -2.694e-01  1.462e-02 -18.432  < 2e-16 ***
## WindSpeed      7.875e+00  3.416e-01  23.053  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 208.2 on 32680 degrees of freedom
## Multiple R-squared:  0.5659, Adjusted R-squared:  0.5659 
## F-statistic:  8522 on 5 and 32680 DF,  p-value: < 2.2e-16

tidy(model_freq)

term	estimate	std.error	statistic
(Intercept)	20983.8434973	695.9213762	30.152607
Temperature	38.3721364	0.2063796	185.929918
Pressure	-747.0686858	22.9010571	-32.621581
Humidity	-0.2691005	0.0484402	-5.555311
WindDirection	-0.2694445	0.0146181	-18.432218
WindSpeed	7.8748844	0.3416022	23.052793

From the model summary we can see all the regressors are significant (by p-value).

3.1.1 Model Selection

step(lm(Radiation ~ 1, data = df1), 
     direction = "forward", 
     scope = ~ Temperature+Pressure+Humidity+WindDirection+WindSpeed)

## Start:  AIC=376248.1
## Radiation ~ 1
## 
##                 Df  Sum of Sq        RSS    AIC
## + Temperature    1 1762032872 1500033533 350857
## + WindDirection  1  173049157 3089017249 374468
## + Humidity       1  166865010 3095201395 374534
## + Pressure       1   46206279 3215860126 375784
## + WindSpeed      1   17683387 3244383019 376072
## <none>                        3262066405 376248
## 
## Step:  AIC=350857.2
## Radiation ~ Temperature
## 
##                 Df Sum of Sq        RSS    AIC
## + Pressure       1  43450981 1456582552 349898
## + WindSpeed      1  30563243 1469470290 350186
## + WindDirection  1   5501379 1494532154 350739
## + Humidity       1    986446 1499047087 350838
## <none>                       1500033533 350857
## 
## Step:  AIC=349898.4
## Radiation ~ Temperature + Pressure
## 
##                 Df Sum of Sq        RSS    AIC
## + WindSpeed      1  25312425 1431270128 349327
## + WindDirection  1  11943121 1444639431 349631
## + Humidity       1   3976752 1452605800 349811
## <none>                       1456582552 349898
## 
## Step:  AIC=349327.4
## Radiation ~ Temperature + Pressure + WindSpeed
## 
##                 Df Sum of Sq        RSS    AIC
## + WindDirection  1  13998169 1417271959 349008
## + Humidity       1    615021 1430655107 349315
## <none>                       1431270128 349327
## 
## Step:  AIC=349008.2
## Radiation ~ Temperature + Pressure + WindSpeed + WindDirection
## 
##            Df Sum of Sq        RSS    AIC
## + Humidity  1   1337143 1415934816 348979
## <none>                  1417271959 349008
## 
## Step:  AIC=348979.3
## Radiation ~ Temperature + Pressure + WindSpeed + WindDirection + 
##     Humidity

## 
## Call:
## lm(formula = Radiation ~ Temperature + Pressure + WindSpeed + 
##     WindDirection + Humidity, data = df1)
## 
## Coefficients:
##   (Intercept)    Temperature       Pressure      WindSpeed  WindDirection  
##    20983.8435        38.3721      -747.0687         7.8749        -0.2694  
##      Humidity  
##       -0.2691

From the model selection choose the lowest AIC value model. So the full model is the best model.

3.1.2 Linear Regression Assumption check ?

par(mfrow = c(2, 2))
plot(model_freq)

From above fitted plots we can convence that the full model has satisfied the assumptions.

3.2 Bayesian Regression

Here we use the function stan_glm from the rstanarm package.
In this function,
- family: default use gaussian distribution.
- prior: default normal prior is used. If we need a flat uniform prior we put it to be NULL.
- prior_intercept: can be normal, student_t, or cauchy. Here also if we need a flat uniform prior we put it to be NULL.

# Fitting bayesian model & remember 1234
bayesian_model <- stan_glm(Radiation ~ ., data = df1, seed = 1234)

## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 0 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.121 seconds (Warm-up)
## Chain 1:                3.007 seconds (Sampling)
## Chain 1:                3.128 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 0 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 2: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 2: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 2: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 2: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 2: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 2: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 2: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 2: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 2: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 2: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.168 seconds (Warm-up)
## Chain 2:                3 seconds (Sampling)
## Chain 2:                3.168 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 0 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 3: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 3: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 3: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 3: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 3: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 3: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 3: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 3: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 3: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 3: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 3: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.142 seconds (Warm-up)
## Chain 3:                2.998 seconds (Sampling)
## Chain 3:                3.14 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 0 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 4: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 4: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 4: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 4: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 4: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 4: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 4: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 4: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 4: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 4: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 4: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.135 seconds (Warm-up)
## Chain 4:                2.985 seconds (Sampling)
## Chain 4:                3.12 seconds (Total)
## Chain 4:

#summary(bayesian_model)
print(bayesian_model, digits = 3)

## stan_glm
##  family:       gaussian [identity]
##  formula:      Radiation ~ .
##  observations: 32686
##  predictors:   6
## ------
##               Median    MAD_SD   
## (Intercept)   20998.757   704.571
## Temperature      38.373     0.211
## Pressure       -747.457    23.291
## Humidity         -0.269     0.050
## WindDirection    -0.270     0.014
## WindSpeed         7.863     0.338
## 
## Auxiliary parameter(s):
##       Median  MAD_SD 
## sigma 208.152   0.788
## 
## ------
## * For help interpreting the printed output see ?print.stanreg
## * For info on the priors used see ?prior_summary.stanreg

From this output, Median estimator is the median computed from the MCMC simulation.
Also MAD_SD is median absolute deviation computed from the same simulation.
To know more we plot bayesian plots.

# Temperature
p1 <- mcmc_dens(x = bayesian_model, pars = c("Temperature")) +
  vline_at(v = 38.373, col = "red")
# Pressure
p2 <- mcmc_dens(x = bayesian_model, pars = c("Pressure")) +
  vline_at(v = -747.457, col = "red")
# Humidity
p3 <- mcmc_dens(x = bayesian_model, pars = c("Humidity")) +
  vline_at(v = -0.269, col = "red")
# WindDirection
p4 <- mcmc_dens(x = bayesian_model, pars = c("WindDirection")) +
  vline_at(v = -0.270, col = "red")
# WindSpeed
p5 <- mcmc_dens(x = bayesian_model, pars = c("WindSpeed")) +
  vline_at(v = 7.863, col = "red")

ggarrange(p1,p2,p3,p4,p5)

Point estimates of variables falls on the median of this distribution.

3.2.1 Evaluate Model Parameter

describe_posterior(bayesian_model)

	Parameter	Median	CI	CI_low	CI_high	pd	ROPE_CI	ROPE_low	ROPE_high	ROPE_Percentage	Rhat	ESS
1	(Intercept)	20998.7566265	89	19939.2044911	22192.5496212	1	89	-31.59164	31.59164	0	0.9995457	4587.185
4	Temperature	38.3730415	89	38.0401060	38.7029959	1	89	-31.59164	31.59164	0	0.9992385	4383.438
3	Pressure	-747.4572219	89	-786.9067070	-712.7380389	1	89	-31.59164	31.59164	0	0.9995353	4597.143
2	Humidity	-0.2692544	89	-0.3420957	-0.1841775	1	89	-31.59164	31.59164	1	0.9992849	4391.914
5	WindDirection	-0.2698797	89	-0.2924052	-0.2459000	1	89	-31.59164	31.59164	1	1.0001522	4880.826
6	WindSpeed	7.8633510	89	7.3266028	8.4053666	1	89	-31.59164	31.59164	1	0.9995283	5318.336

In above description,
- Median: Median estimator is the median computed from the MCMC simulation.
- 89% CI: Creadible Interval, used to quantify the uncertainty about the regression coefficients. with 89% probability (given the data) that a coefficient lies above the CI_low value and under CI_high value.
- pd: Probability of Direction, which is the probability that the effect goes to the positive or to the negative direction, and it is considered as the best equivalent for the p-value.
- 89% ROPE: Region of Practice Equavilance.
- Rhat: scale reduction factor \(\hat R\).
- ESS: effective sample size.

poster <- get_parameters(bayesian_model)
print(purrr::map_dbl(poster,median),digits = 3)

##   (Intercept)   Temperature      Pressure      Humidity WindDirection 
##     20998.757        38.373      -747.457        -0.269        -0.270 
##     WindSpeed 
##         7.863

print(purrr::map_dbl(poster, map_estimate),digits = 3)

##   (Intercept)   Temperature      Pressure      Humidity WindDirection 
##     20973.002        38.374      -746.499        -0.272        -0.271 
##     WindSpeed 
##         7.841

print(purrr::map_dbl(poster, mean),digits = 3)

##   (Intercept)   Temperature      Pressure      Humidity WindDirection 
##     20996.666        38.373      -747.488        -0.269        -0.270 
##     WindSpeed 
##         7.869

As we see the values are closer to each other due to the like normality of the distribution of the posteriors where all the central statistics (mean, median, mode) are closer to each other.

p1 <- mcmc_dens(bayesian_model, pars=c("Temperature"))+
  vline_at(median(poster$Temperature), col="red")+
  vline_at(mean(poster$Temperature), col="yellow")+
  vline_at(map_estimate(poster$Temperature), col="green")

p2 <- mcmc_dens(bayesian_model, pars=c("Pressure"))+
  vline_at(median(poster$Pressure), col="red")+
  vline_at(mean(poster$Pressure), col="yellow")+
  vline_at(map_estimate(poster$Pressure), col="green")

p3 <- mcmc_dens(bayesian_model, pars=c("Humidity"))+
  vline_at(median(poster$Humidity), col="red")+
  vline_at(mean(poster$Humidity), col="yellow")+
  vline_at(map_estimate(poster$Humidity), col="green")

p4 <- mcmc_dens(bayesian_model, pars=c("WindDirection"))+
  vline_at(median(poster$WindDirection), col="red")+
  vline_at(mean(poster$WindDirection), col="yellow")+
  vline_at(map_estimate(poster$WindDirection), col="green")

p5 <- mcmc_dens(bayesian_model, pars=c("WindSpeed"))+
  vline_at(median(poster$WindSpeed), col="red")+
  vline_at(mean(poster$WindSpeed), col="yellow")+
  vline_at(map_estimate(poster$WindSpeed), col="green")

ggarrange(p1,p2,p3,p4,p5)

As expected they are approximately on top of each other.

3.3 Bayesian Inferences

We need to check the significance of Bayesian Regression coefficients.
That is done by checking whether the corresponding credible interval contains zero or not, if no then this coefficient is significant.
Now let’s see significance of out model coefficients.

hdi(bayesian_model)

	Parameter	CI	CI_low	CI_high	Effects	Component
1	(Intercept)	89	19939.2044911	22192.5496212	fixed	conditional
4	Temperature	89	38.0401060	38.7029959	fixed	conditional
3	Pressure	89	-786.9067070	-712.7380389	fixed	conditional
2	Humidity	89	-0.3420957	-0.1841775	fixed	conditional
5	WindDirection	89	-0.2924052	-0.2459000	fixed	conditional
6	WindSpeed	89	7.3266028	8.4053666	fixed	conditional

eti(bayesian_model)

	Parameter	CI	CI_low	CI_high	Effects	Component
1	(Intercept)	89	19863.9719033	22132.6123101	fixed	conditional
4	Temperature	89	38.0404871	38.7037444	fixed	conditional
3	Pressure	89	-784.6436591	-710.2065951	fixed	conditional
2	Humidity	89	-0.3492317	-0.1898372	fixed	conditional
5	WindDirection	89	-0.2929182	-0.2463227	fixed	conditional
6	WindSpeed	89	7.3341335	8.4143058	fixed	conditional

From the both result we can see all the coefficients are significant.
NOTE: We got the satisfied results duo to the normal prior assumption. But in real world it is less often to be sure about the normality assumption.
Another way of testing significance is it is less often to be sure about the normality assumption.

rope(poster$Temperature)

CI	ROPE_low	ROPE_high	ROPE_Percentage
89	-0.1	0.1	0

rope(poster$Pressure)

CI	ROPE_low	ROPE_high	ROPE_Percentage
89	-0.1	0.1	0

rope(poster$Humidity)

CI	ROPE_low	ROPE_high	ROPE_Percentage
89	-0.1	0.1	0

rope(poster$WindDirection)

CI	ROPE_low	ROPE_high	ROPE_Percentage
89	-0.1	0.1	0

rope(poster$WindSpeed)

CI	ROPE_low	ROPE_high	ROPE_Percentage
89	-0.1	0.1	0

rope(poster$`(Intercept)`)

CI	ROPE_low	ROPE_high	ROPE_Percentage
89	-0.1	0.1	0

rope_range(bayesian_model)

## [1] -31.59164  31.59164

For all the variables including intercept, almost all the credible interval (HDI) is outside the ROPE range, which means that coefficient is highly significant.

3.4 PD and P-value

Interest in checking the direction of the coefficient.
This done using pd statistic.
If the pd is high value means that the associated effect is concentrated on the same side as the median.
In our model almost all the posteriors are on same side.

df2 <- select(tidy(model_freq), c(term,p.value))
df2$p.value <- round(df2$p.value, digits = 3)
df3 <- 1 - purrr::map_dbl(poster, p_direction)
df4 <- cbind(df2,df3)
df4

	term	p.value	df3
(Intercept)	(Intercept)	0	0
Temperature	Temperature	0	0
Pressure	Pressure	0	0
Humidity	Humidity	0	0
WindDirection	WindDirection	0	0
WindSpeed	WindSpeed	0	0

4 Questions I got ???

When we fitting the bayesian model, how we put prior, family detail.
- i.e. How we know about the family, prior details.

ST406 - Bayesian LRM

P.I.N.Kehelbedda

2021-01-14

1 Baye’s Theorem

2 Data

3 Model Fitting

3.1 Linear Regression Model

3.1.1 Model Selection

3.1.2 Linear Regression Assumption check ?

3.2 Bayesian Regression

3.2.1 Evaluate Model Parameter

3.3 Bayesian Inferences

3.4 PD and P-value

4 Questions I got ???