1 Baye’s Theorem
\[\overbrace{p(\theta/D)}^{Posterior}=\frac{\overbrace{p(D/\theta)}^{Likelihood}.\overbrace{p(\theta)}^{Prior}}{\underbrace{p(D)}_{Evidence}}\]
2 Data
SolarRadPrediction <- read_csv("Data/SolarRadPrediction.csv",
col_types = cols(Data = col_datetime(format = "%m/%d/%Y %H:%M:%S %p"),
Time = col_time(format = "%H:%M:%S"),
TimeSunRise = col_time(format = "%H:%M:%S"),
TimeSunSet = col_time(format = "%H:%M:%S")))
df <- SolarRadPrediction[, -1]
colnames(df)[1] <- "Date"
colnames(df)[7] <- "WindDirection"
colnames(df)[8] <- "WindSpeed"
## Select numerical data for only model fitting part
df1 <- df %>%
select(Radiation, Temperature, Pressure, Humidity, WindDirection, WindSpeed)
summary(df1)## Radiation Temperature Pressure Humidity
## Min. : 1.11 Min. :34.0 Min. :30.19 Min. : 8.00
## 1st Qu.: 1.23 1st Qu.:46.0 1st Qu.:30.40 1st Qu.: 56.00
## Median : 2.66 Median :50.0 Median :30.43 Median : 85.00
## Mean : 207.12 Mean :51.1 Mean :30.42 Mean : 75.02
## 3rd Qu.: 354.24 3rd Qu.:55.0 3rd Qu.:30.46 3rd Qu.: 97.00
## Max. :1601.26 Max. :71.0 Max. :30.56 Max. :103.00
## WindDirection WindSpeed
## Min. : 0.09 Min. : 0.000
## 1st Qu.: 82.23 1st Qu.: 3.370
## Median :147.70 Median : 5.620
## Mean :143.49 Mean : 6.244
## 3rd Qu.:179.31 3rd Qu.: 7.870
## Max. :359.95 Max. :40.500
## tibble [32,686 x 6] (S3: tbl_df/tbl/data.frame)
## $ Radiation : num [1:32686] 1.21 1.21 1.23 1.21 1.17 1.21 1.2 1.24 1.23 1.21 ...
## $ Temperature : num [1:32686] 48 48 48 48 48 48 49 49 49 49 ...
## $ Pressure : num [1:32686] 30.5 30.5 30.5 30.5 30.5 ...
## $ Humidity : num [1:32686] 59 58 57 60 62 64 72 71 80 85 ...
## $ WindDirection: num [1:32686] 177 177 159 138 105 ...
## $ WindSpeed : num [1:32686] 5.62 3.37 3.37 3.37 5.62 5.62 6.75 5.62 4.5 4.5 ...
- Looking at from the summary data we can see that there is no missing value problem for the data set.
3 Model Fitting
- From our data we are going to fit a model to predict Radiation. So we take Radiation as out dependent variable and others as independent variables.
3.1 Linear Regression Model
##
## Call:
## lm(formula = Radiation ~ ., data = df1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -752.37 -132.65 -20.19 102.98 1121.15
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.098e+04 6.959e+02 30.153 < 2e-16 ***
## Temperature 3.837e+01 2.064e-01 185.930 < 2e-16 ***
## Pressure -7.471e+02 2.290e+01 -32.622 < 2e-16 ***
## Humidity -2.691e-01 4.844e-02 -5.555 2.79e-08 ***
## WindDirection -2.694e-01 1.462e-02 -18.432 < 2e-16 ***
## WindSpeed 7.875e+00 3.416e-01 23.053 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 208.2 on 32680 degrees of freedom
## Multiple R-squared: 0.5659, Adjusted R-squared: 0.5659
## F-statistic: 8522 on 5 and 32680 DF, p-value: < 2.2e-16
| term | estimate | std.error | statistic | p.value |
|---|---|---|---|---|
| (Intercept) | 20983.8434973 | 695.9213762 | 30.152607 | 0 |
| Temperature | 38.3721364 | 0.2063796 | 185.929918 | 0 |
| Pressure | -747.0686858 | 22.9010571 | -32.621581 | 0 |
| Humidity | -0.2691005 | 0.0484402 | -5.555311 | 0 |
| WindDirection | -0.2694445 | 0.0146181 | -18.432218 | 0 |
| WindSpeed | 7.8748844 | 0.3416022 | 23.052793 | 0 |
- From the model summary we can see all the regressors are significant (by p-value).
3.1.1 Model Selection
step(lm(Radiation ~ 1, data = df1),
direction = "forward",
scope = ~ Temperature+Pressure+Humidity+WindDirection+WindSpeed)## Start: AIC=376248.1
## Radiation ~ 1
##
## Df Sum of Sq RSS AIC
## + Temperature 1 1762032872 1500033533 350857
## + WindDirection 1 173049157 3089017249 374468
## + Humidity 1 166865010 3095201395 374534
## + Pressure 1 46206279 3215860126 375784
## + WindSpeed 1 17683387 3244383019 376072
## <none> 3262066405 376248
##
## Step: AIC=350857.2
## Radiation ~ Temperature
##
## Df Sum of Sq RSS AIC
## + Pressure 1 43450981 1456582552 349898
## + WindSpeed 1 30563243 1469470290 350186
## + WindDirection 1 5501379 1494532154 350739
## + Humidity 1 986446 1499047087 350838
## <none> 1500033533 350857
##
## Step: AIC=349898.4
## Radiation ~ Temperature + Pressure
##
## Df Sum of Sq RSS AIC
## + WindSpeed 1 25312425 1431270128 349327
## + WindDirection 1 11943121 1444639431 349631
## + Humidity 1 3976752 1452605800 349811
## <none> 1456582552 349898
##
## Step: AIC=349327.4
## Radiation ~ Temperature + Pressure + WindSpeed
##
## Df Sum of Sq RSS AIC
## + WindDirection 1 13998169 1417271959 349008
## + Humidity 1 615021 1430655107 349315
## <none> 1431270128 349327
##
## Step: AIC=349008.2
## Radiation ~ Temperature + Pressure + WindSpeed + WindDirection
##
## Df Sum of Sq RSS AIC
## + Humidity 1 1337143 1415934816 348979
## <none> 1417271959 349008
##
## Step: AIC=348979.3
## Radiation ~ Temperature + Pressure + WindSpeed + WindDirection +
## Humidity
##
## Call:
## lm(formula = Radiation ~ Temperature + Pressure + WindSpeed +
## WindDirection + Humidity, data = df1)
##
## Coefficients:
## (Intercept) Temperature Pressure WindSpeed WindDirection
## 20983.8435 38.3721 -747.0687 7.8749 -0.2694
## Humidity
## -0.2691
- From the model selection choose the lowest AIC value model. So the full model is the best model.
3.2 Bayesian Regression
- Here we use the function stan_glm from the rstanarm package.
- In this function,
- family: default use gaussian distribution.
- prior: default normal prior is used. If we need a flat uniform prior we put it to be NULL.
- prior_intercept: can be normal, student_t, or cauchy. Here also if we need a flat uniform prior we put it to be NULL.
# Fitting bayesian model & remember 1234
bayesian_model <- stan_glm(Radiation ~ ., data = df1, seed = 1234) ##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 0 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 1: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 1: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 1: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 1: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.121 seconds (Warm-up)
## Chain 1: 3.007 seconds (Sampling)
## Chain 1: 3.128 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 0 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 2: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 2: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 2: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 2: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 2: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 2: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 2: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 2: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 2: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 2: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 0.168 seconds (Warm-up)
## Chain 2: 3 seconds (Sampling)
## Chain 2: 3.168 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3:
## Chain 3: Gradient evaluation took 0 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 3: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 3: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 3: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 3: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 3: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 3: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 3: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 3: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 3: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 3: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 3: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 0.142 seconds (Warm-up)
## Chain 3: 2.998 seconds (Sampling)
## Chain 3: 3.14 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4:
## Chain 4: Gradient evaluation took 0 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 4: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 4: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 4: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 4: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 4: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 4: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 4: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 4: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 4: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 4: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 4: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 0.135 seconds (Warm-up)
## Chain 4: 2.985 seconds (Sampling)
## Chain 4: 3.12 seconds (Total)
## Chain 4:
## stan_glm
## family: gaussian [identity]
## formula: Radiation ~ .
## observations: 32686
## predictors: 6
## ------
## Median MAD_SD
## (Intercept) 20998.757 704.571
## Temperature 38.373 0.211
## Pressure -747.457 23.291
## Humidity -0.269 0.050
## WindDirection -0.270 0.014
## WindSpeed 7.863 0.338
##
## Auxiliary parameter(s):
## Median MAD_SD
## sigma 208.152 0.788
##
## ------
## * For help interpreting the printed output see ?print.stanreg
## * For info on the priors used see ?prior_summary.stanreg
- From this output, Median estimator is the median computed from the MCMC simulation.
- Also MAD_SD is median absolute deviation computed from the same simulation.
- To know more we plot bayesian plots.
# Temperature
p1 <- mcmc_dens(x = bayesian_model, pars = c("Temperature")) +
vline_at(v = 38.373, col = "red")
# Pressure
p2 <- mcmc_dens(x = bayesian_model, pars = c("Pressure")) +
vline_at(v = -747.457, col = "red")
# Humidity
p3 <- mcmc_dens(x = bayesian_model, pars = c("Humidity")) +
vline_at(v = -0.269, col = "red")
# WindDirection
p4 <- mcmc_dens(x = bayesian_model, pars = c("WindDirection")) +
vline_at(v = -0.270, col = "red")
# WindSpeed
p5 <- mcmc_dens(x = bayesian_model, pars = c("WindSpeed")) +
vline_at(v = 7.863, col = "red")
ggarrange(p1,p2,p3,p4,p5)- Point estimates of variables falls on the median of this distribution.
3.2.1 Evaluate Model Parameter
| Parameter | Median | CI | CI_low | CI_high | pd | ROPE_CI | ROPE_low | ROPE_high | ROPE_Percentage | Rhat | ESS | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | (Intercept) | 20998.7566265 | 89 | 19939.2044911 | 22192.5496212 | 1 | 89 | -31.59164 | 31.59164 | 0 | 0.9995457 | 4587.185 |
| 4 | Temperature | 38.3730415 | 89 | 38.0401060 | 38.7029959 | 1 | 89 | -31.59164 | 31.59164 | 0 | 0.9992385 | 4383.438 |
| 3 | Pressure | -747.4572219 | 89 | -786.9067070 | -712.7380389 | 1 | 89 | -31.59164 | 31.59164 | 0 | 0.9995353 | 4597.143 |
| 2 | Humidity | -0.2692544 | 89 | -0.3420957 | -0.1841775 | 1 | 89 | -31.59164 | 31.59164 | 1 | 0.9992849 | 4391.914 |
| 5 | WindDirection | -0.2698797 | 89 | -0.2924052 | -0.2459000 | 1 | 89 | -31.59164 | 31.59164 | 1 | 1.0001522 | 4880.826 |
| 6 | WindSpeed | 7.8633510 | 89 | 7.3266028 | 8.4053666 | 1 | 89 | -31.59164 | 31.59164 | 1 | 0.9995283 | 5318.336 |
- In above description,
- Median: Median estimator is the median computed from the MCMC simulation.
- 89% CI: Creadible Interval, used to quantify the uncertainty about the regression coefficients. with 89% probability (given the data) that a coefficient lies above the CI_low value and under CI_high value.
- pd: Probability of Direction, which is the probability that the effect goes to the positive or to the negative direction, and it is considered as the best equivalent for the p-value.
- 89% ROPE: Region of Practice Equavilance.
- Rhat: scale reduction factor \(\hat R\).
- ESS: effective sample size.
## (Intercept) Temperature Pressure Humidity WindDirection
## 20998.757 38.373 -747.457 -0.269 -0.270
## WindSpeed
## 7.863
## (Intercept) Temperature Pressure Humidity WindDirection
## 20973.002 38.374 -746.499 -0.272 -0.271
## WindSpeed
## 7.841
## (Intercept) Temperature Pressure Humidity WindDirection
## 20996.666 38.373 -747.488 -0.269 -0.270
## WindSpeed
## 7.869
- As we see the values are closer to each other due to the like normality of the distribution of the posteriors where all the central statistics (mean, median, mode) are closer to each other.
p1 <- mcmc_dens(bayesian_model, pars=c("Temperature"))+
vline_at(median(poster$Temperature), col="red")+
vline_at(mean(poster$Temperature), col="yellow")+
vline_at(map_estimate(poster$Temperature), col="green")
p2 <- mcmc_dens(bayesian_model, pars=c("Pressure"))+
vline_at(median(poster$Pressure), col="red")+
vline_at(mean(poster$Pressure), col="yellow")+
vline_at(map_estimate(poster$Pressure), col="green")
p3 <- mcmc_dens(bayesian_model, pars=c("Humidity"))+
vline_at(median(poster$Humidity), col="red")+
vline_at(mean(poster$Humidity), col="yellow")+
vline_at(map_estimate(poster$Humidity), col="green")
p4 <- mcmc_dens(bayesian_model, pars=c("WindDirection"))+
vline_at(median(poster$WindDirection), col="red")+
vline_at(mean(poster$WindDirection), col="yellow")+
vline_at(map_estimate(poster$WindDirection), col="green")
p5 <- mcmc_dens(bayesian_model, pars=c("WindSpeed"))+
vline_at(median(poster$WindSpeed), col="red")+
vline_at(mean(poster$WindSpeed), col="yellow")+
vline_at(map_estimate(poster$WindSpeed), col="green")
ggarrange(p1,p2,p3,p4,p5)- As expected they are approximately on top of each other.
3.3 Bayesian Inferences
- We need to check the significance of Bayesian Regression coefficients.
- That is done by checking whether the corresponding credible interval contains zero or not, if no then this coefficient is significant.
- Now let’s see significance of out model coefficients.
| Parameter | CI | CI_low | CI_high | Effects | Component | |
|---|---|---|---|---|---|---|
| 1 | (Intercept) | 89 | 19939.2044911 | 22192.5496212 | fixed | conditional |
| 4 | Temperature | 89 | 38.0401060 | 38.7029959 | fixed | conditional |
| 3 | Pressure | 89 | -786.9067070 | -712.7380389 | fixed | conditional |
| 2 | Humidity | 89 | -0.3420957 | -0.1841775 | fixed | conditional |
| 5 | WindDirection | 89 | -0.2924052 | -0.2459000 | fixed | conditional |
| 6 | WindSpeed | 89 | 7.3266028 | 8.4053666 | fixed | conditional |
| Parameter | CI | CI_low | CI_high | Effects | Component | |
|---|---|---|---|---|---|---|
| 1 | (Intercept) | 89 | 19863.9719033 | 22132.6123101 | fixed | conditional |
| 4 | Temperature | 89 | 38.0404871 | 38.7037444 | fixed | conditional |
| 3 | Pressure | 89 | -784.6436591 | -710.2065951 | fixed | conditional |
| 2 | Humidity | 89 | -0.3492317 | -0.1898372 | fixed | conditional |
| 5 | WindDirection | 89 | -0.2929182 | -0.2463227 | fixed | conditional |
| 6 | WindSpeed | 89 | 7.3341335 | 8.4143058 | fixed | conditional |
From the both result we can see all the coefficients are significant.
NOTE: We got the satisfied results duo to the normal prior assumption. But in real world it is less often to be sure about the normality assumption.
Another way of testing significance is it is less often to be sure about the normality assumption.
| CI | ROPE_low | ROPE_high | ROPE_Percentage |
|---|---|---|---|
| 89 | -0.1 | 0.1 | 0 |
| CI | ROPE_low | ROPE_high | ROPE_Percentage |
|---|---|---|---|
| 89 | -0.1 | 0.1 | 0 |
| CI | ROPE_low | ROPE_high | ROPE_Percentage |
|---|---|---|---|
| 89 | -0.1 | 0.1 | 0 |
| CI | ROPE_low | ROPE_high | ROPE_Percentage |
|---|---|---|---|
| 89 | -0.1 | 0.1 | 0 |
| CI | ROPE_low | ROPE_high | ROPE_Percentage |
|---|---|---|---|
| 89 | -0.1 | 0.1 | 0 |
| CI | ROPE_low | ROPE_high | ROPE_Percentage |
|---|---|---|---|
| 89 | -0.1 | 0.1 | 0 |
## [1] -31.59164 31.59164
- For all the variables including intercept, almost all the credible interval (HDI) is outside the ROPE range, which means that coefficient is highly significant.
3.4 PD and P-value
- Interest in checking the direction of the coefficient.
- This done using pd statistic.
- If the pd is high value means that the associated effect is concentrated on the same side as the median.
- In our model almost all the posteriors are on same side.
df2 <- select(tidy(model_freq), c(term,p.value))
df2$p.value <- round(df2$p.value, digits = 3)
df3 <- 1 - purrr::map_dbl(poster, p_direction)
df4 <- cbind(df2,df3)
df4| term | p.value | df3 | |
|---|---|---|---|
| (Intercept) | (Intercept) | 0 | 0 |
| Temperature | Temperature | 0 | 0 |
| Pressure | Pressure | 0 | 0 |
| Humidity | Humidity | 0 | 0 |
| WindDirection | WindDirection | 0 | 0 |
| WindSpeed | WindSpeed | 0 | 0 |
4 Questions I got ???
- When we fitting the bayesian model, how we put prior, family detail.
- i.e. How we know about the family, prior details.