Page 1

Chart A

Bayesian inference

From this point onwards we switch to Bayesian approach. The rstanarm package provides stan_glm function which accepts same arguments as glm, but makes full Bayesian inference using Stan (mc-stan.org). By default a weakly informative Gaussian prior is used for weights.

Summary


Model Info:
 function:     stan_glm
 family:       poisson [log]
 formula:      y ~ x1 + x2 + x3 + x4
 algorithm:    sampling
 sample:       4000 (posterior sample size)
 priors:       see help('prior_summary')
 observations: 200
 predictors:   5

Estimates:
              mean   sd    10%   50%   90%
(Intercept)   2.5    6.1  -5.4   2.6  10.2
x1           -2.0    6.1  -9.8  -2.1   6.0
x2            6.5    6.1  -1.4   6.4  14.6
x3           -8.9    6.1 -16.5  -8.9  -1.0
x4           -8.7    6.3 -16.9  -8.8  -0.6

Fit Diagnostics:
           mean   sd   10%   50%   90%
mean_PPD 4.3    0.2  4.1   4.3   4.6  

The mean_ppd is the sample average posterior predictive distribution of the outcome variable (for details see help('summary.stanreg')).

MCMC diagnostics
              mcse Rhat n_eff
(Intercept)   0.1  1.0  4112 
x1            0.1  1.0  4114 
x2            0.1  1.0  4116 
x3            0.1  1.0  4259 
x4            0.1  1.0  3946 
mean_PPD      0.0  1.0  4127 
log-posterior 0.0  1.0  1742 

For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).

Data Table

Collinearity problem as with MLEs

Improvement on predictions

In case of even more variables with some being relevant and some irrelevant, it will be difficult to analyse joint posterior to see which variables are jointly relevant. We can easily test whether any of the covariates are useful by using cross-validation to compare to a null model


Computed from 4000 by 200 log-likelihood matrix

         Estimate   SE
elpd_loo   -383.3 12.1
p_loo         5.3  0.7
looic       766.6 24.2
------
Monte Carlo SE of elpd_loo is 0.0.

All Pareto k estimates are good (k < 0.5).
See help('pareto-k-diagnostic') for details.


Computed from 4000 by 200 log-likelihood matrix

         Estimate   SE
elpd_loo   -714.5 43.8
p_loo         4.9  0.8
looic      1428.9 87.5
------
Monte Carlo SE of elpd_loo is 0.1.

All Pareto k estimates are good (k < 0.5).
See help('pareto-k-diagnostic') for details.

      elpd_diff se_diff
fitg     0.0       0.0 
fitg0 -331.2      42.7

Based on cross-validation covariates together contain significant information to improve predictions.

Variable selection

Piironen and Vehtari (2017) also show that a projection predictive approach can be used to make a model reduction, that is, choosing a smaller model with some coefficients set to 0. The projection predictive approach solves the problem how to do inference after the selection. The solution is to project the full model posterior to the restricted subspace. See more in Piironen et.al (2018).

We make the projective predictive variable selection using the previous full model. A fast leave-one-out cross-validation approach (Vehtari, Gelman and Gabry, 2017) is used to choose the model size.

[1] "Computing LOOs..."

  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |=                                                                     |   1%
  |                                                                            
  |=                                                                     |   2%
  |                                                                            
  |==                                                                    |   2%
  |                                                                            
  |==                                                                    |   3%
  |                                                                            
  |==                                                                    |   4%
  |                                                                            
  |===                                                                   |   4%
  |                                                                            
  |====                                                                  |   5%
  |                                                                            
  |====                                                                  |   6%
  |                                                                            
  |=====                                                                 |   6%
  |                                                                            
  |=====                                                                 |   7%
  |                                                                            
  |=====                                                                 |   8%
  |                                                                            
  |======                                                                |   8%
  |                                                                            
  |======                                                                |   9%
  |                                                                            
  |=======                                                               |  10%
  |                                                                            
  |========                                                              |  11%
  |                                                                            
  |========                                                              |  12%
  |                                                                            
  |=========                                                             |  12%
  |                                                                            
  |=========                                                             |  13%
  |                                                                            
  |=========                                                             |  14%
  |                                                                            
  |==========                                                            |  14%
  |                                                                            
  |==========                                                            |  15%
  |                                                                            
  |===========                                                           |  16%
  |                                                                            
  |============                                                          |  16%
  |                                                                            
  |============                                                          |  17%
  |                                                                            
  |============                                                          |  18%
  |                                                                            
  |=============                                                         |  18%
  |                                                                            
  |=============                                                         |  19%
  |                                                                            
  |==============                                                        |  20%
  |                                                                            
  |===============                                                       |  21%
  |                                                                            
  |===============                                                       |  22%
  |                                                                            
  |================                                                      |  22%
  |                                                                            
  |================                                                      |  23%
  |                                                                            
  |================                                                      |  24%
  |                                                                            
  |=================                                                     |  24%
  |                                                                            
  |==================                                                    |  25%
  |                                                                            
  |==================                                                    |  26%
  |                                                                            
  |===================                                                   |  26%
  |                                                                            
  |===================                                                   |  27%
  |                                                                            
  |===================                                                   |  28%
  |                                                                            
  |====================                                                  |  28%
  |                                                                            
  |====================                                                  |  29%
  |                                                                            
  |=====================                                                 |  30%
  |                                                                            
  |======================                                                |  31%
  |                                                                            
  |======================                                                |  32%
  |                                                                            
  |=======================                                               |  32%
  |                                                                            
  |=======================                                               |  33%
  |                                                                            
  |=======================                                               |  34%
  |                                                                            
  |========================                                              |  34%
  |                                                                            
  |========================                                              |  35%
  |                                                                            
  |=========================                                             |  36%
  |                                                                            
  |==========================                                            |  36%
  |                                                                            
  |==========================                                            |  37%
  |                                                                            
  |==========================                                            |  38%
  |                                                                            
  |===========================                                           |  38%
  |                                                                            
  |===========================                                           |  39%
  |                                                                            
  |============================                                          |  40%
  |                                                                            
  |=============================                                         |  41%
  |                                                                            
  |=============================                                         |  42%
  |                                                                            
  |==============================                                        |  42%
  |                                                                            
  |==============================                                        |  43%
  |                                                                            
  |==============================                                        |  44%
  |                                                                            
  |===============================                                       |  44%
  |                                                                            
  |================================                                      |  45%
  |                                                                            
  |================================                                      |  46%
  |                                                                            
  |=================================                                     |  46%
  |                                                                            
  |=================================                                     |  47%
  |                                                                            
  |=================================                                     |  48%
  |                                                                            
  |==================================                                    |  48%
  |                                                                            
  |==================================                                    |  49%
  |                                                                            
  |===================================                                   |  50%
  |                                                                            
  |====================================                                  |  51%
  |                                                                            
  |====================================                                  |  52%
  |                                                                            
  |=====================================                                 |  52%
  |                                                                            
  |=====================================                                 |  53%
  |                                                                            
  |=====================================                                 |  54%
  |                                                                            
  |======================================                                |  54%
  |                                                                            
  |======================================                                |  55%
  |                                                                            
  |=======================================                               |  56%
  |                                                                            
  |========================================                              |  56%
  |                                                                            
  |========================================                              |  57%
  |                                                                            
  |========================================                              |  58%
  |                                                                            
  |=========================================                             |  58%
  |                                                                            
  |=========================================                             |  59%
  |                                                                            
  |==========================================                            |  60%
  |                                                                            
  |===========================================                           |  61%
  |                                                                            
  |===========================================                           |  62%
  |                                                                            
  |============================================                          |  62%
  |                                                                            
  |============================================                          |  63%
  |                                                                            
  |============================================                          |  64%
  |                                                                            
  |=============================================                         |  64%
  |                                                                            
  |==============================================                        |  65%
  |                                                                            
  |==============================================                        |  66%
  |                                                                            
  |===============================================                       |  66%
  |                                                                            
  |===============================================                       |  67%
  |                                                                            
  |===============================================                       |  68%
  |                                                                            
  |================================================                      |  68%
  |                                                                            
  |================================================                      |  69%
  |                                                                            
  |=================================================                     |  70%
  |                                                                            
  |==================================================                    |  71%
  |                                                                            
  |==================================================                    |  72%
  |                                                                            
  |===================================================                   |  72%
  |                                                                            
  |===================================================                   |  73%
  |                                                                            
  |===================================================                   |  74%
  |                                                                            
  |====================================================                  |  74%
  |                                                                            
  |====================================================                  |  75%
  |                                                                            
  |=====================================================                 |  76%
  |                                                                            
  |======================================================                |  76%
  |                                                                            
  |======================================================                |  77%
  |                                                                            
  |======================================================                |  78%
  |                                                                            
  |=======================================================               |  78%
  |                                                                            
  |=======================================================               |  79%
  |                                                                            
  |========================================================              |  80%
  |                                                                            
  |=========================================================             |  81%
  |                                                                            
  |=========================================================             |  82%
  |                                                                            
  |==========================================================            |  82%
  |                                                                            
  |==========================================================            |  83%
  |                                                                            
  |==========================================================            |  84%
  |                                                                            
  |===========================================================           |  84%
  |                                                                            
  |============================================================          |  85%
  |                                                                            
  |============================================================          |  86%
  |                                                                            
  |=============================================================         |  86%
  |                                                                            
  |=============================================================         |  87%
  |                                                                            
  |=============================================================         |  88%
  |                                                                            
  |==============================================================        |  88%
  |                                                                            
  |==============================================================        |  89%
  |                                                                            
  |===============================================================       |  90%
  |                                                                            
  |================================================================      |  91%
  |                                                                            
  |================================================================      |  92%
  |                                                                            
  |=================================================================     |  92%
  |                                                                            
  |=================================================================     |  93%
  |                                                                            
  |=================================================================     |  94%
  |                                                                            
  |==================================================================    |  94%
  |                                                                            
  |==================================================================    |  95%
  |                                                                            
  |===================================================================   |  96%
  |                                                                            
  |====================================================================  |  96%
  |                                                                            
  |====================================================================  |  97%
  |                                                                            
  |====================================================================  |  98%
  |                                                                            
  |===================================================================== |  98%
  |                                                                            
  |===================================================================== |  99%
  |                                                                            
  |======================================================================| 100%
[1] "Performing the selection using all the data.."
[1] "20% of terms selected."
[1] "50% of terms selected."
[1] "70% of terms selected."
[1] "100% of terms selected."
[1] "Done."

Recommendation for the model size to choose based on a LOO

Selection model results

[1] 2

Intercept        x2        x1 
     -6.1      15.2       6.7

Crude map of the world with capital cities:

HighChart

HighCharts dynamic chart

Page 2

Row1

Average Rating


Call:
glm(formula = cbind(ndied, nalive) ~ dose, family = binomial, 
    data = exp.dat)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-1.2746  -0.4668   0.7688   0.9544   1.2990  

Coefficients:
             Estimate Std. Error z value Pr(>|z|)    
(Intercept) -14.82300    1.28959  -11.49   <2e-16 ***
dose          0.24942    0.02139   11.66   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 284.2024  on 7  degrees of freedom
Residual deviance:   7.3849  on 6  degrees of freedom
AIC: 37.583

Number of Fisher Scoring iterations: 4

[1] 59.43092

[1] 49.1 76.5


Call:
glm(formula = cbind(ndied, nalive) ~ dose, family = binomial, 
    data = exp.dat)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-1.2746  -0.4668   0.7688   0.9544   1.2990  

Coefficients:
             Estimate Std. Error z value Pr(>|z|)    
(Intercept) -14.82300    1.28959  -11.49   <2e-16 ***
dose          0.24942    0.02139   11.66   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 284.2024  on 7  degrees of freedom
Residual deviance:   7.3849  on 6  degrees of freedom
AIC: 37.583

Number of Fisher Scoring iterations: 4

(Intercept)        dose 
     -14.85        0.25

                5%    95%
(Intercept) -17.01 -12.86
dose          0.22   0.29


Computed from 4000 by 8 log-likelihood matrix

         Estimate  SE
elpd_loo    -19.1 2.4
p_loo         2.1 0.4
looic        38.2 4.7
------
Monte Carlo SE of elpd_loo is 0.1.

Pareto k diagnostic values:
                         Count Pct.    Min. n_eff
(-Inf, 0.5]   (good)     6     75.0%   1214      
 (0.5, 0.7]   (ok)       2     25.0%   392       
   (0.7, 1]   (bad)      0      0.0%   <NA>      
   (1, Inf)   (very bad) 0      0.0%   <NA>      

All Pareto k estimates are ok (k < 0.7).
See help('pareto-k-diagnostic') for details.


Computed from 4000 by 8 log-likelihood matrix

         Estimate   SE
elpd_loo   -172.8 34.1
p_loo        30.5  6.7
looic       345.6 68.1
------
Monte Carlo SE of elpd_loo is NA.

Pareto k diagnostic values:
                         Count Pct.    Min. n_eff
(-Inf, 0.5]   (good)     2     25.0%   129       
 (0.5, 0.7]   (ok)       1     12.5%   57        
   (0.7, 1]   (bad)      3     37.5%   16        
   (1, Inf)   (very bad) 2     25.0%   5         
See help('pareto-k-diagnostic') for details.

      elpd_diff se_diff
fitg     0.0       0.0 
fitg0 -153.7      35.5

      elpd_diff se_diff
fitg1  0.0       0.0   
fitg  -0.7       0.7

               5%   95%
(Intercept) -9.74 -7.46
dose         0.13  0.16


Model Info:
 function:     stan_glm
 family:       binomial [probit]
 formula:      cbind(ndied, nalive) ~ dose
 algorithm:    sampling
 sample:       4000 (posterior sample size)
 priors:       see help('prior_summary')
 observations: 8
 predictors:   2

Estimates:
              mean   sd   10%   50%   90%
(Intercept) -8.6    0.7 -9.5  -8.6  -7.7 
dose         0.1    0.0  0.1   0.1   0.2 

Fit Diagnostics:
           mean   sd   10%   50%   90%
mean_PPD 36.5    1.3 34.8  36.5  38.2 

The mean_ppd is the sample average posterior predictive distribution of the outcome variable (for details see help('summary.stanreg')).

MCMC diagnostics
              mcse Rhat n_eff
(Intercept)   0.0  1.0  1741 
dose          0.0  1.0  1688 
mean_PPD      0.0  1.0  3411 
log-posterior 0.0  1.0  1638 

For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).

Page3


Model Info:
 function:     stan_glm
 family:       gaussian [identity]
 formula:      Survived ~ Age + Sex + as.factor(class)
 algorithm:    sampling
 sample:       4000 (posterior sample size)
 priors:       see help('prior_summary')
 observations: 714
 predictors:   5

Estimates:
                    mean   sd   10%   50%   90%
(Intercept)        1.1    0.1  1.1   1.1   1.2 
Age                0.0    0.0  0.0   0.0   0.0 
Sexmale           -0.5    0.0 -0.5  -0.5  -0.4 
as.factor(class)2 -0.2    0.0 -0.3  -0.2  -0.2 
as.factor(class)3 -0.4    0.0 -0.5  -0.4  -0.4 
sigma              0.4    0.0  0.4   0.4   0.4 

Fit Diagnostics:
           mean   sd   10%   50%   90%
mean_PPD 0.4    0.0  0.4   0.4   0.4  

The mean_ppd is the sample average posterior predictive distribution of the outcome variable (for details see help('summary.stanreg')).

MCMC diagnostics
                  mcse Rhat n_eff
(Intercept)       0.0  1.0  2968 
Age               0.0  1.0  3326 
Sexmale           0.0  1.0  3573 
as.factor(class)2 0.0  1.0  2737 
as.factor(class)3 0.0  1.0  2646 
sigma             0.0  1.0  3844 
mean_PPD          0.0  1.0  4177 
log-posterior     0.0  1.0  1961 

For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).


Computed from 4000 by 714 log-likelihood matrix

          Estimate   SE
elpd_waic   -334.9 19.1
p_waic         6.1  0.4
waic         669.9 38.3


Computed from 4000 by 714 log-likelihood matrix

         Estimate   SE
elpd_loo   -334.9 19.1
p_loo         6.1  0.4
looic       669.9 38.3
------
Monte Carlo SE of elpd_loo is 0.0.

All Pareto k estimates are good (k < 0.5).
See help('pareto-k-diagnostic') for details.

[[1]]
Stan model 'bernoulli' does not contain samples.

               elpd_diff se_diff
Titanic_logit   0.0       0.0   
Titanic_probit -1.5       0.8

Page4

Row1

Classes 'data.table' and 'data.frame':  11341 obs. of  8 variables:
 $ SNo            : int  1 2 3 4 5 6 7 8 9 10 ...
 $ ObservationDate: Date, format: "2020-01-22" "2020-01-22" ...
 $ Province.State : chr  "Anhui" "Beijing" "Chongqing" "Fujian" ...
 $ Country.Region : chr  "Mainland China" "Mainland China" "Mainland China" "Mainland China" ...
 $ Last.Update    : Date, format: "2020-01-22" "2020-01-22" ...
 $ Confirmed      : num  1 14 6 1 0 26 2 1 4 1 ...
 $ Deaths         : num  0 0 0 0 0 0 0 0 0 0 ...
 $ Recovered      : num  0 0 0 0 0 0 0 0 0 0 ...
 - attr(*, ".internal.selfref")=<externalptr>

Row2

Row3 ——————

Row4 ——————

---
title: "Chart Stack (Scrolling)"
Author: "Hyunsu Ju"
output: 
  flexdashboard::flex_dashboard:
    theme: sandstone
    vertical_layout: scroll
    social: ["facebook","twitter","menu"]
    source_code: embed
    navbar:
      - {title: "About", href: "https://stat.columbia.edu/~gelman/research/unpublished/loo_stan.pdf", align: left}
---

```{r setup, include=FALSE}
library(flexdashboard)
library(rstanarm)
options(mc.cores = parallel::detectCores())
library(loo)
library(tidyverse)
library(GGally)
library(bayesplot)
theme_set(bayesplot::theme_default())
library(projpred)
SEED=87
```

Page 1
===========
### Chart A

```{r, fig.width=15, fig.height=7}
set.seed(SEED)
df <- tibble(
  pos.tot = runif(200,min=0.8,max=1.0),
  urban.tot = pmin(runif(200,min=0.0,max=0.02),1.0 - pos.tot),
  neg.tot = (1.0 - pmin(pos.tot + urban.tot,1)),
  x1= pmax(pos.tot - runif(200,min=0.05,max=0.30),0),
  x3= pmax(neg.tot - runif(200,min=0.0,max=0.10),0),
  x2= pmax(pos.tot - x1 - x3/2,0),
  x4= pmax(1 - x1 - x2 - x3 - urban.tot,0))
# true model and 200 Poisson observations
mean.y <- exp(-5.8 + 6.3*df$x1 + 15.2*df$x2)
df$y <- rpois(200,mean.y)

ggpairs(df,diag=list(continuous="barDiag"))
```


### Bayesian inference
From this point onwards we switch to Bayesian approach. The rstanarm package provides stan_glm function which accepts same arguments as glm, but makes full Bayesian inference using Stan (mc-stan.org). By default a weakly informative Gaussian prior is used for weights.


```{r}
fitg <- stan_glm(y ~ x1 + x2 + x3 + x4, data = df, na.action = na.fail, family=poisson(), QR=TRUE, seed=SEED)
```

### Summary
```{r}
summary(fitg)
```



### Data Table
```{r}
DT::datatable(df,options=list(bPaginate=FALSE))
```

### Collinearity problem as with MLEs
```{r}
mcmc_areas(as.matrix(fitg),prob_outer = .99)
```

### Improvement on predictions
In case of even more variables with some being relevant and some irrelevant, it will be difficult to analyse joint posterior to see which variables are jointly relevant. We can easily test whether any of the covariates are useful by using cross-validation to compare to a null model

```{r}
fitg0 <- stan_glm(y ~ 1, data = df, na.action = na.fail, family=poisson(), seed=SEED)
(loog <- loo(fitg))
(loog0 <- loo(fitg0))
loo_compare(loog0, loog)
```
Based on cross-validation covariates together contain significant information to improve predictions.

### Variable selection

Piironen and Vehtari (2017) also show that a projection predictive approach can be used to make a model reduction, that is, choosing a smaller model with some coefficients set to 0. The projection predictive approach solves the problem how to do inference after the selection. The solution is to project the full model posterior to the restricted subspace. See more in Piironen et.al (2018).

We make the projective predictive variable selection using the previous full model. A fast leave-one-out cross-validation approach (Vehtari, Gelman and Gabry, 2017) is used to choose the model size.

```{r}
library(projpred)
fitg_cv <- cv_varsel(fitg, method='forward', cv_method='LOO',cores=16)
```

### Recommendation for the model size to choose based on a LOO
```{r}
plot(fitg_cv)
```


### Selection model results

```{r}
(nv <- suggest_size(fitg_cv, alpha=0.1))
projg <- project(fitg_cv, nv = nv, ns = 4000)
round(colMeans(as.matrix(projg)),1)
mcmc_areas(as.matrix(projg) ,prob_outer = .99)
```


### Crude map of the world with capital cities:
```{r}
library(ggplot2)
library(plotly)
library(maps)
library(rbokeh)
data(world.cities)
caps <- subset(world.cities, capital == 1)
caps$population <- prettyNum(caps$pop, big.mark = ",")
figure(width = 800, height = 450, padding_factor = 0) %>%
  ly_map("world", col = "gray") %>%
  ly_points(long, lat, data = caps, size = 5,
    hover = c(name, country.etc, population))
```

### HighChart
`HighCharts` dynamic chart

```{r}
library(highcharter)
data(penguins, package = "palmerpenguins")
hchart(penguins, "scatter", hcaes(x=flipper_length_mm, y=bill_length_mm, group=species))
```
Page 2
===========

Row1
-----------------

### Average Rating
```{r}
exp.dat = matrix(c(49.1,53,56.9,60.8,64.8,68.7,72.6,76.5,
59,60,62,56,63,59,62,60,
6,13,18,28,52,53,61,60,
.102,.217,.29,.5,.825,.898,.984,1),ncol=4)
colnames(exp.dat) = c('dose','nexp','ndied','prop')
exp.dat = as.data.frame(exp.dat)
# compute how many lived
exp.dat$nalive = exp.dat$nexp - exp.dat$ndied
#View(exp.dat)
summary(glm(cbind(ndied,nalive) ~ dose,
family=binomial,
data=exp.dat)->exp.glm)
dose4prob = function(b0,b1,prob){
d = (-b0+log(-prob/(prob-1)))/b1
return(d)
}
dose4prob(coef(exp.glm)[[1]],coef(exp.glm)[[2]],.5)
range(exp.dat$dose)
drange = seq(30,90,length=100)
exp.pred = predict(exp.glm, newdata=data.frame(dose=drange))
plot(drange,
exp(exp.pred)/(1+exp(exp.pred)),
type='l',
xlab='dose',
ylab='probability')
# add in the observed points
points(prop ~ dose, data = exp.dat, col='red', pch=19)
# compute our LD values
ld50 = dose4prob(coef(exp.glm)[[1]],coef(exp.glm)[[2]],.50)
ld25 = dose4prob(coef(exp.glm)[[1]],coef(exp.glm)[[2]],.25)
# let's add these to our graph for visualization purposes
abline(h=.5,lty='dashed',col='red')
abline(h=.25,lty='dotted',col='blue')
legend('topleft',
c(expression(LD[50]),
expression(LD[25])),
col=c('red','blue'),lty=c('dashed','dotted'),
inset=.01)
text(ld50,.52,sprintf("%.3f",ld50),pos=c(2),col='red')
text(ld25,.27,sprintf("%.3f",ld25),pos=c(2),col="blue")

```
```{r}
DT::datatable(exp.dat,options=list(bPaginate=FALSE))
```
```{r}
library(rstanarm)
options(mc.cores = parallel::detectCores())
library(loo)
library(tidyverse)
library(GGally)
library(bayesplot)
theme_set(bayesplot::theme_default())
library(projpred)
SEED=87
set.seed(SEED)
fitg <- stan_glm(cbind(ndied,nalive) ~ dose,
data=exp.dat, na.action = na.fail, family=binomial(), seed=SEED)
summary(glm(cbind(ndied,nalive) ~ dose,
family=binomial,
data=exp.dat))
round(coef(fitg), 2)
round(posterior_interval(fitg, prob = 0.9), 2)
fitg0 <- stan_glm(cbind(ndied,nalive) ~ 1, data = exp.dat, na.action = na.fail, family=binomial(), seed=SEED)
(loog <- loo(fitg))
(loog0 <- loo(fitg0))
loo_compare(loog0, loog)

```

```{r}
#Compare the probit
fitg1 <- stan_glm(cbind(ndied,nalive) ~ dose, data = exp.dat, na.action = na.fail, family=binomial(link=probit), seed=SEED)
loog1<-loo(fitg1)
loo_compare(loog1, loog)
round(posterior_interval(fitg1, prob = 0.9), 2)

summary(fitg1)

posterior <- as.array(fitg1)
np <- nuts_params(fitg1)
mcmc_pairs(posterior, pars = c("dose", "(Intercept)"), np = np)

#launch_shinystan(fitg1)
```


Page3
===========
```{r}
# launch_shinystan(fitg1)
# launch_shinystan(fitg)
url<-"https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv"
Titanic <- read.csv(url, header = TRUE)

#glimpse(Titanic,width = 50)


Titanic$class <- str_extract(Titanic$Pclass, "[0-9]")
TitanicLinear <- stan_glm(Survived ~ Age + 
                            Sex + as.factor(class), 
                          data = Titanic, family = gaussian)
summary(TitanicLinear)
plot(TitanicLinear)

Titanic_posterior <- TitanicLinear %>% as_tibble() %>% 
  rename(sec.class = "as.factor(class)2",
         third.class = "as.factor(class)3")
ggplot(Titanic_posterior, aes(x=third.class)) + 
  geom_histogram()



posterior_vs_prior(TitanicLinear)
pp_check(TitanicLinear)

waic(TitanicLinear)
loo(TitanicLinear)

Titanic_probit <- stan_glm(Survived ~ Age + 
                             Sex + as.factor(class), 
                           data = Titanic, family = binomial(link=probit))
Loo_probit <- loo(Titanic_probit)
Titanic_logit <- stan_glm(Survived ~ Age + 
                            Sex + as.factor(class), 
                          data = Titanic, family = binomial(link=logit))
Loo_logit <- loo(Titanic_logit)
# ELPD_diff>0 indicates more support for 2nd model
loo_compare(Loo_probit, Loo_logit)

```


Page4
=============


Row1
------------------
```{r}
library(tidyverse) 
library(data.table)
library(dplyr)
library(ggplot2)
library(lubridate)
library(stringr)
library(plotly)
## Reading in files
library(igraph)
library(threejs)
library(ggraph)
library(tidygraph)
# You can access files from datasets you've added to this kernel in the "../input/" directory.
# You can see the files added to this kernel by running the code below. 
start.watch.date <- as.Date('2020-03-01')
watched.countries.count <- 12
covid <- data.table(read.csv('covid_19_data.csv'))
krcovid <- data.table(read.csv('PatientInfo.csv'))
krcovid[,confirmed_date:=as.Date(gsub('2002','2020',as.character(confirmed_date)))]
krcovid[,symptom_onset_date:=as.Date(as.character(symptom_onset_date))]
krcovid[,released_date:=as.Date(as.character(released_date))]
krcovid[,deceased_date:=as.Date(as.character(deceased_date),format='%Y-%m-%d')]
#krcovid[,table(symptom_onset_date==confirmed_date)]
#summary(krcovid)
#head(covid)



krcovid[,closed_date:=coalesce(released_date,deceased_date)]
krcovid[,closed.days:=as.integer(difftime(closed_date,confirmed_date,unit='days'))]
ggplot(krcovid[,.(n=.N,median_age=2020-median(birth_year,na.rm=T)),.(closed.days,confirmed_date,state)][,state:=ifelse(state=='deceased','death','recovered')])+
  geom_point(aes(y=closed.days,x=confirmed_date,size=n,col=state))+
  #geom_text(aes(y=closed.days,x=confirmed_date,label=median_age,col=state,hjust='left'))+
  scale_x_date(date_labels = '%m/%d',date_breaks = '3 days')+
  theme(axis.text.x=element_text(angle = 60, vjust = 0.5))+
  scale_color_brewer(palette = 'Set1')+
  labs(title='Days to close the cases in Korea')

covid[,Last.Update:=as.Date(Last.Update,format='%m/%d/%Y')]
covid[str_length(ObservationDate)==8,ObservationDate1:=as.Date(ObservationDate,format='%m/%d/%y',origin = orgin)]
covid[str_length(ObservationDate)==10,ObservationDate1:=as.Date(ObservationDate,format='%m/%d/%Y',origin = orgin)]
covid[,ObservationDate:=ObservationDate1]
covid$ObservationDate1=NULL
covid[grep('Korea',Country.Region),Country.Region:='Republic of Korea']
covid[grep('Iran',Country.Region),Country.Region:='Iran']
covid[grep('Taiwan',Province.State),Country.Region:='Taiwan']
covid[ grepl('Diamon',Province.State),Country.Region:='Diamond Princess']
str(covid)
# The JHU modified the Country name of Taiwan which is suck both for analysis and human right

## quick summary the cases by country
library(conflicted)

conflict_prefer("lag", "dplyr")
conflict_prefer("setdiff", "dplyr")
covid.by.country <- covid[,.(Confirmed=sum(Confirmed),Recovered=sum(Recovered),Deaths=sum(Deaths)),.(ObservationDate,Country.Region)]
setorder(covid.by.country,Country.Region,ObservationDate)
covid.by.country[,Daily.Confirmed:=Confirmed-lag(Confirmed,1),.(Country.Region)]
covid.by.country[,Daily.Recovered:=Recovered-lag(Recovered,1),.(Country.Region)]
covid.by.country[,Daily.Deaths:=Deaths-lag(Deaths,1),.(Country.Region)]
lookup.countries <- covid.by.country[order(-ObservationDate,Country.Region)][,head(.SD,1),.(Country.Region)][order(-Deaths),Country.Region]
lookup.countries <- setdiff(lookup.countries,grep('Diamon',lookup.countries,value=T))
```
Row2
------------------

```{r}
ggplot(covid.by.country[grep('Italy|Korea',Country.Region)][,Daily.Confirmed:=ifelse(Daily.Confirmed>5000,2000,Daily.Confirmed)])+
  geom_line(aes(x=ObservationDate,y=Daily.Recovered,col='Recovered'))+
  geom_line(aes(x=ObservationDate,y=Daily.Deaths,col='Deaths'))+
  geom_line(aes(x=ObservationDate,y=Daily.Confirmed,col='Confirmed'))+
  facet_wrap(~Country.Region,scales = 'free',ncol=1)+
  scale_x_date(date_labels = '%m/%d',date_breaks = '7 days')
```
Row3
------------------
```{r}
ggplot(covid.by.country[grep('Italy|Korea',Country.Region)][ObservationDate>'2020-03-01'])+
  geom_bar(aes(x=ObservationDate,y=Daily.Deaths,fill='Deaths'),stat = 'identity')+
  geom_line(aes(x=ObservationDate,y=Daily.Confirmed),col='blue')+
  facet_wrap(~Country.Region,scales = 'free',ncol=1)+
  scale_x_date(date_labels = '%m/%d',date_breaks = '2 days')
```
Row4
------------------

```{r}

ggplot(covid.by.country[Country.Region %in% lookup.countries[1:watched.countries.count]],aes(group=1))+
  geom_line(aes(x=ObservationDate,y=Daily.Recovered,col='Recovered'))+
  geom_line(aes(x=ObservationDate,y=Daily.Deaths,col='Deaths'))+
  geom_line(aes(x=ObservationDate,y=Daily.Confirmed,col='Confirmed'))+
  facet_wrap(~Country.Region,ncol=3,scale='free_y')+
  scale_color_brewer(palette = 'Set1',name='Value Type')+
  scale_x_date(date_breaks = '7 days',date_labels = '%m/%d',)+
  theme(axis.text.x=element_text(angle = 90, vjust = 0.5))+
  labs(title='COVID-19 daily cases')
```