Intro to STAN Homework Set #1

After our Intro to Stan lecture I think it would be valuable to have you go through a similar exercise. Let’s test a second research question.

Research question: Is sea ice extent declining in the Southern Hemisphere over time? Is the same pattern happening in the Antarctic as in the Arctic? Fit a Stan model to find out!

Make sure you follow the steps we used in class.

What do your Stan model results indicate so far?

library(rstan)

## Warning: package 'rstan' was built under R version 3.5.3

## Loading required package: StanHeaders

## Warning: package 'StanHeaders' was built under R version 3.5.3

## Loading required package: ggplot2

## Warning: package 'ggplot2' was built under R version 3.5.3

## rstan (Version 2.19.2, GitRev: 2e1f913d3ca3)

## For execution on a local, multicore CPU with excess RAM we recommend calling
## options(mc.cores = parallel::detectCores()).
## To avoid recompilation of unchanged Stan programs, we recommend calling
## rstan_options(auto_write = TRUE)

## For improved execution time, we recommend calling
## Sys.setenv(LOCAL_CPPFLAGS = '-march=native')
## although this causes Stan to throw an error on a few processors.

library(gdata)

## Warning: package 'gdata' was built under R version 3.5.2

## gdata: Unable to locate valid perl interpreter
## gdata: 
## gdata: read.xls() will be unable to read Excel XLS and XLSX files
## gdata: unless the 'perl=' argument is used to specify the location
## gdata: of a valid perl intrpreter.
## gdata: 
## gdata: (To avoid display of this message in the future, please
## gdata: ensure perl is installed and available on the executable
## gdata: search path.)

## gdata: Unable to load perl libaries needed by read.xls()
## gdata: to support 'XLX' (Excel 97-2004) files.

##

## gdata: Unable to load perl libaries needed by read.xls()
## gdata: to support 'XLSX' (Excel 2007+) files.

##

## gdata: Run the function 'installXLSXsupport()'
## gdata: to automatically download and install the perl
## gdata: libaries needed to support Excel XLS and XLSX formats.

## 
## Attaching package: 'gdata'

## The following object is masked from 'package:stats':
## 
##     nobs

## The following object is masked from 'package:utils':
## 
##     object.size

## The following object is masked from 'package:base':
## 
##     startsWith

library(bayesplot)

## Warning: package 'bayesplot' was built under R version 3.5.3

## This is bayesplot version 1.7.0

## - Online documentation and vignettes at mc-stan.org/bayesplot

## - bayesplot theme set to bayesplot::theme_default()

##    * Does _not_ affect other ggplot2 plots

##    * See ?bayesplot_theme_set for details on theme setting

library(data.table)

## Warning: package 'data.table' was built under R version 3.5.3

## 
## Attaching package: 'data.table'

## The following objects are masked from 'package:gdata':
## 
##     first, last

Load and Inspect Data

standata <- fread("C:/Users/bhave/Desktop/ANLYHW/seaice.csv")
head(standata)

##    year extent_north extent_south
## 1: 1979       12.328       11.700
## 2: 1980       12.337       11.230
## 3: 1981       12.127       11.435
## 4: 1982       12.447       11.640
## 5: 1983       12.332       11.389
## 6: 1984       11.910       11.454

Plot the data

plot( extent_north ~ year, pch = 20, data =  standata)

Run a geneal linear model using lm()

lm_linear <- lm(extent_north ~ year, data =   standata)
lm_linear

## 
## Call:
## lm(formula = extent_north ~ year, data = standata)
## 
## Coefficients:
## (Intercept)         year  
##   120.50304     -0.05457

Prepare the data, re-run the lm() and extract summary statistics

summary(lm_linear)

## 
## Call:
## lm(formula = extent_north ~ year, data = standata)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.49925 -0.17713  0.04898  0.16923  0.32829 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 120.503036   6.267203   19.23   <2e-16 ***
## year         -0.054574   0.003137  -17.40   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2205 on 37 degrees of freedom
## Multiple R-squared:  0.8911, Adjusted R-squared:  0.8881 
## F-statistic: 302.7 on 1 and 37 DF,  p-value: < 2.2e-16

Displaying the summary above.

Next running plots to visualize the data

plot(extent_north ~ year, pch = 20, data =  standata)
abline(lm_linear , col = 2, lty = 2, lw = 3)

Based on the above graph need to figure out if sea ice is changing from start to end of dataset. Lets index the year to 0. Let’s setup year data from 1 to 30 years.

x_data <- I(standata$year - 1978)
y_data <- standata$extent_north
N_data <- length(standata$year)

Lets rerun the model with linear data

lm_linear2 <- lm(y_data ~ x_data)
summary(lm_linear2)

## 
## Call:
## lm(formula = y_data ~ x_data)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.49925 -0.17713  0.04898  0.16923  0.32829 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 12.555888   0.071985   174.4   <2e-16 ***
## x_data      -0.054574   0.003137   -17.4   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2205 on 37 degrees of freedom
## Multiple R-squared:  0.8911, Adjusted R-squared:  0.8881 
## F-statistic: 302.7 on 1 and 37 DF,  p-value: < 2.2e-16

Let’s extract statistics from model. This will be helpful to compare stan models

lm_intersect <- summary(lm_linear2)$coeff[1]
lm_slope <- summary(lm_linear2)$coeff[2]
lm_rerror <- sigma(lm_linear2)

Converting into dataframe to be used in stan model.

datastan2 <- list(N = N_data, x = x_data, y = y_data)

Write the Stan model

Let’s build data, block and parameter block.

write("
data {
 int < lower = 1 > N; 
 vector[N] x;
 vector[N] y;
}

parameters {
 real alpha; 
 real beta; 
 real < lower = 0 > sigma; 
}

model {
 y ~ normal(alpha + x * beta , sigma);
}

generated quantities {
}",

"modeldatastan2.stan")

Now checking for stan model is file

stanc("modeldatastan2.stan")

## $status
## [1] TRUE
## 
## $model_cppname
## [1] "model142c43103db_modeldatastan2"
## 
## $cppcode
## [1] "// Code generated by Stan version 2.19.1\n\n#include <stan/model/model_header.hpp>\n\nnamespace model142c43103db_modeldatastan2_namespace {\n\nusing std::istream;\nusing std::string;\nusing std::stringstream;\nusing std::vector;\nusing stan::io::dump;\nusing stan::math::lgamma;\nusing stan::model::prob_grad;\nusing namespace stan::math;\n\nstatic int current_statement_begin__;\n\nstan::io::program_reader prog_reader__() {\n    stan::io::program_reader reader;\n    reader.add_event(0, 0, \"start\", \"model142c43103db_modeldatastan2\");\n    reader.add_event(21, 19, \"end\", \"model142c43103db_modeldatastan2\");\n    return reader;\n}\n\nclass model142c43103db_modeldatastan2 : public prob_grad {\nprivate:\n        int N;\n        vector_d x;\n        vector_d y;\npublic:\n    model142c43103db_modeldatastan2(stan::io::var_context& context__,\n        std::ostream* pstream__ = 0)\n        : prob_grad(0) {\n        ctor_body(context__, 0, pstream__);\n    }\n\n    model142c43103db_modeldatastan2(stan::io::var_context& context__,\n        unsigned int random_seed__,\n        std::ostream* pstream__ = 0)\n        : prob_grad(0) {\n        ctor_body(context__, random_seed__, pstream__);\n    }\n\n    void ctor_body(stan::io::var_context& context__,\n                   unsigned int random_seed__,\n                   std::ostream* pstream__) {\n        typedef double local_scalar_t__;\n\n        boost::ecuyer1988 base_rng__ =\n          stan::services::util::create_rng(random_seed__, 0);\n        (void) base_rng__;  // suppress unused var warning\n\n        current_statement_begin__ = -1;\n\n        static const char* function__ = \"model142c43103db_modeldatastan2_namespace::model142c43103db_modeldatastan2\";\n        (void) function__;  // dummy to suppress unused var warning\n        size_t pos__;\n        (void) pos__;  // dummy to suppress unused var warning\n        std::vector<int> vals_i__;\n        std::vector<double> vals_r__;\n        local_scalar_t__ DUMMY_VAR__(std::numeric_limits<double>::quiet_NaN());\n        (void) DUMMY_VAR__;  // suppress unused var warning\n\n        try {\n            // initialize data block variables from context__\n            current_statement_begin__ = 3;\n            context__.validate_dims(\"data initialization\", \"N\", \"int\", context__.to_vec());\n            N = int(0);\n            vals_i__ = context__.vals_i(\"N\");\n            pos__ = 0;\n            N = vals_i__[pos__++];\n            check_greater_or_equal(function__, \"N\", N, 1);\n\n            current_statement_begin__ = 4;\n            validate_non_negative_index(\"x\", \"N\", N);\n            context__.validate_dims(\"data initialization\", \"x\", \"vector_d\", context__.to_vec(N));\n            x = Eigen::Matrix<double, Eigen::Dynamic, 1>(N);\n            vals_r__ = context__.vals_r(\"x\");\n            pos__ = 0;\n            size_t x_j_1_max__ = N;\n            for (size_t j_1__ = 0; j_1__ < x_j_1_max__; ++j_1__) {\n                x(j_1__) = vals_r__[pos__++];\n            }\n\n            current_statement_begin__ = 5;\n            validate_non_negative_index(\"y\", \"N\", N);\n            context__.validate_dims(\"data initialization\", \"y\", \"vector_d\", context__.to_vec(N));\n            y = Eigen::Matrix<double, Eigen::Dynamic, 1>(N);\n            vals_r__ = context__.vals_r(\"y\");\n            pos__ = 0;\n            size_t y_j_1_max__ = N;\n            for (size_t j_1__ = 0; j_1__ < y_j_1_max__; ++j_1__) {\n                y(j_1__) = vals_r__[pos__++];\n            }\n\n\n            // initialize transformed data variables\n            // execute transformed data statements\n\n            // validate transformed data\n\n            // validate, set parameter ranges\n            num_params_r__ = 0U;\n            param_ranges_i__.clear();\n            current_statement_begin__ = 9;\n            num_params_r__ += 1;\n            current_statement_begin__ = 10;\n            num_params_r__ += 1;\n            current_statement_begin__ = 11;\n            num_params_r__ += 1;\n        } catch (const std::exception& e) {\n            stan::lang::rethrow_located(e, current_statement_begin__, prog_reader__());\n            // Next line prevents compiler griping about no return\n            throw std::runtime_error(\"*** IF YOU SEE THIS, PLEASE REPORT A BUG ***\");\n        }\n    }\n\n    ~model142c43103db_modeldatastan2() { }\n\n\n    void transform_inits(const stan::io::var_context& context__,\n                         std::vector<int>& params_i__,\n                         std::vector<double>& params_r__,\n                         std::ostream* pstream__) const {\n        typedef double local_scalar_t__;\n        stan::io::writer<double> writer__(params_r__, params_i__);\n        size_t pos__;\n        (void) pos__; // dummy call to supress warning\n        std::vector<double> vals_r__;\n        std::vector<int> vals_i__;\n\n        current_statement_begin__ = 9;\n        if (!(context__.contains_r(\"alpha\")))\n            stan::lang::rethrow_located(std::runtime_error(std::string(\"Variable alpha missing\")), current_statement_begin__, prog_reader__());\n        vals_r__ = context__.vals_r(\"alpha\");\n        pos__ = 0U;\n        context__.validate_dims(\"parameter initialization\", \"alpha\", \"double\", context__.to_vec());\n        double alpha(0);\n        alpha = vals_r__[pos__++];\n        try {\n            writer__.scalar_unconstrain(alpha);\n        } catch (const std::exception& e) {\n            stan::lang::rethrow_located(std::runtime_error(std::string(\"Error transforming variable alpha: \") + e.what()), current_statement_begin__, prog_reader__());\n        }\n\n        current_statement_begin__ = 10;\n        if (!(context__.contains_r(\"beta\")))\n            stan::lang::rethrow_located(std::runtime_error(std::string(\"Variable beta missing\")), current_statement_begin__, prog_reader__());\n        vals_r__ = context__.vals_r(\"beta\");\n        pos__ = 0U;\n        context__.validate_dims(\"parameter initialization\", \"beta\", \"double\", context__.to_vec());\n        double beta(0);\n        beta = vals_r__[pos__++];\n        try {\n            writer__.scalar_unconstrain(beta);\n        } catch (const std::exception& e) {\n            stan::lang::rethrow_located(std::runtime_error(std::string(\"Error transforming variable beta: \") + e.what()), current_statement_begin__, prog_reader__());\n        }\n\n        current_statement_begin__ = 11;\n        if (!(context__.contains_r(\"sigma\")))\n            stan::lang::rethrow_located(std::runtime_error(std::string(\"Variable sigma missing\")), current_statement_begin__, prog_reader__());\n        vals_r__ = context__.vals_r(\"sigma\");\n        pos__ = 0U;\n        context__.validate_dims(\"parameter initialization\", \"sigma\", \"double\", context__.to_vec());\n        double sigma(0);\n        sigma = vals_r__[pos__++];\n        try {\n            writer__.scalar_lb_unconstrain(0, sigma);\n        } catch (const std::exception& e) {\n            stan::lang::rethrow_located(std::runtime_error(std::string(\"Error transforming variable sigma: \") + e.what()), current_statement_begin__, prog_reader__());\n        }\n\n        params_r__ = writer__.data_r();\n        params_i__ = writer__.data_i();\n    }\n\n    void transform_inits(const stan::io::var_context& context,\n                         Eigen::Matrix<double, Eigen::Dynamic, 1>& params_r,\n                         std::ostream* pstream__) const {\n      std::vector<double> params_r_vec;\n      std::vector<int> params_i_vec;\n      transform_inits(context, params_i_vec, params_r_vec, pstream__);\n      params_r.resize(params_r_vec.size());\n      for (int i = 0; i < params_r.size(); ++i)\n        params_r(i) = params_r_vec[i];\n    }\n\n\n    template <bool propto__, bool jacobian__, typename T__>\n    T__ log_prob(std::vector<T__>& params_r__,\n                 std::vector<int>& params_i__,\n                 std::ostream* pstream__ = 0) const {\n\n        typedef T__ local_scalar_t__;\n\n        local_scalar_t__ DUMMY_VAR__(std::numeric_limits<double>::quiet_NaN());\n        (void) DUMMY_VAR__;  // dummy to suppress unused var warning\n\n        T__ lp__(0.0);\n        stan::math::accumulator<T__> lp_accum__;\n        try {\n            stan::io::reader<local_scalar_t__> in__(params_r__, params_i__);\n\n            // model parameters\n            current_statement_begin__ = 9;\n            local_scalar_t__ alpha;\n            (void) alpha;  // dummy to suppress unused var warning\n            if (jacobian__)\n                alpha = in__.scalar_constrain(lp__);\n            else\n                alpha = in__.scalar_constrain();\n\n            current_statement_begin__ = 10;\n            local_scalar_t__ beta;\n            (void) beta;  // dummy to suppress unused var warning\n            if (jacobian__)\n                beta = in__.scalar_constrain(lp__);\n            else\n                beta = in__.scalar_constrain();\n\n            current_statement_begin__ = 11;\n            local_scalar_t__ sigma;\n            (void) sigma;  // dummy to suppress unused var warning\n            if (jacobian__)\n                sigma = in__.scalar_lb_constrain(0, lp__);\n            else\n                sigma = in__.scalar_lb_constrain(0);\n\n            // model body\n\n            current_statement_begin__ = 15;\n            lp_accum__.add(normal_log<propto__>(y, add(alpha, multiply(x, beta)), sigma));\n\n        } catch (const std::exception& e) {\n            stan::lang::rethrow_located(e, current_statement_begin__, prog_reader__());\n            // Next line prevents compiler griping about no return\n            throw std::runtime_error(\"*** IF YOU SEE THIS, PLEASE REPORT A BUG ***\");\n        }\n\n        lp_accum__.add(lp__);\n        return lp_accum__.sum();\n\n    } // log_prob()\n\n    template <bool propto, bool jacobian, typename T_>\n    T_ log_prob(Eigen::Matrix<T_,Eigen::Dynamic,1>& params_r,\n               std::ostream* pstream = 0) const {\n      std::vector<T_> vec_params_r;\n      vec_params_r.reserve(params_r.size());\n      for (int i = 0; i < params_r.size(); ++i)\n        vec_params_r.push_back(params_r(i));\n      std::vector<int> vec_params_i;\n      return log_prob<propto,jacobian,T_>(vec_params_r, vec_params_i, pstream);\n    }\n\n\n    void get_param_names(std::vector<std::string>& names__) const {\n        names__.resize(0);\n        names__.push_back(\"alpha\");\n        names__.push_back(\"beta\");\n        names__.push_back(\"sigma\");\n    }\n\n\n    void get_dims(std::vector<std::vector<size_t> >& dimss__) const {\n        dimss__.resize(0);\n        std::vector<size_t> dims__;\n        dims__.resize(0);\n        dimss__.push_back(dims__);\n        dims__.resize(0);\n        dimss__.push_back(dims__);\n        dims__.resize(0);\n        dimss__.push_back(dims__);\n    }\n\n    template <typename RNG>\n    void write_array(RNG& base_rng__,\n                     std::vector<double>& params_r__,\n                     std::vector<int>& params_i__,\n                     std::vector<double>& vars__,\n                     bool include_tparams__ = true,\n                     bool include_gqs__ = true,\n                     std::ostream* pstream__ = 0) const {\n        typedef double local_scalar_t__;\n\n        vars__.resize(0);\n        stan::io::reader<local_scalar_t__> in__(params_r__, params_i__);\n        static const char* function__ = \"model142c43103db_modeldatastan2_namespace::write_array\";\n        (void) function__;  // dummy to suppress unused var warning\n\n        // read-transform, write parameters\n        double alpha = in__.scalar_constrain();\n        vars__.push_back(alpha);\n\n        double beta = in__.scalar_constrain();\n        vars__.push_back(beta);\n\n        double sigma = in__.scalar_lb_constrain(0);\n        vars__.push_back(sigma);\n\n        double lp__ = 0.0;\n        (void) lp__;  // dummy to suppress unused var warning\n        stan::math::accumulator<double> lp_accum__;\n\n        local_scalar_t__ DUMMY_VAR__(std::numeric_limits<double>::quiet_NaN());\n        (void) DUMMY_VAR__;  // suppress unused var warning\n\n        if (!include_tparams__ && !include_gqs__) return;\n\n        try {\n            if (!include_gqs__ && !include_tparams__) return;\n            if (!include_gqs__) return;\n        } catch (const std::exception& e) {\n            stan::lang::rethrow_located(e, current_statement_begin__, prog_reader__());\n            // Next line prevents compiler griping about no return\n            throw std::runtime_error(\"*** IF YOU SEE THIS, PLEASE REPORT A BUG ***\");\n        }\n    }\n\n    template <typename RNG>\n    void write_array(RNG& base_rng,\n                     Eigen::Matrix<double,Eigen::Dynamic,1>& params_r,\n                     Eigen::Matrix<double,Eigen::Dynamic,1>& vars,\n                     bool include_tparams = true,\n                     bool include_gqs = true,\n                     std::ostream* pstream = 0) const {\n      std::vector<double> params_r_vec(params_r.size());\n      for (int i = 0; i < params_r.size(); ++i)\n        params_r_vec[i] = params_r(i);\n      std::vector<double> vars_vec;\n      std::vector<int> params_i_vec;\n      write_array(base_rng, params_r_vec, params_i_vec, vars_vec, include_tparams, include_gqs, pstream);\n      vars.resize(vars_vec.size());\n      for (int i = 0; i < vars.size(); ++i)\n        vars(i) = vars_vec[i];\n    }\n\n    static std::string model_name() {\n        return \"model142c43103db_modeldatastan2\";\n    }\n\n\n    void constrained_param_names(std::vector<std::string>& param_names__,\n                                 bool include_tparams__ = true,\n                                 bool include_gqs__ = true) const {\n        std::stringstream param_name_stream__;\n        param_name_stream__.str(std::string());\n        param_name_stream__ << \"alpha\";\n        param_names__.push_back(param_name_stream__.str());\n        param_name_stream__.str(std::string());\n        param_name_stream__ << \"beta\";\n        param_names__.push_back(param_name_stream__.str());\n        param_name_stream__.str(std::string());\n        param_name_stream__ << \"sigma\";\n        param_names__.push_back(param_name_stream__.str());\n\n        if (!include_gqs__ && !include_tparams__) return;\n\n        if (include_tparams__) {\n        }\n\n        if (!include_gqs__) return;\n    }\n\n\n    void unconstrained_param_names(std::vector<std::string>& param_names__,\n                                   bool include_tparams__ = true,\n                                   bool include_gqs__ = true) const {\n        std::stringstream param_name_stream__;\n        param_name_stream__.str(std::string());\n        param_name_stream__ << \"alpha\";\n        param_names__.push_back(param_name_stream__.str());\n        param_name_stream__.str(std::string());\n        param_name_stream__ << \"beta\";\n        param_names__.push_back(param_name_stream__.str());\n        param_name_stream__.str(std::string());\n        param_name_stream__ << \"sigma\";\n        param_names__.push_back(param_name_stream__.str());\n\n        if (!include_gqs__ && !include_tparams__) return;\n\n        if (include_tparams__) {\n        }\n\n        if (!include_gqs__) return;\n    }\n\n}; // model\n\n}  // namespace\n\ntypedef model142c43103db_modeldatastan2_namespace::model142c43103db_modeldatastan2 stan_model;\n\n"
## 
## $model_name
## [1] "modeldatastan2"
## 
## $model_code
## [1] "\ndata {\n int < lower = 1 > N; \n vector[N] x;\n vector[N] y;\n}\n\nparameters {\n real alpha; \n real beta; \n real < lower = 0 > sigma; \n}\n\nmodel {\n y ~ normal(alpha + x * beta , sigma);\n}\n\ngenerated quantities {\n}"
## attr(,"model_name2")
## [1] "modeldatastan2"

Lets save the file path

modeldatastan2 <- "modeldatastan2.stan"

Run the Stan model

modelfit <- stan(file = modeldatastan2, data = datastan2, warmup = 500, iter = 1000, chains = 4, cores = 2, thin = 1)

## Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
## Running the chains for more iterations may help. See
## http://mc-stan.org/misc/warnings.html#bulk-ess

modelfit

## Inference for Stan model: modeldatastan2.
## 4 chains, each with iter=1000; warmup=500; thin=1; 
## post-warmup draws per chain=500, total post-warmup draws=2000.
## 
##        mean se_mean   sd  2.5%   25%   50%   75% 97.5% n_eff Rhat
## alpha 12.56    0.00 0.08 12.40 12.50 12.56 12.61 12.71   826 1.00
## beta  -0.05    0.00 0.00 -0.06 -0.06 -0.05 -0.05 -0.05   942 1.00
## sigma  0.23    0.00 0.03  0.18  0.21  0.23  0.25  0.29   642 1.00
## lp__  37.32    0.06 1.32 34.13 36.73 37.64 38.30 38.85   498 1.01
## 
## Samples were drawn using NUTS(diag_e) at Mon Oct 14 23:27:49 2019.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at 
## convergence, Rhat=1).

Rhat values parameter when they are or near 1, it means chains converged. Lets look at posterior analysis.

posterior_analysis <- extract(modelfit)
str(posterior_analysis)

## List of 4
##  $ alpha: num [1:2000(1d)] 12.5 12.5 12.6 12.6 12.6 ...
##   ..- attr(*, "dimnames")=List of 1
##   .. ..$ iterations: NULL
##  $ beta : num [1:2000(1d)] -0.0568 -0.0507 -0.0575 -0.0564 -0.0554 ...
##   ..- attr(*, "dimnames")=List of 1
##   .. ..$ iterations: NULL
##  $ sigma: num [1:2000(1d)] 0.255 0.27 0.228 0.187 0.175 ...
##   ..- attr(*, "dimnames")=List of 1
##   .. ..$ iterations: NULL
##  $ lp__ : num [1:2000(1d)] 36.5 36.7 38.5 37.7 36.8 ...
##   ..- attr(*, "dimnames")=List of 1
##   .. ..$ iterations: NULL

Compare your results to our results to “lm”

plot(y_data ~ x_data, pch = 20)

abline(lm_linear, col = 2, lty = 2, lw = 3)
abline( mean(posterior_analysis$alpha), mean(posterior_analysis$beta), col = 6, lw = 2)

Lets visualisze the variability in the estimation & plot multiple estimates from posterior.

plot(y_data ~ x_data, pch = 20)

for (i in 1:500) {
 abline(posterior_analysis$alpha[i], posterior_analysis$beta[i], col = "gray", lty = 1)
}

abline(mean(posterior_analysis$alpha), mean(posterior_analysis$beta), col = 6, lw = 2)

write("
data {
 int < lower = 1 > N; 
 vector[N] x;
 vector[N] y; 
}

parameters {
 real alpha; 
 real beta; 
 real < lower = 0 > sigma; 
}

model {
 alpha ~ normal(10, 0.1);
 beta ~ normal(1, 0.1);
 y ~ normal(alpha + x * beta , sigma);
}

generated quantities {}",

"modeldatastan3.stan")

modeldatastan3 <- "modeldatastan3.stan"

plot(y_data ~ x_data, pch = 20)
for (i in 1:500) {
  abline(posterior_analysis$alpha[i], posterior_analysis$beta[i], col="gray", lty=1)
}

modelfit2 <- stan(modeldatastan3, data = datastan2, warmup = 500, iter = 1000, chains = 4, cores = 2, thin = 1)

posterior2 <- extract(modelfit2)


plot(y_data ~ x_data, pch = 20)
abline(lm_intersect, lm_slope, col = 4, lty = 2, lw = 2)
abline(mean(posterior2$alpha), mean(posterior2$beta), col = 3, lw = 2)
abline(mean(posterior_analysis$alpha), mean(posterior_analysis$beta), col = 36, lw = 3)

ANLY 505 Rstan Problem Set #1

Bhavesh Vaswani

October 13, 2019