Kata Pengantar
Workshop Bank Indonesia – Sektor Eksternal & Nilai Tukar
Bayesian time series modeling tumbuh menjadi salah satu pendekatan paling kuat untuk membaca dinamika ekonomi modern—terutama di bidang yang penuh ketidakpastian seperti sektor eksternal, nilai tukar, arus modal, serta pasar keuangan global. Di tengah lanskap global yang bergerak cepat, analis tidak hanya dituntut “mengikuti data”, tetapi juga mengintegrasikan pengetahuan sebelumnya, memperbarui keyakinan secara konsisten, dan merumuskan prediksi yang memuat ketidakpastian secara eksplisit.
Di sinilah pendekatan Bayesian menjadi sangat relevan bagi Bank Indonesia.
Dalam dua dekade terakhir, model Bayesian telah menjadi tulang punggung berbagai analisis makro-keuangan tingkat lanjut: dari Nowcasting PDB, analisis risiko nilai tukar, pemodelan volatilitas pasar global, sampai Bayesian VAR (BVAR) yang kini menjadi standar banyak bank sentral dunia. Workshop ini memanfaatkan konteks tersebut—khususnya pada sesi Analisis teknis makroekonomi dan pasar keuangan global: pendekatan Bayesian yang dijadwalkan pada 24–25 November 2025
Pendekatan Bayesian menggabungkan informasi sebelum melihat data (prior) dengan likelihood dari data untuk menghasilkan posterior, yaitu keyakinan tentang parameter setelah melihat data. Inti dari pendekatan ini adalah
Teorema Bayes:
\[ p(\theta \mid y) = \frac{p(y \mid \theta)\,p(\theta)}{p(y)}, \]
di mana \(p(y\mid\theta)\) adalah likelihood, \(p(\theta)\) adalah prior, dan \(p(\theta\mid y)\) adalah posterior.
Besaran \(p(y)\) pada penyebut disebut peluang marginal atau marginal likelihood / evidence, yang diperoleh dengan “merata-ratakan” likelihood terhadap prior:
\[ p(y) = \int p(y \mid \theta)\,p(\theta)\,d\theta, \]
(atau penjumlahan jika \(\theta\) diskrit). Nilai \(p(y)\) berfungsi sebagai konstanta normalisasi agar \(p(\theta\mid y)\) menjadi distribusi peluang yang sah (terintegrasi sama dengan 1).
Di bagian berikutnya, notasi ini sering ditulis lebih ringkas sebagai
\[ p(\theta\mid y) \propto p(y\mid\theta) p(\theta), \]
dengan memahami bahwa \(p(y)\) “disembunyikan” sebagai konstanta yang tidak bergantung pada \(\theta\).
Keunggulan Bayesian secara umum:
Bagian ini menjelaskan keuntungan menggunakan pendekatan Bayesian dalam analisis ekonomi, yang disusun selaras dengan konteks TOR “ToR & Agenda Pengembangan Kompetensi Permodelan Ekonomi Sektor Eksternal dan Nilai Tukar”—khususnya sesi:
yang menjadi fokus pelatihan Bank Indonesia.
Pendekatan Bayesian semakin banyak digunakan dalam analisis makroekonomi, ekonomi sektor eksternal, serta pasar keuangan global karena menawarkan kerangka inferensi yang lebih fleksibel, adaptif, dan informatif dibandingkan pendekatan statistik klasik. Hal ini selaras dengan tujuan pelatihan “Analisis teknis makroekonomi dan pasar keuangan global: pendekatan Bayesian” dalam TOR kegiatan pengembangan kompetensi ekonomi dan nilai tukar yang diselenggarakan oleh Bank Indonesia.
Beberapa keunggulan utamanya:
1. Mampu Mengintegrasikan Informasi Sebelum dan Sesudah Observasi
Ekonomi makro dan sektor eksternal sarat dengan prior knowledge, seperti:
Dalam Bayesian, semua informasi ini dapat dimasukkan secara eksplisit melalui prior, sehingga hasil estimasi menjadi lebih stabil dan lebih rasional ketika data terbatas atau sangat volatile—seperti pada nilai tukar, arus modal, dan harga komoditas.
2. Lebih Tahan Terhadap Ketidakpastian Tinggi (High Uncertainty Environment)
Variabel-variabel seperti:
sering menunjukkan volatilitas tinggi dan perubahan struktural. Pendekatan Bayesian memiliki posterior distribution yang secara alami menangkap ketidakpastian, sehingga:
Ini sangat relevan untuk analisis sektor eksternal sesuai agenda BI dalam TOR.
3. Lebih Andal untuk Sampel Pendek dan Data Tidak Stabil
Data makro Indonesia kadang hanya tersedia dalam horizon pendek (misalnya kuartalan), atau mengalami revisi. Bayesian:
4. Mendukung Model Kompleks: Nonlinear, Dynamic, dan Hierarchical
Pendekatan Bayesian dapat dengan mudah menangani:
Ini sesuai fokus pelatihan pada “Analisis teknis makroekonomi dan pasar keuangan global: pendekatan ekonometri lanjutan” dalam TOR.
5. Lebih Tepat untuk Forecasting dan Analisis Risiko
Bayesian menyediakan full predictive distribution:
Untuk forecasting nilai tukar, yield, inflasi global, atau pertumbuhan ekonomi, hal ini jauh lebih informatif—karena pembuat kebijakan memerlukan skenario risiko, bukan sekadar titik perkiraan.
6. Cocok untuk Integrasi Dengan Model Struktural
Banyak analisis Bank Indonesia menggunakan:
Pendekatan Bayesian sangat cocok digabungkan dengan model struktural tersebut karena prior dapat merepresentasikan teori ekonomi, sedangkan data menyediakan evidence empiris yang diperbarui secara dinamis.
7. Komputasi Modern: MCMC dan INLA
Metode komputasi modern seperti:
memungkinkan analisis Bayesian berjalan efisien dan bisa digunakan untuk model besar seperti VAR 10 variabel atau DLM dengan banyak komponen. Hal ini sejalan dengan praktik market-based analysis dan ekonometri lanjutan yang ditekankan dalam TOR pengembangan kompetensi permodelan ekonomi sektor eksternal dan nilai tukar.
Seperti telah dijelaskan sebelumnya, pendekatan Bayesian bertumpu pada tiga komponen utama: likelihood, prior, dan posterior. Ketiga elemen ini saling melengkapi dalam proses inferensi, dan penjelasannya disajikan berikut ini.
Likelihood \(L(\theta; y) = p(y\mid\theta)\) mengukur seberapa mungkin data yang kita amati muncul di bawah parameter \(\theta\). Contoh:
Gaussian: \(y_i\sim\mathrm{N}(\mu,\sigma^2)\) \(\Rightarrow\) \(L(\mu,\sigma^2;y)=\prod_i \mathrm{N}(y_i\mid\mu,\sigma^2)\).
Poisson: \(y_i\sim\text{Poisson}(\lambda)\) \(\Rightarrow\) \(L(\lambda;y)=\prod_i \text{Poisson}(y_i\mid\lambda)\).
Dalam inferensi Bayesian, kita tidak memaksimalkan likelihood, tetapi menggabungkan dengan prior untuk memperoleh posterior.
Prior mencerminkan keyakinan awal tentang parameter sebelum melihat data. Beberapa jenis prior dalam pendekatan Bayesian yang akan dijelaskan secara detail di bawah ini:
Hati-hati: prior improper tidak terintegrasi ke 1; pastikan posterior tetap proper.
Ringkas: informative (kuat), weakly-informative (lemah namun membatasi ekor), vague (sangat difus), non-informative (minim info; bisa proper/improper).
Posterior menggabungkan prior dan likelihood:
\[ p(\theta\mid y) \propto p(y\mid \theta)\,p(\theta). \]
Ketika prior konjugat terhadap likelihood, bentuk posterior tetap dalam keluarga distribusi yang sama dan memiliki bentuk tertutup (closed-form). Keuntungannya: perhitungan ringkas (mean, varians, kredibel interval) dan mudah melakukan posterior predictive.
Notasi Proporsional To \[\propto\]
Bayesian selalu bermain dengan tiga aktor: prior, likelihood, dan posterior. Rumus umumnya:
\[ p(\theta \mid y) \propto p(y \mid \theta)\, p(\theta). \]
Simbol \(\propto\) (proportional to) dipakai karena kita tidak menuliskan konstanta normalisasi. Dalam Bayes, definisi lengkap posterior sebenarnya:
\[ p(\theta \mid y) = \frac{p(y \mid \theta)\, p(\theta)}{p(y)}. \]
Masalahnya: bagian bawah, \(p(y)\)—yang disebut marginal likelihood atau evidence—sering berupa integral yang sulit atau bahkan mustahil dihitung secara eksplisit:
\[ p(y) = \int p(y \mid \theta)\, p(\theta)\, d\theta. \]
Untuk keperluan menentukan bentuk distribusi posterior, kita tidak perlu nilai pasti dari \(p(y)\). Konstanta itu tidak tergantung pada parameter \(\theta\). Jadi kita cukup menulis:
\[ p(\theta \mid y) \propto p(y \mid \theta)\, p(\theta), \]
yang artinya: posterior sebanding dengan prior × likelihood, sampai dikalikan oleh suatu konstanta supaya total luasnya = 1.
Ibaratnya begini: kalau kita hanya butuh bentuk kurva, bukan nilai absolutnya, konstanta bisa ditunda belakangan.
Jika prior konjugat, konstanta normalisasi bisa dihitung dengan mudah, sehingga posterior punya bentuk yang sama dengan prior. Inilah kenapa konjugasi itu seperti cheat-code matematis: kita tetap mendapatkan posterior yang rapi, siap dihitung mean, variance, sampai credible interval-nya tanpa drama.
Misal \(y_1,\dots,y_n \stackrel{iid}{\sim}\text{Poisson}(\lambda)\) dan prior \(\lambda \sim \text{Gamma}(\alpha_0, \beta_0)\) (bentuk shape–rate, yaitu densitas \(\propto \lambda^{\alpha-1} e^{-\beta\lambda}\)).
Maka:
\[ p(\lambda\mid y) \sim \text{Gamma}\!\Big(\alpha_n,\ \beta_n\Big),\quad \alpha_n = \alpha_0 + \sum_{i=1}^n y_i,\quad \beta_n = \beta_0 + n. \] Beberapa ringkasan:
$$ [y] = , [y] = .
$$ Posterior predictive untuk pengamatan baru \(\tilde{y}\mid y\) adalah Negative Binomial (parameterisasi size–prob):
\[ \tilde{y}\mid y \sim \text{NegBin}\big(\text{size}=\alpha_n,\ \text{prob}=\tfrac{\beta_n}{\beta_n+1}\big). \]
Detail Poisson - Gamma Model
Dokumen ini menjabarkan mengapa prior Gamma untuk
laju \(\lambda\) pada model
Poisson menghasilkan posterior yang
juga Gamma (konjugat). Kita menggunakan
parameterisasi rate untuk Gamma:
\(\text{Gamma}(\alpha, \beta)\) dengan
pdf
\[ p(\lambda)=\frac{\beta^{\alpha}}{\Gamma(\alpha)}\,\lambda^{\alpha-1} e^{-\beta \lambda},\qquad \lambda>0. \]
Model dan Prior
Misalkan \(y_1,\dots,y_n \mid \lambda \stackrel{i.i.d.}{\sim} \text{Poisson}(\lambda)\) dengan pmf
\[ p(y_i\mid \lambda)=\frac{e^{-\lambda}\lambda^{y_i}}{y_i!},\qquad y_i\in\{0,1,2,\dots\}. \]
Ambil prior \(\lambda \sim \text{Gamma}(\alpha_0,\beta_0)\):
\[ p(\lambda)=\frac{\beta_0^{\alpha_0}}{\Gamma(\alpha_0)}\,\lambda^{\alpha_0-1} e^{-\beta_0\lambda},\quad \lambda>0. \]
Likelihood Gabungan
Karena i.i.d., \[ L(\lambda; \mathrm y)=\prod_{i=1}^n p(y_i\mid \lambda) =\left(\prod_{i=1}^n \frac{1}{y_i!}\right) e^{-n\lambda} \lambda^{\sum_{i=1}^n y_i}. \]
Abaikan konstanta yang tidak bergantung pada \(\lambda\), definisikan \(S=\sum_{i=1}^n y_i\):
\[ L(\lambda; \mathrm y)\ \propto\ e^{-n\lambda}\,\lambda^{S}. \]
Posterior \(p(\lambda\mid \mathrm y)\) ∝ prior × likelihood
\[ p(\lambda\mid \mathrm y)\ \propto\ p(\lambda)\,L(\lambda;\mathrm y) \ \propto\ \Big[\lambda^{\alpha_0-1} e^{-\beta_0\lambda}\Big]\Big[e^{-n\lambda}\lambda^{S}\Big] = \lambda^{(\alpha_0-1)+S} e^{-(\beta_0+n)\lambda}. \]
Tetapkan
\[ \alpha_n := \alpha_0 + S,\qquad \beta_n := \beta_0 + n. \] Maka kernel posterior
\[ p(\lambda\mid \mathrm y)\ \propto\ \lambda^{\alpha_n-1} e^{-\beta_n\lambda}, \]
yang identik dengan Gamma(\(\alpha_n,\beta_n\)) (parameterisasi rate). Dengan konstanta normalisasi, pdf lengkapnya:
\[ p(\lambda\mid \mathrm y) = \frac{\beta_n^{\alpha_n}}{\Gamma(\alpha_n)}\,\lambda^{\alpha_n-1} e^{-\beta_n\lambda},\quad \alpha_n=\alpha_0+\textstyle\sum_i y_i,\ \beta_n=\beta_0+n. \]
Konstanta Normalisasi
Gunakan identitas \(\int_0^{\infty}
\lambda^{a-1} e^{-b\lambda}\,d\lambda = \Gamma(a)/b^a\) untuk
\(a>0,\ b>0\).
Sehingga konstanta normalisasi adalah \(\beta_n^{\alpha_n}/\Gamma(\alpha_n)\).
Ringkasan Momen Posterior
Jika \(\lambda\mid\mathrm y\sim\text{Gamma}(\alpha_n,\beta_n)\) (rate):
qgamma(q, shape=alpha_n, rate=beta_n).Prediktif Posterior ⇒ Negative Binomial
Untuk hitungan baru \(y_{new}\):
\[ p(y_{new}\mid \mathrm y) = \int p(y_{new}\mid \lambda)\,p(\lambda\mid \mathrm y)\,d\lambda = \operatorname{NegBin}\!\left(r=\alpha_n,\ p=\tfrac{\beta_n}{\beta_n+1}\right). \]
Ini berasal dari campuran Poisson–Gamma.
Implementasi di R: Simulasi dan Inferensi
Data Mainan dan Ringkasan Posterior
Kita buat contoh kecil untuk memeriksa \(_n,_n\) dan ringkasan posterior.
set.seed(1)
# Data mainan: 30 hari, laju ~ 3
n <- 30
lambda_true <- 3
y <- rpois(n, lambda_true)
S <- sum(y)
# Prior Gamma(rate)
alpha0 <- 2; beta0 <- 1 # proper, tidak terlalu informatif
alpha_n <- alpha0 + S
beta_n <- beta0 + n
c(alpha_n = alpha_n, beta_n = beta_n,
mean_post = alpha_n / beta_n,
var_post = alpha_n / beta_n^2)
## alpha_n beta_n mean_post var_post
## 95.00000000 31.00000000 3.06451613 0.09885536
Kuantil Posterior (Contoh 95% Credible Interval)
ql <- qgamma(0.025, shape = alpha_n, rate = beta_n)
qu <- qgamma(0.975, shape = alpha_n, rate = beta_n)
c(q025 = ql, q975 = qu)
## q025 q975
## 2.479377 3.710716
Catatan Parameterisasi
Jika menggunakan parameterisasi scale \(/\), maka posterior tetap Gamma dengan \(_n = 1/(_0+n)\) dan pdf menyesuaikan. Pastikan konsisten antara rate vs scale saat memakai fungsidgamma/qgamma/rgamma.
Perbandingan ML vs Bayesian pada Model Poisson
Di bagian ini, kita bandingkan:
Simulasi Data dan Estimasi ML
set.seed(123)
# Data: hitung kejadian per hari (misal 30 hari)
n <- 30
lambda_true <- 3.0
y <- rpois(n, lambda_true)
sum_y <- sum(y)
# MLE untuk λ pada model Poisson i.i.d.
lambda_mle <- mean(y)
lambda_mle
## [1] 3.4
Varian aproksimasi untuk \(_{ML}\) dan interval kepercayaan (normal/Wald dan eksak):
alpha <- 0.05
S <- sum_y
lambda_mle <- mean(y)
se_mle <- sqrt(lambda_mle / n)
# CI Wald (asymptotik normal)
z <- qnorm(1 - alpha/2)
ci_wald <- c(lambda_mle - z * se_mle, lambda_mle + z * se_mle)
ci_wald[ci_wald < 0] <- 0 # potong di 0 bila perlu
# CI eksak (berbasis Chi-square pada total μ = nλ)
if (S == 0) {
mu_low <- 0
mu_upp <- 0.5 * qchisq(1 - alpha/2, df = 2*(S + 1))
} else {
mu_low <- 0.5 * qchisq(alpha/2, df = 2*S)
mu_upp <- 0.5 * qchisq(1 - alpha/2, df = 2*(S + 1))
}
ci_exact <- c(mu_low / n, mu_upp / n)
# Log-likelihood pada λ_hat (abaikan konstanta lfactorial)
loglik <- sum(dpois(y, lambda_mle, log = TRUE))
k <- 1L
AIC_mle <- 2*k - 2*loglik
mle_tab <- data.frame(
method = "MLE (Poisson)",
lambda_mle = lambda_mle,
se = se_mle,
ci_wald_lo = ci_wald[1], ci_wald_hi = ci_wald[2],
ci_exact_lo = ci_exact[1], ci_exact_hi = ci_exact[2],
logLik = loglik,
AIC = AIC_mle
)
mle_tab
## method lambda_mle se ci_wald_lo ci_wald_hi ci_exact_lo
## 1 MLE (Poisson) 3.4 0.3366502 2.740178 4.059822 2.772294
## ci_exact_hi logLik AIC
## 1 4.127368 -60.35731 122.7146
Beberapa Skema Prior dan Posteriornya
# PRIOR: tiga skenario
priors <- list(
informative = list(alpha0 = 15, beta0 = 5), # mean 3, sd ~0.77
weakly = list(alpha0 = 3, beta0 = 1), # mean 3, sd ~1.73
vague = list(alpha0 = 0.1, beta0 = 0.1) # sangat difus (tetap proper)
)
post_tab <- data.frame(
prior = character(),
alpha0 = numeric(), beta0 = numeric(),
alpha_n = numeric(), beta_n = numeric(),
mean_post = numeric(), sd_post = numeric(),
q025 = numeric(), q975 = numeric(),
stringsAsFactors = FALSE
)
for (nm in names(priors)) {
a0 <- priors[[nm]]$alpha0
b0 <- priors[[nm]]$beta0
an <- a0 + sum_y
bn <- b0 + n
mn <- an / bn
sd <- sqrt(an) / bn
ql <- qgamma(0.025, shape = an, rate = bn)
qu <- qgamma(0.975, shape = an, rate = bn)
post_tab <- rbind(post_tab, data.frame(
prior = nm, alpha0 = a0, beta0 = b0,
alpha_n = an, beta_n = bn,
mean_post = mn, sd_post = sd,
q025 = ql, q975 = qu
))
}
post_tab
## prior alpha0 beta0 alpha_n beta_n mean_post sd_post q025 q975
## 1 informative 15.0 5.0 117.0 35.0 3.342857 0.3090473 2.764634 3.975170
## 2 weakly 3.0 1.0 105.0 31.0 3.387097 0.3305468 2.770310 4.064949
## 3 vague 0.1 0.1 102.1 30.1 3.392027 0.3356962 2.766083 4.080859
Gabungkan ringkasan Bayesian (mean posterior + credible interval) dan MLE (dengan CI eksak):
compare_tab <- rbind(
setNames(
data.frame(
prior = post_tab$prior,
estimator = "Posterior (mean)",
point = post_tab$mean_post,
lo = post_tab$q025,
hi = post_tab$q975,
stringsAsFactors = FALSE
),
c("prior","estimator","point","lo","hi")
),
data.frame(
prior = "—",
estimator = "MLE",
point = mle_tab$lambda_mle,
lo = mle_tab$ci_exact_lo, # pakai CI eksak agar konservatif
hi = mle_tab$ci_exact_hi,
stringsAsFactors = FALSE
)
)
compare_tab
## prior estimator point lo hi
## 1 informative Posterior (mean) 3.342857 2.764634 3.975170
## 2 weakly Posterior (mean) 3.387097 2.770310 4.064949
## 3 vague Posterior (mean) 3.392027 2.766083 4.080859
## 4 — MLE 3.400000 2.772294 4.127368
Visualisasi: Prior vs Posterior
Prior dan Posterior dalam Satu Gambar
Pilih satu skenario prior untuk divisualisasikan (misalnya weakly-informative).
nm <- "weakly"
a0 <- priors[[nm]]$alpha0
b0 <- priors[[nm]]$beta0
an <- a0 + sum_y
bn <- b0 + n
lam <- seq(0, max(8, lambda_true*3), length.out = 400)
tibble(
lambda = lam,
Prior = dgamma(lam, a0, b0),
Posterior = dgamma(lam, an, bn)
) |>
pivot_longer(cols = c(Prior, Posterior),
names_to = "Distribusi",
values_to = "densitas") |>
ggplot(aes(x = lambda, y = densitas, color = Distribusi)) +
geom_line(linewidth = 1) +
geom_vline(xintercept = lambda_true, linetype = "dashed", color = "grey40") +
annotate("text", x = lambda_true, y = max(dgamma(lam, an, bn))*0.9,
label = "lambda true", angle = 90, vjust = -0.5, size = 3) +
labs(
title = paste("Prior vs Posterior (", nm, " prior )", sep = ""),
x = expression(lambda),
y = "Kerapatan"
)
Posterior Predictive: Distribusi Negative Binomial
Untuk satu hari ke depan, posterior predictive bagi hitungan baru \(\) adalah Negative Binomial:
\[ y (=_n,=). \]
Berikut ilustrasi distribusinya untuk skenario weakly-informative prior di atas:
size <- an
prob <- bn/(bn+1)
k <- 0:15
pmf <- dnbinom(k, size = size, prob = prob)
tibble(y_new = k, pmf = pmf) |>
ggplot(aes(x = y_new, y = pmf)) +
geom_col(width = 0.6) +
geom_point(size = 2) +
labs(
title = "Posterior Predictive: 1 Hari ke Depan (Negative Binomial)",
x = expression(tilde(y)),
y = "Probabilitas"
)
Ringkasan
Catatan
Markov Chain Monte Carlo (MCMC) membangkitkan sampel dari posterior ketika bentuk tertutup sulit/ tak tersedia. Dua skema umum:
Bagian ini menjelaskan secara sistematis:
Konsep Monte Carlo (MC)
Konsep Markov Chain Monte Carlo (MCMC)
Formulasi matematis dasar MCMC
Dua algoritma penting:
Contoh implementasi sederhana dalam R (simulasi dan plot).
Fokusnya adalah intuisi statistik dan formulasi matematis yang bersih, sehingga mudah diintegrasikan ke dalam bab pengantar Bayesian inference atau Bayesian time series.
Inti dari metode Monte Carlo:
Menggunakan bilangan acak untuk menghitung nilai rata-rata, peluang, atau integral yang sulit dihitung secara analitik.
Secara formal, misalkan \(X\) adalah peubah acak dengan distribusi \(p(x)\), dan kita ingin menghitung ekspektasi:
\[ \theta = \mathbb{E}[g(X)] = \int g(x)\, p(x)\, dx. \]
Sering kali integral ini tidak memiliki bentuk tertutup. Metode Monte Carlo mendekati \(\theta\) dengan langkah:
Bangkitkan sampel acak i.i.d.
\[
X_1, X_2, \dots, X_N \sim p(x).
\]
Gunakan rata-rata sampel sebagai estimator:
\[ \hat{\theta}_N = \frac{1}{N} \sum_{i=1}^{N} g(X_i). \]
Berdasarkan Hukum Bilangan Besar (LLN):
\[ \hat{\theta}_N \xrightarrow[N\to\infty]{\text{a.s.}} \theta. \]
Selain itu, Teorema Limit Pusat (CLT) memberikan:
\[ \sqrt{N}\,\big(\hat{\theta}_N - \theta\big) \;\xrightarrow{d}\; \mathrm{N}\big(0,\; \sigma^{2}\big), \qquad \sigma^{2} = \operatorname{Var}\!\big(g(X)\big). \]
Karena itu, galat Monte Carlo berorde:
\[ O(N^{-1/2}). \]
Contoh: Aproksimasi Integral dengan Monte Carlo
Misalkan kita ingin menghitung integral:
\[ I = \int_0^1 e^{-x^2}\,dx, \]
yang tidak memiliki bentuk tertutup dalam fungsi elementer.
Kita dapat menuliskan integral ini sebagai ekspektasi dengan memilih
\(X \sim \mathrm{Unif}(0,1)\).
Untuk sebaran seragam di \([0,1]\),
fungsi kerapatannya adalah \(p(x) = 1\)
untuk \(0 \le x \le 1\).
Maka:
\[ I = \int_0^1 e^{-x^2}\,dx = \int_0^1 e^{-x^2} \cdot 1\,dx = \mathbb{E}[g(X)], \]
dengan
\[ g(X) = e^{-X^2}, \quad X \sim \mathrm{Unif}(0,1). \]
Estimator Monte Carlo untuk \(I\) adalah:
\[ \hat{I}_N = \frac{1}{N} \sum_{i=1}^N e^{-X_i^2}, \quad X_i \sim \mathrm{Unif}(0,1) \text{ i.i.d.} \]
Semakin besar \(N\), nilai \(\hat{I}_N\) akan semakin mendekati nilai integral \(I\), dengan galat berorde \(\mathrm{O}(N^{-1/2})\).
Contoh Kode R
set.seed(123)
N <- 10000
x <- runif(N, min = 0, max = 1)
g <- exp(-x^2)
I_hat <- mean(g)
I_hat
## [1] 0.7488739
Nilai I_hat adalah aproksimasi Monte Carlo untuk
integral
\[ \int_0^1 e^{-x^2}\,dx. \]
Rantai Markov
Sebuah rantai Markov \(\{X_t\}\) pada ruang keadaan \(\mathbb{X}\) memiliki sifat Markov:
\[ \mathbb{P}(X_{t+1}\in A \mid X_t=x_t, X_{t-1},\dots,X_0) = \mathbb{P}(X_{t+1}\in A \mid X_t=x_t) \]
untuk setiap himpunan \(A\) dan semua \(t\).
Didefinisikan melalui kernel transisi \(K(x, x')\):
Jika ada distribusi \(\pi(x)\) sehingga:
\[ \pi(x') = \int \pi(x) K(x,x')\, dx, \]
maka \(\pi\) disebut distribusi stasioner (invariant) dari rantai Markov.
Sifat yang sering digunakan adalah detailed balance:
\[ \pi(x) K(x,x') = \pi(x') K(x',x), \]
yang menjamin bahwa \(\pi\) adalah distribusi stasioner.
Jika rantai Markov tersebut ergodik (irreducible, aperiodic, positif rekuren), maka untuk fungsi yang wajar \(g\):
\[ \frac{1}{N}\sum_{t=1}^N g(X_t) \xrightarrow[N\to\infty]{} \mathbb{E}_{\pi}[g(X)], \]
di mana \(X \sim \pi\).
Konsep MCMC
Dalam Bayesian, kita ingin mengambil sampel dari distribusi target \(\pi(\theta)\), misalnya posterior:
\[ \pi(\theta) = p(\theta\mid y) = \frac{p(y\mid\theta)p(\theta)}{p(y)}. \]
Ketika \(\pi(\theta)\) tidak dapat disampel langsung (tidak ada bentuk tertutup, dimensi tinggi, dsb.), kita:
Membangun suatu rantai Markov yang distribusi stasionernya adalah \(\pi(\theta)\), kemudian menggunakan sampelnya untuk menghitung ekspektasi/integral secara Monte Carlo.
Secara umum:
\[ \hat{\mathbb{E}}_{\pi}[g(\theta)] = \frac{1}{N-B} \sum_{t=B+1}^N g(\theta^{(t)}). \]
Ide Dasar
Metropolis–Hastings adalah algoritma MCMC generik untuk menghasilkan rantai Markov dengan distribusi target \(\pi(\theta)\).
Diberikan proposal \(q(\theta' \mid \theta)\):
\[ \theta^* \sim q(\theta^* \mid \theta^{(t)}). \]
\[ r(\theta^{(t)},\theta^*) = \frac{\pi(\theta^*) q(\theta^{(t)}\mid \theta^*)} {\pi(\theta^{(t)}) q(\theta^* \mid \theta^{(t)})}. \]
\[ \alpha(\theta^{(t)},\theta^*) = \min\{1, r(\theta^{(t)},\theta^*)\}. \]
Proposal ini menjamin \(\pi\) sebagai distribusi stasioner.
Kasus Khusus: Random-Walk Metropolis
Jika proposal simetris:
\[ q(\theta^* \mid \theta^{(t)}) = q(\theta^{(t)} \mid \theta^*), \]
misalnya:
\[ \theta^* = \theta^{(t)} + \varepsilon, \qquad \varepsilon \sim \mathrm{N}(0,\sigma^2), \]
maka:
\[ r(\theta^{(t)},\theta^*) = \frac{\pi(\theta^*)}{\pi(\theta^{(t)})}. \]
Contoh: Target Normal 1-Dimensi
Misal target:
\[ \pi(\theta) = \mathrm{N}(0,1). \]
Kode R:
log_pi <- function(theta) dnorm(theta, 0, 1, log = TRUE)
mh_rw <- function(n_iter = 5000, theta_init = 0, proposal_sd = 0.5) {
theta <- numeric(n_iter)
theta[1] <- theta_init
accept <- 0
for (t in 2:n_iter) {
current <- theta[t-1]
proposal <- rnorm(1, current, proposal_sd)
log_r <- log_pi(proposal) - log_pi(current)
if (log(runif(1)) < log_r) {
theta[t] <- proposal
accept <- accept + 1
} else theta[t] <- current
}
list(theta = theta, acc_rate = accept/(n_iter-1))
}
set.seed(123)
mh_out <- mh_rw(5000, 0, 0.7)
mh_out$acc_rate
## [1] 0.7881576
Jika target \(\pi(\theta_1,\dots,\theta_d)\) memiliki full conditionals yang dapat disampel langsung:
\[ \theta_1 \mid \theta_2,\dots,\theta_d, y, \]
\[ \theta_2 \mid \theta_1,\theta_3,\dots,\theta_d, y, \]
maka Gibbs bekerja dengan mensampling satu per satu.
Contoh: Bivariat Normal
Dikenal bahwa:
\[ X \mid Y=y \sim \mathrm{N}(\rho y, 1-\rho^2), \qquad Y \mid X=x \sim \mathrm{N}(\rho x, 1-\rho^2). \]
Kode:
gibbs_bvn <- function(n_iter = 5000, rho = 0.8) {
x <- y <- numeric(n_iter)
sd_cond <- sqrt(1-rho^2)
for (t in 2:n_iter) {
x[t] <- rnorm(1, rho*y[t-1], sd_cond)
y[t] <- rnorm(1, rho*x[t], sd_cond)
}
tibble(iter = 1:n_iter, x = x, y = y)
}
set.seed(123)
gibbs_out <- gibbs_bvn(5000, 0.8)
burn <- 1000
gibbs_post <- gibbs_out |> dplyr::filter(iter > burn)
Plot:
gibbs_post |>
ggplot2::ggplot(aes(x = x, y = y)) +
ggplot2::geom_point(alpha = 0.4) +
ggplot2::coord_equal() +
ggplot2::labs(title = "Sampel Gibbs dari Bivariat Normal (rho=0.8)")
Konsep Singkat MCMC
Berikut versi yang telah diperbaiki dari teks ringkas mengenai MCMC:
Markov Chain Monte Carlo (MCMC) adalah keluarga metode simulasi yang membangkitkan sampel dari distribusi target (misalnya distribusi posterior Bayesian) ketika bentuk tertutupnya sulit atau tidak tersedia. Gagasan utamanya adalah membangun suatu rantai Markov yang memiliki distribusi stasioner sama dengan distribusi target tersebut.
Dua skema yang paling sering digunakan adalah: - Gibbs sampling, yang mengambil sampel secara bergantian dari sebaran bersyarat lengkap (full conditional distributions).
- Metropolis–Hastings (MH), yang membangkitkan kandidat dari sebaran proposal lalu menerima atau menolaknya berdasarkan rasio antara nilai distribusi target pada kandidat dan pada keadaan sebelumnya.
Ringkasan MCMC
Struktur ini dapat dijadikan bab pengantar untuk bagian Bayesian computation dalam e-book, dan mudah diperluas ke topik lanjutan seperti Hamiltonian Monte Carlo, adaptive MCMC, dan MCMC untuk model hierarkis berdimensi tinggi.
Bayesian inference (penarikan kesimpulan dengan kerangka probabilistik) semakin banyak digunakan dalam statistik terapan, terutama pada model hierarkis atau model dengan efek laten (latent effects). Namun sebuah tantangan klasik muncul: komputasi — banyak model Bayesian yang sofistikated tidak punya solusi analitik dan memerlukan teknik sampling seperti MCMC (Markov chain Monte Carlo) yang kadang memakan waktu banyak atau butuh perhatian khusus terhadap konvergensi.
Metode INLA muncul sebagai salah satu “jalan pintas” yang deterministik dan cepat untuk jenis model tertentu — yakni Latent Gaussian Models (LGMs). Dengan metode ini kita memperoleh aproksimasi yang cukup akurat terhadap distribusi marginal posterior parameter (dan efek laten), tanpa menjalankan sampel MCMC panjang-lebar.
Menurut literatur utama oleh Håvard Rue dan kolega, INLA mampu mengatasi banyak model dengan efisiensi komputasi yang jauh lebih tinggi daripada MCMC tradisional.
Dokumen ini membahas pendekatan Laplace untuk menghampiri integral berjenis
\[ I = \int_{\mathbb{R}^d} \exp\{\ell(\theta)\} \, d\theta, \]
dan membandingkannya dengan:
Fokus kita adalah contoh integral satu dimensi yang cukup sederhana, sehingga semua pendekatan dapat dihitung dan dibandingkan secara eksplisit.
Catatan: # Transformasi Awal: dari \(f(\theta)\) ke \(\exp(\ell(\theta))\)
Kita mulai dari integral umum bentuk dasar:
\[ I = \int_{\mathbb{R}^d} f(\theta) \, d\theta, \]
dengan syarat \(f(\theta) > 0\). Kita definisikan
\[ \ell(\theta) = \log f(\theta), \]
sehingga berlaku
\[ f(\theta) = \exp(\ell(\theta)) = \exp(\log f(\theta)). \]
Dengan memasukkan kembali ke integral awal, kita peroleh:
\[ \begin{aligned} I &= \int_{\mathbb{R}^d} f(\theta)\, d\theta \\\\ &= \int_{\mathbb{R}^d} \exp(\log f(\theta))\, d\theta \\\\ &= \int_{\mathbb{R}^d} \exp\{\ell(\theta)\}\, d\theta. \end{aligned} \]
Bentuk ini menjadi dasar pendekatan Laplace approximation, karena kita akan mengaproksimasi \(\ell(\theta)\) dengan ekspansi Taylor orde dua di sekitar mode-nya.
Misalkan kita ingin menghitung
\[ I = \int_{-\infty}^{\infty} \exp\{\ell(x)\} \, dx, \]
dengan \(\ell(x)\) halus (dapat diturunkan beberapa kali) dan memiliki mode unik di \(x = \hat x\). Artinya,
Di sekitar \(\hat x\), kita dapat melakukan ekspansi Taylor orde dua:
\[ \ell(x) \approx \ell(\hat x) + \frac{1}{2} \ell''(\hat x) (x - \hat x)^2. \]
Karena \(\ell''(\hat x) < 0\), tulis
\[ \ell''(\hat x) = -a, \quad a > 0. \]
Sehingga
\[ \ell(x) \approx \ell(\hat x) - \frac{a}{2}(x - \hat x)^2. \]
Akibatnya,
\[ \exp\{\ell(x)\} \approx \exp\{\ell(\hat x)\} \exp\left\{-\frac{a}{2}(x - \hat x)^2\right\}, \]
yang merupakan kernel distribusi Normal dengan varian \(1/a\).
Integralnya menjadi
\[ I \approx \exp\{\ell(\hat x)\} \int_{-\infty}^{\infty} \exp\left\{-\frac{a}{2}(x - \hat x)^2\right\} dx = \exp\{\ell(\hat x)\} \sqrt{\frac{2\pi}{a}}. \]
Karena \(a = -\ell''(\hat x)\), rumus Laplace satu dimensi adalah
\[ \boxed{ I_{\text{Lap}} \approx \exp\{\ell(\hat x)\} \sqrt{\frac{2\pi}{-\ell''(\hat x)}} }. \]
Jika \(\theta \in \mathbb{R}^d\) dan \(\ell(\theta)\) mempunyai mode \(\hat\theta\), dengan Hessian \(H(\hat\theta)\) (matriks turunan kedua), maka di sekitar \(\hat\theta\):
\[ \ell(\theta) \approx \ell(\hat\theta) - \frac{1}{2} (\theta - \hat\theta)^T A (\theta - \hat\theta), \]
dengan \(A = -H(\hat\theta)\) positif-definit. Integral
\[ I = \int_{\mathbb{R}^d} \exp\{\ell(\theta)\} \, d\theta \]
dihampiri oleh
\[ \boxed{ I_{\text{Lap}} \approx \exp\{\ell(\hat\theta)\} (2\pi)^{d/2} |A|^{-1/2} = \exp\{\ell(\hat\theta)\} (2\pi)^{d/2} \big| -H(\hat\theta) \big|^{-1/2} }. \]
Dalam konteks Bayesian, \(\ell(\theta)\) sering berupa log-posterior (log-likelihood + log-prior) dan \(I\) adalah konstanta normalisasi atau marginal likelihood.
Pertama kita ambil contoh di mana pendekatan Laplace menghasilkan nilai eksak karena bentuk integrandanya memang Gaussian.
Pertimbangkan integral
\[ I_1 = \int_{-\infty}^{\infty} \exp\left\{ -\frac{(x-1)^2}{2} \right\} dx. \]
Kita definisikan
\[ \ell(x) = -\frac{(x-1)^2}{2}. \]
Dari rumus satu dimensi:
\[ I_{1,\text{Lap}} = \exp\{\ell(1)\} \sqrt{\frac{2\pi}{-\ell''(1)}} = \exp\{0\} \sqrt{\frac{2\pi}{1}} = \sqrt{2\pi}. \]
Secara umum, untuk \(\mu \in \mathbb{R}\), \(\sigma > 0\),
\[ \int_{-\infty}^{\infty} \exp\left\{ -\frac{(x-\mu)^2}{2\sigma^2} \right\} dx = \sqrt{2\pi}\,\sigma. \]
Di sini \(\mu = 1\) dan \(\sigma = 1\), sehingga
\[ I_1 = \sqrt{2\pi}. \]
Artinya:
Sekarang kita ambil integrand yang sedikit lebih kompleks:
\[ I_2 = \int_{-\infty}^{\infty} \exp\left\{ -\frac{x^2}{2} - \frac{x^4}{8} \right\} dx. \]
Ini bisa dilihat sebagai “Gaussian” dengan tambahan penalti \(x^4\) yang membuat ekor turun lebih cepat dan puncak menjadi sedikit berbeda dari Gaussian murni.
Definisikan
\[ \ell(x) = -\frac{x^2}{2} - \frac{x^4}{8}. \]
Jelas fungsi ini simetris dan bernilai maksimum di \(x = 0\):
Di mode \(x = 0\):
\[ \ell''(0) = -1. \]
Dengan \(\hat x = 0\), \(\ell(0) = 0\), dan \(\ell''(0) = -1\), diperoleh:
\[ I_{2,\text{Lap}} = \exp\{0\} \sqrt{\frac{2\pi}{1}} = \sqrt{2\pi} \approx 2.5066. \]
Integral \(I_2\) tidak punya bentuk
elementary yang sederhana, namun bisa dihitung dengan fungsi
integrate() di R yang menggunakan metode numerik
adaptif.
f2 <- function(x) {
exp(-x^2 / 2 - x^4 / 8)
}
res_integrate <- integrate(f2, lower = -Inf, upper = Inf, rel.tol = 1e-10)
res_integrate
## 2.101961 with absolute error < 1.3e-11
I2_true <- res_integrate$value
I2_true
## [1] 2.101961
Nilai I2_true ini dapat dianggap sebagai nilai
mendekati eksak (analitik numerik presisi tinggi).
Secara simbolik kita sudah hitung bahwa \(I_{2,\text{Lap}} = \sqrt{2\pi}\).
Untuk keperluan perbandingan numerik, kita hitung juga di R:
I2_laplace <- sqrt(2 * pi)
I2_laplace
## [1] 2.506628
Sekarang kita dekati \(I_2\)
menggunakan importance sampling.
Ide dasarnya:
\[ I_2 = \int_{-\infty}^{\infty} \exp\left\{ -\frac{x^2}{2} - \frac{x^4}{8} \right\} dx. \]
Pilih proposal \(q(x)\) yang mudah disampel, misalnya Normal standar:
\[ q(x) = \phi(x) = \frac{1}{\sqrt{2\pi}} \exp\left\{ -\frac{x^2}{2} \right\}. \]
Tuliskan
\[ I_2 = \int \frac{\exp\left\{ -\frac{x^2}{2} - \frac{x^4}{8} \right\}}{q(x)}\, q(x) \, dx = \mathbb{E}_q[w(X)], \]
dengan bobot
\[ w(x) = \frac{\exp\left\{ -\frac{x^2}{2} - \frac{x^4}{8} \right\}} {\phi(x)}. \]
Secara praktis:
\[ \hat I_{2,\text{MC}} = \frac{1}{N} \sum_{s=1}^N w^{(s)}. \]
Implementasi di R:
set.seed(123)
N <- 100000
x <- rnorm(N) # proposal q(x) = N(0,1)
# density Normal standar
phi <- function(x) {
dnorm(x, mean = 0, sd = 1)
}
w <- exp(-x^2 / 2 - x^4 / 8) / phi(x)
I2_mc <- mean(w)
I2_mc
## [1] 2.101911
Kita juga dapat menghitung simpangan baku Monte Carlo:
se_mc <- sd(w) / sqrt(N)
se_mc
## [1] 0.002080523
Artinya, secara kira-kira \(I_{2,\text{MC}} \pm 2 \times \text{se}\) memberikan interval kepercayaan 95% (asumsi CLT) untuk nilai integral yang sesungguhnya.
Kita rangkum ketiga pendekatan untuk integral \(I_2\):
data.frame(
Metode = c("Integrate (analitik numerik)",
"Laplace approximation",
"Monte Carlo (importance sampling)"),
Nilai = c(I2_true, I2_laplace, I2_mc)
)
## Metode Nilai
## 1 Integrate (analitik numerik) 2.101961
## 2 Laplace approximation 2.506628
## 3 Monte Carlo (importance sampling) 2.101911
Kita juga dapat menghitung error relatif Laplace dan Monte Carlo terhadap nilai numerik presisi tinggi:
err_laplace <- (I2_laplace - I2_true) / I2_true
err_mc <- (I2_mc - I2_true) / I2_true
c(
error_rel_Laplace = err_laplace,
error_rel_MC = err_mc
)
## error_rel_Laplace error_rel_MC
## 1.925190e-01 -2.391327e-05
Interpretasi umum:
integrate() di R)
memberikan aproksimasi yang sangat presisi untuk integral satu dimensi
ini, sehingga dapat dianggap sebagai “ground truth”.Dalam kerangka Bayesian, bentuk integral seperti
\[ I = \int \exp\{\ell(\theta)\} \, d\theta \]
sering muncul sebagai:
Pendekatan Laplace banyak digunakan untuk:
Di sisi lain, MCMC tidak menghitung konstanta normalisasi secara eksplisit, tetapi:
Untuk menghampiri integral seperti \(I_2\) di atas, teknik Monte Carlo (importance sampling, MCMC, dsb.) memandang integral sebagai ekspektasi di bawah suatu distribusi, lalu menggunakan rata-rata sampel untuk menghitungnya.
Secara garis besar:
Penutup
Dokumen ini menunjukkan bagaimana:
Contoh sederhana ini dapat dengan mudah dikembangkan menjadi latihan praktikum, misalnya dengan:
Sebelum membahas INLA, mari kita lihat ide dasar Laplace Approximation — ini merupakan salah satu “blok bangunan” utama.
Ide dasar
Dalam Bayesian, sering kali kita menemui integral atau distribusi yang kompleks seperti:
\[ p(\theta \mid y) \propto p(y \mid \theta)\,p(\theta). \]
Jika distribusi itu sulit dihitung langsung, Laplace Approximation mencoba menggantinya dengan aproksimasi Gaussian yang “terpusat” pada modus distribusi (mode), dengan penyebaran yang dikaitkan dengan curvature (Hessian) dari log-densitas di modus.
Secara garis besar:
Kapan dan kenapa berguna
Konsep INLA (Integrated Nested Laplace Approximation)
Setelah memahami Laplace secara umum, mari masuk ke INLA — yaitu bagaimana metode ini memanfaatkan ide-ide Laplace untuk inference Bayesian pada model laten Gaussian.
Model yang disasar
INLA ditujukan pada model yang disebut Latent Gaussian Models (LGMs):
\[ \mathrm y \mid \mathrm x, \boldsymbol\theta \sim \prod_i p(y_i \mid x_i, \boldsymbol\theta), \]
\[ \mathrm x \mid \boldsymbol\theta \sim \text{Gaussian Markov Random Field (GMRF)}(\boldsymbol\theta), \]
\[ \boldsymbol\theta \sim p(\boldsymbol\theta). \]
Kelas LGMs ini sangat luas dan mencakup banyak model populer seperti generalized additive mixed models, model spasial, dan model spatio-temporal.
Mekanisme “integrated” dan “nested”
Tujuan utama INLA adalah memperoleh marginals seperti:
\[ \pi(x_i \mid \mathrm y), \qquad \pi(\theta_j \mid \mathrm y). \]
Daripada melakukan sampling penuh dari \(\pi(\mathrm x, \boldsymbol\theta \mid \mathrm y)\), INLA melakukan dua tahap aproksimasi secara nested:
Untuk setiap nilai \(\boldsymbol\theta_k\), aproksimasi \[ \pi(\mathrm x \mid \boldsymbol\theta_k, \mathrm y) \] menggunakan Laplace (atau varian yang lebih sederhana / lebih kasar).
Integrasi atas \(\boldsymbol\theta\) dilakukan secara numerik (grid, desain komposit, dsb.) untuk mendapatkan marginals: \[ \pi(x_i \mid \mathrm y), \qquad \pi(\theta_j \mid \mathrm y). \]
Dari sini muncul istilah Integrated Nested Laplace
Approximation:
- Integrated → melakukan integrasi numerik atas hyperparameter
\(\boldsymbol\theta\);
- Nested → aproksimasi Laplace di-nest untuk efek
laten \(\mathrm x\) di dalam tiap nilai
\(\boldsymbol\theta\);
- Laplace → menggunakan Laplace Approximation pada tahap
kunci.
Kelebihan dan kondisi
Kelebihan:
Keterbatasan / kondisi:
Contoh Sketsa Laplace dalam Bayesian
Untuk memperjelas ide Laplace tanpa langsung masuk ke INLA, berikut contoh sketsa masalah Bayesian sederhana.
Masalah umum
Misalkan kita punya parameter \(\theta\) dan data \(y\), dengan posterior:
\[ p(\theta \mid y) \propto p(y \mid \theta)\,p(\theta). \]
Bila bentuk ini tidak memiliki solusi tertutup, kita dapat melakukan aproksimasi Laplace.
Langkah Laplace
Contoh Model Regresi Sederhana Model dan Data Sederhana
Kita mulai dari model regresi linear tanpa intercept:
\[ y_i = \beta x_i + \varepsilon_i, \qquad \varepsilon_i \sim N(0, \sigma^2). \]
Kita gunakan data kecil (supaya efek prior lebih terasa):
\[ x = (1,2,3,4,5), \qquad y \approx (1.1, 2.4, 2.9, 3.8, 5.2). \]
x <- 1:5
y <- c(1.1, 2.4, 2.9, 3.8, 5.2)
dat <- data.frame(x = x, y = y)
dat
## x y
## 1 1 1.1
## 2 2 2.4
## 3 3 2.9
## 4 4 3.8
## 5 5 5.2
Plot data:
plot(x, y, pch = 16,
xlab = "x",
ylab = "y",
main = "Data Regresi Linear Sederhana")
Sebagai pembanding, kita hitung dulu estimator OLS (tanpa intercept):
\[ \hat\beta_{\text{OLS}} = \frac{\sum_i x_i y_i}{\sum_i x_i^2}. \]
beta_ols <- sum(x * y) / sum(x^2)
beta_ols
## [1] 1.014545
Prior dan Posterior (Pendekatan Manual dengan Laplace)
Kita gunakan prior Normal sederhana untuk slope:
\[ \beta \sim N(0, \tau^2), \qquad \tau^2 = 10^2. \]
Untuk kemudahan, varians error \(\sigma^2\) kita dekati dengan varians residual OLS (plug-in):
# residu dari OLS tanpa intercept
resid_ols <- y - beta_ols * x
sigma2_hat <- mean(resid_ols^2)
tau2 <- 100 # prior variance
sigma2_hat
## [1] 0.04967273
Posterior (dengan \(\sigma^2\) dianggap diketahui) menjadi:
\[ p(\beta \mid y) \propto \exp\left\{ -\frac{1}{2\sigma^2} \sum_i (y_i - \beta x_i)^2 -\frac{1}{2\tau^2} \beta^2 \right\}. \]
Ini adalah posterior Gaussian, sehingga Laplace
Approximation menjadi eksak.
Mean (modus) dan varians posterior:
\[ v_n = \left( \frac{1}{\tau^2} + \frac{\sum_i x_i^2}{\sigma^2} \right)^{-1}, \]
\[ m_n = v_n \left( \frac{\sum_i x_i y_i}{\sigma^2} + \frac{0}{\tau^2} \right) = v_n \left( \frac{\sum_i x_i y_i}{\sigma^2} \right). \]
S_x2 <- sum(x^2)
S_xy <- sum(x * y)
tau2 <- 100 # prior variance
sigma2 <- sigma2_hat # plug-in
v_n <- 1 / ( (1 / tau2) + S_x2 / sigma2 )
m_n <- v_n * (S_xy / sigma2)
beta_post_mean_manual <- m_n
beta_post_sd_manual <- sqrt(v_n)
c(beta_post_mean_manual, beta_post_sd_manual)
## [1] 1.01453629 0.03005216
Jadi, secara manual dengan pendekatan Gaussian (Laplace), kita peroleh:
Analisis yang Sama dengan INLA
Sekarang kita gunakan INLA dengan spesifikasi model yang sama:
Dalam INLA, prior Normal ditentukan melalui mean dan precision (kebalikan varians):
\[ \text{precision} = \frac{1}{\tau^2} = \frac{1}{100}. \]
library(INLA)
res_inla <- inla(
y ~ 0 + x, # 0+ x: tidak ada intercept, hanya slope x
data = dat,
family = "gaussian",
control.fixed = list(
mean = 0, # prior mean untuk koefisien x
prec = 1 / 100 # prior precision (var = 100)
),
control.predictor = list(compute = TRUE),
control.compute = list(dic = TRUE, waic = TRUE, cpo = TRUE)
)
res_inla$summary.fixed
## mean sd 0.025quant 0.5quant 0.975quant mode kld
## x 1.014535 0.03202779 0.9497755 1.014537 1.079282 1.014536 494.3283
Ekstrak mean dan sd posterior untuk \(\beta\) dari INLA:
beta_post_mean_inla <- res_inla$summary.fixed["x", "mean"]
beta_post_sd_inla <- res_inla$summary.fixed["x", "sd"]
c(beta_post_mean_inla, beta_post_sd_inla)
## [1] 1.01453500 0.03202779
Perbandingan Manual vs INLA
Kita bandingkan posterior mean dan sd dari kedua pendekatan:
comparison <- data.frame(
Method = c("Manual (Laplace)", "INLA"),
Mean = c(beta_post_mean_manual, beta_post_mean_inla),
SD = c(beta_post_sd_manual, beta_post_sd_inla)
)
comparison
## Method Mean SD
## 1 Manual (Laplace) 1.014536 0.03005216
## 2 INLA 1.014535 0.03202779
Dalam kasus sederhana ini:
Visualisasi Garis Regresi
Kita visualisasikan data dan garis regresi dari:
beta_true <- beta_ols # gunakan OLS sebagai "approx true"
x_grid <- seq(0, max(x) * 1.1, length.out = 100)
y_true <- beta_true * x_grid
y_man <- beta_post_mean_manual * x_grid
y_inla <- beta_post_mean_inla * x_grid
plot(x, y,
pch = 16,
xlab = "x",
ylab = "y",
main = "Regresi Linear: Manual vs INLA",
xlim = c(0, max(x_grid)),
ylim = range(c(y, y_true, y_man, y_inla))
)
lines(x_grid, y_true, lty = 2, lwd = 2, col = "darkgray")
lines(x_grid, y_man, col = "blue", lwd = 2)
lines(x_grid, y_inla, col = "red", lwd = 2)
legend("topleft",
legend = c("Data",
"Garis (approx true, OLS)",
"Posterior mean (Manual Laplace)",
"Posterior mean (INLA)"),
col = c("black", "darkgray", "blue", "red"),
pch = c(16, NA, NA, NA),
lty = c(NA, 2, 1, 1),
lwd = c(NA, 2, 2, 2),
bty = "n")
Intuisi dan Kaitan dengan INLA yang Lebih Umum
Pada contoh ini:
INLA melakukan hal yang serupa namun dalam konteks:
Contoh kecil ini memberi gambaran bahwa:
Dengan kata lain, apa yang kita lakukan di sini “secara manual” untuk satu parameter \(\beta\), INLA lakukan secara otomatis dan elegan untuk banyak parameter sekaligus.
Ringkasnya:
Regresi Bayesian adalah pendekatan inferensi yang menggabungkan prior information dengan data observasi melalui Teorema Bayes. Fokus utama adalah pada distribusi posterior dari parameter, yang mencerminkan ketidakpastian parameter setelah melihat data.
Motivasi utama:
Misalkan model regresi linear Gaussian:
\[ y_i = \beta_0 + \beta_1 x_{i1} + \cdots + \beta_p x_{ip} + \varepsilon_i,\quad \varepsilon_i \sim \mathrm{N}(0,\sigma^2). \]
Dalam kerangka Bayesian, tentukan: - Likelihood (untuk vektor \(y\) berukuran \(n\) dan matriks desain \(X\) berukuran \(n\times p\)):
\[ p(y\mid \beta,\sigma^2) \propto (\sigma^2)^{-\frac{n}{2}}\exp\Big\{-\tfrac{1}{2\sigma^2}(y-X\beta)'(y-X\beta)\Big\}. \]
\[ \beta\mid\sigma^2 \sim \mathrm{N}(\mu_0,\, \sigma^2\,\Sigma_0),\qquad \sigma^2 \sim \text{IG}(a_0,b_0). \]
\[ \begin{aligned} \Sigma_n &= (\Sigma_0^{-1} + X'X)^{-1} \\ \mu_n &= \Sigma_n(\Sigma_0^{-1}\mu_0 + X'y) \\ a_n &= a_0 + \tfrac{n}{2} \\ b_n &= b_0 + \tfrac{1}{2}\big(y' y + \mu_0'\Sigma_0^{-1}\mu_0 - \mu_n'\Sigma_n^{-1}\mu_n\big) \end{aligned} \]
Distribusi bersyaratnya: \(\beta\mid\sigma^2,y\sim\mathrm{N}(\mu_n,\sigma^2\Sigma_n)\) dan \(\sigma^2\mid y\sim \text{IG}(a_n,b_n)\).
Catatan: Formulasi ini mudah diperluas ke prior ridge/lasso Bayesian (mis. prior Laplace untuk lasso) dan model hierarkis.
Kita mulai dari model regresi linear klasik:
\[ y_i = \beta_0 + \beta_1 x_{i1} + \cdots + \beta_p x_{ip} + \varepsilon_i, \qquad \varepsilon_i \sim \mathrm{N}(0, \sigma^2). \]
Dalam bentuk matriks:
\[ y \mid \beta, \sigma^2 \sim \mathrm{N}(X\beta,\; \sigma^2 I_n), \]
dengan:
Kita gunakan prior konjugat:
\[ \beta \mid \sigma^2 \sim \mathrm{N}(\mu_0,\; \sigma^2 \Sigma_0), \qquad \sigma^2 \sim \operatorname{IG}(a_0, b_0), \]
di mana \(\operatorname{IG}(a_0,b_0)\) adalah distribusi Inverse-Gamma dengan densitas (untuk \(x>0\)):
\[ p(x) \propto x^{-(a_0+1)} \exp\bigg(-\frac{b_0}{x}\bigg). \]
Prior ini umum digunakan karena menghasilkan bentuk posterior yang tetap berada dalam keluarga Normal–Inverse-Gamma.
Dengan menggabungkan likelihood dan prior, didapat:
\[ \beta \mid \sigma^2, y \sim \mathrm{N}(\mu_n,\; \sigma^2 \Sigma_n), \]
dengan:
\[ \Sigma_n = \big(\Sigma_0^{-1} + X^{\top}X\big)^{-1}, \]
\[ \mu_n = \Sigma_n\big(\Sigma_0^{-1} \mu_0 + X^{\top} y\big). \]
Perhatikan bahwa \(\Sigma_n\) dan vektor tengah \(\mu_n\) bisa dihitung sekali di awal (tidak berubah antar iterasi).
Untuk \(\sigma^2\), kita peroleh:
\[ \sigma^2 \mid \beta, y \sim \operatorname{IG}(a^{\*}, b^{\*}), \]
dengan:
\[ a^{\*} = a_0 + \frac{n}{2}, \]
\[ b^{\*} = b_0 + \frac{1}{2} \Big[ (y - X\beta)^{\top}(y - X\beta) + (\beta - \mu_0)^{\top} \Sigma_0^{-1} (\beta - \mu_0) \Big]. \]
Jadi, kedua full conditional memiliki bentuk yang sangat mudah disampling.
Misalkan kita ingin menghasilkan \(M\) sampel MCMC
\(\{(\beta^{(t)},
\sigma^{2\,(t)})\}_{t=1}^M\) dari posterior \(p(\beta,\sigma^2\mid y)\).
Langkah 0 — Inisialisasi
Pilih nilai awal:
Untuk \(t = 1,2,\dots,M\):
Update \(\beta^{(t)}\) dari \(p(\beta\mid\sigma^2,y)\)
Sampel:
\[ \beta^{(t)} \sim \mathrm{N}\big(\mu_n,\; \sigma^{2\,(t-1)} \Sigma_n\big). \]
Update \(\sigma^{2\,(t)}\) dari \(p(\sigma^2\mid\beta,y)\)
Hitung residual dan kuantitas kuadrat:
\[ r^{(t)} = y - X\beta^{(t)}, \]
\[ Q^{(t)} = \big(r^{(t)}\big)^{\top} r^{(t)} + \big(\beta^{(t)} - \mu_0\big)^{\top} \Sigma_0^{-1} \big(\beta^{(t)} - \mu_0\big). \]
Lalu:
\[ a^{\*} = a_0 + \frac{n}{2}, \qquad b^{\*} = b_0 + \frac{1}{2} Q^{(t)}. \]
Sampel:
\[ \sigma^{2\,(t)} \sim \operatorname{IG}(a^{\*}, b^{\*}). \]
Simpan pasangan \((\beta^{(t)}, \sigma^{2\,(t)})\).
Setelah itu, buang beberapa iterasi awal sebagai burn-in dan gunakan sisanya untuk inferensi.
Data Simulasi
Menyiapkan Data
Kita simulasi data dari regresi dengan dua prediktor (plus intercept).
set.seed(123)
n <- 100
x1 <- rnorm(n)
x2 <- rnorm(n)
X <- cbind(1, x1, x2) # kolom pertama: intercept
p <- ncol(X)
beta_true <- c(1, 2, -1) # beta0, beta1, beta2
sigma2_true <- 0.5
y <- as.numeric(X %*% beta_true + rnorm(n, sd = sqrt(sigma2_true)))
head(cbind(y, X))
## y x1 x2
## [1,] 2.144249 1 -0.56047565 -0.71040656
## [2,] 1.210777 1 -0.23017749 0.25688371
## [3,] 4.176623 1 1.55870831 -0.24669188
## [4,] 1.872656 1 0.07050839 -0.34754260
## [5,] 1.917211 1 0.12928774 -0.95161857
## [6,] 4.138400 1 1.71506499 -0.04502772
Sebagai perbandingan, kita lihat dulu hasil OLS:
fit_ols <- lm(y ~ x1 + x2)
summary(fit_ols)
##
## Call:
## lm(formula = y ~ x1 + x2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.3244 -0.4672 -0.0880 0.4394 1.4707
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.09551 0.06798 16.11 <2e-16 ***
## x1 1.90583 0.07415 25.70 <2e-16 ***
## x2 -0.98316 0.07000 -14.04 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.6727 on 97 degrees of freedom
## Multiple R-squared: 0.9023, Adjusted R-squared: 0.9003
## F-statistic: 447.9 on 2 and 97 DF, p-value: < 2.2e-16
beta_true
## [1] 1 2 -1
sigma2_true
## [1] 0.5
Menentukan Hyperparameter Prior
Kita pilih prior yang relatif lemah-informatif:
mu0 <- rep(0, p)
Sigma0 <- diag(10, p) # varians prior besar -> lemah informatif
a0 <- 2
b0 <- 1
Implementasi Gibbs Sampler
Kita implementasikan fungsi Gibbs untuk model ini.
Inverse-Gamma \(\operatorname{IG}(a,b)\) dapat dihasilkan
sebagai \(1 / \text{Gamma}(a,
\text{rate}=b)\).
library(MASS) # untuk mvrnorm
library(dplyr)
library(ggplot2)
gibbs_reg <- function(y, X, mu0, Sigma0, a0, b0,
n_iter = 6000,
beta_init = NULL,
sigma2_init = 1) {
n <- length(y)
p <- ncol(X)
if (is.null(beta_init)) beta_init <- rep(0, p)
Sigma0_inv <- solve(Sigma0)
XtX <- t(X) %*% X
# komponen yang tetap
Sigma_n <- solve(Sigma0_inv + XtX)
mu_n <- Sigma_n %*% (Sigma0_inv %*% mu0 + t(X) %*% y)
beta <- matrix(NA_real_, nrow = n_iter, ncol = p)
sigma2 <- numeric(n_iter)
beta[1, ] <- beta_init
sigma2[1] <- sigma2_init
for (t in 2:n_iter) {
# 1) update beta | sigma2, y
Sigma_beta <- as.numeric(sigma2[t-1]) * Sigma_n
beta[t, ] <- as.numeric(MASS::mvrnorm(1,
mu = as.vector(mu_n),
Sigma = Sigma_beta))
# 2) update sigma2 | beta, y
resid <- y - X %*% beta[t, ]
quad_beta <- t(beta[t, ] - mu0) %*% Sigma0_inv %*% (beta[t, ] - mu0)
a_star <- a0 + n/2
b_star <- b0 + 0.5 * (sum(resid^2) + quad_beta)
sigma2[t] <- 1 / rgamma(1, shape = a_star, rate = b_star)
}
list(beta = beta, sigma2 = sigma2)
}
Menjalankan Gibbs Sampler
n_iter <- 6000
burn <- 1000
set.seed(123)
g_out <- gibbs_reg(y, X, mu0, Sigma0, a0, b0,
n_iter = n_iter,
beta_init = rep(0, p),
sigma2_init = 1)
beta_chain <- g_out$beta
sigma2_chain <- g_out$sigma2
# Buang burn-in
beta_post <- beta_chain[(burn+1):n_iter, , drop = FALSE]
sigma2_post <- sigma2_chain[(burn+1):n_iter]
Ringkasan Posterior
post_beta_mean <- colMeans(beta_post)
post_beta_ci <- apply(beta_post, 2, quantile, probs = c(0.025, 0.975))
post_sigma2_mean <- mean(sigma2_post)
post_sigma2_ci <- quantile(sigma2_post, probs = c(0.025, 0.975))
post_beta_mean
## [1] 1.0938133 1.9015148 -0.9827877
post_beta_ci
## [,1] [,2] [,3]
## 2.5% 0.9555183 1.754318 -1.1260956
## 97.5% 1.2303651 2.050764 -0.8426664
post_sigma2_mean
## [1] 0.4714988
post_sigma2_ci
## 2.5% 97.5%
## 0.3574618 0.6140671
Bandingkan dengan parameter sebenarnya:
beta_true
## [1] 1 2 -1
sigma2_true
## [1] 0.5
Traceplot dan Histogram
beta_df <- as.data.frame(beta_chain)
names(beta_df) <- paste0("beta_", 0:(p-1))
beta_df$iter <- 1:n_iter
ggplot(beta_df, aes(x = iter, y = beta_0)) +
geom_line() +
labs(title = "Traceplot beta_0", x = "Iterasi", y = expression(beta[0]))
beta0_post <- beta_post[, 1]
ggplot(data.frame(beta0 = beta0_post), aes(x = beta0)) +
geom_histogram(aes(y = ..density..), bins = 40, alpha = 0.7) +
geom_vline(xintercept = beta_true[1], linetype = "dashed", color = "red") +
labs(title = "Posterior beta_0", x = expression(beta[0]), y = "Densitas")
ggplot(data.frame(sigma2 = sigma2_post), aes(x = sigma2)) +
geom_histogram(aes(y = ..density..), bins = 40, alpha = 0.7) +
geom_vline(xintercept = sigma2_true, linetype = "dashed", color = "red") +
labs(title = "Posterior sigma^2", x = expression(sigma^2), y = "Densitas")
Dokumen ini memperlihatkan bagaimana:
# Paket
suppressPackageStartupMessages({
library(coda)
library(rjags) # install.packages('rjags') jika perlu
})
set.seed(123)
set.seed(123)
n <- 100
x1 <- rnorm(n)
x2 <- rnorm(n)
X <- cbind(1, x1, x2)
p <- ncol(X)
beta_true <- c(1, 2, -1)
sigma2_true <- 0.5
y <- as.numeric(X %*% beta_true + rnorm(n, sd = sqrt(sigma2_true)))
dat <- data.frame(y, x1, x2)
summary(dat)
## y x1 x2
## Min. :-3.533 Min. :-2.30917 Min. :-2.0532
## 1st Qu.: 0.308 1st Qu.:-0.49385 1st Qu.:-0.8011
## Median : 1.160 Median : 0.06176 Median :-0.2258
## Mean : 1.374 Mean : 0.09041 Mean :-0.1075
## 3rd Qu.: 2.708 3rd Qu.: 0.69182 3rd Qu.: 0.4678
## Max. : 6.448 Max. : 2.18733 Max. : 3.2410
Model RJAGS
library(rjags)
library(coda)
data_jags <- list(
y = dat$y,
x1 = dat$x1,
x2 = dat$x2,
n = nrow(dat)
)
model_string <- "
model {
for (i in 1:n) {
y[i] ~ dnorm(mu[i], tau)
mu[i] <- beta0 + beta1 * x1[i] + beta2 * x2[i]
}
beta0 ~ dnorm(0, 0.01)
beta1 ~ dnorm(0, 0.01)
beta2 ~ dnorm(0, 0.01)
tau ~ dgamma(0.01, 0.01)
sigma2 <- 1/tau
}
"
jm <- jags.model(textConnection(model_string), data=data_jags,
n.chains=3, n.adapt=1000)
## Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 100
## Unobserved stochastic nodes: 4
## Total graph size: 609
##
## Initializing model
update(jm, 2000)
samps <- coda.samples(jm, c("beta0","beta1","beta2","sigma2"),
n.iter=6000, thin=2)
summary(samps)
##
## Iterations = 2002:8000
## Thinning interval = 2
## Number of chains = 3
## Sample size per chain = 3000
##
## 1. Empirical mean and standard deviation for each variable,
## plus standard error of the mean:
##
## Mean SD Naive SE Time-series SE
## beta0 1.0943 0.06821 0.0007190 0.0007269
## beta1 1.9066 0.07538 0.0007945 0.0007945
## beta2 -0.9826 0.07050 0.0007431 0.0007432
## sigma2 0.4627 0.06811 0.0007180 0.0007302
##
## 2. Quantiles for each variable:
##
## 2.5% 25% 50% 75% 97.5%
## beta0 0.9566 1.049 1.0940 1.1399 1.2312
## beta1 1.7584 1.857 1.9060 1.9562 2.0548
## beta2 -1.1199 -1.030 -0.9838 -0.9351 -0.8444
## sigma2 0.3479 0.415 0.4564 0.5029 0.6138
Diagnostik
# Bersihkan device grafis dulu (jika ada sisa)
if(dev.cur() > 1) dev.off()
## null device
## 1
# Atur margin supaya cukup luas
par(mar = c(4, 4, 2, 1)) # bottom, left, top, right
# Plot trace + density untuk semua parameter
plot(samps)
# Gelman-Rubin diagnostic (untuk mcmc.list dengan >1 chain)
gelman.diag(samps)
## Potential scale reduction factors:
##
## Point est. Upper C.I.
## beta0 1 1
## beta1 1 1
## beta2 1 1
## sigma2 1 1
##
## Multivariate psrf
##
## 1
# Autokorelasi
autocorr.plot(samps)
Posterior Summary
sumtab <- summary(samps)
sumtab$statistics
## Mean SD Naive SE Time-series SE
## beta0 1.0942948 0.06821328 0.0007190311 0.0007269269
## beta1 1.9066052 0.07537526 0.0007945250 0.0007945051
## beta2 -0.9825552 0.07050054 0.0007431409 0.0007431736
## sigma2 0.4627049 0.06811232 0.0007179669 0.0007302081
sumtab$quantiles[c("beta0","beta1","beta2","sigma2"),
c("2.5%","50%","97.5%")]
## 2.5% 50% 97.5%
## beta0 0.9566360 1.0940021 1.2312053
## beta1 1.7584255 1.9060189 2.0547692
## beta2 -1.1199028 -0.9837509 -0.8443520
## sigma2 0.3479486 0.4563995 0.6137781
beta_true
## [1] 1 2 -1
sigma2_true
## [1] 0.5
Posterior Predictive
post <- as.matrix(samps)
ix <- sample(seq_len(nrow(post)), 1000, replace = TRUE)
newx1 <- seq(min(x1), max(x1), length.out = 100)
newx2 <- 0
Xnew <- cbind(1, newx1, newx2)
pred_draws <- sapply(ix, function(k){
b0 <- post[k, "beta0"]
b1 <- post[k, "beta1"]
b2 <- post[k, "beta2"]
s2 <- post[k, "sigma2"]
mu <- as.vector(Xnew %*% c(b0, b1, b2))
rnorm(length(mu), mu, sqrt(s2))
})
pred_mean <- rowMeans(pred_draws)
pred_low <- apply(pred_draws, 1, quantile, probs = 0.025)
pred_high <- apply(pred_draws, 1, quantile, probs = 0.975)
plot(newx1, pred_mean, type="l", lwd=2,
ylab="y", xlab="x1")
lines(newx1, pred_low, lty=2)
lines(newx1, pred_high, lty=2)
points(x1, y, pch=16, cex=0.6)
# install.packages("INLA", repos=c(getOption("repos"), INLA="https://inla.r-inla-download.org/R/stable"))
suppressPackageStartupMessages(library(INLA))
Regresi Gaussian (sama seperti MCMC di atas)
res_inla <- inla(
y ~ x1 + x2,
data = dat,
family = "gaussian",
control.predictor = list(compute = TRUE),
control.compute = list(dic = TRUE, waic = TRUE, cpo = TRUE),
control.fixed = list(
mean.intercept = 0, prec.intercept = 0.01, # prior Normal(0, var=100) -> prec=1/100=0.01
mean = 0, prec = 0.01
)
)
summary(res_inla)
## Time used:
## Pre = 1.01, Running = 0.183, Post = 0.00865, Total = 1.21
## Fixed effects:
## mean sd 0.025quant 0.5quant 0.975quant mode kld
## (Intercept) 1.095 0.068 0.962 1.095 1.229 1.095 119.298
## x1 1.906 0.074 1.760 1.906 2.051 1.906 211.850
## x2 -0.983 0.070 -1.120 -0.983 -0.846 -0.983 100.485
##
## Model hyperparameters:
## mean sd 0.025quant 0.5quant
## Precision for the Gaussian observations 2.25 0.321 1.67 2.24
## 0.975quant mode
## Precision for the Gaussian observations 2.93 2.21
##
## Deviance Information Criterion (DIC) ...............: 209.45
## Deviance Information Criterion (DIC, saturated) ....: 106.40
## Effective number of parameters .....................: 4.00
##
## Watanabe-Akaike information criterion (WAIC) ...: 209.68
## Effective number of parameters .................: 4.03
##
## Marginal log-Likelihood: -125.77
## CPO, PIT is computed
## Posterior summaries for the linear predictor and the fitted values are computed
## (Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')
res_inla$dic$dic; res_inla$waic$waic
## [1] 209.4472
## [1] 209.6817
Regresi Poisson (contoh GLM lain)
# Simulasi data count
set.seed(42)
n2 <- 300
z1 <- rnorm(n2)
eta <- -0.5 + 0.8*z1
lam <- exp(eta)
ycount <- rpois(n2, lam)
df_pois <- data.frame(ycount, z1)
fit_pois <- inla(
ycount ~ z1,
data = df_pois,
family = "poisson",
control.predictor = list(compute = TRUE),
control.compute = list(dic = TRUE, waic = TRUE)
)
summary(fit_pois)
## Time used:
## Pre = 0.919, Running = 0.196, Post = 0.00671, Total = 1.12
## Fixed effects:
## mean sd 0.025quant 0.5quant 0.975quant mode kld
## (Intercept) -0.546 0.083 -0.709 -0.546 -0.383 -0.546 46.521
## z1 0.841 0.068 0.708 0.841 0.975 0.841 91.432
##
## Deviance Information Criterion (DIC) ...............: 611.16
## Deviance Information Criterion (DIC, saturated) ....: 282.92
## Effective number of parameters .....................: 2.00
##
## Watanabe-Akaike information criterion (WAIC) ...: 610.80
## Effective number of parameters .................: 1.62
##
## Marginal log-Likelihood: -311.54
## is computed
## Posterior summaries for the linear predictor and the fitted values are computed
## (Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')
Ekstensi: Model Hierarkis Ringkas (opsional)
Misal model random intercept per grup (\(g=1,\ldots,G\)) untuk Gaussian: \[ y_{ig} = \beta_0 + \beta_1 x_{ig} + u_g + \varepsilon_{ig},\quad u_g\sim\mathrm{N}(0, \tau_u^{-1}),\ \varepsilon_{ig}\sim\mathrm{N}(0,\sigma^2). \]
Di INLA:
set.seed(7)
G <- 20
ng <- 20
x_h <- rnorm(G*ng)
grp <- rep(1:G, each = ng)
u <- rnorm(G, 0, 0.6)
y_h <- 1 + 1.2*x_h + u[grp] + rnorm(G*ng, 0, 1)
df_h <- data.frame(y_h, x_h, grp = as.factor(grp))
fit_h <- inla(
y_h ~ x_h + f(grp, model = "iid"),
data = df_h,
family = "gaussian",
control.predictor = list(compute = TRUE),
control.compute = list(dic = TRUE, waic = TRUE)
)
summary(fit_h)
## Time used:
## Pre = 0.982, Running = 0.193, Post = 0.0113, Total = 1.19
## Fixed effects:
## mean sd 0.025quant 0.5quant 0.975quant mode kld
## (Intercept) 1.046 0.127 0.794 1.046 1.297 1.046 32.165
## x_h 1.205 0.049 1.109 1.205 1.301 1.205 255.030
##
## Random effects:
## Name Model
## grp IID model
##
## Model hyperparameters:
## mean sd 0.025quant 0.5quant
## Precision for the Gaussian observations 1.07 0.078 0.927 1.07
## Precision for grp 4.11 1.543 1.853 3.85
## 0.975quant mode
## Precision for the Gaussian observations 1.23 1.07
## Precision for grp 7.83 3.39
##
## Deviance Information Criterion (DIC) ...............: 1129.72
## Deviance Information Criterion (DIC, saturated) ....: 421.48
## Effective number of parameters .....................: 19.16
##
## Watanabe-Akaike information criterion (WAIC) ...: 1130.07
## Effective number of parameters .................: 18.66
##
## Marginal log-Likelihood: -601.15
## is computed
## Posterior summaries for the linear predictor and the fitted values are computed
## (Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')
Perbandingan MCMC vs INLA
|||| | Prinsip | Sampling posterior | Aproksimasi Laplace bertingkat | | Fleksibilitas | Sangat luas (dengan pemodelan yang tepat) | Luas untuk LGM; beberapa model khusus memerlukan trik | | Kecepatan | Lebih lambat, perlu konvergensi | Umumnya jauh lebih cepat | | Diagnostik | Perlu cek konvergensi (trace, R-hat, ESS) | Tidak perlu rantai; cek marginals & kriteria (DIC/WAIC/CPO) | | Output | Sampel penuh posterior | Marginal posterior aproksimasi |
Praktik baik: gunakan INLA untuk eksplorasi/produksi cepat pada model LGM standar; validasi silang dengan MCMC pada subset kecil ketika perlu.
Pada bagian ini kita akan melakukan simulasi data regresi linear sederhana dengan slope \(\beta_1\) yang ditentukan. Kita akan membandingkan:
Simulasi Data
set.seed(123)
n <- 20
beta0_true <- 1.0
beta1_true <- 2.0
sigma_true <- 1.2
x <- sort(runif(n, -2, 2))
X <- cbind(1, x)
y <- beta0_true + beta1_true * x + rnorm(n, 0, sigma_true)
OLS
ols_fit <- lm(y ~ x)
beta_ols <- coef(ols_fit)
beta_ols
## (Intercept) x
## 0.9323097 1.6955629
Bayesian Linear Regression dengan Prior Normal-Inverse-Gamma
bayes_linreg_nig <- function(X, y, m0, V0, a0, b0) {
XtX <- crossprod(X)
Xty <- crossprod(X, y)
V0_inv <- tryCatch(solve(V0), error = function(e) matrix(0, nrow(V0), ncol(V0)))
Vn <- solve(V0_inv + XtX)
mn <- Vn %*% (V0_inv %*% m0 + Xty)
a_n <- a0 + nrow(X) / 2
b_n <- as.numeric(
b0 + 0.5 * ( crossprod(y, y) + t(m0) %*% V0_inv %*% m0 - t(mn) %*% solve(Vn) %*% mn )
)
list(mn = as.vector(mn), Vn = Vn, a_n = a_n, b_n = b_n)
}
Spesifikasi Prior
beta1_target <- 0.5 # nilai slope target
tau_beta1 <- 0.2 # varian kecil -> prior kuat (bukan 'au_beta1')
# (opsional) guard agar tau_beta1 valid
stopifnot(is.numeric(tau_beta1), length(tau_beta1)==1, tau_beta1 > 0)
# Prior informatif
m0_inf <- c(0, beta1_target)
V0_inf <- diag(c(10^2, tau_beta1^2)) # pakai tau_beta1 yang benar
a0_inf <- 2; b0_inf <- 1
# Prior vague
m0_vag <- c(0, 0)
V0_vag <- diag(c(1e6, 1e6))
a0_vag <- 1e-3; b0_vag <- 1e-3
Posterior Estimation
post_inf <- bayes_linreg_nig(X, y, m0_inf, V0_inf, a0_inf, b0_inf)
post_vag <- bayes_linreg_nig(X, y, m0_vag, V0_vag, a0_vag, b0_vag)
beta_bayes_inf <- post_inf$mn
beta_bayes_vag <- post_vag$mn
summary_tab <- data.frame(
model = c("True", "OLS", "Bayes (informative)", "Bayes (vague ≈ noninformative)"),
beta0 = c(beta0_true, beta_ols[1], beta_bayes_inf[1], beta_bayes_vag[1]),
beta1 = c(beta1_true, beta_ols[2], beta_bayes_inf[2], beta_bayes_vag[2]),
stringsAsFactors = FALSE
)
knitr::kable(summary_tab, digits = 3, caption = "Ringkasan estimasi beta0 dan beta1")
| model | beta0 | beta1 |
|---|---|---|
| True | 1.000 | 2.000 |
| OLS | 0.932 | 1.696 |
| Bayes (informative) | 1.042 | 1.151 |
| Bayes (vague ≈ noninformative) | 0.932 | 1.696 |
Visualisasi
x_grid <- seq(min(x), max(x), length.out = 200)
y_true <- beta0_true + beta1_true * x_grid
y_ols <- beta_ols[1] + beta_ols[2] * x_grid
y_inf <- beta_bayes_inf[1]+ beta_bayes_inf[2]* x_grid
y_vag <- beta_bayes_vag[1]+ beta_bayes_vag[2]* x_grid
plot(x, y, pch = 19, cex = 1.1, xlab = "x", ylab = "y",
main = "OLS vs Bayesian Linear Regression")
lines(x_grid, y_true, lwd = 2, lty = 2, col = "black")
lines(x_grid, y_ols, lwd = 2, col = "blue")
lines(x_grid, y_inf, lwd = 2, lty = 3, col = "red")
lines(x_grid, y_vag, lwd = 2, lty = 4, col = "darkgreen")
legend("topleft",
legend = c(
sprintf("True: beta1=%.2f", beta1_true),
sprintf("OLS: beta1=%.2f", beta_ols[2]),
sprintf("Bayes (informative): beta1=%.2f (target=%.2f)", beta_bayes_inf[2], beta1_target),
sprintf("Bayes (vague): beta1=%.2f", beta_bayes_vag[2])
),
lty = c(2,1,3,4), lwd = 2, col = c("black","blue","red","darkgreen"), bty = "n")
Kita gunakan model regresi tanpa intercept:
\[ y_i = \beta x_i + \varepsilon_i, \quad \varepsilon_i \sim N(0,\sigma^2). \]
set.seed(1)
n <- 5
x <- 1:5
beta_true <- 2
sigma2_true <- 1.3
y <- beta_true * x + rnorm(n, sd = sqrt(sigma2_true))
cbind(x, y)
## x y
## [1,] 1 1.285733
## [2,] 2 4.209386
## [3,] 3 5.047237
## [4,] 4 9.818900
## [5,] 5 10.375697
Estimator OLS:
beta_hat <- sum(x * y) / sum(x^2)
beta_hat
## [1] 2.109096
Posterior Analitik (Normal–Normal)
Dengan prior:
\[ \beta \sim N(\mu_0,\tau_0^2), \]
dan \(\sigma^2\) diketahui, posterior \(\beta|y\) adalah:
\[ \beta|y \sim N(m_n, v_n), \]
dengan
\[ v_n = \left( \frac{1}{\tau_0^2} + \frac{\sum x_i^2}{\sigma^2} \right)^{-1}, \]
\[ m_n = v_n \left( \frac{\sum x_i y_i}{\sigma^2} + \frac{\mu_0}{\tau_0^2} \right). \]
Kita buat fungsi praktis:
sigma2 <- sigma2_true
S <- sum(x^2) / sigma2 # "informasi" dari data
post_normal <- function(mu0, tau2) {
prec0 <- 1 / tau2
vn <- 1 / (S + prec0)
mn <- vn * (beta_hat * S + mu0 * prec0)
wdata <- S / (S + prec0)
wprior<- prec0 / (S + prec0)
list(mean = mn, var = vn, w_data = wdata, w_prior = wprior)
}
Tiga Prior: Lemah, Kuat ke 0, Kuat ke 5
prior_A <- post_normal(mu0 = 0, tau2 = 100^2) # prior lemah
prior_B <- post_normal(mu0 = 0, tau2 = 0.1^2) # prior kuat ke 0
prior_C <- post_normal(mu0 = 5, tau2 = 0.1^2) # prior kuat ke 5
prior_A
## $mean
## [1] 2.109091
##
## $var
## [1] 0.02363631
##
## $w_data
## [1] 0.9999976
##
## $w_prior
## [1] 2.363631e-06
prior_B
## $mean
## [1] 0.6270286
##
## $var
## [1] 0.007027027
##
## $w_data
## [1] 0.2972973
##
## $w_prior
## [1] 0.7027027
prior_C
## $mean
## [1] 4.140542
##
## $var
## [1] 0.007027027
##
## $w_data
## [1] 0.2972973
##
## $w_prior
## [1] 0.7027027
Visualisasi: Plot x vs y dan Garis Regresi
# grid x untuk menggambar garis
x_grid <- seq(0, max(x) * 1.1, length.out = 100)
# slope dari masing-masing posterior
beta_A <- prior_A$mean # prior lemah
beta_B <- prior_B$mean # prior kuat (mu0 = 0)
beta_C <- prior_C$mean # prior kuat (mu0 = 5)
# nilai y di sepanjang garis
y_true <- beta_true * x_grid
y_A <- beta_A * x_grid
y_B <- beta_B * x_grid
y_C <- beta_C * x_grid
# plot scatter x vs y
plot(x, y,
pch = 16,
xlab = "x",
ylab = "y",
main = "Pengaruh Prior terhadap Slope Regresi",
xlim = c(0, max(x_grid)),
ylim = range(c(y, y_true, y_A, y_B, y_C))
)
# garis true beta
lines(x_grid, y_true, lty = 2, lwd = 2, col = "darkgray")
# garis posterior untuk 3 prior
lines(x_grid, y_A, col = "black", lwd = 2) # prior lemah
lines(x_grid, y_B, col = "red", lwd = 2) # prior kuat (mu0=0)
lines(x_grid, y_C, col = "blue", lwd = 2) # prior kuat (mu0=5)
legend("topleft",
legend = c("Data (titik)",
expression(beta[true] == 2),
"Posterior (prior lemah)",
"Posterior (prior kuat, mu0 = 0)",
"Posterior (prior kuat, mu0 = 5)"),
col = c("black","darkgray","black","red","blue"),
pch = c(16, NA, NA, NA, NA),
lty = c(NA, 2, 1, 1, 1),
lwd = c(NA, 2, 2, 2, 2),
bty = "n"
)
Cara mudah dengan INLA
Simulasi Data
Kita gunakan model regresi tanpa intercept:
\[ y_i = \beta x_i + \varepsilon_i, \quad \varepsilon_i \sim N(0, \sigma^2). \]
set.seed(1)
n <- 5
x <- 1:5
beta_true <- 2
sigma2_true <- 1.3
y <- beta_true * x + rnorm(n, sd = sqrt(sigma2_true))
dat <- data.frame(y = y, x = x)
dat
## y x
## 1 1.285733 1
## 2 4.209386 2
## 3 5.047237 3
## 4 9.818900 4
## 5 10.375697 5
Model INLA dengan Berbagai Prior untuk Slope
Kita akan membandingkan tiga model:
Ingat: di INLA, kita memasukkan precision (=
1/variance).
Jadi var = 0.01 berarti precision = 100.
# Pastikan paket INLA sudah terinstal (gunakan INLA::inla.download() jika perlu)
library(INLA)
# 2.1. Model dengan prior default (vague)
res_def <- inla(
y ~ x,
data = dat,
family = "gaussian",
control.predictor = list(compute = TRUE),
control.compute = list(dic = TRUE, waic = TRUE, cpo = TRUE)
)
# 2.2. Prior kuat: slope ~ N(0, 0.01)
res0 <- inla(
y ~ x,
data = dat,
family = "gaussian",
control.predictor = list(compute = TRUE),
control.compute = list(dic = TRUE, waic = TRUE, cpo = TRUE),
control.fixed = list(
mean.intercept = 0,
prec.intercept = 0.01, # var intercept = 100 (lemah, hanya supaya stabil)
mean = 0, # mean prior untuk semua koef non-intercept (di sini: slope x)
prec = 100 # var = 0.01 -> prior kuat ke 0
)
)
# 2.3. Prior kuat: slope ~ N(5, 0.01)
res5 <- inla(
y ~ x,
data = dat,
family = "gaussian",
control.predictor = list(compute = TRUE),
control.compute = list(dic = TRUE, waic = TRUE, cpo = TRUE),
control.fixed = list(
mean.intercept = 0,
prec.intercept = 0.01,
mean = 5, # mean prior untuk slope x
prec = 100 # var = 0.01 -> prior kuat ke 5
)
)
Ringkasan koefisien tetap (intercept dan slope):
res_def$summary.fixed
## mean sd 0.025quant 0.5quant 0.975quant mode
## (Intercept) -0.9887381 1.0385708 -3.100168 -0.9888902 1.123733 -0.9888254
## x 2.3787095 0.3131363 1.741769 2.3787602 3.015303 2.3787386
## kld
## (Intercept) 0.4724183
## x 17.9150148
res0$summary.fixed
## mean sd 0.025quant 0.5quant 0.975quant mode
## (Intercept) 5.90853210 1.6296433 2.5668045 5.9236891 9.1380464 5.91620575
## x 0.02735905 0.1004341 -0.1695645 0.0273527 0.2243187 0.02735266
## kld
## (Intercept) 12.27211
## x 24.30560
res5$summary.fixed
## mean sd 0.025quant 0.5quant 0.975quant mode
## (Intercept) -8.503636 1.7834940 -11.981576 -8.538432 -4.786228 -8.52531
## x 4.974035 0.1003649 4.777215 4.974040 5.170826 4.97404
## kld
## (Intercept) 24.13514
## x 644.54225
Khususnya, kita fokus pada slope untuk x:
beta_def_hat <- summary(res_def)$fixed["x", "mean"]
beta0_hat <- summary(res0)$fixed["x", "mean"]
beta5_hat <- summary(res5)$fixed["x", "mean"]
c(beta_true = beta_true,
beta_default = beta_def_hat,
beta_prior0 = beta0_hat,
beta_prior5 = beta5_hat)
## beta_true beta_default beta_prior0 beta_prior5
## 2.000 2.379 0.027 4.974
Visualisasi: Garis Regresi untuk Prior yang Berbeda
x_grid <- seq(0, max(dat$x) * 1.1, length.out = 100)
y_true <- beta_true * x_grid
y_def <- beta_def_hat * x_grid
y0_hat <- beta0_hat * x_grid
y5_hat <- beta5_hat * x_grid
plot(dat$x, dat$y,
pch = 16,
xlab = "x",
ylab = "y",
main = "Efek Prior terhadap Slope (INLA)",
xlim = c(0, max(x_grid)),
ylim = range(c(dat$y, y_true, y_def, y0_hat, y5_hat))
)
# Garis parameter benar
lines(x_grid, y_true, lty = 2, lwd = 2, col = "darkgray")
# Garis posterior dari tiga model
lines(x_grid, y_def, col = "black", lwd = 2) # prior default (lemah)
lines(x_grid, y0_hat, col = "red", lwd = 2) # prior kuat ke 0
lines(x_grid, y5_hat, col = "blue", lwd = 2) # prior kuat ke 5
legend("topleft",
legend = c("Data (titik)",
expression(paste("True ", beta, " = 2")),
"Posterior (prior default)",
"Posterior (prior kuat, mean = 0)",
"Posterior (prior kuat, mean = 5)"),
col = c("black", "darkgray", "black", "red", "blue"),
pch = c(16, NA, NA, NA, NA),
lty = c(NA, 2, 1, 1, 1),
lwd = c(NA, 2, 2, 2, 2),
bty = "n"
)
Diskusi Singkat
Ilustrasi ini menunjukkan bahwa:
Kesimpulan
Untuk mengurangi subjektivitas:
Rekomendasi
Kita tinjau model regresi linear dengan notasi matriks: \[ \mathrm y \mid \boldsymbol\beta,\sigma^2 \sim \mathrm N\!\big(X\boldsymbol\beta,\ \sigma^2 I_n\big), \] dengan \(\mathrm y\in\mathbb R^n\), \(X\in\mathbb R^{n\times p}\), \(\boldsymbol\beta\in\mathbb R^p\), dan \(\sigma^2>0\).
Prior konjugat Normal–Inverse-Gamma (NIG): \[ \boldsymbol\beta\mid\sigma^2 \sim \mathrm N\!\big(m_0,\ \sigma^2 V_0\big),\qquad \sigma^2 \sim \mathrm{InvGamma}(a_0,b_0). \]
Parameterisasi Inverse-Gamma yang dipakai di sini:
\[ p(\sigma^2)=\frac{b_0^{a_0}}{\Gamma(a_0)}\,(\sigma^2)^{-(a_0+1)}\exp\!\Big(-\frac{b_0}{\sigma^2}\Big),\quad \sigma^2>0. \]
Likelihood dan Prior (kernel)
Likelihood (abaikan konstanta tak bergantung pada parameter):
\[ p(\mathrm y\mid\boldsymbol\beta,\sigma^2)\ \propto\ (\sigma^2)^{-n/2}\exp\!\Big\{-\tfrac{1}{2\sigma^2}(\mathrm y-X\boldsymbol\beta)'(\mathrm y-X\boldsymbol\beta)\Big\}. \] Prior bersama (joint prior):
\[ p(\boldsymbol\beta,\sigma^2)=p(\boldsymbol\beta\mid\sigma^2)\,p(\sigma^2) \ \propto\ (\sigma^2)^{-p/2}\exp\!\Big\{-\tfrac{1}{2\sigma^2}(\boldsymbol\beta-m_0)'V_0^{-1}(\boldsymbol\beta-m_0)\Big\}\cdot (\sigma^2)^{-(a_0+1)}\exp\!\Big(-\tfrac{b_0}{\sigma^2}\Big). \]
Posterior ∝ Likelihood × Prior
Gabungkan (kernel): \[ p(\boldsymbol\beta,\sigma^2\mid\mathrm y)\ \propto\ (\sigma^2)^{-\frac{n}{2}}\exp\!\Big\{-\tfrac{1}{2\sigma^2}\Vert\mathrm y-X\boldsymbol\beta\Vert^2\Big\}\cdot (\sigma^2)^{-\frac{p}{2}}\exp\!\Big\{-\tfrac{1}{2\sigma^2}(\boldsymbol\beta-m_0)'V_0^{-1}(\boldsymbol\beta-m_0)\Big\}\cdot (\sigma^2)^{-(a_0+1)}\exp\!\Big(-\tfrac{b_0}{\sigma^2}\Big). \]
Kumpulkan faktor (^2) dan selesaikan kuadrat untuk \(\boldsymbol\beta\). Definisikan
\[ V_n=(V_0^{-1}+X'X)^{-1},\qquad m_n=V_n\big(V_0^{-1}m_0+X'\mathrm y\big). \] Gunakan identitas penyelesaian kuadrat:
\[ \Vert\mathrm y-X\boldsymbol\beta\Vert^2+(\boldsymbol\beta-m_0)'V_0^{-1}(\boldsymbol\beta-m_0) = (\boldsymbol\beta-m_n)'V_n^{-1}(\boldsymbol\beta-m_n) + C, \]
dengan konstanta (tidak bergantung pada \(\boldsymbol\beta\))
\[ C = \mathrm y'\mathrm y + m_0'V_0^{-1}m_0 - m_n'V_n^{-1}m_n. \]
Maka kernel posterior menjadi
\[ p(\boldsymbol\beta,\sigma^2\mid\mathrm y)\ \propto\ (\sigma^2)^{-\frac{n+p}{2}}\,(\sigma^2)^{-(a_0+1)}\exp\!\Big\{-\tfrac{1}{2\sigma^2}(\boldsymbol\beta-m_n)'V_n^{-1}(\boldsymbol\beta-m_n)\Big\} \times \exp\!\Big\{-\tfrac{1}{2\sigma^2}C\Big\}\,\exp\!\Big(-\tfrac{b_0}{\sigma^2}\Big). \]
Kelompokkan lagi eksponen bagian \(\sigma^2\) (yang tidak mengandung \(\boldsymbol\beta\)):
\[ \exp\!\Big\{-\tfrac{1}{\sigma^2}\Big(\tfrac{1}{2}C + b_0\Big)\Big\}. \]
Definisikan
\[ a_n = a_0 + \tfrac{n}{2},\qquad b_n = b_0 + \tfrac{1}{2}\Big(\mathrm y'\mathrm y + m_0'V_0^{-1}m_0 - m_n'V_n^{-1}m_n\Big). \]
Bentuk Posterior (tertutup)
Dari bentuk kernel di atas, kita memperoleh faktorasi posterior konjugat:
\[ \boldsymbol\beta \mid \sigma^2, \mathrm y \sim \mathrm N\!\big(m_n,\ \sigma^2 V_n\big) \quad\text{dan}\quad \sigma^2 \mid \mathrm y \sim \mathrm{InvGamma}(a_n,\ b_n). \]
Marginal \(\boldsymbol\beta\mid\mathrm y\) ⇒ Multivariate Student-\(t\)
Dengan mengintegrasikan \(\sigma^2\), marginal \(\boldsymbol\beta\mid\mathrm y\) adalah sebaran Student-\(t\) multivariat:
\[ \boldsymbol\beta\mid\mathrm y \sim \mathsf{MVT}_{\nu}\Big(m_n,\ \frac{b_n}{a_n}V_n\Big),\quad \nu=2a_n. \]
Artinya, untuk vektor mana pun \(c\), \(c'\boldsymbol\beta\mid\mathrm y\) mengikuti Student-\(t_{\nu}\) dengan mean \(c'm_n\) dan varians \(\frac{b_n}{a_n}c'V_n c\).
Untuk kovariat baru \(x_*\in\mathbb R^p\) (termasuk 1 untuk intersep):
\[ y_*\mid\mathrm y \sim t_{\nu}\Big(\mu_*,\ s_*^2\Big),\quad \mu_* = x_*'m_n,\ \ s_*^2 = \frac{b_n}{a_n}\big(1 + x_*'V_n x_*\big),\ \nu=2a_n. \]
Verifikasi Numerik (R)
Simulasi data
set.seed(42)
n <- 40
p <- 2
beta_true <- c(1.0, 2.0) # (intercept, slope)
sigma_true <- 1.0
x <- sort(runif(n, -2, 2))
X <- cbind(1, x)
y <- as.vector(X %*% beta_true + rnorm(n, 0, sigma_true))
Prior (pilih yang cukup proper)
m0 <- c(0, 0)
V0 <- diag(c(10^2, 10^2))
a0 <- 2
b0 <- 1
Hitung parameter posterior (m_n, V_n, a_n, b_n)
XtX <- crossprod(X)
Xty <- crossprod(X, y)
V0_inv <- solve(V0)
Vn <- solve(V0_inv + XtX)
mn <- Vn %*% (V0_inv %*% m0 + Xty)
an <- a0 + n/2
bn <- as.numeric(b0 + 0.5 * ( crossprod(y,y) + t(m0) %*% V0_inv %*% m0 - t(mn) %*% solve(Vn) %*% mn ))
c(an = an, bn = bn)
## an bn
## 22.0000 22.9944
mn
## [,1]
## 0.849861
## x 2.038724
Vn
## x
## 0.027250856 -0.006255731
## x -0.006255731 0.017338219
Ringkasan posterior (mean & varians)
Untuk beta|y: mean = m_n, kovarian bersyarat = E[sigma^2|y] * V_n =
(b_n/(a_n-1)) V_n (jika a_n>1).
Untuk sigma^2|y: mean = b_n/(a_n-1) (jika a_n>1).
mean_sigma2 <- bn / (an - 1) # a_n > 1
cov_beta <- as.numeric(mean_sigma2) * Vn
list(mean_sigma2 = mean_sigma2, cov_beta = cov_beta)
## $mean_sigma2
## [1] 1.094972
##
## $cov_beta
## x
## 0.029838914 -0.006849848
## x -0.006849848 0.018984857
Bandingkan dengan OLS (prior sangat vague → mendekati OLS)
coef(lm(y ~ x))
## (Intercept) x
## 0.849965 2.039024
Prediktif Posterior di titik grid
x_grid <- seq(min(x), max(x), length.out = 100)
Xg <- cbind(1, x_grid)
mu_pred <- as.vector(Xg %*% mn)
scale2 <- as.numeric(bn/an) * (1 + rowSums((Xg %*% Vn) * Xg))
nu <- 2*an
# Interval kredibel prediktif 95% (Student-t)
qL <- mu_pred + sqrt(scale2) * qt(0.025, df = nu)
qU <- mu_pred + sqrt(scale2) * qt(0.975, df = nu)
plot(x, y, pch=19, xlab = "x", ylab = "y",
main = "Posterior Predictive t-interval (95%)")
lines(x_grid, mu_pred, lwd = 2)
lines(x_grid, qL, lty = 2)
lines(x_grid, qU, lty = 2)
Catatan Tambahan
knitr::opts_chunk$set(
message = FALSE,
warning = FALSE,
fig.width = 7,
fig.height = 4.5
)
library(ggplot2)
library(forecast) # jika belum ada: install.packages("forecast")
Secara sederhana, deret waktu (time series) adalah urutan pengamatan yang tersusun menurut waktu:
\[ \{Y_t\}_{t = 1,2,\dots, T}, \]
di mana:
Contoh deret waktu:
Tujuan analisis deret waktu antara lain:
White noise adalah deret waktu \(\{\varepsilon_t\}\) yang memenuhi:
Ditulis:
\[ \varepsilon_t \sim \text{WN}(0, \sigma^2). \]
White noise sering dipakai sebagai error term (gangguan acak) dalam model deret waktu.
Suatu deret waktu \(\{Y_t\}\) disebut stasioner lemah jika:
Rata-rata konstan: \[ \mathbb E[Y_t] = \mu, \quad \text{tidak bergantung pada } t. \]
Varians konstan: \[ \operatorname{Var}(Y_t) = \sigma^2, \quad \text{tidak bergantung pada } t. \]
Kovariansi hanya bergantung pada lag: \[ \gamma(h) = \operatorname{Cov}(Y_t, Y_{t-h}) = \operatorname{Cov}(Y_{t+h}, Y_t), \]
artinya hanya fungsi dari jarak \(h\), bukan waktu absolut.
Banyak model deret waktu klasik (AR, MA, ARMA) diasumsikan stasioner.
set.seed(123)
n <- 200
wn <- ts(rnorm(n, mean = 0, sd = 1))
autoplot(wn) +
labs(title = "Simulasi White Noise N(0,1)",
x = "Waktu", y = "Nilai")
Plot ACF:
ggAcf(wn) +
labs(title = "ACF White Noise (Harus Dekat 0 di Semua Lag)")
Model AR(p):
\[ Y_t = \phi_1 Y_{t-1} + \phi_2 Y_{t-2} + \cdots + \phi_p Y_{t-p} + \varepsilon_t, \]
dengan \(\varepsilon_t \sim \text{WN}(0, \sigma^2)\).
Kasus paling sederhana: AR(1),
\[ Y_t = \phi Y_{t-1} + \varepsilon_t. \]
Syarat stasioneritas AR(1): \(|\phi| < 1\).
Model MA(q):
\[ Y_t = \varepsilon_t + \theta_1 \varepsilon_{t-1} + \cdots + \theta_q \varepsilon_{t-q}, \]
dengan \(\varepsilon_t \sim \text{WN}(0, \sigma^2)\).
Model ARMA(p,q) menggabungkan keduanya:
\[ Y_t = \phi_1 Y_{t-1} + \cdots + \phi_p Y_{t-p} + \varepsilon_t + \theta_1 \varepsilon_{t-1} + \cdots + \theta_q \varepsilon_{t-q}. \]
Model ARMA berlaku untuk deret waktu stasioner.
Misal kita simulasikan AR(1):
\[ Y_t = 0.6 Y_{t-1} + \varepsilon_t, \quad \varepsilon_t \sim \text{WN}(0,1). \]
set.seed(123)
ar1 <- arima.sim(model = list(ar = 0.6), n = 300)
ar1 <- ts(ar1)
autoplot(ar1) +
labs(title = "Simulasi AR(1) dengan phi = 0.6",
x = "Waktu", y = "Y_t")
ACF dan PACF:
p1 <- ggAcf(ar1) + labs(title = "ACF AR(1)")
p2 <- ggPacf(ar1) + labs(title = "PACF AR(1)")
p1
p2
Gunakan fungsi arima():
fit_ar1 <- arima(ar1, order = c(1, 0, 0)) # AR(1)
fit_ar1
##
## Call:
## arima(x = ar1, order = c(1, 0, 0))
##
## Coefficients:
## ar1 intercept
## 0.551 0.0553
## s.e. 0.048 0.1227
##
## sigma^2 estimated as 0.9179: log likelihood = -413.02, aic = 832.04
Koefisien dan inferensi (approx z-stat):
coefs <- coef(fit_ar1)
se <- sqrt(diag(fit_ar1$var.coef))
zval <- coefs / se
pval <- 2 * (1 - pnorm(abs(zval)))
cbind(Estimate = coefs,
Std.Error = se,
z.value = zval,
p.value = pval)
## Estimate Std.Error z.value p.value
## ar1 0.55096701 0.0479752 11.4844138 0.000000
## intercept 0.05534586 0.1226877 0.4511117 0.651909
Residual dan diagnostik:
res_ar1 <- residuals(fit_ar1)
autoplot(res_ar1) +
labs(title = "Residual AR(1)",
x = "Waktu", y = "Residual")
ggAcf(res_ar1) +
labs(title = "ACF Residual AR(1)")
Jika model cocok, ACF residual harus mendekati 0 pada semua lag.
Simulasi:
\[ Y_t = \varepsilon_t + 0.7 \varepsilon_{t-1}. \]
set.seed(456)
ma1 <- arima.sim(model = list(ma = 0.7), n = 300)
ma1 <- ts(ma1)
autoplot(ma1) +
labs(title = "Simulasi MA(1) dengan theta = 0.7",
x = "Waktu", y = "Y_t")
ACF/PACF:
ggAcf(ma1) + labs(title = "ACF MA(1)")
ggPacf(ma1) + labs(title = "PACF MA(1)")
Estimasi:
fit_ma1 <- arima(ma1, order = c(0, 0, 1)) # MA(1)
fit_ma1
##
## Call:
## arima(x = ma1, order = c(0, 0, 1))
##
## Coefficients:
## ma1 intercept
## 0.6656 0.0536
## s.e. 0.0474 0.0939
##
## sigma^2 estimated as 0.9567: log likelihood = -419.34, aic = 844.68
Banyak deret waktu nyata tidak stasioner (misalnya
ada trend).
Model ARIMA(p,d,q) menggabungkan:
Operator differencing:
\[ \nabla Y_t = Y_t - Y_{t-1}, \]
\[ \nabla^d Y_t = (1 - B)^d Y_t, \]
dengan \(B\) adalah backshift operator: \(B Y_t = Y_{t-1}\).
Kita buat deret dengan integrasi (random walk dengan struktur ARMA):
set.seed(789)
arima111 <- arima.sim(model = list(order = c(1, 1, 1),
ar = 0.5, ma = -0.4),
n = 300)
arima111 <- ts(arima111)
autoplot(arima111) +
labs(title = "Simulasi ARIMA(1,1,1)",
x = "Waktu", y = "Y_t")
Cek ACF/PACF dari differenced series:
d_arima <- diff(arima111)
autoplot(d_arima) +
labs(title = "Differenced Series dari ARIMA(1,1,1)",
x = "Waktu", y = "ΔY_t")
ggAcf(d_arima) + labs(title = "ACF ΔY_t")
ggPacf(d_arima) + labs(title = "PACF ΔY_t")
auto.arima dan arimafit_arima_auto <- auto.arima(arima111)
fit_arima_auto
## Series: arima111
## ARIMA(0,1,1)
##
## Coefficients:
## ma1
## 0.0969
## s.e. 0.0584
##
## sigma^2 = 1.002: log likelihood = -425.53
## AIC=855.05 AICc=855.09 BIC=862.46
# Atau secara manual (jika sudah tahu ordo)
fit_arima <- arima(arima111, order = c(1, 1, 1))
fit_arima
##
## Call:
## arima(x = arima111, order = c(1, 1, 1))
##
## Coefficients:
## ar1 ma1
## -0.0891 0.1844
## s.e. 0.3506 0.3429
##
## sigma^2 estimated as 0.9987: log likelihood = -425.5, aic = 856.99
Untuk deret waktu dengan musiman (misalnya data bulanan dengan pola tahunan), kita gunakan model SARIMA:
\[ \text{SARIMA}(p,d,q) \times (P,D,Q)_s, \]
dengan:
Kita simulasi data dengan pola musiman sederhana menggunakan komponen sinus + noise:
set.seed(42)
# Misal data bulanan selama 10 tahun
n_s <- 120
t_s <- 1:n_s
# Komponen musiman sederhana (sinus) + noise
seasonal <- 2 * sin(2 * pi * t_s / 12)
noise <- rnorm(n_s, sd = 1)
y_s <- seasonal + noise
y_s <- ts(y_s, frequency = 12)
autoplot(y_s) +
labs(title = "Simulasi Deret Waktu dengan Musiman (f=12)",
x = "Waktu", y = "Y_t")
Gunakan auto.arima dengan argumen
seasonal = TRUE:
fit_sarima <- auto.arima(y_s, seasonal = TRUE)
fit_sarima
## Series: y_s
## ARIMA(0,0,0)(0,1,1)[12]
##
## Coefficients:
## sma1
## -0.8319
## s.e. 0.1238
##
## sigma^2 = 1.236: log likelihood = -171.1
## AIC=346.21 AICc=346.32 BIC=351.57
Atau tentukan sendiri, misal SARIMA(0,0,0)×(1,0,0)_12:
fit_sarima_manual <- Arima(y_s, order = c(0,0,0),
seasonal = list(order = c(1,0,0), period = 12))
fit_sarima_manual
## Series: y_s
## ARIMA(0,0,0)(1,0,0)[12] with non-zero mean
##
## Coefficients:
## sar1 mean
## 0.6089 0.0957
## s.e. 0.0723 0.2722
##
## sigma^2 = 1.808: log likelihood = -207.58
## AIC=421.15 AICc=421.36 BIC=429.52
Banyak deret ekonomi bersifat non-stasioner
(misalnya I(1): menjadi stasioner setelah sekali differencing).
Dua deret waktu \(\{Y_t\}\) dan \(\{X_t\}\) disebut
terkointegrasi jika:
Artinya, walaupun masing-masing deret random walk, ada hubungan jangka panjang (equilibrium) di antara keduanya.
Bayangkan dua orang sahabat jalan kaki di taman.
TAPI… karena mereka sahabatan:
Nah, kalau kita lihat selisih jarak mereka, meskipun masing-masing tidak stabil, selisihnya stabil (bounded & mean-reverting).
Itulah kointegrasi:
Dua (atau lebih) deret waktu non-stasioner tetapi memiliki kombinasi linier yang stasioner.
Misalkan dua deret waktu:
keduanya adalah \(I(1)\) (integrated of order 1), artinya:
Jika ada konstanta dan koefisien \(\beta\) sehingga:
\[ u_t = y_t - \beta x_t \]
adalah \(I(0)\) (stasioner), maka:
\(y_t\) dan \(x_t\) dikatakan terkointegrasi.
Catatan penting:
Beberapa pasangan variabel yang sering dianggap terkointegrasi:
| Variabel | Kenapa mungkin terkointegrasi? |
|---|---|
| Harga minyak mentah & harga bensin | saling terkait produksi & pasar |
| Nilai tukar USD/IDR & inflasi | arbitrase & keseimbangan jangka panjang |
| Harga emas & harga perak | substitusi pasar di komoditas berdekatan |
| Permintaan listrik & GDP | pertumbuhan ekonomi & kebutuhan energi |
| Jumlah mahasiswa & jumlah dosen | kapasitas layanan akademik jangka panjang |
Walaupun masing-masing bisa naik turun tak stabil, hubungan jangka panjang tetap ada.
Tanpa Kointegrasi
Regresi linier OLS dua series non-stasioner (tanpa kointegrasi) dapat menghasilkan:
Dengan Kointegrasi
Jika \(y_t\) dan \(x_t\) terkointegrasi:
sehingga dinamika jangka pendek dan jangka panjang bisa dimodelkan sekaligus.
Secara sketsa:
| .
| . .
| . . .
|____________________ time
Dua random walk bebas—jarak di antara mereka cenderung makin lama makin melebar.
Secara sketsa:
| . . . . .
| . . . .
| . . . .
|____________________ time
selisih stabil (stationer)
Masing-masing non-stasioner, tetapi
selisihnya stabil.
Inilah inti kointegrasi.
Pada bagian ini kita akan mensimulasikan:
set.seed(123)
T <- 200
e1 <- rnorm(T, 0, 1)
e2 <- rnorm(T, 0, 1)
# Random walk pertama
x <- cumsum(e1)
# y mengikuti x dengan sedikit error (kointegrasi)
y <- 2 * x + e2
ts_plot <- data.frame(
time = 1:T,
x = x,
y = y
)
ggplot(ts_plot, aes(x = time)) +
geom_line(aes(y = y, colour = "y_t")) +
geom_line(aes(y = x, colour = "x_t")) +
scale_colour_manual(values = c("y_t" = "blue", "x_t" = "red"),
name = "Series") +
labs(
title = "Simulasi Dua Random Walk yang Terkointegrasi",
x = "t",
y = "Nilai"
)
Terlihat bahwa:
Metode Engle-Granger (dua langkah):
Regresi jangka panjang: \[ y_t = \alpha + \beta x_t + u_t. \]
Uji stasioneritas terhadap residu \(u_t\) dengan uji akar unit (misalnya ADF).
Jika residu \(u_t\) stasioner (I(0)) \(\Rightarrow\) ada kointegrasi.
eg_model <- lm(y ~ x)
summary(eg_model)
##
## Call:
## lm(formula = y ~ x)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.44898 -0.64740 -0.01813 0.67822 2.59433
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.01498 0.08321 0.18 0.857
## x 2.01590 0.02587 77.94 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.9975 on 198 degrees of freedom
## Multiple R-squared: 0.9684, Adjusted R-squared: 0.9683
## F-statistic: 6074 on 1 and 198 DF, p-value: < 2.2e-16
res <- residuals(eg_model)
plot(res, type = "l",
main = "Residual Engle-Granger (u_t)",
xlab = "t", ylab = "Residual")
abline(h = 0, col = "red", lty = 2)
adf_test_res <- adf.test(res)
adf_test_res
##
## Augmented Dickey-Fuller Test
##
## data: res
## Dickey-Fuller = -7.5447, Lag order = 5, p-value = 0.01
## alternative hypothesis: stationary
Interpretasi umum:
Jika p-value dari
adf.test(res)< 0.05, kita menyimpulkan bahwa y dan x terkointegrasi.
Ringkasnya:
| Tanpa kointegrasi | Dengan kointegrasi |
|---|---|
| Tidak ada hubungan jangka panjang | Ada hubungan jangka panjang yang stabil |
| Regresi biasa \(\to\) spurious | OLS valid sebagai model jangka panjang |
| ARIMA terpisah per variabel | ECM/VECM untuk gabungan jangka pendek dan panjang |
Cara paling mudah untuk mengingat:
“Dua orang sahabat yang berjalan acak, tapi tak pernah terpisah jauh.”
Jarak di antara mereka stasioner \(\Rightarrow\) itulah kointegrasi.
Catatan ini bisa dikembangkan lebih lanjut menjadi:
Random walk adalah salah satu proses stokastik yang paling dasar dan penting dalam statistika dan time series analysis. Walaupun bentuk modelnya sederhana, random walk muncul di banyak bidang:
Dokumen ini akan membahas:
Secara umum, suatu proses stokastik \(\{X_t\}\) dikatakan mengikuti random walk jika:
\[ X_t = X_{t-1} + \varepsilon_t, \qquad t = 1, 2, 3, \dots \]
dengan:
Dengan melakukan substitusi berulang, kita dapat menuliskan:
\[ X_t = X_0 + \varepsilon_1 + \varepsilon_2 + \cdots + \varepsilon_t. \]
Jika \(X_0 = 0\), maka:
\[ X_t = \sum_{j=1}^t \varepsilon_j. \]
Dalam beberapa kasus, random walk ditambahkan drift \(\delta\):
\[ X_t = \delta + X_{t-1} + \varepsilon_t. \]
Jika \(\delta > 0\), maka secara rata-rata proses akan cenderung meningkat seiring waktu. Jika \(\delta < 0\), proses cenderung menurun.
Misalkan \(X_0 = 0\) dan \(\varepsilon_t \sim \mathrm{N}(0, \sigma^2)\) i.i.d.
Karena \(\mathbb{E}[\varepsilon_t] = 0\), maka:
\[ \mathbb{E}[X_t] = \mathbb{E}\left[ \sum_{j=1}^t \varepsilon_j \right] = \sum_{j=1}^t \mathbb{E}[\varepsilon_j] = 0. \]
Untuk random walk dengan drift \(\delta\):
\[ X_t = X_0 + t\delta + \sum_{j=1}^t \varepsilon_j \]
sehingga:
\[ \mathbb{E}[X_t] = X_0 + t\delta. \]
Karena \(\varepsilon_j\) saling bebas dan \(\mathrm{Var}(\varepsilon_j) = \sigma^2\):
\[ \mathrm{Var}(X_t) = \mathrm{Var}\left( \sum_{j=1}^t \varepsilon_j \right) = \sum_{j=1}^t \mathrm{Var}(\varepsilon_j) = t \sigma^2. \]
Jadi varians meningkat linear seiring waktu: semakin jauh ke depan, ketidakpastian makin besar.
Sebuah proses dikatakan weakly stationary jika:
Pada random walk:
Karena varians tidak konstan, random walk adalah proses non-stasioner. Inilah alasan mengapa dalam analisis ARIMA, random walk biasanya perlu di-difference (differences orde 1):
\[ \nabla X_t = X_t - X_{t-1} = \varepsilon_t, \]
yang sudah merupakan white noise stasioner.
Model AR(1) dituliskan:
\[ Y_t = \phi Y_{t-1} + \varepsilon_t. \]
Jika:
\[ Y_t = Y_{t-1} + \varepsilon_t. \]
Jadi random walk dapat dilihat sebagai batas dari model AR(1) ketika \(\phi \to 1\).
Beberapa contoh aplikasi:
Harga Saham (Log-Price)
Model klasik: log-price mengikuti random walk:
\[ p_t = p_{t-1} + \varepsilon_t \]
sehingga return \(r_t = p_t - p_{t-1} = \varepsilon_t\) menjadi white noise (kurang lebih).
Model Level dalam State-Space (Local
Level)
Pada local level model:
\[ y_t = \mu_t + \varepsilon_t, \\ \mu_t = \mu_{t-1} + \eta_t. \]
Di sini \(\mu_t\) mengikuti random walk, menangkap level sistem yang berubah secara perlahan.
Pergerakan Partikel
Posisi partikel satu dimensi:
\[ X_t = X_{t-1} + \varepsilon_t \]
menggambarkan pergerakan acak (walk) di sepanjang garis.
Penyebaran Informasi / Proses Sosial
Kadang digunakan sebagai model sederhana untuk akumulasi opini/dampak
kebijakan dari waktu ke waktu.
set.seed(123)
T <- 200
eps <- rnorm(T, mean = 0, sd = 1)
x <- cumsum(eps) # random walk (X_0 dianggap 0)
time <- 1:T
plot(time, x, type = "l", lwd = 2,
main = "Simulasi Random Walk (tanpa drift)",
xlab = "t", ylab = expression(X[t]))
abline(h = 0, col = "red", lty = 2)
set.seed(123)
T <- 200
delta <- 0.1 # drift positif
eps <- rnorm(T, mean = 0, sd = 1)
x_drift <- cumsum(delta + eps)
time <- 1:T
plot(time, x_drift, type = "l", lwd = 2,
main = "Simulasi Random Walk dengan Drift",
xlab = "t", ylab = expression(X[t]))
abline(a = 0, b = delta, col = "blue", lty = 2) # garis rata-rata kira-kira
Garis biru (\(\approx \delta t\)) memberi gambaran bahwa rata-rata proses naik linear.
Untuk melihat perbedaan, kita bandingkan:
set.seed(42)
T <- 200
eps <- rnorm(T)
x <- cumsum(eps)
par(mfrow = c(2, 1))
plot(1:T, eps, type = "l",
main = "White Noise",
xlab = "t", ylab = expression(epsilon[t]))
plot(1:T, x, type = "l",
main = "Random Walk = Cumulative Sum of White Noise",
xlab = "t", ylab = expression(X[t]))
par(mfrow = c(1, 1))
White noise berosilasi di sekitar nol dengan varians konstan. Random walk melayang ke atas/bawah dengan varians yang membesar.
Kita bandingkan:
set.seed(777)
T <- 200
phi <- 0.8
eps1 <- rnorm(T)
eps2 <- rnorm(T)
# AR(1) dengan phi = 0.8
y_ar1 <- numeric(T)
y_ar1[1] <- 0
for (t in 2:T) {
y_ar1[t] <- phi * y_ar1[t-1] + eps1[t]
}
# Random walk
y_rw <- cumsum(eps2)
df_compare <- data.frame(
time = rep(1:T, 2),
value = c(y_ar1, y_rw),
process = rep(c("AR(1), phi=0.8", "Random Walk"), each = T)
)
ggplot(df_compare, aes(x = time, y = value, colour = process)) +
geom_line() +
labs(
title = "Perbandingan AR(1) Hampir Satu vs Random Walk",
x = "t",
y = "Nilai"
)
AR(1) dengan \(|\phi| < 1\) cenderung kembali ke suatu rentang (mean-reverting), sedangkan random walk cenderung mengembara lebih jauh.
Misalkan kita menggunakan random walk sebagai model:
\[ X_t = X_{t-1} + \varepsilon_t, \qquad \varepsilon_t \sim \mathrm{N}(0, \sigma^2). \]
Forecast \(h\)-langkah ke depan dari waktu \(t\):
Mean forecast: \[ \hat{X}_{t+h|t} = X_t. \]
Varians forecast: \[ \mathrm{Var}(X_{t+h} \mid X_t) = h \sigma^2. \]
Artinya:
set.seed(123)
T <- 100
eps <- rnorm(T, sd = 1)
x <- cumsum(eps)
# kita anggap observasi terakhir di waktu T
x_T <- x[T]
h <- 20
sigma <- 1
# mean forecast konstan = x_T
fc_mean <- rep(x_T, h)
fc_lower <- x_T - qnorm(0.975) * sqrt(1:h * sigma^2)
fc_upper <- x_T + qnorm(0.975) * sqrt(1:h * sigma^2)
df_hist <- data.frame(
time = 1:T,
x = x
)
df_fc <- data.frame(
time = (T+1):(T+h),
mean = fc_mean,
lower = fc_lower,
upper = fc_upper
)
ggplot() +
geom_line(data = df_hist, aes(x = time, y = x), colour = "black") +
geom_line(data = df_fc, aes(x = time, y = mean), colour = "blue", linetype = "dashed") +
geom_ribbon(data = df_fc,
aes(x = time, ymin = lower, ymax = upper),
fill = "blue", alpha = 0.2) +
labs(
title = "Forecast Random Walk: Mean Konstan, Interval Melebar",
x = "t",
y = expression(X[t])
)
Terlihat bahwa:
Random walk sering muncul sebagai komponen laten dalam model state-space:
Dalam konteks Bayesian:
Di Stan / bayesforecast, random walk
diimplementasikan dengan priors pada state increment.
Di INLA, random walk diimplementasikan melalui efek acak:
f(time, model = "rw1")yang berarti tingkat (level) berubah secara smooth tetapi fleksibel mengikuti data.
Ringkasan
Materi dan kode di atas dapat langsung digunakan sebagai bahan ajar, contoh praktikum, atau bagian bab dalam e-book Bayesian Time Series dan state-space modeling.
set.seed(999)
n_c <- 300
e1 <- rnorm(n_c, sd = 1)
e2 <- rnorm(n_c, sd = 1)
# Random walk 1
x_c <- cumsum(e1)
# y memiliki hubungan jangka panjang dengan x + noise kecil
y_c <- 1.5 * x_c + rnorm(n_c, sd = 2)
x_c <- ts(x_c)
y_c <- ts(y_c)
autoplot(cbind(x_c, y_c)) +
labs(title = "Dua Random Walk dengan Hubungan Jangka Panjang",
x = "Waktu", y = "Nilai")
Jika kita regresikan \(y_t\) terhadap \(x_t\):
fit_coint <- lm(y_c ~ x_c)
summary(fit_coint)
##
## Call:
## lm(formula = y_c ~ x_c)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.2779 -1.3196 -0.0169 1.3179 5.8573
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.1443 0.2908 -0.496 0.62
## x_c 1.4956 0.0166 90.116 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.956 on 298 degrees of freedom
## Multiple R-squared: 0.9646, Adjusted R-squared: 0.9645
## F-statistic: 8121 on 1 and 298 DF, p-value: < 2.2e-16
Residual:
res_c <- residuals(fit_coint)
res_df <- data.frame(
t = 1:length(res_c),
res = as.numeric(res_c)
)
ggplot(res_df, aes(x = t, y = res)) +
geom_line() +
labs(title = "Residual dari Regresi y_t ~ x_t",
x = "Waktu", y = "Residual")
ggAcf(res_c) +
labs(title = "ACF Residual (Diharapkan Stasioner jika Berkointegrasi)")
Secara formal, kointegrasi diuji dengan uji khusus (Engle–Granger, Johansen, dsb.), tetapi ilustrasi ini cukup untuk menggambarkan ide:
Dalam materi ini, kita telah:
arima() dan auto.arima.Materi ini bisa menjadi pijakan awal untuk analisis deret waktu lebih lanjut, seperti:
Dalam analisis deret waktu klasik, parameter model dipandang sebagai konstanta tidak diketahui dan diestimasi dengan metode seperti Maximum Likelihood (ML). Pada pendekatan Bayesian, parameter diperlakukan sebagai peubah acak yang memiliki distribusi prior. Informasi dari data masuk melalui likelihood sehingga menghasilkan distribusi posterior.
Dokumen singkat ini merangkum elemen dasar pendekatan Bayesian untuk pemodelan deret waktu. Fokusnya adalah pada:
Pendekatan Bayesian sangat natural untuk deret waktu karena:
Secara umum:
Teorema Bayes:
\[ p(\theta \mid y_{1:T}) \propto p(y_{1:T} \mid \theta)\, p(\theta) \]
dengan:
Posterior menggabungkan informasi dari data dan prior. Dalam praktik, komputasi posterior sering kali tidak memiliki bentuk tertutup, sehingga dibutuhkan pendekatan numerik (MCMC, INLA, Laplace, dsb.).
Tujuan penting dalam deret waktu adalah memprediksi nilai di masa depan. Secara Bayesian, prediksi satu langkah ke depan diberikan oleh:
\[ p(y_{T+1} \mid y_{1:T}) = \int p(y_{T+1} \mid \theta, y_{1:T})\, p(\theta \mid y_{1:T})\, d\theta. \]
Artinya, kita mengintegrasikan ketidakpastian parameter \(\theta\) melalui posterior \(p(\theta \mid y_{1:T})\).
Dalam deret waktu stasioner (misal AR/ARIMA), umumnya:
\[ p(y_{T+1} \mid \theta, y_{1:T}) = p(y_{T+1} \mid \theta, y_T, y_{T-1}, \dots), \]
sehingga prediksi cukup bergantung pada nilai–nilai terakhir, bukan seluruh sejarah lengkap, selama asumsi stasioneritas terpenuhi.
Bagian ini merangkum komponen yang sering muncul pada berbagai model deret waktu Bayesian.
Secara generik, model deret waktu dapat ditulis sebagai:
\[ y_t = f_t(\theta, y_{t-1}, y_{t-2}, \dots) + \varepsilon_t, \]
dengan \(\varepsilon_t\) merupakan komponen error. Dalam banyak model linear–Gaussian, diambil:
\[ \varepsilon_t \sim \mathrm{N}(0, \sigma^2), \]
sehingga \(y_t\) bersifat normal bersyarat pada masa lalu dan parameter \(\theta\).
Contoh sederhana:
Model AR(1):
\[ y_t = \phi\, y_{t-1} + \varepsilon_t, \quad \varepsilon_t \sim \mathrm{N}(0, \sigma^2). \]
Di sini \(\theta = (\phi, \sigma^2)\).
Untuk model linear–Gaussian, salah satu prior yang sering digunakan adalah Normal–Inverse-Gamma (NIG) untuk parameter mean/koefisien dan varians.
Misalkan \(\theta\) adalah vektor koefisien regresi dan \(\sigma^2\) adalah varians error, maka:
\[ \theta \mid \sigma^2 \sim \mathrm{N}(m_0,\; \sigma^2 V_0), \]
\[ \sigma^2 \sim \mathrm{InvGamma}(a_0,\; b_0). \]
Kombinasi Normal–Inverse-Gamma ini bersifat konjugat untuk model linear–Gaussian, sehingga posterior tetap berada dalam keluarga yang sama (Normal–Inverse-Gamma) dan dapat diturunkan secara analitik (untuk kasus tanpa struktur deret waktu yang rumit).
Untuk model sederhana, posterior dapat dituliskan dalam bentuk tertutup. Namun, untuk banyak model deret waktu yang lebih kompleks (ARIMA umum, model keadaan–ruang, nonlinier, heteroskedastik, dsb.), bentuk tertutup sukar atau tidak tersedia.
Dalam kasus tersebut, digunakan metode numerik, misalnya:
MCMC (Markov Chain Monte Carlo)
Metode ini membangkitkan sampel dari posterior \(p(\theta \mid y_{1:T})\) melalui rantai
Markov.
Beberapa skema MCMC yang populer:
Gibbs sampler:
Meng-update masing–masing komponen \(\theta\) secara bergantian dari distribusi
bersyarat lengkap. Contoh: \(\theta_1^{(s+1)}
\sim p(\theta_1 \mid \theta_2^{(s)}, y)\), \(\theta_2^{(s+1)} \sim p(\theta_2 \mid
\theta_1^{(s+1)}, y)\), dan seterusnya.
Metropolis–Hastings (MH):
Mengusulkan kandidat \(\theta^\star\)
dari proposal \(q(\theta^\star \mid
\theta^{(s)})\) dan menerima/menolak kandidat tersebut dengan
probabilitas berdasarkan rasio posterior dan proposal.
Hamiltonian Monte Carlo (HMC)
Digunakan, misalnya, dalam Stan. Menggunakan ide dinamika Hamiltonian
untuk menjelajahi ruang parameter secara lebih efisien, terutama pada
dimensi tinggi.
Teknik lain di luar MCMC (misalnya INLA untuk model Gaussian Markov Random Field) juga sering digunakan untuk deret waktu, namun tidak dibahas detail di bagian singkat ini.
Setelah posterior \(p(\theta \mid y)\) (dan posterior prediktif) diperoleh, beberapa ringkasan utama yang sering digunakan:
Posterior mean:
\[ \mathbb{E}[\theta \mid y], \]
digunakan sebagai point estimate untuk parameter \(\theta\). Ini adalah analog Bayesian dari estimator titik (misal MLE), namun berbasis distribusi posterior.
####Credible Interval
Credible interval 95% untuk parameter \(\theta\) adalah interval \([l, u]\) yang memenuhi:
\[ \Pr(l \le \theta \le u \mid y) = 0.95. \]
Interpretasinya: dengan mempertimbangkan model dan prior yang digunakan, peluang \(\theta\) berada di antara \(l\) dan \(u\) adalah 95%.
Berbeda dengan confidence interval frekuentis, probabilitas di sini melekat pada parameter (sebagai besaran acak) dalam kerangka Bayesian.
Untuk data baru \(y_{\text{new}}\), distribusi prediktif posterior adalah:
\[ p(y_{\text{new}} \mid y) = \int p(y_{\text{new}} \mid \theta)\, p(\theta \mid y)\, d\theta. \]
Dalam konteks deret waktu, ini menjadi:
\[ p(y_{T+1} \mid y_{1:T}) = \int p(y_{T+1} \mid \theta, y_{1:T})\, p(\theta \mid y_{1:T})\, d\theta. \]
Distribusi ini menggabungkan ketidakpastian model dan parameter, sehingga memberi gambaran penuh tentang ketidakpastian prediksi (bukan hanya satu angka ramalan).
Ringkasan:
Bagian–bagian selanjutnya dapat diisi dengan contoh konkret (misalnya model AR(1) Bayesian, ARIMA Bayesian, atau model keadaan–ruang (state–space) dengan INLA) beserta kode R dan visualisasi hasil.
Sebagai pemanasan, kita buat satu deret waktu AR(1) sederhana yang akan digunakan beberapa kali.
n <- 200
phi_true <- 0.6
sigma_true <- 1
y <- numeric(n)
y[1] <- rnorm(1, 0, sigma_true / sqrt(1 - phi_true^2))
for (t in 2:n) {
y[t] <- phi_true * y[t-1] + rnorm(1, 0, sigma_true)
}
df_y <- data.frame(
t = 1:n,
y = y
)
ggplot(df_y, aes(t, y)) +
geom_line() +
labs(
title = "Simulasi Deret Waktu AR(1)",
x = "Waktu",
y = "y_t"
) +
theme_minimal()
Plot memperlihatkan pola fluktuasi dengan memory jangka pendek: nilai sekarang cenderung mengikuti nilai sebelumnya karena \(= 0.6\) positif.
knitr::opts_chunk$set(
message = FALSE,
warning = FALSE,
fig.width = 7,
fig.height = 4.5
)
library(ggplot2)
library(forecast)
set.seed(123)
Model AR(1) standar:
\[ y_t = \phi\, y_{t-1} + \varepsilon_t, \qquad \varepsilon_t \sim \mathrm N(0,\sigma^2). \]
Model AR(\(p\)):
\[ y_t = \phi_1 y_{t-1} + \phi_2 y_{t-2} + \cdots + \phi_p y_{t-p} + \varepsilon_t . \]
Dalam bentuk regresi:
\[ y_t = \mathrm{x}_t^\top \boldsymbol{\phi} + \varepsilon_t, \qquad \mathrm{x}_t = \begin{pmatrix} y_{t-1}\\ y_{t-2}\\ \vdots\\ y_{t-p} \end{pmatrix}, \]
dengan vektor parameter
\[ \boldsymbol{\phi} = \begin{pmatrix} \phi_1 \\ \phi_2 \\ \vdots\\ \phi_p \end{pmatrix}. \]
Jika kita kumpulkan semua observasi (dari \(t = p+1,\dots,T\)) menjadi:
\[ \mathrm{y} = \begin{pmatrix} y_{p+1}\\ y_{p+2}\\ \vdots\\ y_T \end{pmatrix}, \quad \mathrm{X} = \begin{pmatrix} y_p & y_{p-1} & \cdots & y_1 \\ y_{p+1} & y_p & \cdots & y_2 \\ \vdots & \vdots & & \vdots \\ y_{T-1} & y_{T-2} & \cdots & y_{T-p} \end{pmatrix}, \]
maka model dapat ditulis ringkas sebagai:
\[ \mathrm{y} = \mathrm{X}\boldsymbol{\phi} + \boldsymbol{\varepsilon}, \qquad \boldsymbol{\varepsilon} \sim \mathrm N(\mathrm{0},\, \sigma^2\mathrm{I}). \]
Ambil prior konjugat:
\[ \boldsymbol{\phi} \mid \sigma^2 \sim \mathrm N(\mathrm{m}_0,\; \sigma^2 \mathrm{V}_0), \]
\[ \sigma^2 \sim \text{Inv-Gamma}(a_0, b_0). \]
Untuk AR(1), \(\boldsymbol{\phi} = \phi\) skalar, sehingga \(\mathrm{V}_0\) adalah matriks \(1\times 1\) (sekadar varian prior).
Dengan likelihood Gaussian dan prior seperti di atas, posterior tetap berada dalam keluarga Normal–Inverse-Gamma.
\[ \mathrm{V}_n = \left( \mathrm{V}_0^{-1} + \mathrm{X}^\top\mathrm{X} \right)^{-1}. \]
\[ \mathrm{m}_n = \mathrm{V}_n \left( \mathrm{V}_0^{-1} \mathrm{m}_0 + \mathrm{X}^\top \mathrm{y} \right). \]
Dengan \(n = T-p\) banyaknya baris dalam \(\mathrm{X}\) dan \(\mathrm{y}\):
\[ a_n = a_0 + \frac{n}{2}, \]
\[ b_n = b_0 + \frac{1}{2} \left( \mathrm{y}^\top \mathrm{y} + \mathrm{m}_0^\top \mathrm{V}_0^{-1} \mathrm{m}_0 - \mathrm{m}_n^\top \mathrm{V}_n^{-1} \mathrm{m}_n \right). \]
\[ \boldsymbol{\phi} \mid \sigma^2, \mathrm{y} \sim \mathrm N(\mathrm{m}_n,\; \sigma^2 \mathrm{V}_n). \]
\[ \sigma^2 \mid \mathrm{y} \sim \text{Inv-Gamma}(a_n, b_n). \]
Posterior marginal \(p(\boldsymbol{\phi} \mid \mathrm{y})\) mengikuti distribusi Student-t multivariat, tetapi dalam praktik sering kali kita bekerja dengan:
Kita simulasi AR(1) dengan parameter benar \(\phi = 0.7\) dan \(\sigma^2 = 1\).
set.seed(123)
Tn <- 300
phi0 <- 0.7
sigma0 <- 1
y <- numeric(Tn)
y[1] <- 0
for (t in 2:Tn) {
y[t] <- phi0 * y[t-1] + rnorm(1, 0, sqrt(sigma0))
}
ts_y <- ts(y)
autoplot(ts_y) +
labs(title = "Simulasi AR(1): phi = 0.7",
x = "Waktu", y = "y_t")
Untuk AR(1): \(p=1\), sehingga
p <- 1
y_vec <- y[(p+1):Tn] # dari t=p+1 s/d T
X_mat <- cbind(y[p:(Tn-1)]) # kolom y_{t-1}
colnames(X_mat) <- "y_lag1"
n <- length(y_vec)
head(cbind(y = y_vec, X_mat))
## y y_lag1
## [1,] -0.5604756 0.0000000
## [2,] -0.6225104 -0.5604756
## [3,] 1.1229510 -0.6225104
## [4,] 0.8565741 1.1229510
## [5,] 0.7288896 0.8565741
## [6,] 2.2252877 0.7288896
Misalkan kita ingin prior lemah/informatif-ringan:
m0 <- 0 # prior mean phi
V0 <- matrix(10) # prior variance (1x1)
a0 <- 2
b0 <- 2
# Vn
Vn_inv <- solve(V0) + crossprod(X_mat) # V0^{-1} + X'X
Vn <- solve(Vn_inv)
# mn
mn <- Vn %*% (solve(V0) %*% m0 + crossprod(X_mat, y_vec))
# an, bn
an <- a0 + n/2
bn <- b0 + 0.5 * (
crossprod(y_vec) +
t(m0) %*% solve(V0) %*% m0 -
t(mn) %*% Vn_inv %*% mn
)
Vn; mn; an; bn
## y_lag1
## y_lag1 0.00222412
## [,1]
## y_lag1 0.6419021
## [1] 151.5
## [,1]
## [1,] 134.3901
Interpretasi:
Posterior marginal \(p(\phi \mid y)\) adalah Student-\(t\) dengan:
\[ \operatorname{Var}(\phi \mid y) = \frac{b_n}{(a_n - 1)} V_n \]
(jika \(a_n > 1\)).
Untuk kepraktisan, kita ambil pendekatan Monte Carlo: sampling dari posterior untuk mendapatkan ringkasan dan credible interval.
n_samp <- 5000
# Sampel sigma^2 ~ Inv-Gamma(an, bn)
# (gunakan bahwa jika x ~ Gamma(shape, rate), maka 1/x ~ Inv-Gamma)
sigma2_samp <- 1 / rgamma(n_samp, shape = an, rate = bn)
phi_samp <- numeric(n_samp)
for (i in 1:n_samp) {
phi_samp[i] <- rnorm(1, mean = mn, sd = sqrt(sigma2_samp[i] * Vn))
}
summary(phi_samp)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.4671 0.6108 0.6412 0.6412 0.6716 0.7948
quantile(phi_samp, probs = c(0.025, 0.5, 0.975))
## 2.5% 50% 97.5%
## 0.5545854 0.6411548 0.7286380
Plot posterior \(\phi\):
phi_df <- data.frame(phi = phi_samp)
ggplot(phi_df, aes(x = phi)) +
geom_histogram(aes(y = ..density..), bins = 40, alpha = 0.7) +
geom_vline(xintercept = phi0, color = "red", linetype = "dashed", size = 1) +
labs(title = "Posterior phi (AR(1))",
x = expression(phi), y = "Densitas") +
theme_minimal()
Garis merah menunjukkan nilai true \(\phi_0 = 0.7\).
Asumsi utama AR(1):
Dalam pendekatan Bayesian, kita bisa:
Cek numerik:
mean(abs(phi_samp) < 1)
## [1] 1
Jika nilai ini ~1, maka hampir seluruh posterior \(\phi\) berada pada nilai stasioner.
Plot posterior dengan garis di -1 dan 1:
ggplot(phi_df, aes(x = phi)) +
geom_histogram(aes(y = ..density..), bins = 40, alpha = 0.7) +
geom_vline(xintercept = c(-1, 1), linetype = "dotted") +
labs(title = "Posterior phi dan Batas Stasioneritas (-1, 1)",
x = expression(phi), y = "Densitas") +
theme_minimal()
Gunakan satu ringkasan titik (misalnya mean posterior \(m_n\)) sebagai plug-in estimator untuk \(\phi\), lalu bangun residual:
\[ \hat{\varepsilon}_t = y_t - \hat{\phi}\,y_{t-1}. \]
phi_hat <- as.numeric(mn)
eps_hat <- y[2:Tn] - phi_hat * y[1:(Tn-1)]
ts_eps <- ts(eps_hat)
autoplot(ts_eps) +
labs(title = "Residual Plug-in (phi = mean posterior)",
x = "Waktu", y = expression(hat(epsilon)[t]))
ggAcf(ts_eps) + labs(title = "ACF Residual AR(1) (phi posterior mean)")
Box.test(ts_eps, lag = 10, type = "Ljung-Box")
##
## Box-Ljung test
##
## data: ts_eps
## X-squared = 7.861, df = 10, p-value = 0.6424
Jika p-value Ljung–Box besar, tidak ada bukti kuat adanya autokorelasi residual yang tersisa.
qqnorm(eps_hat)
qqline(eps_hat, col = "red", lwd = 2)
Untuk analisis yang lebih Bayesian, idealnya:
Tetapi sebagai ilustrasi, plug-in menggunakan \(\hat{\phi} = m_n\) sudah cukup memberi intuisi.
Alih-alih uji hipotesis klasik (t-test), di Bayesian kita lebih suka:
Pertanyaan klasik: apakah ada autokorelasi?
Hipotesis:
Jika credible interval 95% untuk \(\phi\) tidak mencakup 0, kita dapat menyimpulkan bahwa secara Bayesian ada bukti kuat bahwa \(\phi \neq 0\).
quantile(phi_samp, probs = c(0.025, 0.5, 0.975))
## 2.5% 50% 97.5%
## 0.5545854 0.6411548 0.7286380
mean(phi_samp > 0)
## [1] 1
Jika nilai ini sangat dekat dengan 1 (misalnya > 0.99), interpretasinya:
Dengan mempertimbangkan prior dan data, probabilitas posterior bahwa \(\phi > 0\) adalah ~0.999 (misalnya), sehingga hampir pasti bahwa autokorelasi positif.
Demikian juga bisa dicek \(P(|\phi| < 1 \mid y)\) untuk stasioneritas:
mean(abs(phi_samp) < 1)
## [1] 1
Untuk pembanding, kita bisa menaksir AR(1) dengan fungsi
arima/Arima:
fit_arima <- Arima(ts_y, order = c(1,0,0), include.mean = FALSE)
fit_arima
## Series: ts_y
## ARIMA(1,0,0) with zero mean
##
## Coefficients:
## ar1
## 0.6399
## s.e. 0.0441
##
## sigma^2 = 0.8854: log likelihood = -407.19
## AIC=818.38 AICc=818.42 BIC=825.79
Keunggulan Bayesian:
Untuk AR(\(p\)) umum, semua formula sebelumnya tetap berlaku, dengan:
\[ \boldsymbol{\phi} \mid \sigma^2 \sim \mathrm N(\mathrm{m}_0,\; \sigma^2 \mathrm{V}_0), \]
\[ \sigma^2 \sim \text{Inv-Gamma}(a_0,b_0). \]
Posterior:
\[ \mathrm{V}_n = \left( \mathrm{V}_0^{-1} + \mathrm{X}^\top\mathrm{X} \right)^{-1}, \]
\[ \mathrm{m}_n = \mathrm{V}_n \left( \mathrm{V}_0^{-1}\mathrm{m}_0 + \mathrm{X}^\top\mathrm{y} \right), \]
\[ a_n = a_0 + \frac{n}{2}, \]
\[ b_n = b_0 + \frac{1}{2} \left( \mathrm{y}^\top \mathrm{y} + \mathrm{m}_0^\top \mathrm{V}_0^{-1}\mathrm{m}_0 - \mathrm{m}_n^\top \mathrm{V}_n^{-1}\mathrm{m}_n \right). \]
Implementasi R untuk AR(\(p\)):
Ringkasan
Dalam bab ini kita sudah:
Bagian ini bisa dijadikan fondasi untuk bab lanjutan, misalnya:
# Data y (dari simulasi global)
p <- 1
T_eff <- length(y) - p
y_vec <- y[(p+1):length(y)]
X_mat <- as.matrix(y[1:(length(y)-p)])
# Prior
m0 <- 0
V0 <- 10 # var prior besar = prior lemah
a0 <- 2
b0 <- 1
# Posterior
Vn <- 1 / (1/V0 + t(X_mat) %*% X_mat)
mn <- Vn * (m0 / V0 + t(X_mat) %*% y_vec)
an <- a0 + T_eff/2
bn <- b0 + 0.5 * (t(y_vec) %*% y_vec + m0^2 / V0 - mn^2 / Vn)
phi_post_mean <- mn
sigma2_post_mean <- bn / (an - 1)
phi_post_mean
## [,1]
## [1,] 0.6419021
sigma2_post_mean
## [,1]
## [1,] 0.8863132
Kita dapat menggambar posterior \(\) dan \(^2\) menggunakan sampling:
n_samp <- 5000
sigma2_samp <- 1 / rgamma(n_samp, shape = an, rate = bn) # Inv-Gamma
phi_samp <- rnorm(n_samp, mean = mn, sd = sqrt(sigma2_samp * Vn))
summary(phi_samp)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.4610 0.6127 0.6423 0.6424 0.6714 0.8157
quantile(phi_samp, probs = c(0.025, 0.5, 0.975))
## 2.5% 50% 97.5%
## 0.5550878 0.6422816 0.7304690
Plot distribusi posterior:
phi_df <- data.frame(phi = phi_samp)
ggplot(phi_df, aes(phi)) +
geom_histogram(bins = 40, aes(y = ..density..)) +
geom_vline(xintercept = phi_true, linetype = "dashed", color = "red") +
labs(
title = "Posterior phi untuk Model AR(1)",
x = expression(phi),
y = "Kerapatan"
) +
theme_minimal()
knitr::opts_chunk$set(
message = FALSE,
warning = FALSE,
fig.width = 7,
fig.height = 4.5
)
library(ggplot2)
library(forecast)
set.seed(123)
Model ARIMA menangkap komponen autoregressive (AR), differencing (I), dan moving average (MA).
Secara umum:
\[ \phi(B) (1 - B)^d y_t = \theta(B) \, \varepsilon_t, \]
dengan:
Jika kita definisikan deret terdiferensi
\[ w_t = (1-B)^d y_t, \]
maka model dapat ditulis sebagai ARMA(\(p,q\)) untuk \(w_t\):
\[ \phi(B) w_t = \theta(B) \, \varepsilon_t. \]
Pada pendekatan Bayesian, secara konseptual kita memberikan prior untuk parameter:
Secara umum:
\[ p(\boldsymbol{\phi}, \boldsymbol{\theta}, \sigma^2 \mid \mathbf{w}) \propto L(\mathbf{w} \mid \boldsymbol{\phi}, \boldsymbol{\theta}, \sigma^2) \; p(\boldsymbol{\phi}, \boldsymbol{\theta}, \sigma^2), \]
dengan \(\mathbf{w}\) adalah vektor observasi terdiferensi.
Untuk ARIMA(\(p,d,0\)) (tanpa komponen MA), model menjadi lebih sederhana:
\[ w_t = \phi_1 w_{t-1} + \cdots + \phi_p w_{t-p} + \varepsilon_t, \qquad \varepsilon_t \sim \mathrm N(0,\sigma^2). \]
Ini persis model AR(\(p\)) pada deret \(w_t\), sehingga:
Untuk ARIMA(\(p,d,q\)) dengan \(q > 0\) (ada MA):
Dalam file ini, kita fokus pada kasus yang punya bentuk tertutup yang elegan, yaitu ARIMA(1,1,0). Ini memberi contoh lengkap:
ARIMA(1,1,0) dapat ditulis:
\[ (1 - \phi B)(1-B) y_t = \varepsilon_t, \qquad \varepsilon_t \sim \mathrm N(0,\sigma^2). \]
Definisikan
\[ w_t = (1-B) y_t = y_t - y_{t-1}. \]
Maka model menjadi:
\[ w_t = \phi w_{t-1} + \varepsilon_t, \]
yaitu AR(1) pada deret terdiferensi \(w_t\).
Jadi, pendekatan Bayesian untuk ARIMA(1,1,0) dapat dilakukan dengan:
Kita simulasi ARIMA(1,1,0) dengan parameter:
set.seed(2024)
n <- 300
phi_true <- 0.6
sigma2_true <- 1
y_arima <- arima.sim(
list(order = c(1,1,0), ar = phi_true),
n = n
)
ts_y_arima <- ts(y_arima)
autoplot(ts_y_arima) +
labs(title = "Simulasi ARIMA(1,1,0): phi = 0.6",
x = "Waktu", y = "y_t")
Deret ini tidak stasioner (ada root unit karena differencing 1 kali).
w <- diff(ts_y_arima)
ts_w <- ts(w)
autoplot(ts_w) +
labs(title = "Deret Terdiferensi: w_t = y_t - y_{t-1}",
x = "Waktu", y = "w_t")
ggAcf(ts_w) + labs(title = "ACF w_t (seharusnya AR(1))")
ggPacf(ts_w) + labs(title = "PACF w_t")
Dari ACF/PACF, kita harapkan pola AR(1).
Untuk AR(1) pada \(w_t\):
\[ w_t = \phi w_{t-1} + \varepsilon_t, \qquad \varepsilon_t \sim \mathrm N(0,\sigma^2). \]
Ambil:
w_vec <- as.numeric(ts_w)
n_w <- length(w_vec)
y_vec <- w_vec[2:n_w] # respon (w_t)
X_mat <- cbind(w_vec[1:(n_w-1)]) # predictor (w_{t-1})
colnames(X_mat) <- "w_lag1"
n_eff <- length(y_vec)
head(cbind(y = y_vec, X_mat))
## y w_lag1
## [1,] -3.0327990 0.4024780
## [2,] -1.9677902 -3.0327990
## [3,] -2.1106128 -1.9677902
## [4,] -0.8121534 -2.1106128
## [5,] -2.0933993 -0.8121534
## [6,] -3.2675109 -2.0933993
Gunakan prior lemah (diffuse):
m0 <- 0
V0 <- matrix(10)
a0 <- 2
b0 <- 2
Seperti pada kasus AR(1) umum:
\[ V_n^{-1} = V_0^{-1} + X^\top X, \]
\[ V_n = (V_0^{-1} + X^\top X)^{-1}, \]
\[ m_n = V_n \left( V_0^{-1} m_0 + X^\top y \right), \]
\[ a_n = a_0 + \frac{n_{\text{eff}}}{2}, \]
\[ b_n = b_0 + \frac{1}{2} \left( y^\top y + m_0^\top V_0^{-1} m_0 - m_n^\top V_n^{-1} m_n \right). \]
Vn_inv <- solve(V0) + crossprod(X_mat)
Vn <- solve(Vn_inv)
mn <- Vn %*% (solve(V0) %*% m0 + crossprod(X_mat, y_vec))
an <- a0 + n_eff/2
bn <- b0 + 0.5 * (
crossprod(y_vec) +
t(m0) %*% solve(V0) %*% m0 -
t(mn) %*% Vn_inv %*% mn
)
Vn; mn; an; bn
## w_lag1
## w_lag1 0.00221906
## [,1]
## w_lag1 0.6038559
## [1] 151.5
## [,1]
## [1,] 146.7676
Kita sampel dari posterior:
n_samp <- 5000
sigma2_samp <- 1 / rgamma(n_samp, shape = an, rate = bn)
phi_samp <- numeric(n_samp)
for (i in 1:n_samp) {
phi_samp[i] <- rnorm(1, mean = mn, sd = sqrt(sigma2_samp[i] * Vn))
}
summary(phi_samp)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.4188 0.5724 0.6022 0.6033 0.6351 0.7595
quantile(phi_samp, probs = c(0.025, 0.5, 0.975))
## 2.5% 50% 97.5%
## 0.5111274 0.6022018 0.6937221
Plot posterior \(\phi\):
phi_df <- data.frame(phi = phi_samp)
ggplot(phi_df, aes(x = phi)) +
geom_histogram(aes(y = ..density..), bins = 40, alpha = 0.7) +
geom_vline(xintercept = phi_true, color = "red", linetype = "dashed", size = 1) +
labs(title = "Posterior phi (AR(1) pada Δy_t)",
subtitle = "Simulasi ARIMA(1,1,0)",
x = expression(phi), y = "Densitas") +
theme_minimal()
Untuk AR(1), stasioneritas mensyaratkan:
\[ |\phi| < 1. \]
Hitung probabilitas posterior bahwa syarat ini terpenuhi:
mean(abs(phi_samp) < 1)
## [1] 1
Jika nilai ini \(\approx 1\), hampir semua massa posterior berada pada region stasioner.
Visual:
ggplot(phi_df, aes(x = phi)) +
geom_histogram(aes(y = ..density..), bins = 40, alpha = 0.7) +
geom_vline(xintercept = c(-1, 1), linetype = "dotted") +
labs(title = "Posterior phi dan Batas Stasioneritas (|phi| < 1)",
x = expression(phi), y = "Densitas") +
theme_minimal()
Gunakan estimator plug-in:
\[ \hat{\phi} = m_n, \]
dan residual:
\[ \hat{\varepsilon}_t = w_t - \hat{\phi} w_{t-1}. \]
phi_hat <- as.numeric(mn)
eps_hat <- w_vec[2:n_w] - phi_hat * w_vec[1:(n_w-1)]
ts_eps <- ts(eps_hat)
autoplot(ts_eps) +
labs(title = "Residual Plug-in (AR(1) pada Δy_t)",
x = "Waktu", y = expression(hat(epsilon)[t]))
ACF dan uji Ljung–Box:
ggAcf(ts_eps) + labs(title = "ACF Residual (AR(1) pada Δy_t)")
Box.test(ts_eps, lag = 10, type = "Ljung-Box")
##
## Box-Ljung test
##
## data: ts_eps
## X-squared = 7.2258, df = 10, p-value = 0.704
Normalitas residual (sekadar cek visual):
qqnorm(eps_hat)
qqline(eps_hat, col = "red", lwd = 2)
Pertanyaan tipikal:
Credible interval 95%:
quantile(phi_samp, probs = c(0.025, 0.5, 0.975))
## 2.5% 50% 97.5%
## 0.5111274 0.6022018 0.6937221
Jika interval ini tidak mencakup 0, secara Bayesian kita punya bukti kuat bahwa \(\phi \neq 0\).
mean(phi_samp > 0)
## [1] 1
Jika nilai ini mendekati 1, interpretasinya:
Dengan mempertimbangkan prior dan data, probabilitas posterior bahwa \(\phi > 0\) adalah sekitar 0.999 (misalnya), yang menunjukkan autokorelasi positif yang sangat kuat pada deret terdiferensi.
Karena:
\[ w_t = y_t - y_{t-1}, \quad w_t = \phi w_{t-1} + \varepsilon_t, \]
maka:
Untuk ARIMA(\(p,d,q\)) dengan \(q > 0\):
Estimasi Bayesian untuk ARIMA(\(p,d,q\)) umumnya membutuhkan:
Dalam konteks bab pengantar:
Ringkasan
Di bagian ini, kita:
Bagian ini melengkapi bab AR(\(p\))
sehingga pembaca punya gambaran jelas:
mulai dari AR stasioner, lalu berpindah ke ARIMA non-stasioner sederhana
(1,1,0) yang masih dapat ditangani dengan alat Bayesian analitik,
sebelum beranjak ke model-model yang memerlukan MCMC atau INLA.
library(ggplot2)
set.seed(123)
n <- 200
phi1_true <- 0.5
theta1_true <- -0.4
d_true <- 1
# Simulasi ARIMA(1,1,1)
y_arima <- arima.sim(
model = list(order = c(1, d_true, 1),
ar = phi1_true,
ma = theta1_true),
n = n
)
# Pangkas agar panjangnya = n
y_arima <- y_arima[1:n]
df_arima <- data.frame(
t = 1:n,
y = as.numeric(y_arima)
)
ggplot(df_arima, aes(t, y)) +
geom_line() +
labs(
title = "Simulasi ARIMA(1,1,1)",
x = "Waktu",
y = "y_t"
) +
theme_minimal()
brms (pendekatan AR)Untuk kesederhanaan, kita dekati ARIMA(1,1,1) dengan AR(p) pada data yang sudah di-differencing.
library(brms)
# Differencing orde 1 (sudah dibuat di chunk sebelumnya)
dy <- diff(y_arima, differences = 1)
df_dy <- data.frame(
t = 1:length(dy), # pastikan berurutan 1,2,3,...,T
dy = as.numeric(dy)
)
# Prior untuk model AR(4)
priors_ar4 <- c(
prior(normal(0, 5), class = "Intercept"),
prior(normal(0, 1), class = "ar"), # prior semua koef AR
prior(exponential(1), class = "sigma") # prior sd error
)
fit_brms_ar4 <- brm(
dy ~ 1 + ar(time = t, p = 4), # AR(4) lewat helper ar()
data = df_dy,
family = gaussian(),
prior = priors_ar4,
chains = 2, iter = 2000, seed = 123
)
##
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 0.000108 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 1.08 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 1: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 1: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 1: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 1: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.276 seconds (Warm-up)
## Chain 1: 0.273 seconds (Sampling)
## Chain 1: 0.549 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 4e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.4 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 2: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 2: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 2: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 2: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 2: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 2: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 2: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 2: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 2: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 2: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 0.267 seconds (Warm-up)
## Chain 2: 0.289 seconds (Sampling)
## Chain 2: 0.556 seconds (Total)
## Chain 2:
summary(fit_brms_ar4)
## Family: gaussian
## Links: mu = identity
## Formula: dy ~ 1 + ar(time = t, p = 4)
## Data: df_dy (Number of observations: 199)
## Draws: 2 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup draws = 2000
##
## Correlation Structures:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## ar[1] 0.04 0.07 -0.10 0.18 1.00 2451 1694
## ar[2] -0.04 0.07 -0.18 0.10 1.00 3282 1541
## ar[3] 0.12 0.07 -0.02 0.26 1.00 2921 1410
## ar[4] -0.11 0.08 -0.27 0.04 1.00 2353 1307
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept -0.00 0.07 -0.14 0.14 1.00 2809 1560
##
## Further Distributional Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 0.95 0.05 0.85 1.05 1.00 2758 1578
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Model stan_ar() menulis model AR dengan prior:
\[ _j (0, ^2), (), \]
lalu menggunakan HMC (Hamiltonian Monte Carlo) untuk sampling.
Interpretasi output:
Plot posterior predictive:
yrep <- posterior_predict(fit_brms_ar4, draws = 20)
matplot(
t(yrep), type = "l", lty = 1, col = rgb(0,0,1,0.2),
xlab = "t", ylab = "Simulated dy"
)
lines(df_dy$dy, col = "red", lwd = 2)
Plot ini membandingkan beberapa jalur simulasi dari posterior
predictive dengan deret data differenced dy. Kesesuaian
pola menunjukkan model cukup baik.
knitr::opts_chunk$set(
message = FALSE,
warning = FALSE,
fig.width = 7,
fig.height = 4.5
)
library(MASS)
library(ggplot2)
library(forecast)
set.seed(123)
Kita mulai dari model VAR(1) standar untuk vektor deret waktu \(\mathbf{y}_t \in \mathbb{R}^K\):
\[ \mathbf{y}_t = \mathbf{A}_0 + \mathbf{A}_1 \, \mathbf{y}_{t-1} + \boldsymbol{\varepsilon}_t, \qquad \boldsymbol{\varepsilon}_t \sim \mathrm N_K(\mathbf{0}, \boldsymbol{\Sigma}). \]
Dengan:
Kita bisa menulis ulang dalam bentuk regresi multivariat:
Sehingga:
\[ \mathbf{y}_t = \mathbf{B}^\top \mathbf{x}_t + \boldsymbol{\varepsilon}_t. \]
Untuk \(t = 2, \dots, T\), jika kita tumpuk:
\[ \mathbf{Y} = \begin{pmatrix} \mathbf{y}_2^\top \\ \mathbf{y}_3^\top \\ \vdots \\ \mathbf{y}_T^\top \end{pmatrix} \in \mathbb{R}^{(T-1) \times K}, \quad \mathbf{X} = \begin{pmatrix} \mathbf{x}_2^\top \\ \mathbf{x}_3^\top \\ \vdots \\ \mathbf{x}_T^\top \end{pmatrix} \in \mathbb{R}^{(T-1) \times (1+K)}, \]
maka:
\[ \mathbf{Y} = \mathbf{X}\mathbf{B} + \mathbf{E}, \]
dengan \(\mathbf{E}\) berisi baris-baris \(\boldsymbol{\varepsilon}_t^\top\).
Dokumen ini berisi catatan terstruktur tentang:
dalam konteks model VAR (Vector Autoregression) dan khususnya bagaimana keduanya diinterpretasikan, termasuk dalam kerangka Bayesian VAR (BVAR).
Impulse Response Function (IRF) adalah alat analisis dalam VAR yang digunakan untuk melihat:
bagaimana suatu shock (guncangan / perubahan tak terduga) pada satu variabel akan mempengaruhi variabel lain (termasuk dirinya sendiri) seiring waktu.
Bayangkan Anda menendang sebuah bola billiard di meja penuh bola lain:
IRF menjelaskan perjalanan efek tendangan itu, dari detik pertama sampai sistem kembali tenang (atau stabil).
Misalkan kita memiliki model VAR(p) dengan \(K\) variabel:
\[ Y_t = A_1 Y_{t-1} + A_2 Y_{t-2} + \cdots + A_p Y_{t-p} + u_t, \]
di mana:
IRF mempelajari seberapa besar respon \(Y_{t+h}\) terhadap shock pada variabel ke-\(i\) di waktu \(t\). Secara matematis, IRF dapat ditulis sebagai:
\[ \frac{\partial Y_{t+h}}{\partial u_{i,t}}, \]
yang dapat dibaca sebagai:
Perubahan pada vektor variabel \(Y_{t+h}\) akibat shock 1-unit pada error/shock variabel ke-\(i\) pada waktu \(t\).
Sering kita tertarik pada komponen ke-\(j\) dari \(Y_{t+h}\), sehingga:
\[ \frac{\partial Y_{j, t+h}}{\partial u_{i,t}} \]
menggambarkan:
“Seberapa besar perubahan pada variabel \(j\) pada horizon waktu \(h\) akibat 1-unit shock pada variabel \(i\) pada waktu \(t\).”
Dalam pendekatan Bayesian VAR (BVAR), parameter-parameter VAR diperlakukan sebagai acak dengan distribusi posterior. Konsekuensinya:
IRF bukan hanya satu garis tunggal, tetapi satu distribusi posterior.
Dari distribusi tersebut, kita biasa merangkum:
Sehingga IRF Bayesian sering digambarkan sebagai:
Secara grafis, kita mendapatkan:
Forecast Error Variance Decomposition (FEVD) menjawab pertanyaan:
Dari variabilitas error prediksi suatu variabel, berapa persen disebabkan oleh shock masing-masing variabel dalam sistem VAR?
Bahasa santainya:
“Siapa yang paling bertanggung jawab ketika prediksi meleset? Variabel mana biang keroknya?”
FEVD memecah varian kesalahan prediksi (forecast error variance) menjadi kontribusi shock dari setiap variabel dalam sistem.
Misalkan kita punya VAR(3) dengan variabel:
dengan vektor:
\[ (Y_t, X_t, Z_t). \]
Kita ingin tahu untuk horizon \(H\):
Output FEVD biasanya berupa tabel seperti:
| Horizon (h) | Shock Y | Shock X | Shock Z |
|---|---|---|---|
| h = 1 | 90% | 8% | 2% |
| h = 5 | 65% | 22% | 13% |
| h = 10 | 55% | 30% | 15% |
Interpretasi:
Seiring horizon meningkat, pengaruh antar variabel cenderung makin kuat, sehingga FEVD sangat membantu untuk:
Tabel ringkas:
| Aspek | IRF | FEVD |
|---|---|---|
| Fokus | Melihat bagaimana shock pada variabel mempengaruhi variabel lain seiring waktu | Mengukur seberapa besar kontribusi shock setiap variabel terhadap varians kesalahan prediksi |
| Orientasi | Dampak shock secara dinamis di waktu ke depan | Proporsi varians error prediksi yang dijelaskan oleh masing-masing shock |
| Analogi | Seperti melihat bentuk gelombang akibat tendangan bola billiard | Seperti melihat siapa yang paling menyumbang kesalahan posisi bola akhir |
Keduanya adalah alat utama untuk analisis dinamika hubungan kausal dalam VAR, terutama di:
Anggap kita membangun model VAR dengan tiga variabel:
Contoh narasi IRF:
Plot IRF akan menunjukkan:
Contoh narasi FEVD:
Ini berarti:
Dalam kerangka Bayesian VAR:
Beberapa keunggulan pendekatan Bayesian dalam konteks IRF dan FEVD:
Kesimpulan satu kalimat:
IRF menggambarkan bentuk respon dinamis terhadap shock, sedangkan FEVD mengukur berapa besar kontribusi shock setiap variabel terhadap varians kesalahan prediksi.
Analogi santai:
Untuk regresi multivariat dengan error kovarian \(\boldsymbol{\Sigma}\), prior yang umum dipakai adalah:
Matrix Normal untuk koefisien: \[ \mathbf{B} \mid \boldsymbol{\Sigma} \sim \mathrm{MN}(\mathbf{B}_0, \mathbf{V}_0, \boldsymbol{\Sigma}), \] ekuivalen dengan \[ \text{vec}(\mathbf{B}) \mid \boldsymbol{\Sigma} \sim \mathrm N\big(\text{vec}(\mathbf{B}_0),\; \boldsymbol{\Sigma} \otimes \mathbf{V}_0\big). \]
Inverse-Wishart untuk kovarian error: \[ \boldsymbol{\Sigma} \sim \mathrm{IW}(\nu_0, \mathbf{S}_0). \]
Ini disebut prior Normal–Inverse-Wishart (NIW).
Interpretasi hyperparameter:
Untuk prior lemah, kita bisa ambil:
Dengan likelihood Gaussian multivariat dan prior NIW, posterior juga NIW.
Posterior:
\[ \mathbf{B} \mid \boldsymbol{\Sigma}, \mathbf{Y}, \mathbf{X} \sim \mathrm{MN}(\mathbf{B}_n, \mathbf{V}_n, \boldsymbol{\Sigma}), \]
dengan:
\[ \mathbf{V}_n^{-1} = \mathbf{V}_0^{-1} + \mathbf{X}^\top \mathbf{X}, \]
\[ \mathbf{B}_n = \mathbf{V}_n \big( \mathbf{V}_0^{-1} \mathbf{B}_0 + \mathbf{X}^\top \mathbf{Y} \big). \]
Posterior kovarian error:
\[ \boldsymbol{\Sigma} \mid \mathbf{Y}, \mathbf{X} \sim \mathrm{IW}(\nu_n, \mathbf{S}_n), \]
dengan:
\[ \nu_n = \nu_0 + (T-1), \]
\[ \mathbf{S}_n = \mathbf{S}_0 + (\mathbf{Y} - \mathbf{X}\mathbf{B}_n)^\top (\mathbf{Y} - \mathbf{X}\mathbf{B}_n) + (\mathbf{B}_n - \mathbf{B}_0)^\top \mathbf{V}_0^{-1} (\mathbf{B}_n - \mathbf{B}_0). \]
Dari sini:
K <- 2
Tn <- 300
A0_true <- c(0, 0)
A1_true <- matrix(c(0.5, 0.2,
-0.1, 0.3),
nrow = 2, byrow = TRUE)
Sigma_true <- matrix(c(1, 0.3,
0.3, 0.5),
nrow = 2, byrow = TRUE)
Y <- matrix(0, nrow = Tn, ncol = K)
Y[1, ] <- c(0, 0)
for (t in 2:Tn) {
mean_t <- A0_true + A1_true %*% Y[t-1, ]
Y[t, ] <- as.numeric(mvrnorm(1, mu = mean_t, Sigma = Sigma_true))
}
ts_y1 <- ts(Y[,1])
ts_y2 <- ts(Y[,2])
autoplot(cbind(y1 = ts_y1, y2 = ts_y2)) +
labs(title = "Simulasi VAR(1) dengan K = 2",
x = "Waktu", y = "Nilai")
Scatter antar variabel:
as.data.frame(Y) |>
ggplot(aes(x = V1, y = V2)) +
geom_point(alpha = 0.6) +
labs(x = "y1_t", y = "y2_t",
title = "Scatter Plot y1_t vs y2_t") +
theme_minimal()
Untuk \(t=2,\dots,T\):
Y_resp <- Y[2:Tn, ] # (T-1) x K
X_reg <- cbind(1, Y[1:(Tn-1), ]) # (T-1) x (1+K)
colnames(X_reg) <- c("const", "y1_lag1", "y2_lag1")
dim(Y_resp); dim(X_reg)
## [1] 299 2
## [1] 299 3
head(cbind(Y_resp, X_reg))
## const y1_lag1 y2_lag1
## [1,] 0.483511548 0.3788606 1 0.000000000 0.0000000
## [2,] -1.172002413 -0.6790338 1 0.483511548 0.3788606
## [3,] -0.410683361 -1.0762974 1 -1.172002413 -0.6790338
## [4,] -1.188083759 0.1962814 1 -0.410683361 -1.0762974
## [5,] -0.003852139 0.7308002 1 -1.188083759 0.1962814
## [6,] -0.948097758 -0.5302211 1 -0.003852139 0.7308002
K <- 2
p_param <- 1 + K
B0 <- matrix(0, nrow = p_param, ncol = K)
V0 <- diag(10, p_param) # varians prior besar -> prior lemah
nu0 <- K + 2 # sedikit di atas K-1
S0 <- diag(1, K)
X <- X_reg
Ymat <- Y_resp
T_eff <- nrow(Ymat)
V0_inv <- solve(V0)
Vn_inv <- V0_inv + crossprod(X)
Vn <- solve(Vn_inv)
Bn <- Vn %*% (V0_inv %*% B0 + crossprod(X, Ymat))
nun <- nu0 + T_eff
resid_mat <- Ymat - X %*% Bn
Sn <- S0 +
t(resid_mat) %*% resid_mat +
t(Bn - B0) %*% V0_inv %*% (Bn - B0)
Vn
## const y1_lag1 y2_lag1
## const 0.0034654735 0.000570610 -0.0003730325
## y1_lag1 0.0005706100 0.002683276 -0.0013899728
## y2_lag1 -0.0003730325 -0.001389973 0.0085355445
Bn
## [,1] [,2]
## const -0.1161148 -0.009426235
## y1_lag1 0.4627350 -0.102332502
## y2_lag1 0.2816822 0.229211026
nun
## [1] 303
Sn
## [,1] [,2]
## [1,] 293.40229 71.75659
## [2,] 71.75659 122.50232
Ekstrak mean posterior:
A0_post_mean <- Bn[1, ] # panjang K
A1_post_mean <- t(Bn[-1, , drop = FALSE]) # K x K
A0_true
## [1] 0 0
A0_post_mean
## [1] -0.116114799 -0.009426235
A1_true
## [,1] [,2]
## [1,] 0.5 0.2
## [2,] -0.1 0.3
A1_post_mean
## y1_lag1 y2_lag1
## [1,] 0.4627350 0.2816822
## [2,] -0.1023325 0.2292110
rinvwishart <- function(df, S) {
W <- rWishart(1, df, solve(S))[,,1]
solve(W)
}
n_samp <- 4000
B_samp <- array(NA_real_, dim = c(p_param, K, n_samp))
Sigma_samp <- vector("list", n_samp)
for (i in 1:n_samp) {
Sigma_i <- rinvwishart(nun, Sn)
Sigma_samp[[i]] <- Sigma_i
# vec(B) | Sigma ~ N(vec(Bn), Sigma ⊗ Vn)
Sigma_kronecker <- kronecker(Sigma_i, Vn)
mean_vec <- as.vector(Bn)
b_vec <- mvrnorm(1, mu = mean_vec, Sigma = Sigma_kronecker)
B_samp[,,i] <- matrix(b_vec, nrow = p_param, ncol = K)
}
dim(B_samp)
## [1] 3 2 4000
Koefisien:
a11_samp <- B_samp[2, 1, ]
a12_samp <- B_samp[3, 1, ]
summary(a11_samp)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.2756 0.4273 0.4618 0.4615 0.4953 0.6662
quantile(a11_samp, probs = c(0.025, 0.5, 0.975))
## 2.5% 50% 97.5%
## 0.3590379 0.4617734 0.5610691
summary(a12_samp)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -0.06481 0.22204 0.28674 0.28563 0.34913 0.62561
quantile(a12_samp, probs = c(0.025, 0.5, 0.975))
## 2.5% 50% 97.5%
## 0.1037868 0.2867358 0.4652967
Plot posterior \(a_{11}\):
a11_df <- data.frame(a11 = a11_samp)
ggplot(a11_df, aes(x = a11)) +
geom_histogram(aes(y = ..density..), bins = 40, alpha = 0.7) +
geom_vline(xintercept = A1_true[1,1], color = "red", linetype = "dashed", size = 1) +
labs(title = "Posterior a[11] (VAR(1))",
x = expression(a[11]), y = "Densitas") +
theme_minimal()
VAR(1) stabil jika semua eigenvalue \(\mathbf{A}_1\) berada di dalam lingkaran satuan:
\[ \max_j |\lambda_j(\mathbf{A}_1)| < 1. \]
Kita aproksimasi \(P(\text{stabil} \mid \text{data})\) dengan sampel posterior:
is_stable <- logical(n_samp)
radius_max <- numeric(n_samp)
for (i in 1:n_samp) {
B_i <- B_samp[,,i]
A1_i <- t(B_i[-1, , drop = FALSE]) # K x K
eigs <- eigen(A1_i, only.values = TRUE)$values
radius_max[i] <- max(Mod(eigs))
is_stable[i] <- all(Mod(eigs) < 1)
}
mean(is_stable) # P(stable | data)
## [1] 1
Distribusi radius spektral:
data.frame(radius_max = radius_max) |>
ggplot(aes(x = radius_max)) +
geom_histogram(bins = 40, alpha = 0.7) +
geom_vline(xintercept = 1, linetype = "dotted") +
labs(title = "Posterior max |eigen(A1)|",
x = "Radius spektral", y = "Frekuensi") +
theme_minimal()
Gunakan \(\mathbf{B}_n\):
Y_hat <- X_reg %*% Bn
resid_hat <- Y_resp - Y_hat
ts_e1 <- ts(resid_hat[,1])
ts_e2 <- ts(resid_hat[,2])
autoplot(cbind(e1 = ts_e1, e2 = ts_e2)) +
labs(title = "Residual (plug-in, mean posterior B)",
x = "Waktu", y = "Residual")
ACF residual:
ggAcf(ts_e1) + labs(title = "ACF Residual Seri 1")
ggAcf(ts_e2) + labs(title = "ACF Residual Seri 2")
Tes Ljung–Box:
Box.test(ts_e1, lag = 10, type = "Ljung-Box")
##
## Box-Ljung test
##
## data: ts_e1
## X-squared = 11.518, df = 10, p-value = 0.3186
Box.test(ts_e2, lag = 10, type = "Ljung-Box")
##
## Box-Ljung test
##
## data: ts_e2
## X-squared = 16.094, df = 10, p-value = 0.09696
QQ-plot:
par(mfrow = c(1,2))
qqnorm(resid_hat[,1], main = "QQ-plot Residual Seri 1"); qqline(resid_hat[,1], col = "red")
qqnorm(resid_hat[,2], main = "QQ-plot Residual Seri 2"); qqline(resid_hat[,2], col = "red")
par(mfrow = c(1,1))
Y1_obs <- ts(Y_resp[,1])
Y2_obs <- ts(Y_resp[,2])
Y1_fit <- ts(Y_hat[,1], start = start(Y1_obs), frequency = frequency(Y1_obs))
Y2_fit <- ts(Y_hat[,2], start = start(Y2_obs), frequency = frequency(Y2_obs))
autoplot(cbind(y1 = Y1_obs, y1_hat = Y1_fit)) +
labs(title = "Seri 1: Observed vs Fitted (Mean Posterior)",
x = "Waktu", y = "Nilai")
autoplot(cbind(y2 = Y2_obs, y2_hat = Y2_fit)) +
labs(title = "Seri 2: Observed vs Fitted (Mean Posterior)",
x = "Waktu", y = "Nilai")
Untuk IRF, kita butuh representasi VAR(1) dalam bentuk structural:
\[ \mathbf{y}_t = \mathbf{A}_0 + \mathbf{A}_1 \mathbf{y}_{t-1} + \boldsymbol{\varepsilon}_t, \quad \boldsymbol{\varepsilon}_t \sim \mathrm N(0, \boldsymbol{\Sigma}). \]
Kita definisikan shock orthogonal:
IRF pada horizon \(h\) untuk shock awal \(\mathbf{u}_0\):
Kita gunakan mean posterior \(\mathbf{A}_1\) dan \(\boldsymbol{\Sigma}\).
A1_hat <- A1_post_mean
Sigma_hat <- Sn / (nun - K - 1) # mean posterior Sigma (IW)
P_chol <- chol(Sigma_hat) # upper triangular: t(P_chol) %*% P_chol = Sigma
P <- t(P_chol) # ambil lower triangular
H <- 20 # horizon
irf_y1_on_u1 <- numeric(H+1)
irf_y2_on_u1 <- numeric(H+1)
u1 <- c(1, 0)
A1_power <- diag(1, K)
for (h in 0:H) {
irf_vec <- A1_power %*% P %*% u1
irf_y1_on_u1[h+1] <- irf_vec[1]
irf_y2_on_u1[h+1] <- irf_vec[2]
A1_power <- A1_power %*% A1_hat
}
irf_df <- data.frame(
h = 0:H,
y1 = irf_y1_on_u1,
y2 = irf_y2_on_u1
)
ggplot(irf_df, aes(x = h)) +
geom_line(aes(y = y1, colour = "y1")) +
geom_line(aes(y = y2, colour = "y2")) +
geom_hline(yintercept = 0, linetype = "dashed") +
labs(title = "IRF (Posterior Mean): Shock Struktural pada Variabel 1",
x = "Horizon (h)", y = "Respon") +
scale_colour_manual(values = c("y1" = "blue", "y2" = "red"),
name = "Seri") +
theme_minimal()
Catatan:
Untuk horizon \(h\), varians error ramalan (forecast error variance) dapat ditulis (VAR(1)) sebagai:
\[ \mathbb{V}\big( \mathbf{y}_{t+h} - \hat{\mathbf{y}}_{t+h|t} \big) = \sum_{j=0}^{h-1} \mathbf{A}_1^j \boldsymbol{\Sigma} (\mathbf{A}_1^j)^\top. \]
Jika kita pakai faktor orthogonal \(\mathbf{P}\):
\[ \boldsymbol{\Sigma} = \mathbf{P} \mathbf{P}^\top, \]
maka kontribusi shock ke-\(k\) terhadap varians komponen ke-\(i\) dapat dihitung dengan melihat komponen ke-\(i\) dari \(\mathbf{A}_1^j \mathbf{P} \mathbf{e}_k\) untuk \(j = 0,\dots,h-1\).
Untuk ringkasnya, berikut contoh FEVD berbasis posterior mean untuk seri 1 dan horizon beberapa \(h\):
fevd_share <- function(A1, Sigma, H = 10) {
P_chol <- chol(Sigma)
P <- t(P_chol)
K <- nrow(A1)
# shock struktural basis
E <- diag(1, K)
# simpan kontribusi impulse masing-masing shock
cont <- array(0, dim = c(K, K, H + 1)) # [seri, shock, horizon]
A1_power <- diag(1, K)
for (h in 0:H) {
for (k in 1:K) {
imp <- A1_power %*% P %*% E[, k]
cont[, k, h + 1] <- imp
}
A1_power <- A1_power %*% A1
}
# varians forecast error tiap seri & shock
var_total <- numeric(K)
var_shock <- matrix(0, nrow = K, ncol = K)
for (h in 0:H) {
slice2 <- cont[, , h + 1]^2
var_total <- var_total + rowSums(slice2) # total varians per seri (sum over shocks)
var_shock <- var_shock + slice2 # akumulasi varians per seri×shock
}
# bagi per baris: setiap seri dibagi total variansnya
share <- sweep(var_shock, 1, var_total, "/")
list(
share = share, # [seri, shock]
H = H,
var_total = var_total,
var_shock = var_shock
)
}
fevd_res <- fevd_share(A1_hat, Sigma_hat, H = 10)
fevd_res$share
## [,1] [,2]
## [1,] 0.9669429 0.03305713
## [2,] 0.1529827 0.84701732
Interpretasi: share[1,1] adalah proporsi varians error
ramalan seri 1 (hingga horizon H) yang disebabkan shock struktural seri
1.
Untuk presentasi lebih rapi, bisa dijadikan tabel.
Kita mulai dari simulasi yang sudah ada: VAR(1) dengan \(K = 2\) dan \(T = 300\).
K <- 2
Tn <- 300
A0_true <- c(0, 0)
A1_true <- matrix(c(0.5, 0.2,
-0.1, 0.3),
nrow = 2, byrow = TRUE)
Sigma_true <- matrix(c(1, 0.3,
0.3, 0.5),
nrow = 2, byrow = TRUE)
Y <- matrix(0, nrow = Tn, ncol = K)
Y[1, ] <- c(0, 0)
for (t in 2:Tn) {
mean_t <- A0_true + A1_true %*% Y[t-1, ]
Y[t, ] <- as.numeric(mvrnorm(1, mu = mean_t, Sigma = Sigma_true))
}
ts_y1 <- ts(Y[,1])
ts_y2 <- ts(Y[,2])
autoplot(cbind(y1 = ts_y1, y2 = ts_y2)) +
labs(title = "Simulasi VAR(1) dengan K = 2",
x = "Waktu", y = "Nilai")
Scatter antar variabel:
as.data.frame(Y) |>
ggplot(aes(x = Y[,1], y = Y[,2])) +
geom_point(alpha = 0.6) +
labs(x = "y1_t", y = "y2_t",
title = "Scatter Plot y1_t vs y2_t") +
theme_minimal()
varsSebagai pembanding, kita fit dulu model VAR(1) dengan pendekatan
klasik menggunakan paket vars.
data_var <- ts(Y)
colnames(data_var) <- c("y1", "y2")
# Fit VAR klasik
fit_var <- VAR(data_var, p = 1, type = "const")
# Hitung IRF untuk VAR klasik
irf_var <- vars::irf(
fit_var,
impulse = "y1",
response = "y1",
n.ahead = 20,
boot = TRUE,
runs = 500
)
# EKSTRAK DATA IRF UNTUK ANIMASI
# Dari objek irf_var, kita bisa ekstrak data IRF-nya
irf_values <- irf_var$irf$y1[, "y1"] # nilai IRF
irf_lower <- irf_var$Lower$y1[, "y1"] # lower bound
irf_upper <- irf_var$Upper$y1[, "y1"] # upper bound
# Buat data frame untuk animasi
df_irf <- data.frame(
h = 0:20,
mean = irf_values,
lower = irf_lower,
upper = irf_upper
)
# Tampilkan data frame
print(head(df_irf))
## h mean lower upper
## 1 0 0.97449619 0.88964030 1.05713507
## 2 1 0.56294579 0.43309683 0.65939511
## 3 2 0.28991489 0.15763172 0.39323272
## 4 3 0.13644205 0.04099221 0.22140369
## 5 4 0.05895385 -0.00379181 0.12736367
## 6 5 0.02311940 -0.01305000 0.07212054
Koefisien \(A_1\) dari
vars:
coef_var <- Bcoef(fit_var) # matriks koefisien
coef_var
## y1.l1 y2.l1 const
## y1 0.5163900 0.2172515 0.005606684
## y2 -0.1061494 0.3631314 -0.023795499
A1_true
## [,1] [,2]
## [1,] 0.5 0.2
## [2,] -0.1 0.3
BVARModel VAR dan Prior Minnesota
Secara umum, model VAR(p) dapat ditulis:
\[ \mathbf{y}_t = \mathbf{c} + \sum_{\ell=1}^p A_{\ell} \mathbf{y}_{t-\ell} + \boldsymbol{\varepsilon}_t, \qquad \boldsymbol{\varepsilon}_t \sim \mathrm{N}(\mathbf{0}, \Sigma), \]
dengan:
Dalam Bayesian VAR, kita memberikan prior pada parameter \(A_{\ell}\) (dan kadang \(\Sigma\)). Minnesota prior (versi klasik) biasanya memiliki bentuk:
Dalam notasi matriks, prior Minnesota sering ditulis:
\[ \mathrm{vec}(A) \mid \Sigma \sim \mathrm{N}(a_0, \Sigma \otimes \Omega_0), \]
dengan:
Paket BVAR mengimplementasikan variasi Minnesota prior
dan prior shrinkage lainnya dalam bentuk yang lebih praktis.
Fitting BVAR
Kita gunakan bvar() dengan lags = 1. BVAR
secara default menerapkan prior shrinkage yang mirip Minnesota.
set.seed(1234)
data_bvar <- ts(Y)
fit_bvar <- bvar(
data_bvar,
lags = 1,
n_draw = 5000,
n_burn = 1000,
n_thin = 1
)
## Optimisation concluded.
## Posterior marginal likelihood: -724.227
## Hyperparameters: lambda = 0.48606
## | | | 0% | | | 1% | |= | 1% | |= | 2% | |== | 2% | |== | 3% | |== | 4% | |=== | 4% | |=== | 5% | |==== | 5% | |==== | 6% | |===== | 6% | |===== | 7% | |===== | 8% | |====== | 8% | |====== | 9% | |======= | 9% | |======= | 10% | |======= | 11% | |======== | 11% | |======== | 12% | |========= | 12% | |========= | 13% | |========= | 14% | |========== | 14% | |========== | 15% | |=========== | 15% | |=========== | 16% | |============ | 16% | |============ | 17% | |============ | 18% | |============= | 18% | |============= | 19% | |============== | 19% | |============== | 20% | |============== | 21% | |=============== | 21% | |=============== | 22% | |================ | 22% | |================ | 23% | |================ | 24% | |================= | 24% | |================= | 25% | |================== | 25% | |================== | 26% | |=================== | 26% | |=================== | 27% | |=================== | 28% | |==================== | 28% | |==================== | 29% | |===================== | 29% | |===================== | 30% | |===================== | 31% | |====================== | 31% | |====================== | 32% | |======================= | 32% | |======================= | 33% | |======================= | 34% | |======================== | 34% | |======================== | 35% | |========================= | 35% | |========================= | 36% | |========================== | 36% | |========================== | 37% | |========================== | 38% | |=========================== | 38% | |=========================== | 39% | |============================ | 39% | |============================ | 40% | |============================ | 41% | |============================= | 41% | |============================= | 42% | |============================== | 42% | |============================== | 43% | |============================== | 44% | |=============================== | 44% | |=============================== | 45% | |================================ | 45% | |================================ | 46% | |================================= | 46% | |================================= | 47% | |================================= | 48% | |================================== | 48% | |================================== | 49% | |=================================== | 49% | |=================================== | 50% | |=================================== | 51% | |==================================== | 51% | |==================================== | 52% | |===================================== | 52% | |===================================== | 53% | |===================================== | 54% | |====================================== | 54% | |====================================== | 55% | |======================================= | 55% | |======================================= | 56% | |======================================== | 56% | |======================================== | 57% | |======================================== | 58% | |========================================= | 58% | |========================================= | 59% | |========================================== | 59% | |========================================== | 60% | |========================================== | 61% | |=========================================== | 61% | |=========================================== | 62% | |============================================ | 62% | |============================================ | 63% | |============================================ | 64% | |============================================= | 64% | |============================================= | 65% | |============================================== | 65% | |============================================== | 66% | |=============================================== | 66% | |=============================================== | 67% | |=============================================== | 68% | |================================================ | 68% | |================================================ | 69% | |================================================= | 69% | |================================================= | 70% | |================================================= | 71% | |================================================== | 71% | |================================================== | 72% | |=================================================== | 72% | |=================================================== | 73% | |=================================================== | 74% | |==================================================== | 74% | |==================================================== | 75% | |===================================================== | 75% | |===================================================== | 76% | |====================================================== | 76% | |====================================================== | 77% | |====================================================== | 78% | |======================================================= | 78% | |======================================================= | 79% | |======================================================== | 79% | |======================================================== | 80% | |======================================================== | 81% | |========================================================= | 81% | |========================================================= | 82% | |========================================================== | 82% | |========================================================== | 83% | |========================================================== | 84% | |=========================================================== | 84% | |=========================================================== | 85% | |============================================================ | 85% | |============================================================ | 86% | |============================================================= | 86% | |============================================================= | 87% | |============================================================= | 88% | |============================================================== | 88% | |============================================================== | 89% | |=============================================================== | 89% | |=============================================================== | 90% | |=============================================================== | 91% | |================================================================ | 91% | |================================================================ | 92% | |================================================================= | 92% | |================================================================= | 93% | |================================================================= | 94% | |================================================================== | 94% | |================================================================== | 95% | |=================================================================== | 95% | |=================================================================== | 96% | |==================================================================== | 96% | |==================================================================== | 97% | |==================================================================== | 98% | |===================================================================== | 98% | |===================================================================== | 99% | |======================================================================| 99% | |======================================================================| 100%
## Finished MCMC after 0.82 secs.
summary(fit_bvar)
## Bayesian VAR consisting of 299 observations, 2 variables and 1 lags.
## Time spent calculating: 0.82 secs
## Hyperparameters: lambda
## Hyperparameter values after optimisation: 0.48606
## Iterations (burnt / thinning): 5000 (1000 / 1)
## Accepted draws (rate): 3896 (0.974)
##
## Numeric array (dimensions 3, 2) of coefficient values from a BVAR.
## Median values:
## Series 1 Series 2
## constant 0.004 -0.024
## Series 1-lag1 0.520 -0.107
## Series 2-lag1 0.211 0.372
##
## Numeric array (dimensions 2, 2) of variance-covariance values from a BVAR.
## Median values:
## Series 1 Series 2
## Series 1 0.937 0.261
## Series 2 0.261 0.426
##
## Log-Likelihood: -682.9327
Ekstrak koefisien posterior (mean). Struktur objek
coef(fit_bvar) tergantung versi paket, tetapi umumnya
berisi koefisien pada lag.
coef_bvar <- coef(fit_bvar)
coef_bvar # lihat struktur
## Numeric array (dimensions 3, 2) of coefficient values from a BVAR.
## Median values:
## Series 1 Series 2
## constant 0.004 -0.024
## Series 1-lag1 0.520 -0.107
## Series 2-lag1 0.211 0.372
A1_true
## [,1] [,2]
## [1,] 0.5 0.2
## [2,] -0.1 0.3
Catatan: jika struktur
coef_bvaradalah daftar, langkah berikutnya bisa dibuat spesifik (misalnyacoef_bvar$A[,,1]untuk lag 1), menyesuaikan output versiBVARyang terpasang.
IRF dari VAR Frequentist (vars)
colnames(data_var) <- c("y1", "y2")
# Fit VAR(1)
fit_var <- VAR(data_var, p = 1, type = "const")
summary(fit_var)
##
## VAR Estimation Results:
## =========================
## Endogenous variables: y1, y2
## Deterministic variables: const
## Sample size: 299
## Log Likelihood: -682.903
## Roots of the characteristic polynomial:
## 0.4589 0.4589
## Call:
## VAR(y = data_var, p = 1, type = "const")
##
##
## Estimation results for equation y1:
## ===================================
## y1 = y1.l1 + y2.l1 + const
##
## Estimate Std. Error t value Pr(>|t|)
## y1.l1 0.516390 0.050744 10.176 <2e-16 ***
## y2.l1 0.217252 0.085815 2.532 0.0119 *
## const 0.005607 0.056441 0.099 0.9209
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 0.9745 on 296 degrees of freedom
## Multiple R-Squared: 0.3243, Adjusted R-squared: 0.3198
## F-statistic: 71.04 on 2 and 296 DF, p-value: < 2.2e-16
##
##
## Estimation results for equation y2:
## ===================================
## y2 = y1.l1 + y2.l1 + const
##
## Estimate Std. Error t value Pr(>|t|)
## y1.l1 -0.10615 0.03416 -3.107 0.00207 **
## y2.l1 0.36313 0.05778 6.285 1.17e-09 ***
## const -0.02380 0.03800 -0.626 0.53166
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 0.6561 on 296 degrees of freedom
## Multiple R-Squared: 0.121, Adjusted R-squared: 0.115
## F-statistic: 20.37 on 2 and 296 DF, p-value: 5.156e-09
##
##
##
## Covariance matrix of residuals:
## y1 y2
## y1 0.9496 0.2679
## y2 0.2679 0.4304
##
## Correlation matrix of residuals:
## y1 y2
## y1 1.000 0.419
## y2 0.419 1.000
# IRF seperti pada contoh Canada
irf_var <- vars::irf(
fit_var,
impulse = "y1",
response = "y1",
n.ahead = 20,
boot = TRUE,
runs = 500
)
plot(irf_var)
IRF Bayesian dengan BVAR + Credible
Interval
Paket BVAR menyediakan fungsi IRF (nama fungsi dapat
berbeda sesuai versi; di bawah ini salah satu skema umum):
irf_bvar <- BVAR::irf(fit_bvar, horizon = 20)
plot(irf_bvar)
Misalkan objek irf_bvar berisi:
irf_bvar$irf_mean : vektor mean IRF,irf_bvar$irf_lower: bound bawah credible interval,irf_bvar$irf_upper: bound atas credible interval,maka kita dapat mem-plot IRF dengan band shading:
h <- 0:20
df_irf <- data.frame(
h = h,
mean = irf_bvar$irf_mean,
lower = irf_bvar$irf_lower,
upper = irf_bvar$irf_upper
)
ggplot(df_irf, aes(x = h, y = mean)) +
geom_ribbon(aes(ymin = lower, ymax = upper), alpha = 0.2) +
geom_line(size = 1) +
geom_hline(yintercept = 0, linetype = "dashed") +
labs(
title = "Bayesian IRF (posterior mean + credible bands)",
x = "Horizon (h)",
y = "Respon"
) +
theme_minimal()
Catatan penting: struktur persis objek dari
irf()diBVARbisa berbeda (misalnya berupa array[horizon, respon, impulse]). Kode di atas adalah template yang perlu disesuaikan dengan output aktual (str(irf_bvar)).
Untuk membuat animasi IRF, kita dapat menampilkan respon sebagai fungsi horizon yang “bertumbuh” frame demi frame.
Kita membutuhkan paket gganimate:
library(gganimate)
# contoh data frame IRF (ambil dari df_irf di atas)
library(gganimate)
# contoh data frame IRF (ambil dari df_irf di atas)
p_irf <- ggplot(df_irf, aes(x = h, y = mean)) +
geom_ribbon(aes(ymin = lower, ymax = upper), alpha = 0.2) +
geom_line(size = 1, colour = "blue") +
geom_point(size = 2, colour = "blue") +
geom_hline(yintercept = 0, linetype = "dashed") +
labs(
title = "IRF (Shock pada y1, Respon y1)",
subtitle = "Horizon: {frame_time}",
x = "Horizon (h)",
y = "Respon"
) +
transition_reveal(along = h) +
theme_minimal()
anim <- animate(p_irf, nframes = 50, fps = 10)
anim
Untuk menyimpan animasi sebagai GIF:
anim_save("irf_y1_on_y1.gif", animation = anim)
Ini menghasilkan animasi IRF yang memperlihatkan bagaimana respon berkembang seiring horizon, dengan credible band sebagai bayangan.
Koefisien
Secara konsep:
vars::VAR memberikan estimasi OLS (frequentist) dan
standard error.BVAR::bvar memberikan posterior untuk
parameter:
Kita dapat membandingkan koefisien (posterior mean vs OLS) secara numerik:
# Contoh struktur: sesuaikan dengan output BVAR
A1_hat_bvar <- coef_bvar$A[,,1] # misal: matriks A1 dari BVAR
A1_true
coef_var # dari VAR (vars)
A1_hat_bvar # dari BVAR (posterior mean)
IRF
vars::irf() = IRF frequentist, dengan interval
berdasarkan bootstrap.BVAR::irf() = IRF Bayesian, dengan credible band
berdasarkan distribusi posterior.Secara visual:
Ringkasan: Teori Minnesota Prior
Model VAR(p):
\[ \mathbf{y}_t = \mathbf{c} + \sum_{\ell=1}^p A_{\ell} \mathbf{y}_{t-\ell} + \boldsymbol{\varepsilon}_t, \qquad \boldsymbol{\varepsilon}_t \sim \mathrm{N}(\mathbf{0}, \Sigma). \]
Tuliskan \(\mathbf{y}_t\) dan semua koefisien ke dalam bentuk regresi multivariat:
\[ Y = X B + E, \]
dengan:
Minnesota prior klasik:
Prior untuk koefisien (dalam bentuk vektor):
\[ \mathrm{vec}(B) \mid \Sigma \sim \mathrm{N}(b_0, \Sigma \otimes \Omega_0), \]
di mana \(\Omega_0\) diagonal, dengan elemen diag yang dikendalikan parameter hiper (\(\lambda\)):
Prior untuk \(\Sigma\) sering diambil sebagai inverse-Wishart:
\[ \Sigma \sim \mathrm{IW}(\nu_0, S_0). \]
Gabungan prior ini menghasilkan prior conjugate Normal-Inverse-Wishart (NIW) yang memudahkan sampling posterior.
Paket BVAR mengimplementasikan varian dari pendekatan
ini, dengan pilihan:
Catatan Akhir
vars sangat bagus untuk VAR
frequentist standar.BVAR memberikan kerangka Bayesian
dengan:
Dokumen ini dapat dijadikan:
Poin penting:
R-INLA tidak dirancang untuk VAR multivariat penuh dengan kovarian error umum \(\boldsymbol{\Sigma}\) dan cross-equation dynamics eksplisit.
INLA fokus pada Latent Gaussian Models (LGM);
efek waktu sering dimodelkan dengan f(time, model="ar1"),
rw1, dsb.
Namun, ada pendekatan praktis untuk mendekati VAR:
Ini bukan VAR penuh karena:
Tetapi, untuk tujuan praktis (misal: efek lag antar seri) pendekatan ini sering cukup informatif.
Berikut contoh pseudo-code INLA untuk persamaan pertama (asumsi paket INLA sudah terinstal, dan kita pakai indeks waktu yang sama):
library(INLA)
dat_inla <- data.frame(
y1 = Y_resp[,1],
y1_l1 = Y[1:(Tn-1), 1],
y2_l1 = Y[1:(Tn-1), 2],
time = 2:Tn
)
res_inla_y1 <- inla(
y1 ~ y1_l1 + y2_l1 + f(time, model = "ar1"),
data = dat_inla,
family = "gaussian",
control.compute = list(dic = TRUE, waic = TRUE, cpo = TRUE)
)
summary(res_inla_y1)
## Time used:
## Pre = 1.06, Running = 0.221, Post = 0.0348, Total = 1.32
## Fixed effects:
## mean sd 0.025quant 0.5quant 0.975quant mode kld
## (Intercept) -0.209 0.120 -0.446 -0.209 0.029 -0.209 17.533
## y1_l1 -0.029 0.064 -0.155 -0.029 0.096 -0.029 60.739
## y2_l1 -0.005 0.092 -0.186 -0.005 0.176 -0.005 28.868
##
## Random effects:
## Name Model
## time AR1 model
##
## Model hyperparameters:
## mean sd 0.025quant 0.5quant
## Precision for the Gaussian observations 2.54e+04 4.5e+04 928.507 1.22e+04
## Precision for time 7.28e-01 7.7e-02 0.584 7.26e-01
## Rho for time 5.13e-01 4.9e-02 0.415 5.13e-01
## 0.975quant mode
## Precision for the Gaussian observations 1.32e+05 2242.314
## Precision for time 8.87e-01 0.723
## Rho for time 6.08e-01 0.511
##
## Deviance Information Criterion (DIC) ...............: -1685.11
## Deviance Information Criterion (DIC, saturated) ....: 607.30
## Effective number of parameters .....................: 308.30
##
## Watanabe-Akaike information criterion (WAIC) ...: -1677.34
## Effective number of parameters .................: 248.47
##
## Marginal log-Likelihood: -453.72
## CPO, PIT is computed
## Posterior summaries for the linear predictor and the fitted values are computed
## (Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')
Hal yang sama bisa dilakukan untuk persamaan kedua:
dat_inla$y2 <- Y_resp[,2]
dat_inla$y2_l1 <- Y[1:(Tn-1), 2]
res_inla_y2 <- inla(
y2 ~ y2_l1 + y1_l1 + f(time, model = "ar1"),
data = dat_inla,
family = "gaussian"
)
summary(res_inla_y2)
## Time used:
## Pre = 1.03, Running = 0.229, Post = 0.0179, Total = 1.28
## Fixed effects:
## mean sd 0.025quant 0.5quant 0.975quant mode kld
## (Intercept) 0.012 0.046 -0.078 0.012 0.103 0.012 118.431
## y2_l1 -0.078 0.060 -0.194 -0.078 0.039 -0.078 70.398
## y1_l1 0.003 0.037 -0.070 0.003 0.075 0.003 181.585
##
## Random effects:
## Name Model
## time AR1 model
##
## Model hyperparameters:
## mean sd 0.025quant 0.5quant
## Precision for the Gaussian observations 2.31e+04 3.16e+04 1066.421 1.32e+04
## Precision for time 2.32e+00 1.96e-01 1.952 2.31e+00
## Rho for time 1.79e-01 5.70e-02 0.067 1.79e-01
## 0.975quant mode
## Precision for the Gaussian observations 1.05e+05 2677.595
## Precision for time 2.72e+00 2.301
## Rho for time 2.91e-01 0.177
##
## Marginal log-Likelihood: -322.40
## is computed
## Posterior summaries for the linear predictor and the fitted values are computed
## (Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')
Catatan:
f(time, model="ar1") memodelkan dependensi
serial pada residual sebagai proses AR(1) laten.y1_l1, y2_l1 memberi informasi
mirip VAR (dengan perbedaan bahwa struktur error bersama tidak
dimodelkan eksplisit).Untuk VAR Bayesian penuh dengan struktur kovarian lengkap, NIW + MCMC (atau teknik lain) jauh lebih natural dibanding memaksa semuanya ke dalam INLA.
Ringkasan
Dalam bab ini, kita:
Struktur ini paralel dengan bab AR dan ARIMA Bayesian, sehingga bisa langsung dijahit menjadi satu chapter lengkap tentang pemodelan deret waktu Bayesian: univariat (AR/ARIMA) hingga multivariat (VAR).
set.seed(123)
T_var <- 200
K <- 2
A0_true <- matrix(c(0, 0), nrow = K)
A1_true <- matrix(c(0.5, 0.2,
0.1, 0.4), nrow = K, byrow = TRUE)
Sigma_true <- matrix(c(1, 0.3,
0.3, 1), nrow = K)
Y <- matrix(0, nrow = T_var, ncol = K)
colnames(Y) <- c("y1", "y2")
Y[1, ] <- c(0, 0)
for (t in 2:T_var) {
eps_t <- mvtnorm::rmvnorm(1, sigma = Sigma_true)
Y[t, ] <- A0_true + A1_true %*% Y[t-1, ] + t(eps_t)
}
df_var <- data.frame(
t = 1:T_var,
y1 = Y[, 1],
y2 = Y[, 2]
)
df_var_long <- df_var %>%
tidyr::pivot_longer(cols = c(y1, y2), names_to = "series", values_to = "value")
ggplot(df_var_long, aes(t, value, color = series)) +
geom_line() +
labs(
title = "Simulasi VAR(1) dengan Dua Seri",
x = "Waktu",
y = "Nilai"
) +
theme_minimal()
Y_lag <- Y[1:(T_var-1), ]
Y_now <- Y[2:T_var, ]
# Bentuk matriks X dengan intercept
X <- cbind(1, Y_lag)
T_eff <- nrow(X)
# Dimensi
K <- ncol(Y_now) # jumlah seri
M <- ncol(X) # intercept + lag
# Prior NIW
A_prior_mean <- matrix(0, nrow = M, ncol = K)
V0 <- diag(10, M) # prior var besar (lemah)
nu0 <- K + 2
S0 <- diag(K)
n_iter <- 5000
# Asumsikan:
# Y: matriks T_var x K
# Sudah ada: Y_lag, Y_now, X, T_eff, K, M, A_prior_mean, V0, nu0, S0, n_iter
A_store <- array(NA, dim = c(n_iter, M, K))
Sigma_store <- array(NA, dim = c(n_iter, K, K))
Sigma_curr <- diag(K)
A_curr <- A_prior_mean
for (iter in 1:n_iter) {
## 1. Draw A | Sigma, Y (Matrix-Normal)
# Posterior row-covariance V_n (M x M)
V0_inv <- solve(V0)
Vn <- solve(V0_inv + t(X) %*% X) # M x M
# Posterior mean A_n (M x K)
An <- Vn %*% (V0_inv %*% A_prior_mean + t(X) %*% Y_now)
# Sampling dari MN(An, Vn, Sigma_curr)
# Z ~ MN(0, Vn, Sigma) -> Z = L_V %*% E %*% t(L_Sigma)
E <- matrix(rnorm(M * K), nrow = M, ncol = K)
L_V <- chol(Vn)
L_S <- chol(Sigma_curr)
A_curr <- An + L_V %*% E %*% t(L_S)
## 2. Draw Sigma | A, Y (Inverse-Wishart)
E_res <- Y_now - X %*% A_curr # T_eff x K
Sn <- S0 + t(E_res) %*% E_res # K x K
nun <- nu0 + T_eff
Sigma_curr <- solve(rWishart(1, df = nun, Sigma = solve(Sn))[,,1])
## Simpan sampel
A_store[iter, , ] <- A_curr
Sigma_store[iter, , ] <- Sigma_curr
}
# Contoh ringkasan koefisien lag pertama (misal VAR(1), K = 2)
# Baris 1: intercept, baris 2..(K+1): lag dari masing-masing variabel
A1_samp <- A_store[, 2:(K+1), ] # koefisien lag
apply(A1_samp, c(2, 3), mean)
## [,1] [,2]
## [1,] 0.4617250 0.1101901
## [2,] 0.2078003 0.2180147
Kita bisa menghitung credible interval:
A1_mean <- apply(A1_samp, c(2,3), mean)
A1_ci <- apply(A1_samp, c(2,3), quantile, probs = c(0.025, 0.975))
A1_mean
## [,1] [,2]
## [1,] 0.4617250 0.1101901
## [2,] 0.2078003 0.2180147
A1_ci
## , , 1
##
## [,1] [,2]
## 2.5% 0.3312210 0.07606611
## 97.5% 0.5974021 0.33787986
##
## , , 2
##
## [,1] [,2]
## 2.5% -0.01992237 0.08819747
## 97.5% 0.24052021 0.34744481
knitr::opts_chunk$set(
message = FALSE,
warning = FALSE,
fig.width = 7,
fig.height = 4.5
)
library(ggplot2)
library(forecast)
library(dlm) # untuk DLM/Kalman filter
set.seed(123)
Dynamic Linear Model (DLM; West & Harrison, 1997) menulis deret waktu dalam bentuk state-space yang sepenuhnya Gaussian.
Model umum:
\[ y_t = \mathbf{F}_t^\top \boldsymbol{\theta}_t + v_t, \qquad v_t \sim \mathrm{N}(0, V_t), \]
\[ \boldsymbol{\theta}_t = \mathbf{G}_t \boldsymbol{\theta}_{t-1} + \mathbf{w}_t, \qquad \mathbf{w}_t \sim \mathrm{N}(\mathbf{0}, \mathbf{W}_t). \]
Dengan:
Intuisi:
Asumsikan pada waktu \(t-1\) kita punya prior untuk state:
\[ \boldsymbol{\theta}_{t-1} \mid D_{t-1} \sim \mathrm{N}(\mathbf{m}_{t-1}, \mathbf{C}_{t-1}), \]
di mana \(D_{t-1}\) adalah semua data sampai waktu \(t-1\).
Prediksi state sebelum melihat \(y_t\):
\[ \mathbf{a}_t = \mathbf{G}_t \mathbf{m}_{t-1}, \]
\[ \mathbf{R}_t = \mathbf{G}_t \mathbf{C}_{t-1} \mathbf{G}_t^\top + \mathbf{W}_t. \]
Ini adalah distribusi prior (evolusi) state pada waktu \(t\):
\[ \boldsymbol{\theta}_t \mid D_{t-1} \sim \mathrm{N}(\mathbf{a}_t, \mathbf{R}_t). \]
Prediksi observasi satu langkah ke depan:
\[ f_t = \mathbf{F}_t^\top \mathbf{a}_t, \]
\[ Q_t = \mathbf{F}_t^\top \mathbf{R}_t \mathbf{F}_t + V_t. \]
Sehingga:
\[ y_t \mid D_{t-1} \sim \mathrm{N}(f_t, Q_t). \]
Setelah mengamati \(y_t\), hitung forecast error dan gain:
\[ e_t = y_t - f_t, \]
\[ \mathbf{A}_t = \frac{\mathbf{R}_t \mathbf{F}_t}{Q_t}. \]
Posterior state:
\[ \mathbf{m}_t = \mathbf{a}_t + \mathbf{A}_t e_t, \]
\[ \mathbf{C}_t = \mathbf{R}_t - \mathbf{A}_t \mathbf{A}_t^\top Q_t. \]
Ini adalah rumus Kalman filter, diinterpretasikan sebagai update Bayesian conjugate pada setiap waktu.
Alih-alih menentukan \(\mathbf{W}_t\) secara eksplisit, pendekatan populer adalah discount factor \(\delta \in (0,1]\):
Untuk model dengan:
\[ \mathbf{R}_t = \frac{1}{\delta} \mathbf{G}_t \mathbf{C}_{t-1} \mathbf{G}_t^\top, \]
maka \(\mathbf{W}_t\) implicit:
\[ \mathbf{W}_t = \mathbf{R}_t - \mathbf{G}_t \mathbf{C}_{t-1} \mathbf{G}_t^\top = \left(\frac{1}{\delta} - 1\right)\mathbf{G}_t \mathbf{C}_{t-1} \mathbf{G}_t^\top. \]
Model local level:
\[ y_t = \mu_t + \varepsilon_t, \qquad \varepsilon_t \sim \mathrm{N}(0, V), \]
\[ \mu_t = \mu_{t-1} + \eta_t, \qquad \eta_t \sim \mathrm{N}(0, W). \]
Ini adalah DLM dengan:
Dengan demikian:
dlmSim)Kita simulasi data dengan:
n <- 150
V_true <- 0.5^2
W_true <- 0.1^2
mu <- numeric(n)
y <- numeric(n)
mu[1] <- 0
y[1] <- mu[1] + rnorm(1, sd = sqrt(V_true))
for (t in 2:n) {
mu[t] <- mu[t-1] + rnorm(1, sd = sqrt(W_true))
y[t] <- mu[t] + rnorm(1, sd = sqrt(V_true))
}
ts_y <- ts(y)
ts_mu <- ts(mu)
autoplot(cbind(y = ts_y, mu_true = ts_mu)) +
labs(title = "Simulasi Local Level DLM (Manual)",
x = "Waktu", y = "") +
theme_minimal()
Implementasi Kalman filter manual dengan \(V\) dan \(W\) diketahui:
kalman_local_level <- function(y, V, W, m0 = 0, C0 = 1e4) {
n <- length(y)
a <- m <- numeric(n)
R <- C <- numeric(n)
f <- Q <- e <- numeric(n)
A <- numeric(n)
# t = 1 (menggunakan prior awal)
a[1] <- m0
R[1] <- C0 + W
f[1] <- a[1]
Q[1] <- R[1] + V
e[1] <- y[1] - f[1]
A[1] <- R[1] / Q[1]
m[1] <- a[1] + A[1] * e[1]
C[1] <- R[1] - A[1]^2 * Q[1]
# t >= 2
for (t in 2:n) {
a[t] <- m[t-1]
R[t] <- C[t-1] + W
f[t] <- a[t]
Q[t] <- R[t] + V
e[t] <- y[t] - f[t]
A[t] <- R[t] / Q[t]
m[t] <- a[t] + A[t] * e[t]
C[t] <- R[t] - A[t]^2 * Q[t]
}
list(a = a, R = R, f = f, Q = Q,
m = m, C = C, e = e, A = A)
}
kf_res <- kalman_local_level(y, V = V_true, W = W_true, m0 = 0, C0 = 10)
str(kf_res)
## List of 8
## $ a: num [1:150] 0 -0.273 0.245 0.176 0.235 ...
## $ R: num [1:150] 10.01 0.2539 0.136 0.0981 0.0804 ...
## $ f: num [1:150] 0 -0.273 0.245 0.176 0.235 ...
## $ Q: num [1:150] 10.26 0.504 0.386 0.348 0.33 ...
## $ m: num [1:150] -0.273 0.245 0.176 0.235 0.101 ...
## $ C: num [1:150] 0.2439 0.126 0.0881 0.0704 0.0609 ...
## $ e: num [1:150] -0.28 1.03 -0.197 0.21 -0.55 ...
## $ A: num [1:150] 0.976 0.504 0.352 0.282 0.243 ...
Plot observed y vs filtered mean m_t dan true level μ_t:
df_kf <- data.frame(
t = 1:n,
y = y,
mu_true = mu,
m_filt = kf_res$m
)
ggplot(df_kf, aes(x = t)) +
geom_point(aes(y = y), colour = "grey60", alpha = 0.6) +
geom_line(aes(y = mu_true, colour = "True level"), linewidth = 1) +
geom_line(aes(y = m_filt, colour = "Filtered m_t"), linewidth = 1) +
scale_colour_manual(values = c("True level" = "black", "Filtered m_t" = "blue")) +
labs(title = "Local Level: Observasi vs Filtered State",
x = "Waktu", y = "") +
theme_minimal()
Untuk DLM Gaussian, distribusi posterior joint state \((\mu_1, \dots, \mu_n)\) adalah multivariat Normal. Cara praktis untuk sampling adalah algoritma Forward Filtering Backward Sampling (FFBS).
ffbs_local_level <- function(y, V, W, m0 = 0, C0 = 1e4) {
n <- length(y)
# Forward
kf <- kalman_local_level(y, V, W, m0, C0)
a <- kf$a
R <- kf$R
m <- kf$m
C <- kf$C
# Backward
theta <- numeric(n)
# t = n
theta[n] <- rnorm(1, mean = m[n], sd = sqrt(C[n]))
# t = n-1,...,1
for (t in (n-1):1) {
J_t <- C[t] / R[t+1]
mean_t <- m[t] + J_t * (theta[t+1] - a[t+1])
var_t <- C[t] - J_t^2 * R[t+1]
theta[t] <- rnorm(1, mean = mean_t, sd = sqrt(var_t))
}
theta
}
# Contoh satu draw
theta_draw <- ffbs_local_level(y, V_true, W_true, m0 = 0, C0 = 10)
Sampling banyak draw untuk membuat credible band:
n_samp <- 1000
theta_mat <- matrix(NA_real_, nrow = n, ncol = n_samp)
for (i in 1:n_samp) {
theta_mat[,i] <- ffbs_local_level(y, V_true, W_true, m0 = 0, C0 = 10)
}
theta_mean <- rowMeans(theta_mat)
theta_lo <- apply(theta_mat, 1, quantile, probs = 0.025)
theta_hi <- apply(theta_mat, 1, quantile, probs = 0.975)
df_smooth <- data.frame(
t = 1:n,
mu_true = mu,
y = y,
mean_sm = theta_mean,
lo_sm = theta_lo,
hi_sm = theta_hi
)
ggplot(df_smooth, aes(x = t)) +
geom_point(aes(y = y), colour = "grey70", alpha = 0.6) +
geom_ribbon(aes(ymin = lo_sm, ymax = hi_sm), fill = "skyblue", alpha = 0.4) +
geom_line(aes(y = mean_sm, colour = "Smoothed mean"), linewidth = 1) +
geom_line(aes(y = mu_true, colour = "True level"), linewidth = 1) +
scale_colour_manual(values = c("Smoothed mean" = "blue", "True level" = "black")) +
labs(title = "Local Level: Smoothed State (FFBS) vs True Level",
x = "Waktu", y = "") +
theme_minimal()
Local linear trend (dua state):
\[ y_t = \mu_t + \varepsilon_t, \qquad \varepsilon_t \sim \mathrm{N}(0, V), \]
\[ \mu_t = \mu_{t-1} + \beta_{t-1} + \eta_t, \]
\[ \beta_t = \beta_{t-1} + \zeta_t, \]
dengan:
Bentuk DLM:
dlmSim)library(dlm)
library(ggplot2)
n_ll <- 200
V_ll <- 0.4^2
W_mu <- 0.1^2
W_beta <- 0.05^2
mu_ll <- numeric(n_ll)
beta_ll <- numeric(n_ll)
y_ll <- numeric(n_ll)
mu_ll[1] <- 0
beta_ll[1] <- 0
y_ll[1] <- mu_ll[1] + rnorm(1, sd = sqrt(V_ll))
for (t in 2:n_ll) {
mu_ll[t] <- mu_ll[t-1] + beta_ll[t-1] + rnorm(1, sd = sqrt(W_mu))
beta_ll[t] <- beta_ll[t-1] + rnorm(1, sd = sqrt(W_beta))
y_ll[t] <- mu_ll[t] + rnorm(1, sd = sqrt(V_ll))
}
ts_y_ll <- ts(y_ll)
ts_mu_ll <- ts(mu_ll)
ts_beta_ll <- ts(beta_ll)
autoplot(cbind(y = ts_y_ll, mu_true = ts_mu_ll)) +
labs(title = "Simulasi Local Linear Trend (Manual)",
x = "Waktu", y = "") +
theme_minimal()
dlmModPoly (Filter & Smooth)Kita gunakan dlmModPoly(order = 2) dengan parameter yang
sama seperti simulasi.
build_local_trend <- function(v, w_mu, w_beta) {
dlmModPoly(order = 2, dV = v, dW = c(w_mu, w_beta))
}
mod_true <- build_local_trend(V_ll, W_mu, W_beta)
filt_ll <- dlmFilter(y_ll, mod_true)
smooth_ll <- dlmSmooth(filt_ll)
## PERBAIKAN: buang state awal (t=0)
mu_filt <- drop(filt_ll$m[-1, 1])
mu_smooth <- drop(smooth_ll$s[-1, 1])
df_ll <- data.frame(
t = 1:length(y_ll),
y = y_ll,
mu_true = mu_ll,
mu_filt = mu_filt,
mu_smooth = mu_smooth
)
ggplot(df_ll, aes(x = t)) +
geom_point(aes(y = y), colour = "grey70", alpha = 0.6) +
geom_line(aes(y = mu_true, colour = "True level"), linewidth = 1) +
geom_line(aes(y = mu_filt, colour = "Filtered"), linewidth = 1) +
geom_line(aes(y = mu_smooth, colour = "Smoothed"), linewidth = 1,
linetype = "dashed") +
scale_colour_manual(values = c(
"True level" = "black",
"Filtered" = "blue",
"Smoothed" = "red"
)) +
labs(title = "Local Linear Trend: Observasi vs Level State",
x = "Waktu", y = "", colour = "") +
theme_minimal()
Regresi linear sederhana dengan parameter yang berubah terhadap waktu:
\[ y_t = \beta_{0,t} + \beta_{1,t} x_t + \varepsilon_t, \qquad \varepsilon_t \sim \mathrm{N}(0, V), \]
\[ \beta_{0,t} = \beta_{0,t-1} + \eta_{0,t}, \qquad \eta_{0,t} \sim \mathrm{N}(0, W_0), \]
\[ \beta_{1,t} = \beta_{1,t-1} + \eta_{1,t}, \qquad \eta_{1,t} \sim \mathrm{N}(0, W_1). \]
State:
\[ \boldsymbol{\theta}_t = \begin{pmatrix} \beta_{0,t} \\ \beta_{1,t} \end{pmatrix}, \qquad \mathbf{F}_t = \begin{pmatrix} 1 \\ x_t \end{pmatrix}, \qquad \mathbf{G}_t = \mathbf{I}_2. \]
library(dlm)
library(ggplot2)
set.seed(123)
n_tvp <- 200
x_t <- rnorm(n_tvp)
beta0_true <- cumsum(rnorm(n_tvp, sd = 0.05)) + 1
beta1_true <- cumsum(rnorm(n_tvp, sd = 0.03)) + 0.5
V_tvp <- 0.3^2
y_tvp <- beta0_true + beta1_true * x_t + rnorm(n_tvp, sd = sqrt(V_tvp))
ts_y_tvp <- ts(y_tvp)
autoplot(ts_y_tvp) +
labs(title = "Simulasi Time-Varying Parameter Regression",
x = "Waktu", y = "y_t") +
theme_minimal()
dlmModRegPaket dlm menyediakan dlmModReg yang
memformulasikan regresi dalam bentuk DLM dengan koefisien yang
dapat waktu-varian.
# Model TVP dengan intercept (addInt = TRUE)
mod_tvp <- dlmModReg(x_t,
dV = V_tvp,
dW = c(0.05^2, 0.03^2),
addInt = TRUE)
filt_tvp <- dlmFilter(y_tvp, mod_tvp)
smooth_tvp <- dlmSmooth(filt_tvp)
## PERBAIKAN: buang state awal t = 0
beta0_hat <- drop(smooth_tvp$s[-1, 1])
beta1_hat <- drop(smooth_tvp$s[-1, 2])
df_tvp <- data.frame(
t = 1:n_tvp,
beta0_true = beta0_true,
beta1_true = beta1_true,
beta0_hat = beta0_hat,
beta1_hat = beta1_hat
)
# Contoh plot evolusi koefisien
ggplot(df_tvp, aes(x = t)) +
geom_line(aes(y = beta0_true, colour = "beta0 true")) +
geom_line(aes(y = beta0_hat, colour = "beta0 hat"), linetype = "dashed") +
labs(title = "Koefisien Intersep Time-Varying: True vs Smoothed",
x = "Waktu", y = "beta0", colour = "") +
theme_minimal()
ggplot(df_tvp, aes(x = t)) +
geom_line(aes(y = beta1_true, colour = "beta1 true")) +
geom_line(aes(y = beta1_hat, colour = "beta1 hat"), linetype = "dashed") +
labs(title = "TVP Regression: Slope",
x = "Waktu", y = expression(beta[1](t))) +
scale_colour_manual(values = c("beta1 true" = "black", "beta1 hat" = "red")) +
theme_minimal()
INLA memodelkan Latent Gaussian Models (LGM) di mana
latent field dapat berupa random walk (RW1, RW2),
seasonal, AR1, dsb.
Banyak DLM (khususnya local level/trend) dapat dipetakan ke bentuk
INLA.
Local level:
\[ y_t = \mu_t + \varepsilon_t, \qquad \mu_t - \mu_{t-1} \sim \mathrm{N}(0, \tau^{-1}), \]
adalah RW1. Dalam INLA (sketsa):
library(INLA)
dat_ll <- data.frame(
y = y,
time = 1:length(y)
)
res_inla_ll <- inla(
y ~ f(time, model = "rw1"),
data = dat_ll,
family = "gaussian",
control.compute = list(dic = TRUE, waic = TRUE, cpo = TRUE)
)
summary(res_inla_ll)
## Time used:
## Pre = 1.09, Running = 0.202, Post = 0.0143, Total = 1.31
## Fixed effects:
## mean sd 0.025quant 0.5quant 0.975quant mode kld
## (Intercept) -0.063 0.04 -0.141 -0.063 0.015 -0.063 160.104
##
## Random effects:
## Name Model
## time RW1 model
##
## Model hyperparameters:
## mean sd 0.025quant 0.5quant
## Precision for the Gaussian observations 4.33 0.521 3.38 4.30
## Precision for time 2103.77 2583.645 204.13 1328.43
## 0.975quant mode
## Precision for the Gaussian observations 5.43 4.25
## Precision for time 8776.91 535.64
##
## Deviance Information Criterion (DIC) ...............: 216.06
## Deviance Information Criterion (DIC, saturated) ....: 158.96
## Effective number of parameters .....................: 6.72
##
## Watanabe-Akaike information criterion (WAIC) ...: 216.18
## Effective number of parameters .................: 6.58
##
## Marginal log-Likelihood: -123.84
## CPO, PIT is computed
## Posterior summaries for the linear predictor and the fitted values are computed
## (Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')
f(time, model="rw1") memodelkan state \(\mu_t\) sebagai random walk orde 1.Local linear trend (dua state) sering dipetakan ke RW2 pada level:
\[ \mu_t - 2\mu_{t-1} + \mu_{t-2} \sim \mathrm{N}(0, \tau^{-1}). \]
Di INLA:
dat_ll2 <- data.frame(
y = y_ll,
time = 1:length(y_ll)
)
res_inla_rw2 <- inla(
y ~ f(time, model = "rw2"),
data = dat_ll2,
family = "gaussian"
)
summary(res_inla_rw2)
## Time used:
## Pre = 1.02, Running = 0.195, Post = 0.0142, Total = 1.23
## Fixed effects:
## mean sd 0.025quant 0.5quant 0.975quant mode kld
## (Intercept) -4.768 0.029 -4.824 -4.768 -4.711 -4.768 7206.932
##
## Random effects:
## Name Model
## time RW2 model
##
## Model hyperparameters:
## mean sd 0.025quant 0.5quant
## Precision for the Gaussian observations 6.13 0.67 4.91 6.10
## Precision for time 459.52 152.19 227.41 437.39
## 0.975quant mode
## Precision for the Gaussian observations 7.54 6.04
## Precision for time 819.52 396.54
##
## Marginal log-Likelihood: -170.95
## is computed
## Posterior summaries for the linear predictor and the fitted values are computed
## (Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')
rw2 memodelkan second-order random walk, analog dengan
local linear trend.TVP regression:
\[ y_t = \beta_{0,t} + \beta_{1} x_t + \varepsilon_t, \]
bisa dimodelkan dengan:
Sketsa:
dat_tvp2 <- data.frame(
y = y_tvp,
x = x_t,
time = 1:length(y_tvp)
)
res_inla_tvp <- inla(
y ~ x + f(time, model = "rw1"),
data = dat_tvp2,
family = "gaussian"
)
summary(res_inla_tvp)
## Time used:
## Pre = 1.03, Running = 0.202, Post = 0.0138, Total = 1.25
## Fixed effects:
## mean sd 0.025quant 0.5quant 0.975quant mode kld
## (Intercept) 1.358 0.024 1.311 1.358 1.406 1.358 1211.023
## x 0.665 0.026 0.613 0.665 0.716 0.665 522.453
##
## Random effects:
## Name Model
## time RW1 model
##
## Model hyperparameters:
## mean sd 0.025quant 0.5quant
## Precision for the Gaussian observations 8.59 0.899 6.94 8.55
## Precision for time 1006.58 595.894 282.03 868.95
## 0.975quant mode
## Precision for the Gaussian observations 10.48 8.47
## Precision for time 2538.20 639.49
##
## Marginal log-Likelihood: -102.31
## is computed
## Posterior summaries for the linear predictor and the fitted values are computed
## (Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')
Untuk slope juga time-varying, bisa dibuat dua efek RW1 dengan struktur desain khusus (lebih teknis, biasanya memakai trik indeks dan faktor untuk membedakan level vs slope).
Dalam bab DLM ini, kita telah:
dlmModPoly,dlmModReg.rw1,rw2,Struktur ini paralel dengan bab AR, ARIMA, dan VAR Bayesian:
Dengan DLM, kita mendapat kerangka generik untuk banyak model deret waktu, dan dengan pendekatan Bayesian (Kalman + FFBS / INLA) kita bisa menyusun inferensi penuh (point estimate + interval + prediksi) secara konsisten.
set.seed(123)
n <- 150
mu <- beta <- numeric(n)
y_dlm <- numeric(n)
mu[1] <- 0
beta[1] <- 0.1 # trend awal
sigma_eps <- 0.5
sigma_eta <- 0.1
sigma_zeta <- 0.05
for (t in 2:n) {
beta[t] <- beta[t-1] + rnorm(1, 0, sigma_zeta)
mu[t] <- mu[t-1] + beta[t-1] + rnorm(1, 0, sigma_eta)
}
y_dlm <- mu + rnorm(n, 0, sigma_eps)
df_dlm <- data.frame(
t = 1:n,
y = y_dlm,
mu = mu
)
ggplot(df_dlm, aes(t, y)) +
geom_line(color = "grey40") +
geom_line(aes(y = mu), color = "blue", linewidth = 1) +
labs(
title = "Simulasi DLM: Local Level + Trend",
x = "Waktu",
y = "y_t"
) +
theme_minimal()
Garis biru adalah true state \(_t\); garis abu-abu adalah observasi dengan noise.
dlmlibrary(dlm)
library(ggplot2)
# Misal: y_dlm dan mu sudah didefinisikan sebelumnya
# n <- length(y_dlm)
build_dlm <- function(parm) {
V <- exp(parm[1]) # varians observasi
W_level <- exp(parm[2]) # varians level
W_trend <- exp(parm[3]) # varians trend
dlmModPoly(order = 2,
dV = V,
dW = c(W_level, W_trend))
}
# MLE hyperparameter
fit_mle <- dlmMLE(y_dlm, parm = rep(0, 3), build = build_dlm)
mod_dlm <- build_dlm(fit_mle$par)
# Filtering & smoothing
filt <- dlmFilter(y_dlm, mod_dlm)
smth <- dlmSmooth(filt)
# smth$s: (T+1) x 2 → buang baris pertama (state t = 0)
n <- length(y_dlm)
mu_hat <- drop(smth$s[-1, 1]) # sekarang panjang = n
# Pastikan mu juga panjangnya n
# kalau perlu: mu <- mu[1:n]
df_dlm_est <- data.frame(
t = 1:n,
y = y_dlm,
mu_true = mu,
mu_hat = mu_hat
)
ggplot(df_dlm_est, aes(t)) +
geom_line(aes(y = y), color = "grey70") +
geom_line(aes(y = mu_true), color = "blue", linewidth = 1) +
geom_line(aes(y = mu_hat), color = "red", linewidth = 1, linetype = "dashed") +
labs(
title = "DLM: True Level vs Posterior Mean Level",
x = "Waktu",
y = "Level"
) +
theme_minimal()
Meskipun di atas kita pakai MLE untuk hyper-parameter, pendekatan
Bayesian penuh dapat diperoleh dengan memberikan prior untuk \(V, W\)
dan melakukan MCMC; paket seperti dlm menyediakan fungsi
untuk forward filtering backward sampling (FFBS) yang
menghasilkan sampel state \(_t\) dari posterior.
knitr::opts_chunk$set(
message = FALSE,
warning = FALSE,
fig.width = 7,
fig.height = 4.5
)
library(dlm)
library(ggplot2)
library(ggfortify)
set.seed(123)
Pada regresi klasik (OLS), kita menulis:
\[ y_t = \beta_0 + \beta_1 x_t + \varepsilon_t, \qquad \varepsilon_t \sim \mathrm{N}(0, \sigma^2), \]
dengan asumsi bahwa koefisien regresi \(_0, _1\) tetap sepanjang waktu.
Dalam banyak aplikasi deret waktu (ekonomi, keuangan, kebijakan, dsb.), hubungan antara \(y_t\) dan \(x_t\) bisa berubah:
Model Time-Varying Parameter (TVP) mengakomodasi hal ini dengan:
\[ y_t = \beta_{0,t} + \beta_{1,t} x_t + \varepsilon_t, \qquad \varepsilon_t \sim \mathrm{N}(0, \sigma^2), \]
di mana koefisien \(*{0,t},* \) dibiarkan berubah terhadap waktu.
Cara paling umum memodelkan evolusi koefisien adalah random walk:
\[ \beta_{j,t} = \beta_{j,t-1} + \eta_{j,t}, \qquad \eta_{j,t} \sim \mathrm{N}(0, \omega_j^2), \]
untuk \(j = 0, 1\).
Artinya:
Jika \(_j^2\) sangat kecil, koefisien hampir konstan.
Jika \(_j^2\) besar, koefisien sangat dinamis.
Model TVP di atas bisa ditulis dalam bentuk DLM.
Definisikan state:
\[ \boldsymbol{\theta}_t = \begin{pmatrix} \beta_{0,t} \\ \beta_{1,t} \end{pmatrix}. \]
Vektor desain:
\[ \mathbf{F}_t = \begin{pmatrix} 1 \\ x_t \end{pmatrix}. \]
Persamaan observasi:
\[ y_t = \mathbf{F}_t^\top \boldsymbol{\theta}_t + \varepsilon_t, \qquad \varepsilon_t \sim \mathrm{N}(0, V), \]
dengan \(V = ^2\).
Untuk random walk pada koefisien, persamaan state:
\[ \boldsymbol{\theta}_t = \mathbf{G} \boldsymbol{\theta}_{t-1} + \mathbf{w}_t, \qquad \mathbf{w}_t \sim \mathrm{N}(\mathbf{0}, \mathbf{W}), \]
dengan:
\[ \mathbf{G} = \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix} = \mathbf{I}_2, \qquad \mathbf{W} = \begin{pmatrix} \omega_0^2 & 0 \\ 0 & \omega_1^2 \end{pmatrix}. \]
Jadi:
Dalam kerangka Bayesian:
Di sini kita ambil pendekatan conditionally Gaussian:
Kita akan mensimulasikan data dari model TVP supaya bisa:
n_tvp <- 200
x_t <- rnorm(n_tvp)
beta0_true <- cumsum(rnorm(n_tvp, sd = 0.05)) + 1
beta1_true <- cumsum(rnorm(n_tvp, sd = 0.03)) + 0.5
sigma_eps <- 0.3
V_tvp <- sigma_eps^2
y_tvp <- beta0_true + beta1_true * x_t + rnorm(n_tvp, sd = sigma_eps)
ts_y_tvp <- ts(y_tvp)
autoplot(ts_y_tvp) +
labs(title = "Simulasi Time-Varying Parameter Regression",
x = "Waktu", y = "y_t") +
theme_minimal()
df_beta_true <- data.frame(
t = 1:n_tvp,
beta0_true = beta0_true,
beta1_true = beta1_true
)
p0 <- ggplot(df_beta_true, aes(x = t)) +
geom_line(aes(y = beta0_true), colour = "blue") +
labs(title = expression("Koefisien Intercept " * beta[0](t) * " (True)"),
x = "Waktu", y = expression(beta[0](t))) +
theme_minimal()
p1 <- ggplot(df_beta_true, aes(x = t)) +
geom_line(aes(y = beta1_true), colour = "red") +
labs(title = expression("Koefisien Slope " * beta[1](t) * " (True)"),
x = "Waktu", y = expression(beta[1](t))) +
theme_minimal()
p0
p1
dlm)Paket dlm menyediakan fungsi dlmModReg()
yang membantu membangun model regresi sebagai DLM.
dlmModRegKita ingin:
mod_tvp <- dlmModReg(x_t, dV = V_tvp, dW = c(0.05^2, 0.03^2), addInt = TRUE)
mod_tvp
## $FF
## [,1] [,2]
## [1,] 1 1
##
## $V
## [,1]
## [1,] 0.09
##
## $GG
## [,1] [,2]
## [1,] 1 0
## [2,] 0 1
##
## $W
## [,1] [,2]
## [1,] 0.0025 0e+00
## [2,] 0.0000 9e-04
##
## $JFF
## [,1] [,2]
## [1,] 0 1
##
## $X
## [,1]
## [1,] -0.5605
## [2,] -0.2302
## [3,] ...
##
## $m0
## [1] 0 0
##
## $C0
## [,1] [,2]
## [1,] 1e+07 0e+00
## [2,] 0e+00 1e+07
Keterangan:
addInt = TRUE menambahkan intercept sebagai state
pertama,dW = c(0.05^2, 0.03^2) mengatur varians evolusi untuk
intercept dan slope,dV dan dW sering diestimasi
(misalnya dengan dlmMLE), tetapi di sini kita set sama
dengan nilai simulasi agar fokus pada estimasi state.Kita jalankan dlmFilter() untuk mendapatkan distribusi
posterior state \(_t\) berdasarkan data sampai waktu \(t\):
filt_tvp <- dlmFilter(y_tvp, mod_tvp)
str(filt_tvp$m)
## num [1:201, 1:2] 0 0.877 0.997 1.259 1.109 ...
filt_tvp$m adalah matriks \(n \) yang berisi mean
posterior state pada tiap waktu:
dlmSmooth() memberikan smoothing posterior \(_t D_n\),
yaitu state pada waktu \(t\) menggunakan seluruh data
dari 1 sampai \(n\):
smooth_tvp <- dlmSmooth(filt_tvp)
## buang state awal t = 0
beta0_hat <- drop(smooth_tvp$s[-1, 1])
beta1_hat <- drop(smooth_tvp$s[-1, 2])
df_tvp <- data.frame(
t = 1:n_tvp,
beta0_true = beta0_true,
beta1_true = beta1_true,
beta0_hat = beta0_hat,
beta1_hat = beta1_hat
)
ggplot(df_tvp, aes(x = t)) +
geom_line(aes(y = beta0_true, colour = "beta0 true")) +
geom_line(aes(y = beta0_hat, colour = "beta0 hat"), linetype = "dashed") +
scale_colour_manual(values = c("beta0 true" = "black", "beta0 hat" = "blue")) +
labs(title = "TVP Regression: Intercept (True vs Smoothed)",
x = "Waktu", y = expression(beta[0](t)),
colour = "") +
theme_minimal()
ggplot(df_tvp, aes(x = t)) +
geom_line(aes(y = beta1_true, colour = "beta1 true")) +
geom_line(aes(y = beta1_hat, colour = "beta1 hat"), linetype = "dashed") +
scale_colour_manual(values = c("beta1 true" = "black", "beta1 hat" = "red")) +
labs(title = "TVP Regression: Slope (True vs Smoothed)",
x = "Waktu", y = expression(beta[1](t)),
colour = "") +
theme_minimal()
Interpretasi:
Dari filter/smoother, kita bisa memperoleh fitted values \(_t\) dan residual:
# fitted berdasarkan smoothing state
y_hat <- beta0_hat + beta1_hat * x_t
resid <- y_tvp - y_hat
df_fit <- data.frame(
t = 1:n_tvp,
y = y_tvp,
y_hat = y_hat,
resid = resid
)
ggplot(df_fit, aes(x = t)) +
geom_line(aes(y = y, colour = "Observed")) +
geom_line(aes(y = y_hat, colour = "Fitted (TVP)"), linetype = "dashed") +
scale_colour_manual(values = c("Observed" = "black", "Fitted (TVP)" = "blue")) +
labs(title = "Observed vs Fitted (TVP Regression)",
x = "Waktu", y = "y_t",
colour = "") +
theme_minimal()
Plot residual:
ggplot(df_fit, aes(x = t, y = resid)) +
geom_hline(yintercept = 0, colour = "red", linetype = "dashed") +
geom_line() +
labs(title = "Residual TVP Regression",
x = "Waktu", y = "Residual") +
theme_minimal()
dlmModRegKetika kita menggunakan
dlmModReg(x_t, dV = V, dW = ...):
Secara Bayesian:
dlmFilter memberikan mean dan kovarian posterior secara
rekursif,dlmSmooth memberikan smoothing posterior.dlmBSampleKalau ingin sampling seluruh path state dari
posterior (bukan hanya mean), kita bisa gunakan
dlmBSample:
# Menghasilkan 100 sampel path state dari posterior
nsim <- 100
samp_states <- dlmBSample(filt_tvp)
dim(samp_states)
## [1] 201 2
# array: [time, state, nsim]
Kita bisa, misalnya, melihat credible band untuk \(\beta_{1,t}\):
library(dlm)
# -----------------------------------
# FILTER & SMOOTH
# -----------------------------------
filt_tvp <- dlmFilter(y_tvp, mod_tvp)
smooth_tvp <- dlmSmooth(filt_tvp)
# Ambil info model
m <- mod_tvp$m0 # dimensi state
W <- mod_tvp$W # state variance
V <- mod_tvp$V # observation variance
# Ambil hasil smoothing
a_smooth <- smooth_tvp$s # t = 0 .. n
R_smooth <- smooth_tvp$U.S # matriks kovarian smoothing (list)
n <- nrow(a_smooth) - 1 # tanpa state 0
nstate <- ncol(a_smooth)
# -----------------------------------
# FFBS MANUAL
# -----------------------------------
nsim <- 200
samples <- array(NA, dim = c(n, nstate, nsim))
for (k in 1:nsim) {
theta_sim <- matrix(NA, nrow = n + 1, ncol = nstate)
# Step 1: sample dari smoothing untuk t = n
theta_sim[n + 1, ] <- MASS::mvrnorm(
1,
a_smooth[n + 1, ],
chol2inv(R_smooth[[n + 1]])
)
# Step 2: backward sampling untuk t = n-1..1
for (t in n:1) {
# G = matriks transition
G <- mod_tvp$GG
# Smoothing mean dan covariance
a_t <- a_smooth[t, ]
R_t <- chol2inv(R_smooth[[t]])
# Conditional mean dan covariance dari formula Gaussian conditioning
mean_t <- a_t + R_t %*% t(G) %*% solve(G %*% R_t %*% t(G) + W) %*%
(theta_sim[t + 1, ] - G %*% a_t)
cov_t <- R_t - R_t %*% t(G) %*% solve(G %*% R_t %*% t(G) + W) %*%
G %*% R_t
theta_sim[t, ] <- MASS::mvrnorm(1, mean_t, cov_t)
}
samples[, , k] <- theta_sim[-1, ] # drop state t=0
}
# -----------------------------------
# Credible band untuk β1(t)
# -----------------------------------
beta1_samp <- samples[, 2, ] # dim = n × nsim
beta1_mean <- rowMeans(beta1_samp)
beta1_lo <- apply(beta1_samp, 1, quantile, 0.025)
beta1_hi <- apply(beta1_samp, 1, quantile, 0.975)
df_band <- data.frame(
t = 1:n_tvp,
mean = beta1_mean,
lo = beta1_lo,
hi = beta1_hi,
true = beta1_true
)
library(ggplot2)
ggplot(df_band, aes(x = t)) +
geom_ribbon(aes(ymin = lo, ymax = hi), fill = "skyblue", alpha = 0.4) +
geom_line(aes(y = mean, colour = "Posterior mean")) +
geom_line(aes(y = true, colour = "True beta1"), linewidth = 1) +
scale_colour_manual(values = c("Posterior mean" = "blue", "True beta1" = "black")) +
labs(title = "Credible Band untuk Koefisien Slope β₁(t)",
x = "Waktu", y = expression(beta[1](t)),
colour = "") +
theme_minimal()
INLA dirancang untuk Latent Gaussian Models (LGM) dengan struktur seperti:
\[ y_i \mid \eta_i \sim p(y_i \mid \eta_i), \qquad \eta_i = \mu + \sum_k f_k(z_{ik}) + \mathbf{x}_i^\top \boldsymbol{\beta}, \]
di mana efek acak \(f_k\) biasanya dimodelkan sebagai:
TVP regression:
\[ y_t = \beta_{0,t} + \beta_{1,t} x_t + \varepsilon_t \]
bisa didekati di INLA dengan beberapa trik, antara lain:
Intercept dinamis, slope tetap
Misal:
\[ y_t = \alpha_t + \beta x_t + \varepsilon_t, \]
dengan \(_t\) sebagai random walk:
# Sketsa, tidak dieksekusi di sini
library(INLA)
dat_inla <- data.frame(
y = y_tvp,
x = x_t,
time = 1:length(y_tvp)
)
res_inla <- inla(
y ~ x + f(time, model = "rw1"),
data = dat_inla,
family = "gaussian"
)Slope juga time-varying
Ini lebih teknis: perlu mendesain dua efek random walk yang disusun
sehingga satu mengalikan konstanta, satu mengalikan \(x_t\). Biasanya
dilakukan dengan membangun desain
f(index, model="rw1", values = ...) atau memanfaatkan fitur
rgeneric di INLA.
Untuk e-book, cukup penting menyampaikan bahwa:
Memperkenalkan konsep TVP
Koefisien regresi \(\beta_{j,t}\) berubah terhadap
waktu,
misalnya untuk model sederhana:
\[ y_t = \beta_{0,t} + \beta_{1,t} x_t + \varepsilon_t. \]
Evolusi koefisien sering dimodelkan sebagai random walk:
\[ \beta_{j,t} = \beta_{j,t-1} + \eta_{j,t}, \quad \eta_{j,t} \sim \mathrm{N}(0, \sigma_{\eta,j}^2). \]
Menulis TVP secara eksplisit sebagai Dynamic Linear Model (DLM)
Definisikan state vector:
\[ \boldsymbol{\theta}_t = \begin{pmatrix} \beta_{0,t} \\ \beta_{1,t} \end{pmatrix}. \]
Persamaan observasi (observation equation):
\[ y_t = \mathbf{F}_t^\top \boldsymbol{\theta}_t + \varepsilon_t, \]
dengan
\[ \mathbf{F}_t = \begin{pmatrix} 1 \\ x_t \end{pmatrix}, \quad \varepsilon_t \sim \mathrm{N}(0, V). \]
Persamaan state (state evolution):
\[ \boldsymbol{\theta}_t = G \boldsymbol{\theta}_{t-1} + \mathbf{w}_t, \]
dengan
\[ G = I_2, \quad \mathbf{w}_t \sim \mathrm{N}\!\left( \mathbf{0}, W = \begin{pmatrix} \sigma_{\beta_0}^2 & 0 \\ 0 & \sigma_{\beta_1}^2 \end{pmatrix} \right). \]
Mensimulasikan data TVP regression
Bangun trajektori \(\beta_{0,t}\) dan \(\beta_{1,t}\) sebagai random walk:
\[ \beta_{0,t} = \beta_{0,t-1} + \eta_{0,t}, \quad \beta_{1,t} = \beta_{1,t-1} + \eta_{1,t}, \]
dengan \(\eta_{0,t}, \eta_{1,t}\) Gaussian dengan varians kecil (koefisien berubah perlahan).
Bangun observasi:
\[ y_t = \beta_{0,t} + \beta_{1,t} x_t + \varepsilon_t, \quad \varepsilon_t \sim \mathrm{N}(0, V). \]
Mengestimasi TVP regression dengan DLM
(dlmModReg)
Di R, model TVP regresi dua koefisien dapat dibangun dengan:
mod_tvp <- dlmModReg(x_t, dV = V_tvp, dW = c(sigma0^2, sigma1^2), addInt = TRUE)Gunakan filtering dan smoothing:
filt_tvp <- dlmFilter(y_tvp, mod_tvp)
smooth_tvp <- dlmSmooth(filt_tvp)Bandingkan koefisien true vs smoothed untuk setiap waktu \(t\):
Dapat pula diplot observed vs fitted:
\[ \hat{y}_t = \mathbf{F}_t^\top \hat{\boldsymbol{\theta}}_t^{(\text{smooth})}. \]
Menunjukkan bagaimana FFBS (Forward Filtering Backward Sampling) dapat digunakan
Pada kerangka Bayesian DLM, kita ingin sampling full path state \(\boldsymbol{\theta}_{1:T}\) dari distribusi posterior \(p(\boldsymbol{\theta}_{1:T} \mid y_{1:T})\).
Algoritma FFBS melakukan:
Di paket dlm, ini dapat dilakukan dengan:
draw_state <- dlmBSample(filt_tvp)Dengan banyak sampel path, kita bisa membangun credible band untuk \(\beta_{j,t}\), misalnya 95% credible interval di setiap \(t\).
Menghubungkan TVP regression dengan INLA
Untuk kasus di mana intercept dinamis dan slope tetap, model bisa ditulis:
\[ y_t = \alpha_t + \beta x_t + \varepsilon_t, \]
dengan \(\alpha_t\) dimodelkan sebagai random walk:
\[ \alpha_t \sim \text{RW1}. \]
Dalam INLA, kita dapat menggunakan:
f(time, model = "rw1")
untuk memodelkan efek waktu \(\alpha_t\).
Untuk slope dinamis penuh \(\beta_t\), perlu desain model yang lebih teknis:
Bagian-bagian di atas bisa dijadikan struktur: mulai dari konsep TVP,
formulasi sebagai DLM, simulasi data, estimasi dengan dlm,
hingga koneksi ke INLA dan FFBS untuk inference Bayesian penuh.
TVP regression dapat dipandang sebagai jembatan alami antara regresi klasik dan DLM: begitu kita menerima bahwa parameter boleh berubah terhadap waktu, seluruh machinery state-space (Kalman, FFBS, dan INLA) menjadi alat utama untuk inference Bayesian yang kaya dan fleksibel.
library(dlm)
library(ggplot2)
set.seed(123)
n <- 200
x <- rnorm(n)
beta0 <- beta1 <- numeric(n)
beta0[1] <- 0
beta1[1] <- 0.5
sigma_eps <- 0.3
sigma_b0 <- 0.05
sigma_b1 <- 0.05
for (t in 2:n) {
beta0[t] <- beta0[t-1] + rnorm(1, 0, sigma_b0)
beta1[t] <- beta1[t-1] + rnorm(1, 0, sigma_b1)
}
y_tvp <- beta0 + beta1 * x + rnorm(n, 0, sigma_eps)
df_tvp <- data.frame(
t = 1:n,
x = x,
y = y_tvp,
beta0 = beta0,
beta1 = beta1
)
# Cek data
ggplot(df_tvp, aes(t, y)) +
geom_line() +
labs(
title = "Simulasi TVP Regression",
x = "Waktu",
y = "y_t"
) +
theme_minimal()
# --------------------------------------------------
# 1) Definisikan model DLM untuk TVP regression
# y_t = [1, x_t] %*% theta_t + eps_t
# theta_t = theta_{t-1} + noise
# --------------------------------------------------
build_tvp <- function(parm) {
V = exp(parm[1]) # var observasi
W0 = exp(parm[2]) # var beta0
W1 = exp(parm[3]) # var beta1
dlmModReg(x,
dV = V,
dW = c(W0, W1)) # random walk untuk beta0, beta1
}
# MLE untuk hyperparameter
fit_tvp <- dlmMLE(y_tvp,
parm = rep(0, 3),
build = build_tvp)
fit_tvp$par
## [1] -2.397554 -5.539475 -6.842776
mod_tvp <- build_tvp(fit_tvp$par)
# --------------------------------------------------
# 2) Filtering dan smoothing
# --------------------------------------------------
filt_tvp <- dlmFilter(y_tvp, mod_tvp)
smth_tvp <- dlmSmooth(filt_tvp) # <-- ini yang tadi belum ada
# smth_tvp$s: (n+1) x 2 (state t=0,1,...,n)
# buang state awal t=0
beta0_hat <- drop(smth_tvp$s[-1, 1])
beta1_hat <- drop(smth_tvp$s[-1, 2])
df_coef <- data.frame(
t = 1:n,
beta0 = beta0,
beta1 = beta1,
beta0_hat = beta0_hat,
beta1_hat = beta1_hat
)
# Plot perbandingan beta0(t)
ggplot(df_coef, aes(t)) +
geom_line(aes(y = beta0), color = "blue") +
geom_line(aes(y = beta0_hat), color = "red", linetype = "dashed") +
labs(
title = "TVP Regression: beta0(t) True vs Posterior Mean",
x = "Waktu",
y = expression(beta[0](t))
) +
theme_minimal()
# Plot perbandingan beta1(t)
ggplot(df_coef, aes(t)) +
geom_line(aes(y = beta1), color = "blue") +
geom_line(aes(y = beta1_hat), color = "red", linetype = "dashed") +
labs(
title = "TVP Regression: beta1(t) True vs Posterior Mean",
x = "Waktu",
y = expression(beta[1](t))
) +
theme_minimal()
Dalam notasi DLM:
\[ y_t = [1 ;; x_t] ] \[ _t = ]
dengan \( = (_0^2, _1^2)\).
dlm (FFBS)library(dlm)
library(dplyr)
library(tidyr)
library(ggplot2)
# ---- Model TVP dengan dlmModReg ----
mod_tvp <- dlmModReg(x, dW = c(0.01, 0.01), addInt = TRUE)
filt_tvp <- dlmFilter(y_tvp, mod_tvp)
smth_tvp <- dlmSmooth(filt_tvp)
# smth_tvp$s: (n+1) x 2 → buang state t = 0
beta0_hat <- drop(smth_tvp$s[-1, 1])
beta1_hat <- drop(smth_tvp$s[-1, 2])
length(beta0_hat); length(beta0) # sekarang dua-duanya 200
## [1] 200
## [1] 200
df_beta <- data.frame(
t = 1:n,
beta0_true = beta0,
beta1_true = beta1,
beta0_hat = beta0_hat,
beta1_hat = beta1_hat
)
library(dplyr)
library(tidyr)
library(ggplot2)
df_beta_long <- df_beta %>%
dplyr::select(t, beta0_true, beta0_hat, beta1_true, beta1_hat) %>%
tidyr::pivot_longer(
-t,
names_to = "series",
values_to = "value"
)
ggplot(df_beta_long, aes(t, value, color = series)) +
geom_line() +
labs(
title = "Koefisien Time-Varying: True vs Posterior Mean",
x = "Waktu",
y = "Nilai"
) +
theme_minimal()
beta0_true vs beta0_hat: intercept
sebenarnya vs estimasi posterior mean.beta1_true vs beta1_hat: slope
terhadap \(x_t\) yang berubah seiring waktu.Analisis deret waktu (time series) adalah bidang statistika yang mempelajari data yang tersusun menurut urutan waktu. Fokus utamanya adalah memahami struktur ketergantungan waktu dan melakukan peramalan (forecasting). Dalam kerangka Bayesian, model time series dapat diperkaya dengan prior dan regularisasi, menghasilkan estimasi yang stabil, terutama ketika data kecil atau model kompleks.
Salah satu pendekatan Bayesian yang sangat efisien untuk model time series berbasis latent Gaussian adalah INLA (Integrated Nested Laplace Approximation). Metode ini memberikan perkiraan posterior yang akurat secara deterministik tanpa MCMC, sehingga sangat efisien untuk model besar atau kompleks.
Dokumen ini memuat materi yang sangat lengkap:
Tujuannya agar pembaca mendapatkan pemahaman menyeluruh tentang bagaimana model time series diestimasi menggunakan INLA.
knitr::opts_chunk$set(
message = FALSE,
warning = FALSE,
fig.width = 7,
fig.height = 4.5
)
library(INLA)
library(ggplot2)
library(forecast) # untuk autoplot.ts, ggAcf, dll
set.seed(123)
Pada pendekatan INLA, ketergantungan waktu tidak dimodelkan dengan mem-regresi \(Y_t\) langsung pada lag-nya sebagai kovariat, tetapi dengan:
Membuat efek laten (random effect) yang diindeks oleh waktu, dan memberikan struktur korelasi tertentu, misalnya
model = "ar1",model = "rw1", dsb.
Secara umum, untuk deret \(y_t\):
f(time, model = "ar1")f(time, model = "rw1")Dalam Rmd ini:
rw1rw1Untuk setiap model, kita tampilkan:
model = "ar1"Model AR(1) klasik:
\[ Y_t = \phi Y_{t-1} + \varepsilon_t, \quad \varepsilon_t \sim N(0,\sigma^2). \]
Dalam kerangka INLA, kita tulis sebagai model hirarkis:
Vektor \(\boldsymbol\eta = (\eta_1,\dots,\eta_n)\) memiliki struktur AR(1) dan di-INLA-kan dengan:
f(time, model = "ar1")
n <- 300
phi_true <- 0.8 # parameter AR(1) laten
tau_eta <- 10 # precision proses laten (var = 0.1)
sigma_obs <- 0.3 # sd error observasi
eta <- numeric(n)
eta[1] <- rnorm(1, 0, sqrt(1 / tau_eta))
for (t in 2:n) {
eta[t] <- phi_true * eta[t - 1] + rnorm(1, 0, sqrt(1 / tau_eta))
}
y <- eta + rnorm(n, 0, sigma_obs)
ts_eta <- ts(eta)
ts_y <- ts(y)
autoplot(cbind(Laten = ts_eta, Observasi = ts_y)) +
labs(title = "Simulasi Proses Laten AR(1) dan Observasi",
x = "Waktu", y = "Nilai")
dat_ar1 <- data.frame(
y = as.numeric(ts_y),
time = 1:n
)
head(dat_ar1)
## y time
## 1 -0.39181062 1
## 2 -0.44038557 2
## 3 0.03968213 3
## 4 -0.03646228 4
## 5 0.13316988 5
## 6 0.85315911 6
res_ar1 <- inla(
y ~ 1 + f(time, model = "ar1"),
data = dat_ar1,
family = "gaussian",
control.predictor = list(compute = TRUE),
control.compute = list(dic = TRUE, waic = TRUE, cpo = TRUE)
)
summary(res_ar1)
## Time used:
## Pre = 1.04, Running = 0.206, Post = 0.0211, Total = 1.27
## Fixed effects:
## mean sd 0.025quant 0.5quant 0.975quant mode kld
## (Intercept) 0.054 0.072 -0.087 0.053 0.197 0.053 48.528
##
## Random effects:
## Name Model
## time AR1 model
##
## Model hyperparameters:
## mean sd 0.025quant 0.5quant
## Precision for the Gaussian observations 9.727 1.714 6.771 9.59
## Precision for time 6.683 1.499 4.208 6.52
## Rho for time 0.782 0.065 0.631 0.79
## 0.975quant mode
## Precision for the Gaussian observations 13.497 9.319
## Precision for time 10.077 6.221
## Rho for time 0.887 0.805
##
## Deviance Information Criterion (DIC) ...............: 281.72
## Deviance Information Criterion (DIC, saturated) ....: 413.44
## Effective number of parameters .....................: 111.94
##
## Watanabe-Akaike information criterion (WAIC) ...: 289.01
## Effective number of parameters .................: 95.63
##
## Marginal log-Likelihood: -201.06
## CPO, PIT is computed
## Posterior summaries for the linear predictor and the fitted values are computed
## (Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')
Hyperparameter (\(\phi\), prec, dll):
res_ar1$summary.hyperpar
## mean sd 0.025quant
## Precision for the Gaussian observations 9.7271304 1.71405462 6.7710003
## Precision for time 6.6834536 1.49866489 4.2080764
## Rho for time 0.7817723 0.06544702 0.6314693
## 0.5quant 0.975quant mode
## Precision for the Gaussian observations 9.5857836 13.4974060 9.3190646
## Precision for time 6.5244042 10.0772138 6.2211361
## Rho for time 0.7896956 0.8866999 0.8051098
eta_hat <- res_ar1$summary.random$time$mean
plot(eta, type = "l", col = "black", lwd = 2,
ylab = "Nilai", xlab = "Waktu",
main = "AR(1) Laten: True vs Posterior Mean")
lines(eta_hat, col = "red", lwd = 2, lty = 2)
legend("topleft",
legend = c("True eta_t", "Posterior mean (INLA)"),
col = c("black", "red"),
lty = c(1, 2),
lwd = c(2, 2),
bty = "n")
Posterior mean untuk fitted values (\(\mathbb{E}[y_t \mid data]\)) tersedia di:
res_ar1$summary.fitted.values
Kita plot:
fit_ar1 <- res_ar1$summary.fitted.values$mean
ts_fit_ar1 <- ts(fit_ar1, start = start(ts_y), frequency = frequency(ts_y))
autoplot(cbind(Observed = ts_y, Fitted = ts_fit_ar1)) +
labs(title = "AR(1) dengan INLA: Observed vs Fitted",
x = "Waktu", y = "Y_t")
Interpretasi: garis fitted harus mengikuti pola \(y_t\) dengan smoothing sesuai komponen laten AR(1).
Model MA(1) klasik:
\[ Y_t = \varepsilon_t + \theta \, \varepsilon_{t-1}, \quad \varepsilon_t \sim N(0, \sigma^2). \]
Di sini, ketergantungan waktu ada pada error, bukan pada proses laten yang eksplisit. Sampai saat ini:
R-INLA tidak menyediakan kernel khusus
model = "ma1"ataumodel = "arma"untuk error term.
Artinya:
f(time, model="ma1") dan berharap INLA mengenali struktur
MA(1).Jadi, secara eksak, MA(1) tidak dimodelkan oleh INLA seperti AR(1).
set.seed(456)
theta_true <- 0.6
y_ma <- arima.sim(model = list(ma = theta_true), n = 300)
ts_ma <- ts(y_ma)
autoplot(ts_ma) +
labs(title = "Simulasi MA(1)",
x = "Waktu", y = "Y_t")
Kalau tujuannya hanya smoothing / ilustrasi, kita bisa memandang deret ini sebagai:
rw1.dat_ma <- data.frame(
y = as.numeric(ts_ma),
time = 1:length(ts_ma)
)
res_ma_rw1 <- inla(
y ~ 1 + f(time, model = "rw1"),
data = dat_ma,
family = "gaussian",
control.predictor = list(compute = TRUE),
control.compute = list(dic = TRUE, waic = TRUE, cpo = TRUE)
)
summary(res_ma_rw1)
## Time used:
## Pre = 1.06, Running = 0.229, Post = 0.0171, Total = 1.3
## Fixed effects:
## mean sd 0.025quant 0.5quant 0.975quant mode kld
## (Intercept) 0.055 0.064 -0.071 0.055 0.182 0.055 59.843
##
## Random effects:
## Name Model
## time RW1 model
##
## Model hyperparameters:
## mean sd 0.025quant 0.5quant
## Precision for the Gaussian observations 8.05e-01 6.60e-02 0.682 8.02e-01
## Precision for time 2.04e+04 2.34e+04 1086.930 1.28e+04
## 0.975quant mode
## Precision for the Gaussian observations 9.41e-01 0.799
## Precision for time 8.25e+04 2817.047
##
## Deviance Information Criterion (DIC) ...............: 921.61
## Deviance Information Criterion (DIC, saturated) ....: 304.97
## Effective number of parameters .....................: 2.77
##
## Watanabe-Akaike information criterion (WAIC) ...: 921.80
## Effective number of parameters .................: 2.95
##
## Marginal log-Likelihood: -475.63
## CPO, PIT is computed
## Posterior summaries for the linear predictor and the fitted values are computed
## (Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')
fit_ma <- res_ma_rw1$summary.fitted.values$mean
ts_fit_ma <- ts(fit_ma, start = start(ts_ma), frequency = frequency(ts_ma))
autoplot(cbind(Observed = ts_ma, Fitted_RW1 = ts_fit_ma)) +
labs(title = "MA(1) (Simulasi) + RW1 (Aproksimasi INLA)",
x = "Waktu", y = "Y_t")
Penting:
Plot ini bukan estimasi parameter \(\theta\) MA(1). Ini hanya menunjukkan bahwa
INLA bisa memberikan komponen halus untuk deret yang secara generatif
adalah MA(1).
Model ARIMA(1,1,0):
set.seed(101)
n_arima <- 300
y_arima <- arima.sim(
list(order = c(1, 1, 0), ar = 0.6),
n = n_arima
)
ts_y_arima <- ts(y_arima)
autoplot(ts_y_arima) +
labs(title = "Simulasi ARIMA(1,1,0)",
x = "Waktu", y = "Y_t")
dy <- diff(ts_y_arima)
ts_dy <- ts(dy)
autoplot(ts_dy) +
labs(title = "Differenced Series ΔY_t (ARIMA(1,1,0))",
x = "Waktu", y = "ΔY_t")
ggAcf(ts_dy) + labs(title = "ACF ΔY_t")
ggPacf(ts_dy) + labs(title = "PACF ΔY_t")
Deret \(\Delta Y_t\) kira-kira AR(1).
dat_dy <- data.frame(
y = as.numeric(ts_dy),
time = 1:length(ts_dy)
)
res_dy_ar1 <- inla(
y ~ 1 + f(time, model = "ar1"),
data = dat_dy,
family = "gaussian",
control.predictor = list(compute = TRUE),
control.compute = list(dic = TRUE, waic = TRUE, cpo = TRUE)
)
res_dy_ar1$summary.hyperpar
## mean sd 0.025quant
## Precision for the Gaussian observations 2.202320e+04 2.434934e+04 1386.6605963
## Precision for time 7.419565e-01 8.746926e-02 0.5819337
## Rho for time 5.927750e-01 4.638758e-02 0.4974373
## 0.5quant 0.975quant mode
## Precision for the Gaussian observations 1.433749e+04 8.666124e+04 3740.7492188
## Precision for time 7.377919e-01 9.257767e-01 0.7313908
## Rho for time 5.942465e-01 6.797469e-01 0.5961666
fit_dy <- res_dy_ar1$summary.fitted.values$mean
ts_fit_dy <- ts(fit_dy, start = start(ts_dy), frequency = frequency(ts_dy))
autoplot(cbind(Observed = ts_dy, Fitted = ts_fit_dy)) +
labs(title = "ΔY_t (ARIMA(1,1,0)) : Observed vs Fitted (AR(1) laten)",
x = "Waktu", y = "ΔY_t")
ARIMA(1,1,1):
INLA:
model = "ar1" atau model = "ar" (orde lebih
tinggi),ma1, arma, dsb.),Kesimpulan:
- ARIMA(1,1,0) (tanpa MA) nyaman ditangani (AR(1) laten pada differenced series).
- ARIMA(1,1,1) tidak bisa diestimasi secara eksak dan langsung dengan R-INLA menggunakan kernel bawaan; hanya bisa didekati (misalnya dengan AR orde lebih tinggi atau DLM khusus).
rw1Local level model:
\[ \begin{aligned} y_t &= \mu_t + \varepsilon_t, \quad \varepsilon_t \sim N(0, \sigma^2), \\ \mu_t &= \mu_{t-1} + \eta_t, \quad \eta_t \sim N(0, \tau^{-1}). \end{aligned} \]
Di INLA, \(\mu_t\) dimodelkan dengan
f(time, model = "rw1").
set.seed(202)
n_dlm <- 200
mu <- numeric(n_dlm)
mu[1] <- 0
for (t in 2:n_dlm) {
mu[t] <- mu[t - 1] + rnorm(1, 0, 0.2) # RW1
}
sigma_obs_dlm <- 0.5
y_dlm <- mu + rnorm(n_dlm, 0, sigma_obs_dlm)
ts_mu <- ts(mu)
ts_y_dlm <- ts(y_dlm)
autoplot(cbind(Level = ts_mu, Observasi = ts_y_dlm)) +
labs(title = "Simulasi Local Level Model (RW1)",
x = "Waktu", y = "Nilai")
dat_dlm <- data.frame(
y = as.numeric(ts_y_dlm),
time = 1:n_dlm
)
res_dlm <- inla(
y ~ 1 + f(time, model = "rw1"),
data = dat_dlm,
family = "gaussian",
control.predictor = list(compute = TRUE),
control.compute = list(dic = TRUE, waic = TRUE, cpo = TRUE)
)
summary(res_dlm)
## Time used:
## Pre = 1.01, Running = 0.212, Post = 0.0149, Total = 1.24
## Fixed effects:
## mean sd 0.025quant 0.5quant 0.975quant mode kld
## (Intercept) 1.546 0.034 1.479 1.546 1.613 1.546 723.076
##
## Random effects:
## Name Model
## time RW1 model
##
## Model hyperparameters:
## mean sd 0.025quant 0.5quant
## Precision for the Gaussian observations 4.32 0.611 3.24 4.28
## Precision for time 16.67 5.268 8.63 15.90
## 0.975quant mode
## Precision for the Gaussian observations 5.64 4.20
## Precision for time 29.15 14.47
##
## Deviance Information Criterion (DIC) ...............: 329.84
## Deviance Information Criterion (DIC, saturated) ....: 254.16
## Effective number of parameters .....................: 52.03
##
## Watanabe-Akaike information criterion (WAIC) ...: 333.44
## Effective number of parameters .................: 46.62
##
## Marginal log-Likelihood: -210.44
## CPO, PIT is computed
## Posterior summaries for the linear predictor and the fitted values are computed
## (Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')
mu_hat <- res_dlm$summary.random$time$mean
plot(mu, type = "l", col = "black", lwd = 2,
ylim = range(c(mu, mu_hat)),
xlab = "Waktu", ylab = "μ_t",
main = "Local Level: True vs Posterior Mean (rw1)")
lines(mu_hat, col = "red", lwd = 2, lty = 2)
legend("topleft",
legend = c("True μ_t", "Posterior mean (INLA)"),
col = c("black", "red"),
lty = c(1, 2),
lwd = c(2, 2),
bty = "n")
fit_dlm <- res_dlm$summary.fitted.values$mean
ts_fit_dlm <- ts(fit_dlm, start = start(ts_y_dlm), frequency = frequency(ts_y_dlm))
autoplot(cbind(Observed = ts_y_dlm, Fitted = ts_fit_dlm)) +
labs(title = "Local Level (rw1) dengan INLA: Observed vs Fitted",
x = "Waktu", y = "Y_t")
Model TVP yang paling sederhana dan natural di INLA:
\[ y_t = \alpha_t + \beta x_t + \varepsilon_t, \]
dengan:
\[ \alpha_t = \alpha_{t-1} + u_t, \quad u_t \sim N(0, \tau^{-1}), \]
sehingga intercept (level) berubah terhadap waktu (RW1), sedangkan slope \(\beta\) tetap.
set.seed(303)
n_tvp <- 250
x_t <- rnorm(n_tvp)
alpha <- numeric(n_tvp)
alpha[1] <- 0
for (t in 2:n_tvp) {
alpha[t] <- alpha[t - 1] + rnorm(1, 0, 0.1) # RW1
}
beta_true <- 1.5
sigma_eps <- 0.5
y_tvp <- alpha + beta_true * x_t + rnorm(n_tvp, 0, sigma_eps)
ts_alpha <- ts(alpha)
ts_y_tvp <- ts(y_tvp)
autoplot(cbind(alpha = ts_alpha, y = ts_y_tvp)) +
labs(title = "Simulasi TVP: Intercept ~ RW1",
x = "Waktu", y = "Nilai")
dat_tvp <- data.frame(
y = as.numeric(ts_y_tvp),
x = x_t,
time = 1:n_tvp
)
res_tvp <- inla(
y ~ x + f(time, model = "rw1"),
data = dat_tvp,
family = "gaussian",
control.predictor = list(compute = TRUE),
control.compute = list(dic = TRUE, waic = TRUE, cpo = TRUE)
)
summary(res_tvp)
## Time used:
## Pre = 1.01, Running = 0.221, Post = 0.0174, Total = 1.25
## Fixed effects:
## mean sd 0.025quant 0.5quant 0.975quant mode kld
## (Intercept) 1.021 0.031 0.959 1.021 1.083 1.021 518.834
## x 1.477 0.034 1.410 1.477 1.544 1.477 682.631
##
## Random effects:
## Name Model
## time RW1 model
##
## Model hyperparameters:
## mean sd 0.025quant 0.5quant
## Precision for the Gaussian observations 4.10 0.40 3.36 4.08
## Precision for time 137.12 55.41 58.52 127.31
## 0.975quant mode
## Precision for the Gaussian observations 4.94 4.06
## Precision for time 272.81 109.65
##
## Deviance Information Criterion (DIC) ...............: 384.82
## Deviance Information Criterion (DIC, saturated) ....: 277.57
## Effective number of parameters .....................: 25.27
##
## Watanabe-Akaike information criterion (WAIC) ...: 386.28
## Effective number of parameters .................: 24.50
##
## Marginal log-Likelihood: -226.70
## CPO, PIT is computed
## Posterior summaries for the linear predictor and the fitted values are computed
## (Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')
alpha_hat <- res_tvp$summary.random$time$mean
plot(alpha, type = "l", col = "black", lwd = 2,
ylim = range(c(alpha, alpha_hat)),
xlab = "Waktu", ylab = "α_t",
main = "TVP Intercept (rw1): True vs Posterior Mean")
lines(alpha_hat, col = "red", lwd = 2, lty = 2)
legend("topleft",
legend = c("True α_t", "Posterior mean (INLA)"),
col = c("black", "red"),
lty = c(1, 2),
lwd = c(2, 2),
bty = "n")
fit_tvp <- res_tvp$summary.fitted.values$mean
ts_fit_tvp <- ts(fit_tvp, start = start(ts_y_tvp), frequency = frequency(ts_y_tvp))
autoplot(cbind(Observed = ts_y_tvp, Fitted = ts_fit_tvp)) +
labs(title = "TVP Intercept (rw1) dengan INLA: Observed vs Fitted",
x = "Waktu", y = "Y_t")
Model ARDL(1,1):
\[ y_t = a_0 + a_1 y_{t-1} + b_0 x_t + b_1 x_{t-1} + e_t, \quad e_t \sim N(0,\sigma^2). \]
Di sini, ketergantungan waktu ada pada lag sebagai kovariat. Dalam INLA Gaussian, ini cukup sebagai regresi biasa, kecuali kita ingin menambah efek laten (misal trend).
set.seed(404)
n_ardl <- 300
# Proses X_t (AR(1))
x_ardl <- as.numeric(arima.sim(list(ar = 0.4), n = n_ardl))
# Parameter true
a0_true <- 0.5
a1_true <- 0.6
b0_true <- 0.8
b1_true <- -0.3
sigma_e <- 0.7
y_ardl <- numeric(n_ardl)
y_ardl[1] <- 0 # initial
for (t in 2:n_ardl) {
y_ardl[t] <- a0_true +
a1_true * y_ardl[t - 1] +
b0_true * x_ardl[t] +
b1_true * x_ardl[t - 1] +
rnorm(1, 0, sigma_e)
}
ts_y_ardl <- ts(y_ardl)
ts_x_ardl <- ts(x_ardl)
autoplot(cbind(y = ts_y_ardl, x = ts_x_ardl)) +
labs(title = "Simulasi ARDL(1,1)",
x = "Waktu", y = "Nilai")
dat_ardl <- data.frame(
y = y_ardl[-1],
y_lag = y_ardl[-n_ardl],
x = x_ardl[-1],
x_lag = x_ardl[-n_ardl]
)
head(dat_ardl)
## y y_lag x x_lag
## 1 1.9804646 0.0000000 1.5957038 -0.1017597
## 2 0.9333709 1.9804646 0.4301183 1.5957038
## 3 0.7637402 0.9333709 -0.7078686 0.4301183
## 4 1.4016635 0.7637402 -0.8555508 -0.7078686
## 5 1.4920445 1.4016635 -0.3545484 -0.8555508
## 6 0.2158763 1.4920445 -1.5755921 -0.3545484
res_ardl <- inla(
y ~ 1 + y_lag + x + x_lag,
data = dat_ardl,
family = "gaussian",
control.predictor = list(compute = TRUE),
control.compute = list(dic = TRUE, waic = TRUE, cpo = TRUE)
)
summary(res_ardl)
## Time used:
## Pre = 0.936, Running = 0.195, Post = 0.0126, Total = 1.14
## Fixed effects:
## mean sd 0.025quant 0.5quant 0.975quant mode kld
## (Intercept) 0.506 0.070 0.369 0.506 0.642 0.506 64.206
## y_lag 0.601 0.045 0.512 0.601 0.690 0.601 164.336
## x 0.763 0.042 0.681 0.763 0.845 0.763 227.266
## x_lag -0.342 0.056 -0.451 -0.342 -0.233 -0.342 89.682
##
## Model hyperparameters:
## mean sd 0.025quant 0.5quant
## Precision for the Gaussian observations 2.05 0.168 1.73 2.04
## 0.975quant mode
## Precision for the Gaussian observations 2.39 2.04
##
## Deviance Information Criterion (DIC) ...............: 642.05
## Deviance Information Criterion (DIC, saturated) ....: 306.39
## Effective number of parameters .....................: 5.00
##
## Watanabe-Akaike information criterion (WAIC) ...: 642.18
## Effective number of parameters .................: 5.03
##
## Marginal log-Likelihood: -348.98
## CPO, PIT is computed
## Posterior summaries for the linear predictor and the fitted values are computed
## (Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')
res_ardl$summary.fixed
## mean sd 0.025quant 0.5quant 0.975quant mode
## (Intercept) 0.5055722 0.06968573 0.3688295 0.5055722 0.6423150 0.5055722
## y_lag 0.6007873 0.04544470 0.5116121 0.6007873 0.6899624 0.6007873
## x 0.7631597 0.04168931 0.6813536 0.7631597 0.8449657 0.7631597
## x_lag -0.3420890 0.05565618 -0.4513018 -0.3420890 -0.2328760 -0.3420890
## kld
## (Intercept) 64.20563
## y_lag 164.33650
## x 227.26596
## x_lag 89.68219
fit_ardl <- res_ardl$summary.fitted.values$mean
ts_y_ardl_sub <- ts_y_ardl[-1] # karena kita hilangkan observasi pertama
ts_fit_ardl <- ts(fit_ardl, start = start(ts_y_ardl_sub), frequency = frequency(ts_y_ardl_sub))
autoplot(cbind(Observed = ts_y_ardl_sub, Fitted = ts_fit_ardl)) +
labs(title = "ARDL(1,1) dengan INLA: Observed vs Fitted",
x = "Waktu", y = "Y_t")
Jika posterior mean mendekati:
y_lag ≈ \(a_1 =
0.6\),x ≈ \(b_0 =
0.8\),x_lag ≈ \(b_1 =
-0.3\),maka model berhasil merekonstruksi struktur ARDL.
model = "ar1" atau "ar").rw1.rw1 di indeks
waktu.model = "ma1".Dengan revisi ini, setiap model utama sekarang:
Dalam e-book ini kita sudah mendisukan:
stan_ar sebagai aproksimasi.Beberapa arah pengayaan yang bisa Anda tambahkan untuk mendekati target 10.000 kata:
bayesforecastPendekatan Bayesian dalam analisis deret waktu menawarkan alternatif yang lebih komprehensif dibandingkan pendekatan frequentist tradisional, karena memberikan:
Model-model deret waktu klasik seperti ARIMA dan GARCH selama ini biasanya diestimasi menggunakan Maximum Likelihood Estimation (MLE). Namun, MLE hanya memberi taksiran titik (point estimate), sehingga ketidakpastian sering dipresentasikan sebagai interval kepercayaan yang bersifat asimtotik.
Dengan Bayesian, kita bekerja dengan posterior distribution:
\[ p(\theta \mid y) \propto p(y \mid \theta) \, p(\theta), \]
di mana:
Untuk memperoleh posterior, digunakan algoritma Hamiltonian Monte Carlo (HMC), yang lebih efisien dibandingkan Metropolis–Hastings klasik, terutama pada parameter berdimensi tinggi.
Paket bayesforecast menyediakan wrapper yang memudahkan pemodelan Bayesian untuk berbagai model deret waktu tanpa perlu menulis kode Stan manual, tetapi tetap menghasilkan objek Stan yang terbuka untuk inspeksi mendalam.
Jenis model yang dapat diestimasi antara lain:
| Kategori Model | Contoh |
|---|---|
| ARIMA/SARIMA | stan_arima(), stan_sarima() |
| ARMAX | stan_arimax() |
| GARCH | stan_garch() |
| Stochastic Volatility | stan_sv() |
| ETS (Exponential Smoothing) | stan_ets() |
| Additive time-series (Prophet-like) | forecast() |
Semua berujung pada objek kelas varstan.
Catatan dalam bayesforecast ada juga fungsi
stan_LocalLeveldanstan_ssm
Fokus catatan ini adalah membedakan dua fungsi pemodelan deret waktu
berbasis state-space yang sering digunakan dalam
konteks Bayesian time series dengan Stan (misalnya melalui package
bayesforecast), yaitu:
stan_LocalLevelstan_ssmKeduanya sama-sama bekerja dengan model state-space Gaussian, tetapi:
stan_LocalLevel dirancang khusus untuk Local
Level Model (ETS(A,N,N)).stan_ssm adalah state-space model yang jauh
lebih fleksibel (bisa menampung musiman, regresor, AR, dan
lain-lain).Analogi singkat:
stan_LocalLevel itu indomie instan rasa
original — simpel, tinggal seduh.stan_ssm itu indomie goreng custom topping
sosis, telur, keju, ekstra cabai — fleksibel luar biasa, tapi
bikin keringetan saat masak.Model stan_LocalLevel
stan_LocalLevel adalah model khusus
untuk Local Level Model, yang dalam terminologi ETS
dikenal sebagai:
Model ini cocok ketika:
Formulasi Matematis
Model Local Level dapat ditulis sebagai:
Persamaan observasi:
\[ y_t = \mu_t + \varepsilon_t, \qquad \varepsilon_t \sim \mathrm{N}(0, \sigma_\varepsilon^2) \]
Persamaan state (level):
\[ \mu_t = \mu_{t-1} + \eta_t, \qquad \eta_t \sim \mathrm{N}(0, \sigma_\eta^2) \]
di mana:
Model ini merepresentasikan bahwa level deret waktu mengikuti random walk.
library(bayesforecast)
Karena paket ini membungkus Stan, penting untuk memastikan:
rstan bekerja baik (test compile),Untuk versi pengembangan dari GitHub:
if (!requireNamespace("remotes")) install.packages("remotes")
remotes::install_github("asael697/bayesforecast", dependencies = TRUE)
Paket bayesforecast menyediakan beberapa dataset deret waktu, misalnya:
birth – U.S. Monthly Live Births,air – Australian Air Transport Passengers,aust – International Tourists to Australia,demgbp – DEM/GBP exchange-rate log-returns.Sebagai ilustrasi, kita gunakan birth.
library(bayesforecast)
library(ggplot2)
data("birth")
autoplot(birth) +
ggtitle("U.S. Monthly Live Births (1948–1979)") +
xlab("Tahun") + ylab("Jumlah kelahiran")
Dataset ini berfrekuensi bulanan dan sering menunjukkan pola musiman tahunan.
Secara umum, prosedur analisis dengan bayesforecast:
stan_*).par(mfrow = c(2,1))
acf(birth, main = "ACF Birth")
pacf(birth, main = "PACF Birth")
par(mfrow = c(1,1))
Pola musiman sering terlihat setiap lag 12, yang mengindikasikan adanya komponen musiman tahunan. Dengan melihat bentuk ACF/PACF setelah differencing, kita dapat mengusulkan model:
\[ \text{SARIMA}(1,1,1)(1,1,1)[12]. \]
Kita fit model SARIMA Bayesian dengan stan_sarima():
sf1 <- stan_sarima(
ts = birth,
order = c(1, 1, 1), # (p,d,q)
seasonal = c(1, 1, 1), # (P,D,Q)
prior_mu0 = student(mu = 0, sd = 1, df = 7)
)
##
## SAMPLING FOR MODEL 'Sarima' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 0.000562 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 5.62 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 1: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 1: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 1: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 1: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 1.471 seconds (Warm-up)
## Chain 1: 1.46 seconds (Sampling)
## Chain 1: 2.931 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'Sarima' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 0.000102 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 1.02 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 2: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 2: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 2: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 2: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 2: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 2: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 2: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 2: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 2: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 2: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 1.275 seconds (Warm-up)
## Chain 2: 1.311 seconds (Sampling)
## Chain 2: 2.586 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL 'Sarima' NOW (CHAIN 3).
## Chain 3:
## Chain 3: Gradient evaluation took 0.000103 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 1.03 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 3: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 3: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 3: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 3: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 3: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 3: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 3: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 3: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 3: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 3: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 3: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 1.356 seconds (Warm-up)
## Chain 3: 1.391 seconds (Sampling)
## Chain 3: 2.747 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL 'Sarima' NOW (CHAIN 4).
## Chain 4:
## Chain 4: Gradient evaluation took 0.000102 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 1.02 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 4: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 4: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 4: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 4: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 4: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 4: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 4: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 4: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 4: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 4: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 4: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 1.308 seconds (Warm-up)
## Chain 4: 1.404 seconds (Sampling)
## Chain 4: 2.712 seconds (Total)
## Chain 4:
Beberapa poin penting:
order dan seasonal merepresentasikan
struktur ARIMA dan komponen musiman.prior_mu0 mendefinisikan prior Student-t untuk mean
proses.varstan.sf1
##
## y ~ Sarima(1,1,1)(1,1,1)[12]
## 373 observations and 1 dimension
## Differences: 1 seasonal Differences: 1
## Current observations: 360
##
## mean se 5% 95% ess Rhat
## mu0 0.0032 0.0020 -0.1982 0.2095 3961.905 1.0007
## sigma0 7.3490 0.0043 6.9191 7.8087 3870.751 1.0000
## ar -0.2522 0.0012 -0.3663 -0.1190 3983.610 0.9999
## ma -0.0344 0.0014 -0.2014 0.0813 3988.143 0.9999
## sar 0.0071 0.0015 -0.1572 0.1467 3899.113 1.0001
## sma -0.6725 0.0015 -0.7878 -0.4895 4242.670 1.0001
## loglik -1231.6743 0.0301 -1235.3450 -1229.2140 3978.216 1.0001
##
## Samples were drawn using sampling(NUTS). For each parameter, ess
## is the effective sample size, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
summary(sf1)
## mean se 5% 95% ess Rhat
## mu0 0.0032 0.0020 -0.1982 0.2095 3961.905 1.0007
## sigma0 7.3490 0.0043 6.9191 7.8087 3870.751 1.0000
## ar -0.2522 0.0012 -0.3663 -0.1190 3983.610 0.9999
## ma -0.0344 0.0014 -0.2014 0.0813 3988.143 0.9999
## sar 0.0071 0.0015 -0.1572 0.1467 3899.113 1.0001
## sma -0.6725 0.0015 -0.7878 -0.4895 4242.670 1.0001
## loglik -1231.6743 0.0301 -1235.3450 -1229.2140 3978.216 1.0001
Ringkasan posterior mencakup:
ar,
ma,sar, sma,sigma0,n_eff, Rhat,Interpretasi singkat:
Rhat yang dekat 1 menandakan bahwa rantai MCMC
telah konvergen.n_eff yang besar mengindikasikan sampel efektif tinggi
(estimasi lebih stabil).Dengan Bayesian, kita tetap ingin memastikan bahwa residual sudah menyerupai white noise.
check_residuals(sf1)
Biasanya akan menampilkan:
Kriteria informal:
Sebagai model berbasis Stan, kita dapat melihat traceplot dan densitas posterior secara langsung:
autoplot(sf1)
Biasanya akan muncul:
Traceplot yang baik:
Untuk pengguna lanjutan, objek varstan dapat diekstrak
menjadi stanfit dan diperiksa dengan fungsi dari
rstan atau bayesplot, misalnya:
library(bayesplot)
mcmc_plot(sf1)
Hal ini bermanfaat bila Anda ingin mengajarkan:
Setelah model di-fit, kita dapat melakukan prediksi deret waktu ke depan.
fc1 <- forecast(object = sf1, h = 12) # 12 bulan ke depan
fc1
## Point Forecast Lo 0.8 Hi 0.8 Lo 0.9 Hi 0.9
## Feb 1979 259.9892 250.0487 269.0776 248.1898 271.7703
## Mar 1979 285.1902 275.7282 294.7230 273.1562 296.9838
## Apr 1979 266.0691 256.5118 275.1977 254.0761 277.5773
## May 1979 277.5248 268.7863 286.6856 265.6823 289.5428
## Jun 1979 273.3999 264.7065 282.3757 261.5980 285.2977
## Jul 1979 299.2620 289.0300 308.5841 286.9002 311.0207
## Aug 1979 306.3336 296.8058 316.3620 293.8361 319.4893
## Sep 1979 297.5937 288.6591 306.8253 285.8230 309.2628
## Oct 1979 291.2026 282.0151 300.3891 279.5768 302.4320
## Nov 1979 274.7729 265.4530 284.4903 262.6284 287.2950
## Dec 1979 286.5449 277.0472 295.9851 275.3133 299.0047
## Jan 1980 279.7973 270.1443 289.5902 267.1712 292.5459
autoplot(fc1)
Objek fc1 biasanya memuat:
Interpretasi interval kredibel:
Misalnya, interval kredibel 95% untuk bulan tertentu adalah [a, b]. Artinya, berdasarkan model dan data yang ada, probabilitas bahwa nilai sebenarnya berada dalam interval [a, b] adalah 95% (dengan asumsi model benar).
Ini berbeda dengan interval kepercayaan frequentist yang memiliki interpretasi berbasis long-run frequency.
Selain ARIMA/SARIMA, bayesforecast juga mendukung model
volatilitas seperti GARCH, yang cocok untuk data keuangan (return saham,
kurs, dll.) dengan volatility clustering.
library(bayesforecast)
library(ggplot2)
data("demgbp")
autoplot(demgbp) +
ggtitle("DEM/GBP Exchange-Rate Log-Returns") +
ylab("Log-returns") + xlab("Waktu")
# Model GARCH(1,1): order = c(1,1,0)
sf_garch <- stan_garch(
ts = demgbp,
order = c(1, 1, 0),
iter = 1000,
chains = 2
)
##
## SAMPLING FOR MODEL 'tgarch' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 0.000867 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 8.67 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 1000 [ 0%] (Warmup)
## Chain 1: Iteration: 100 / 1000 [ 10%] (Warmup)
## Chain 1: Iteration: 200 / 1000 [ 20%] (Warmup)
## Chain 1: Iteration: 300 / 1000 [ 30%] (Warmup)
## Chain 1: Iteration: 400 / 1000 [ 40%] (Warmup)
## Chain 1: Iteration: 500 / 1000 [ 50%] (Warmup)
## Chain 1: Iteration: 501 / 1000 [ 50%] (Sampling)
## Chain 1: Iteration: 600 / 1000 [ 60%] (Sampling)
## Chain 1: Iteration: 700 / 1000 [ 70%] (Sampling)
## Chain 1: Iteration: 800 / 1000 [ 80%] (Sampling)
## Chain 1: Iteration: 900 / 1000 [ 90%] (Sampling)
## Chain 1: Iteration: 1000 / 1000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 1.364 seconds (Warm-up)
## Chain 1: 1.119 seconds (Sampling)
## Chain 1: 2.483 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'tgarch' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 9.3e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.93 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 1000 [ 0%] (Warmup)
## Chain 2: Iteration: 100 / 1000 [ 10%] (Warmup)
## Chain 2: Iteration: 200 / 1000 [ 20%] (Warmup)
## Chain 2: Iteration: 300 / 1000 [ 30%] (Warmup)
## Chain 2: Iteration: 400 / 1000 [ 40%] (Warmup)
## Chain 2: Iteration: 500 / 1000 [ 50%] (Warmup)
## Chain 2: Iteration: 501 / 1000 [ 50%] (Sampling)
## Chain 2: Iteration: 600 / 1000 [ 60%] (Sampling)
## Chain 2: Iteration: 700 / 1000 [ 70%] (Sampling)
## Chain 2: Iteration: 800 / 1000 [ 80%] (Sampling)
## Chain 2: Iteration: 900 / 1000 [ 90%] (Sampling)
## Chain 2: Iteration: 1000 / 1000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 1.53 seconds (Warm-up)
## Chain 2: 1.493 seconds (Sampling)
## Chain 2: 3.023 seconds (Total)
## Chain 2:
summary(sf_garch)
## mean se 5% 95% ess Rhat
## mu0 -0.0062 0.0007 -0.0458 0.0313 1022.9616 1.0054
## sigma0 0.0813 0.0011 0.0306 0.1401 776.9680 1.0044
## arch 0.1805 0.0021 0.0819 0.2949 789.4296 1.0047
## garch 0.4180 0.0061 0.1120 0.7227 794.3957 1.0037
## loglik -199.9436 0.0480 -202.8444 -197.9061 884.2589 1.0055
check_residuals(sf_garch)
Secara konseptual, model GARCH(1,1):
\[ y_t = \sigma_t \varepsilon_t, \quad \varepsilon_t \sim \mathrm{N}(0,1), \]
\[ \sigma_t^2 = \omega + \alpha y_{t-1}^2 + \beta \sigma_{t-1}^2. \]
Dalam setting Bayesian, \(\omega, \alpha, \beta\) diperlakukan sebagai acak dengan prior tertentu, dan kita memperoleh posterior untuk parameter serta distribusi prediktif \(\sigma_t^2\) di masa depan.
| Aspek | Frequentist (MLE) | Bayesian (HMC/Stan) |
|---|---|---|
| Estimasi | Taksiran titik (point estimate) | Distribusi posterior penuh |
| Ketidakpastian | Interval kepercayaan asimtotik | Interval kredibel dengan interpretasi probabilistik |
| Informasi luar (expert) | Sulit dimasukkan | Mudah melalui prior |
| Prediksi | Point forecast + interval | Full predictive distribution |
| Kompleksitas model | Diatur dengan penalti (AIC/BIC) | Diatur via prior dan marginal likelihood |
Untuk pengajaran, tabel seperti ini sangat membantu menjelaskan mengapa kita repot memakai HMC dan Bayesian—bukan sekadar “karena lagi tren”.
Poin-poin utama:
Modul ini dapat dikembangkan lebih lanjut dengan:
Diberikan deret waktu bulanan jumlah penumpang pesawat domestik di
Australia air (dari paket bayesforecast
atau dataset serupa). Modelkan dengan pendekatan Bayesian:
varstan() dengan misalnya
iter = 1000, chains = 2.Pertanyaan:
library(bayesforecast)
library(ggplot2)
library(bayesplot)
# Data
data("air")
ts_air <- air
# Misal kita pakai SARIMA(1,1,1)(1,0,1)[12]
order_nonseasonal <- c(1, 1, 1) # (p,d,q)
order_seasonal <- c(1, 0, 1) # (P,D,Q)
# 1) Bangun objek model SARIMA
model_sarima <- Sarima(
ts_air,
order = order_nonseasonal,
seasonal = order_seasonal,
period = 12
)
# 2) Set prior: Normal(0,1) untuk AR, MA, SAR, SMA
model_sarima <- set_prior(
model_sarima,
par = "ar",
dist = normal(0, 1)
)
model_sarima <- set_prior(
model_sarima,
par = "ma",
dist = normal(0, 1)
)
model_sarima <- set_prior(
model_sarima,
par = "sar",
dist = normal(0, 1)
)
model_sarima <- set_prior(
model_sarima,
par = "sma",
dist = normal(0, 1)
)
# 3) Fitting dengan varstan
fit_sarima <- varstan(
model_sarima,
iter = 1000,
chains = 2,
seed = 123
)
##
## SAMPLING FOR MODEL 'Sarima' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 1.7e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.17 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 1000 [ 0%] (Warmup)
## Chain 1: Iteration: 100 / 1000 [ 10%] (Warmup)
## Chain 1: Iteration: 200 / 1000 [ 20%] (Warmup)
## Chain 1: Iteration: 300 / 1000 [ 30%] (Warmup)
## Chain 1: Iteration: 400 / 1000 [ 40%] (Warmup)
## Chain 1: Iteration: 500 / 1000 [ 50%] (Warmup)
## Chain 1: Iteration: 501 / 1000 [ 50%] (Sampling)
## Chain 1: Iteration: 600 / 1000 [ 60%] (Sampling)
## Chain 1: Iteration: 700 / 1000 [ 70%] (Sampling)
## Chain 1: Iteration: 800 / 1000 [ 80%] (Sampling)
## Chain 1: Iteration: 900 / 1000 [ 90%] (Sampling)
## Chain 1: Iteration: 1000 / 1000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.154 seconds (Warm-up)
## Chain 1: 0.118 seconds (Sampling)
## Chain 1: 0.272 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'Sarima' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 1.4e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.14 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 1000 [ 0%] (Warmup)
## Chain 2: Iteration: 100 / 1000 [ 10%] (Warmup)
## Chain 2: Iteration: 200 / 1000 [ 20%] (Warmup)
## Chain 2: Iteration: 300 / 1000 [ 30%] (Warmup)
## Chain 2: Iteration: 400 / 1000 [ 40%] (Warmup)
## Chain 2: Iteration: 500 / 1000 [ 50%] (Warmup)
## Chain 2: Iteration: 501 / 1000 [ 50%] (Sampling)
## Chain 2: Iteration: 600 / 1000 [ 60%] (Sampling)
## Chain 2: Iteration: 700 / 1000 [ 70%] (Sampling)
## Chain 2: Iteration: 800 / 1000 [ 80%] (Sampling)
## Chain 2: Iteration: 900 / 1000 [ 90%] (Sampling)
## Chain 2: Iteration: 1000 / 1000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 0.133 seconds (Warm-up)
## Chain 2: 0.098 seconds (Sampling)
## Chain 2: 0.231 seconds (Total)
## Chain 2:
# Ringkasan posterior
summary(fit_sarima)
## mean se 5% 95% ess Rhat
## mu0 2.7901 0.0239 1.6153 4.0727 1027.9264 1.0071
## sigma0 2.0486 0.0102 1.5814 2.6595 777.3069 1.0037
## ar -0.4078 0.0089 -0.7699 0.1121 1183.0037 1.0057
## ma 0.2669 0.0136 -0.3760 0.9331 1212.8471 0.9994
## sar 0.0781 0.0100 -0.4461 0.6042 945.0107 1.0144
## sma -0.6348 0.0084 -0.9689 -0.1171 965.7012 1.0011
## loglik -56.3035 0.0584 -59.4385 -53.5170 1030.7259 1.0023
# Konvergensi: R-hat, ESS, traceplot
print(fit_sarima, pars = c("ar1", "ma1", "sar1", "sma1"))
##
## y ~ Sarima(1,1,1)(1,0,1)[12]
## 27 observations and 1 dimension
## Differences: 1 seasonal Differences: 0
## Current observations: 26
##
## mean se 5% 95% ess Rhat
## mu0 2.7901 0.0239 1.6153 4.0727 1027.9264 1.0071
## sigma0 2.0486 0.0102 1.5814 2.6595 777.3069 1.0037
## ar -0.4078 0.0089 -0.7699 0.1121 1183.0037 1.0057
## ma 0.2669 0.0136 -0.3760 0.9331 1212.8471 0.9994
## sar 0.0781 0.0100 -0.4461 0.6042 945.0107 1.0144
## sma -0.6348 0.0084 -0.9689 -0.1171 965.7012 1.0011
## loglik -56.3035 0.0584 -59.4385 -53.5170 1030.7259 1.0023
##
## Samples were drawn using sampling(NUTS). For each parameter, ess
## is the effective sample size, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
# Jika bayesforecast menyimpan model Stan:
# as.stanfit() mungkin tersedia pada versi tertentu.
# Cek dokumentasi versi yang Anda gunakan.
# Contoh (pseudo):
# fit_stan <- as.stanfit(fit_sarima)
# mcmc_trace(as.array(fit_stan), pars = c("ar[1]", "ma[1]", "sar[1]", "sma[1]"))
# mcmc_acf(as.array(fit_stan), pars = c("ar[1]", "ma[1]"))
fc_sarima <- forecast(
fit_sarima,
h = 12,
level = 0.90 # 90% credible interval
)
autoplot(fc_sarima) +
ggtitle("Forecast 12 Bulan ke Depan (SARIMA Bayesian)") +
xlab("Waktu") + ylab("Penumpang (air)")
# Fungsi berikut disediakan bayesforecast (cek dokumentasi):
res_sarima <- residuals(fit_sarima)
# ACF & PACF residual
bayesforecast::ggacf(res_sarima)
bayesforecast::ggpacf(res_sarima)
Interpretasi singkat (narasi):
summary(fit_sarima) perhatikan posterior
mean dan credible interval (misal 95%) untuk AR(1), MA(1),
SAR(1), SMA(1).library(INLA)
library(bayesforecast)
library(ggplot2)
INLA::inla.setOption(num.threads = "1:1") # supaya hasil lebih stabil
# ------------------------------------------------------------
# Data: sama dengan bayesforecast (seri air, bulanan)
# ------------------------------------------------------------
data("air")
ts_air <- air
n <- length(ts_air)
h <- 12 # forecast 12 bulan ke depan
# Dalam INLA, observasi yang akan diramal dibuat NA
y_inla <- c(as.numeric(ts_air), rep(NA, h))
time <- 1:(n + h)
season <- rep(1:12, length.out = n + h)
# ------------------------------------------------------------
# Model INLA:
# y_t = intercept + f(time, AR1) + f(season, seasonal(12)) + noise
# Ini state-space Gaussian sepadan dengan SARIMA musiman.
# Priors default cukup, atau bisa dipersempit via hyper.
# ------------------------------------------------------------
formula_inla <- y_inla ~ 1 +
f(time,
model = "ar1") +
f(season,
model = "seasonal",
season.length = 12)
res_inla <- inla(
formula_inla,
data = list(y_inla = y_inla, time = time, season = season),
family = "gaussian",
control.predictor = list(compute = TRUE),
control.compute = list(dic = TRUE, waic = TRUE),
quantiles = c(0.05, 0.5, 0.95) # untuk credible interval 90%
)
# Ringkasan model: intercept + hyperparameter AR1 & seasonal
summary(res_inla)
## Time used:
## Pre = 1.48, Running = 0.203, Post = 0.0165, Total = 1.69
## Fixed effects:
## mean sd 0.05quant 0.5quant 0.95quant mode kld
## (Intercept) 44.7 17.426 16.69 44.597 72.991 44.6 576.595
##
## Random effects:
## Name Model
## time AR1 model
## season Seasonal model
##
## Model hyperparameters:
## mean sd 0.05quant 0.5quant
## Precision for the Gaussian observations 2.73e+04 3.26e+04 3048.401 1.73e+04
## Precision for time 5.00e-03 4.00e-03 0.001 4.00e-03
## Rho for time 9.66e-01 2.70e-02 0.915 9.74e-01
## Precision for season 5.60e-02 1.40e-01 0.001 2.00e-02
## 0.95quant mode
## Precision for the Gaussian observations 8.40e+04 5890.320
## Precision for time 1.20e-02 0.002
## Rho for time 9.93e-01 0.985
## Precision for season 2.17e-01 0.001
##
## Deviance Information Criterion (DIC) ...............: -146.59
## Deviance Information Criterion (DIC, saturated) ....: 62.36
## Effective number of parameters .....................: 35.36
##
## Watanabe-Akaike information criterion (WAIC) ...: -156.62
## Effective number of parameters .................: 19.95
##
## Marginal log-Likelihood: -83.53
## is computed
## Posterior summaries for the linear predictor and the fitted values are computed
## (Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')
# ------------------------------------------------------------
# Ekstrak forecast 12 bulan ke depan + credible interval 90%
# ------------------------------------------------------------
idx_forecast <- (n + 1):(n + h)
inla_mean <- res_inla$summary.fitted.values[idx_forecast, "mean"]
inla_lower <- res_inla$summary.fitted.values[idx_forecast, "0.05quant"]
inla_upper <- res_inla$summary.fitted.values[idx_forecast, "0.95quant"]
# ------------------------------------------------------------
# Plot khusus INLA: data historis + forecast + CI 90%
# ------------------------------------------------------------
time_hist <- 1:n
time_fc <- (n + 1):(n + h)
df_hist <- data.frame(
time = time_hist,
value = as.numeric(ts_air)
)
df_fc <- data.frame(
time = time_fc,
mean = inla_mean,
lower = inla_lower,
upper = inla_upper
)
ggplot() +
# Data historis
geom_line(
data = df_hist,
aes(x = time, y = value),
colour = "black"
) +
# Credible interval 90%
geom_ribbon(
data = df_fc,
aes(x = time, ymin = lower, ymax = upper),
fill = "blue",
alpha = 0.2
) +
# Mean forecast
geom_line(
data = df_fc,
aes(x = time, y = mean),
colour = "blue",
linewidth = 1
) +
labs(
x = "Waktu (bulan)",
y = "Jumlah penumpang (scaled)",
title = "Forecast 12 bulan dengan INLA (AR1 + Seasonal 12)",
subtitle = "Garis biru: mean posterior, pita biru: credible interval 90%"
) +
theme_minimal()
library(INLA)
library(bayesforecast)
library(ggplot2)
INLA::inla.setOption(num.threads = "1:1")
# ------------------------------------------------------------
# 1. Data
# ------------------------------------------------------------
data("air")
ts_air <- air
n <- length(ts_air)
h <- 12
time_hist <- 1:n
time_fc <- (n + 1):(n + h)
# ------------------------------------------------------------
# 2. Setup indeks untuk INLA
# ------------------------------------------------------------
y_inla <- c(as.numeric(ts_air), rep(NA, h))
time_idx <- 1:(n + h) # untuk komponen RW1
season <- rep(1:12, length.out = n + h) # indeks musiman
time_lin <- (1:(n + h)) / 12 # tren linear (tahun)
# ------------------------------------------------------------
# 3. Model INLA: trend + seasonal
# y_t = beta0 + beta1 * time_lin + f(time_idx, RW1) + seasonal(12) + noise
# ------------------------------------------------------------
formula_inla_trend <- y_inla ~ 1 +
time_lin + # trend linear
f(time_idx,
model = "rw1",
hyper = list(
prec = list(prior = "loggamma", param = c(1, 0.01))
)) +
f(season,
model = "seasonal",
season.length = 12,
hyper = list(
prec = list(prior = "loggamma", param = c(1, 0.01))
))
res_inla_trend <- inla(
formula_inla_trend,
data = list(
y_inla = y_inla,
time_idx = time_idx,
time_lin = time_lin,
season = season
),
family = "gaussian",
control.predictor = list(compute = TRUE),
control.compute = list(dic = TRUE, waic = TRUE),
quantiles = c(0.05, 0.5, 0.95) # 90% CI
)
summary(res_inla_trend)$fixed
## mean sd 0.05quant 0.5quant 0.95quant mode kld
## (Intercept) 15.260 8.312 1.688 15.220 28.972 15.222 75.837
## time_lin 24.669 5.475 15.637 24.696 33.606 24.695 164.217
summary(res_inla_trend)$hyperpar
## mean sd 0.05quant 0.5quant
## Precision for the Gaussian observations 21169.003 20660.311 2880.973 15009.516
## Precision for time_idx 0.184 0.060 0.101 0.176
## Precision for season 0.082 0.106 0.008 0.049
## 0.95quant mode
## Precision for the Gaussian observations 59315.874 5854.767
## Precision for time_idx 0.295 0.161
## Precision for season 0.260 0.015
# ------------------------------------------------------------
# 4. Ekstrak forecast + credible interval 90%
# ------------------------------------------------------------
idx_forecast <- (n + 1):(n + h)
inla_mean_tr <- res_inla_trend$summary.fitted.values[idx_forecast, "mean"]
inla_lower_tr <- res_inla_trend$summary.fitted.values[idx_forecast, "0.05quant"]
inla_upper_tr <- res_inla_trend$summary.fitted.values[idx_forecast, "0.95quant"]
# ------------------------------------------------------------
# 5. Plot INLA (dengan trend): data + forecast + CI 90%
# ------------------------------------------------------------
df_hist <- data.frame(
time = time_hist,
value = as.numeric(ts_air)
)
df_fc_tr <- data.frame(
time = time_fc,
mean = inla_mean_tr,
lower = inla_lower_tr,
upper = inla_upper_tr
)
ggplot() +
geom_line(
data = df_hist,
aes(x = time, y = value),
colour = "black"
) +
geom_ribbon(
data = df_fc_tr,
aes(x = time, ymin = lower, ymax = upper),
fill = "blue",
alpha = 0.2
) +
geom_line(
data = df_fc_tr,
aes(x = time, y = mean),
colour = "blue",
linewidth = 1
) +
labs(
x = "Waktu (bulan)",
y = "Jumlah penumpang (scaled)",
title = "Forecast 12 bulan dengan INLA: Trend + Seasonal(12)",
subtitle = "Garis biru: mean posterior, pita biru: credible interval 90%"
) +
theme_minimal()
Diberikan data log-return mata uang demgbp (dari paket
bayesforecast) dan ingin dimodelkan volatilitasnya
dengan model GARCH(1,1) dalam kerangka Bayesian.
Pertanyaan:
library(bayesforecast)
data("demgbp")
y_ret <- demgbp
# Model GARCH(1,1) dengan inovasi Normal
# gunakan wrapper stan_garch (sesuai vignette)
fit_g <- stan_garch(
ts = y_ret,
order = c(1, 1, 0), # GARCH(1,1), tanpa MGARCH
arma = c(0, 0), # mean hanya konstanta (tanpa ARMA)
genT = FALSE, # inovasi Normal (bukan t-student)
chains = 2,
iter = 2000,
prior_arch = beta(2, 5), # prior untuk α (ARCH)
prior_garch = beta(2, 5), # prior untuk β (GARCH)
prior_sigma0 = exponential(1) # prior untuk ω (intercept varians)
)
##
## SAMPLING FOR MODEL 'tgarch' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 0.00015 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 1.5 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 1: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 1: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 1: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 1: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 2.07 seconds (Warm-up)
## Chain 1: 2.119 seconds (Sampling)
## Chain 1: 4.189 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'tgarch' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 9.9e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.99 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 2: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 2: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 2: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 2: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 2: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 2: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 2: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 2: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 2: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 2: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 2.295 seconds (Warm-up)
## Chain 2: 1.907 seconds (Sampling)
## Chain 2: 4.202 seconds (Total)
## Chain 2:
summary(fit_g)
## mean se 5% 95% ess Rhat
## mu0 -0.0076 0.0005 -0.0450 0.0303 2065.684 0.9995
## sigma0 0.0895 0.0006 0.0443 0.1371 2090.894 1.0016
## arch 0.2000 0.0015 0.1005 0.3271 2037.022 1.0066
## garch 0.3603 0.0034 0.1185 0.6209 2143.256 1.0028
## loglik -200.0808 0.0325 -202.7733 -198.0751 2111.159 0.9999
fit_stan_g <- as.stan(fit_g) # konversi ke stanfit
post_g <- rstan::extract(fit_stan_g)
names(post_g) # lihat nama parameter
## [1] "mu0" "sigma0" "arch" "garch" "mu" "epsilon"
## [7] "sigma" "loglik" "log_lik" "fit" "residuals" "lp__"
# misal param-nya bernama "arch" dan "garch" (cek dari names)
alpha_post <- post_g$arch
beta_post <- post_g$garch
sum_term <- alpha_post + beta_post
mean(sum_term < 1) # proporsi sampel yang stasioner
## [1] 1
# Prediksi 20 langkah ke depan
fc_g <- forecast(fit_g, h = 20)
autoplot(fc_g)
# Cek residual
check_residuals(fit_g)
Diskusi singkat:
Tujuan: membandingkan pendekatan Bayesian menggunakan
INLA dan menggunakan Stan (melalui
bayesforecast atau model sendiri). Dipilih satu seri
waktu, misalnya birth dari paket
bayesforecast.
Pertanyaan:
data("birth")
ts_birth <- birth
model_bf <- Sarima(
ts_birth, # TANPA y =
order = c(0, 1, 2),
seasonal = c(1, 0, 1),
period = 12
)
fit_bf <- varstan(
model_bf,
iter = 2000,
chains = 2,
seed = 123
)
##
## SAMPLING FOR MODEL 'Sarima' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 0.000109 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 1.09 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 1: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 1: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 1: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 1: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 3.571 seconds (Warm-up)
## Chain 1: 1.957 seconds (Sampling)
## Chain 1: 5.528 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'Sarima' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 0.000104 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 1.04 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 2: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 2: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 2: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 2: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 2: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 2: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 2: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 2: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 2: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 2: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 31.691 seconds (Warm-up)
## Chain 2: 104.441 seconds (Sampling)
## Chain 2: 136.132 seconds (Total)
## Chain 2:
summary(fit_bf)
## mean se 5% 95% ess Rhat
## mu0 0.2306 0.0054 -0.0513 0.4779 1.3050 2.1222
## sigma0 18.8089 0.2469 7.4416 31.1777 1.3067 2.1222
## ma.1 -0.4733 0.0027 -0.6020 -0.2939 1.3059 2.1222
## ma.2 -0.2604 0.0021 -0.3634 -0.1150 1.3035 2.1222
## sar 0.1326 0.0185 -0.6973 0.9810 1.3027 2.1222
## sma -0.2682 0.0030 -0.4573 -0.1219 1.3042 2.1223
## loglik -1544.4862 5.5751 -1796.8869 -1293.0451 1.3054 2.1225
fc_bf <- forecast(fit_bf, h = 12)
autoplot(fc_bf) +
ggtitle("Forecast 12 bulan - Stan/bayesforecast")
fc_bf <- forecast(fit_bf, h = h, level = 90)
bf_mean <- as.numeric(fc_bf$mean)
bf_lower <- as.numeric(fc_bf$lower[, "90%"])
bf_upper <- as.numeric(fc_bf$upper[, "90%"])
Untuk INLA, perlu menggunakan paket
INLA dan menulis ARIMA ke dalam bentuk state-space (latent
Gaussian model). Sketsa (pseudo):
# =================================================
# BAGIAN B: INLA state-space AR1 + seasonal
# =================================================
data("birth")
ts_birth <- birth
n <- length(ts_birth) # jangan pakai nilai n dari chunk lain
h <- 12
time_hist <- 1:n
time_fc <- (n + 1):(n + h)
# ------------------------------------------------------------
# Set variabel untuk INLA (PANJANG SAMA)
# ------------------------------------------------------------
y_inla <- c(as.numeric(ts_birth), rep(NA, h)) # length = n+h
time_idx <- 1:(n + h) # length sama
season <- rep(1:12, length.out = n + h) # length sama
length(y_inla); length(time_idx); length(season) # cek: harus sama
## [1] 385
## [1] 385
## [1] 385
# ------------------------------------------------------------
# MODEL INLA: AR1 + seasonal
# ------------------------------------------------------------
formula_inla <- y_inla ~ 1 +
f(time_idx, model = "ar1") +
f(season, model = "seasonal", season.length = 12)
res_inla <- inla(
formula_inla,
data = list(
y_inla = y_inla,
time_idx = time_idx,
season = season
),
family = "gaussian",
control.predictor = list(compute = TRUE),
control.compute = list(dic = TRUE, waic = TRUE),
quantiles = c(0.05, 0.5, 0.95)
)
summary(res_inla)
## Time used:
## Pre = 1.1, Running = 0.441, Post = 0.0251, Total = 1.57
## Fixed effects:
## mean sd 0.05quant 0.5quant 0.95quant mode kld
## (Intercept) 308.834 14.567 284.71 308.975 332.447 308.966 24009.49
##
## Random effects:
## Name Model
## time_idx AR1 model
## season Seasonal model
##
## Model hyperparameters:
## mean sd 0.05quant 0.5quant
## Precision for the Gaussian observations 0.060 0.011 0.044 0.058
## Precision for time_idx 0.001 0.000 0.001 0.001
## Rho for time_idx 0.986 0.003 0.981 0.986
## Precision for season 0.000 0.000 0.000 0.000
## 0.95quant mode
## Precision for the Gaussian observations 0.080 0.056
## Precision for time_idx 0.002 0.001
## Rho for time_idx 0.990 0.986
## Precision for season 0.000 0.000
##
## Deviance Information Criterion (DIC) ...............: 2294.07
## Deviance Information Criterion (DIC, saturated) ....: 544.70
## Effective number of parameters .....................: 176.95
##
## Watanabe-Akaike information criterion (WAIC) ...: 2300.43
## Effective number of parameters .................: 141.88
##
## Marginal log-Likelihood: -1287.39
## is computed
## Posterior summaries for the linear predictor and the fitted values are computed
## (Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')
idx_forecast <- (n + 1):(n + h)
inla_mean <- res_inla$summary.fitted.values[idx_forecast, "mean"]
inla_lower <- res_inla$summary.fitted.values[idx_forecast, "0.05quant"]
inla_upper <- res_inla$summary.fitted.values[idx_forecast, "0.95quant"]
# ------------------------------------------------------------
# 4. Susun data untuk plot khusus INLA
# ------------------------------------------------------------
df_hist <- data.frame(
time = time_hist,
value = as.numeric(ts_birth),
type = "Data historis"
)
df_fc <- data.frame(
time = time_fc,
mean = inla_mean,
lower = inla_lower,
upper = inla_upper
)
# ------------------------------------------------------------
# 5. Plot INLA: data + mean + credible interval 90%
# ------------------------------------------------------------
ggplot() +
# Data historis
geom_line(
data = df_hist,
aes(x = time, y = value),
colour = "black"
) +
# Credible interval 90% (pita)
geom_ribbon(
data = df_fc,
aes(x = time, ymin = lower, ymax = upper),
fill = "blue",
alpha = 0.2
) +
# Mean forecast
geom_line(
data = df_fc,
aes(x = time, y = mean),
colour = "blue",
linewidth = 1
) +
labs(
x = "Waktu (bulan)",
y = "Birth",
title = "Forecast 12 Bulan dengan INLA",
subtitle = "Garis biru: mean posterior, pita biru: credible interval 90%"
) +
theme_minimal()
**Pebandingan bayesforecast vs INLA
# =================================================
# BAGIAN C: Susun data frame untuk plot
# =================================================
# Data historis
df_hist <- data.frame(
time = time_hist,
value = as.numeric(ts_birth),
series = "Data historis"
)
# Mean forecast Stan
df_bf_mean <- data.frame(
time = time_fc,
value = bf_mean,
series = "Stan (mean)"
)
# Mean forecast INLA
df_inla_mean <- data.frame(
time = time_fc,
value = inla_mean,
series = "INLA (mean)"
)
# CI ribbons
df_bf_ribbon <- data.frame(
time = time_fc,
lower = bf_lower,
upper = bf_upper,
model = "Stan"
)
df_inla_ribbon <- data.frame(
time = time_fc,
lower = inla_lower,
upper = inla_upper,
model = "INLA"
)
# =================================================
# BAGIAN D: Plot Perbandingan
# =================================================
ggplot() +
geom_line(
data = df_hist,
aes(x = time, y = value, colour = "Data historis"),
linewidth = 0.8
) +
geom_ribbon(
data = df_bf_ribbon,
aes(x = time, ymin = lower, ymax = upper, fill = "Stan (CI 90%)"),
alpha = 0.20
) +
geom_ribbon(
data = df_inla_ribbon,
aes(x = time, ymin = lower, ymax = upper, fill = "INLA (CI 90%)"),
alpha = 0.20
) +
geom_line(
data = df_bf_mean,
aes(x = time, y = value, colour = "Stan (mean)"),
linewidth = 1
) +
geom_line(
data = df_inla_mean,
aes(x = time, y = value, colour = "INLA (mean)"),
linewidth = 1,
linetype = "dashed"
) +
scale_colour_manual(
values = c(
"Data historis" = "black",
"Stan (mean)" = "red",
"INLA (mean)" = "blue"
),
name = "Garis"
) +
scale_fill_manual(
values = c(
"Stan (CI 90%)" = "red",
"INLA (CI 90%)" = "blue"
),
name = "Credible interval"
) +
labs(
x = "Waktu (bulan)",
y = "Birth",
title = "Perbandingan Forecast 12 Bulan — Stan/bayesforecast vs INLA",
subtitle = "Garis: Mean Posterior | Pita: Credible Interval 90%"
) +
theme_minimal()
library(dplyr)
library(knitr)
library(kableExtra)
df_compare <- data.frame(
Horizon = 1:h,
Stan_Mean = round(bf_mean, 3),
Stan_Lower = round(bf_lower, 3),
Stan_Upper = round(bf_upper, 3),
INLA_Mean = round(inla_mean, 3),
INLA_Lower = round(inla_lower, 3),
INLA_Upper = round(inla_upper, 3)
)
kable(df_compare,
caption = "Perbandingan Forecast 12 Bulan: Stan/bayesforecast vs INLA",
align = "c") %>%
kable_styling(full_width = FALSE, bootstrap_options = c("striped", "hover"))
| Horizon | Stan_Mean | Stan_Lower | Stan_Upper | INLA_Mean | INLA_Lower | INLA_Upper |
|---|---|---|---|---|---|---|
| 1 | 277.483 | 248.679 | 331.427 | 255.834 | 246.613 | 265.099 |
| 2 | 275.638 | 227.700 | 304.514 | 279.853 | 268.165 | 291.609 |
| 3 | 276.424 | 245.845 | 320.025 | 259.721 | 246.055 | 273.467 |
| 4 | 276.655 | 234.455 | 313.558 | 271.112 | 255.767 | 286.545 |
| 5 | 276.455 | 238.723 | 313.775 | 272.064 | 255.248 | 288.973 |
| 6 | 276.841 | 210.823 | 312.476 | 298.583 | 280.452 | 316.811 |
| 7 | 274.997 | 198.744 | 322.411 | 306.296 | 286.974 | 325.717 |
| 8 | 276.418 | 215.943 | 308.952 | 302.598 | 282.185 | 323.111 |
| 9 | 277.414 | 226.580 | 305.907 | 296.523 | 275.105 | 318.041 |
| 10 | 276.336 | 238.645 | 307.477 | 277.199 | 254.849 | 299.651 |
| 11 | 277.219 | 227.833 | 308.302 | 286.977 | 263.757 | 310.298 |
| 12 | 276.223 | 233.597 | 308.982 | 279.892 | 255.861 | 304.025 |
Analisis perbandingan (narasi):
Diskusi kelebihan/kekurangan:
Stan (HMC) menghasilkan posterior “full”, sangat
fleksibel untuk prior dan model kompleks, tetapi membutuhkan waktu
komputasi lebih lama dan perlu pemeriksaan konvergensi (R-hat, ESS,
traceplot).INLA sangat cepat untuk model latent Gaussian (termasuk
banyak ARIMA dan state-space standar), tetapi menggunakan pendekatan
aproksimasi Laplace; spesifikasi prior bisa lebih terbatas dan terutama
cocok untuk kelas model tertentu.Misalkan tersedia deret waktu bulanan penjualan suatu produk selama
10 tahun, dan variabel eksogen promosi (binary 0/1) yang
menunjukkan ada/tidaknya kampanye promosi.
Model: ARIMAX(2,0,1) tanpa seasonal, dengan
xreg = promosi.
\[ y_t = \mu_0 + \beta_{\text{promosi}} \cdot \text{promosi}_t + \phi_{1} y_{t-1} + \phi_{2} y_{t-2} + \theta_{1} \varepsilon_{t-1} + \varepsilon_t \]
Prior:
Pertanyaan:
set.seed(123)
n <- 120 # 10 tahun bulanan
t <- 1:n
promosi <- rbinom(n, size = 1, prob = 0.3)
# Misal true effect promosi = 0.8 dan sisanya ARMA(2,1) sederhana
beta_promosi_true <- 0.8
# Untuk latihan, bisa langsung pakai data nyata jika tersedia.
# Di sini hanya sketsa simulasi / placeholder.
ts_penjualan <- ts(rnorm(n, mean = 100 + beta_promosi_true * promosi, sd = 5), frequency = 12)
library(bayesforecast)
set.seed(123)
n <- 120 # 10 tahun bulanan
t <- 1:n
promosi <- rbinom(n, size = 1, prob = 0.3)
beta_promosi_true <- 0.8
ts_penjualan <- ts(
rnorm(n, mean = 100 + beta_promosi_true * promosi, sd = 5),
frequency = 12
)
# --- Pemodelan ARIMAX(2,0,1) dengan stan_sarima ---
fit_arimax <- stan_sarima(
ts = ts_penjualan,
order = c(2, 0, 1),
seasonal = c(0, 0, 0), # non-seasonal ARIMAX
xreg = matrix(promosi, ncol = 1), # xreg HARUS matrix
period = 12,
chains = 2,
iter = 2000,
prior_ar = normal(0, 0.5), # Prior AR(i) ~ N(0, 0.5^2)
prior_ma = normal(0, 0.5), # Prior MA(1) ~ N(0, 0.5^2)
prior_breg = normal(0, 1) # Prior beta_promosi ~ N(0,1)
)
##
## SAMPLING FOR MODEL 'Sarima' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 3.6e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.36 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 1: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 1: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 1: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 1: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 3.217 seconds (Warm-up)
## Chain 1: 3.22 seconds (Sampling)
## Chain 1: 6.437 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'Sarima' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 3.6e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.36 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 2: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 2: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 2: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 2: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 2: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 2: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 2: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 2: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 2: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 2: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 3.023 seconds (Warm-up)
## Chain 2: 2.994 seconds (Sampling)
## Chain 2: 6.017 seconds (Total)
## Chain 2:
summary(fit_arimax)
## mean se 5% 95% ess Rhat
## mu0 100.4386 0.1159 91.8326 108.7799 1829.242 1.0015
## sigma0 4.7715 0.0068 4.2956 5.2891 1732.396 0.9997
## ar.1 -0.0208 0.0015 -0.1290 0.0922 2008.913 0.9998
## ar.2 0.0132 0.0009 -0.0528 0.0810 1936.467 1.0036
## ma -0.0361 0.0025 -0.2148 0.1489 2090.741 0.9999
## breg 1.4542 0.0158 0.3254 2.6578 1984.823 1.0009
## loglik -360.3911 0.0425 -364.0209 -357.8046 1935.233 1.0001
h <- 6
# Skenario 1: promosi = 0
xreg_future_0 <- matrix(0, nrow = h, ncol = 1)
# Skenario 2: promosi = 1
xreg_future_1 <- matrix(1, nrow = h, ncol = 1)
fc_arimax_0 <- forecast(fit_arimax, h = h, xreg = xreg_future_0)
fc_arimax_1 <- forecast(fit_arimax, h = h, xreg = xreg_future_1)
autoplot(fc_arimax_0) + ggtitle("Forecast 6 bulan (promosi = 0)")
autoplot(fc_arimax_1) + ggtitle("Forecast 6 bulan (promosi = 1)")
check_residuals()).library(INLA)
library(ggplot2)
set.seed(123)
# -------------------------------------------
# 1) Data (sama seperti sebelumnya)
# -------------------------------------------
n <- 120 # 10 tahun bulanan
t <- 1:n
promosi <- rbinom(n, size = 1, prob = 0.3)
beta_promosi_true <- 0.8
ts_penjualan <- ts(
rnorm(n, mean = 100 + beta_promosi_true * promosi, sd = 5),
frequency = 12
)
dat_inla <- data.frame(
y = as.numeric(ts_penjualan),
time_id = 1:n,
promosi = promosi
)
formula_inla <- y ~ 1 + promosi +
f(time_id, model = "ar", order = 2)
fit_inla <- inla(
formula_inla,
data = dat_inla,
family = "gaussian",
control.predictor = list(compute = TRUE),
control.compute = list(dic = TRUE, waic = TRUE)
)
# -------------------------------------------
# 2) Forecast 6 bulan ke depan
# Skenario 1: promosi = 0
# -------------------------------------------
h <- 6
time_future <- (n + 1):(n + h)
dat_future_0 <- data.frame(
y = rep(NA, h),
time_id = time_future,
promosi = rep(0, h)
)
dat_inla_0 <- rbind(dat_inla, dat_future_0)
fit_inla_0 <- inla(
formula_inla,
data = dat_inla_0,
family = "gaussian",
control.predictor = list(compute = TRUE),
control.compute = list(dic = TRUE, waic = TRUE)
)
# Ambil fitted values (termasuk untuk y = NA)
fv0 <- fit_inla_0$summary.fitted.values
# 6 titik terakhir = forecast
idx_future_0 <- (n + 1):(n + h)
fc_mean_0 <- fv0[idx_future_0, "mean"]
fc_lower_0 <- fv0[idx_future_0, "0.025quant"]
fc_upper_0 <- fv0[idx_future_0, "0.975quant"]
# Data historis dan forecast dipisah
df_hist <- data.frame(
time = 1:n,
y = as.numeric(ts_penjualan)
)
df_fc_0 <- data.frame(
time = time_future,
mean = fc_mean_0,
lower = fc_lower_0,
upper = fc_upper_0
)
ggplot() +
geom_line(data = df_hist, aes(x = time, y = y)) +
geom_line(data = df_fc_0, aes(x = time, y = mean), linetype = "dashed") +
geom_ribbon(
data = df_fc_0,
aes(x = time, ymin = lower, ymax = upper),
alpha = 0.2
) +
labs(
title = "Forecast 6 bulan (promosi = 0) - INLA",
x = "Waktu",
y = "Penjualan"
)
# -------------------------------------------
# 3) Skenario 2: promosi = 1
# -------------------------------------------
dat_future_1 <- data.frame(
y = rep(NA, h),
time_id = time_future,
promosi = rep(1, h)
)
dat_inla_1 <- rbind(dat_inla, dat_future_1)
fit_inla_1 <- inla(
formula_inla,
data = dat_inla_1,
family = "gaussian",
control.predictor = list(compute = TRUE),
control.compute = list(dic = TRUE, waic = TRUE)
)
fv1 <- fit_inla_1$summary.fitted.values
fc_mean_1 <- fv1[idx_future_1 <- (n + 1):(n + h), "mean"]
fc_lower_1 <- fv1[idx_future_1, "0.025quant"]
fc_upper_1 <- fv1[idx_future_1, "0.975quant"]
df_fc_1 <- data.frame(
time = time_future,
mean = fc_mean_1,
lower = fc_lower_1,
upper = fc_upper_1
)
ggplot() +
geom_line(data = df_hist, aes(x = time, y = y)) +
geom_line(data = df_fc_1, aes(x = time, y = mean), linetype = "dashed") +
geom_ribbon(
data = df_fc_1,
aes(x = time, ymin = lower, ymax = upper),
alpha = 0.2
) +
labs(
title = "Forecast 6 bulan (promosi = 1) - INLA",
x = "Waktu",
y = "Penjualan"
)
Ingin membangun model trend + musiman sederhana (“local level + seasonal”) dalam kerangka Bayesian untuk deret mingguan jumlah kunjungan website selama 3 tahun.
Model state-space:
\[ y_t = \mu_t + \gamma_t + \varepsilon_t, \\ \mu_t = \mu_{t-1} + \eta_t, \]
dengan:
Gunakan fungsi LocalLevel() atau setara di
bayesforecast (paket mendukung model semacam ini).
Prior:
Pertanyaan:
library(bayesforecast)
set.seed(123)
T <- 156
s <- 52
time <- 1:T
mu_true <- cumsum(rnorm(T, sd = 0.1))
gamma_true <- rep(rnorm(s, sd = 0.5), length.out = T)
y_web <- mu_true + gamma_true + rnorm(T, sd = 0.3)
ts_web <- ts(y_web, frequency = s)
# Fitting Local Level (ETS(A,N,N)) dengan Stan
fit_ll <- stan_LocalLevel(
ts = ts_web,
chains = 2,
iter = 1500,
prior_sigma0 = normal(0, 1), # kira-kira ~ prior untuk σ_ε
prior_level = normal(0, 1) # prior untuk level awal μ_1
)
##
## SAMPLING FOR MODEL 'ets' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 0.000216 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 2.16 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 1500 [ 0%] (Warmup)
## Chain 1: Iteration: 150 / 1500 [ 10%] (Warmup)
## Chain 1: Iteration: 300 / 1500 [ 20%] (Warmup)
## Chain 1: Iteration: 450 / 1500 [ 30%] (Warmup)
## Chain 1: Iteration: 600 / 1500 [ 40%] (Warmup)
## Chain 1: Iteration: 750 / 1500 [ 50%] (Warmup)
## Chain 1: Iteration: 751 / 1500 [ 50%] (Sampling)
## Chain 1: Iteration: 900 / 1500 [ 60%] (Sampling)
## Chain 1: Iteration: 1050 / 1500 [ 70%] (Sampling)
## Chain 1: Iteration: 1200 / 1500 [ 80%] (Sampling)
## Chain 1: Iteration: 1350 / 1500 [ 90%] (Sampling)
## Chain 1: Iteration: 1500 / 1500 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.198 seconds (Warm-up)
## Chain 1: 0.172 seconds (Sampling)
## Chain 1: 0.37 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'ets' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 3.4e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.34 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 1500 [ 0%] (Warmup)
## Chain 2: Iteration: 150 / 1500 [ 10%] (Warmup)
## Chain 2: Iteration: 300 / 1500 [ 20%] (Warmup)
## Chain 2: Iteration: 450 / 1500 [ 30%] (Warmup)
## Chain 2: Iteration: 600 / 1500 [ 40%] (Warmup)
## Chain 2: Iteration: 750 / 1500 [ 50%] (Warmup)
## Chain 2: Iteration: 751 / 1500 [ 50%] (Sampling)
## Chain 2: Iteration: 900 / 1500 [ 60%] (Sampling)
## Chain 2: Iteration: 1050 / 1500 [ 70%] (Sampling)
## Chain 2: Iteration: 1200 / 1500 [ 80%] (Sampling)
## Chain 2: Iteration: 1350 / 1500 [ 90%] (Sampling)
## Chain 2: Iteration: 1500 / 1500 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 0.187 seconds (Warm-up)
## Chain 2: 0.162 seconds (Sampling)
## Chain 2: 0.349 seconds (Total)
## Chain 2:
summary(fit_ll)
## mean se 5% 95% ess Rhat
## sigma0 0.6459 0.0009 0.5899 0.7110 1565.806 1.0005
## level 0.1128 0.0009 0.0626 0.1780 1472.178 1.0067
## level1 0.2960 0.0073 -0.1741 0.7813 1474.195 1.0019
## loglik -152.4155 0.0350 -155.0325 -150.9943 1540.224 0.9996
fit_ll_seasonal <- stan_ssm(
ts = ts_web,
trend = FALSE,
seasonal = TRUE,
period = s,
chains = 2,
iter = 1500
)
##
## SAMPLING FOR MODEL 'ets' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 0.000361 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 3.61 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 1500 [ 0%] (Warmup)
## Chain 1: Iteration: 150 / 1500 [ 10%] (Warmup)
## Chain 1: Iteration: 300 / 1500 [ 20%] (Warmup)
## Chain 1: Iteration: 450 / 1500 [ 30%] (Warmup)
## Chain 1: Iteration: 600 / 1500 [ 40%] (Warmup)
## Chain 1: Iteration: 750 / 1500 [ 50%] (Warmup)
## Chain 1: Iteration: 751 / 1500 [ 50%] (Sampling)
## Chain 1: Iteration: 900 / 1500 [ 60%] (Sampling)
## Chain 1: Iteration: 1050 / 1500 [ 70%] (Sampling)
## Chain 1: Iteration: 1200 / 1500 [ 80%] (Sampling)
## Chain 1: Iteration: 1350 / 1500 [ 90%] (Sampling)
## Chain 1: Iteration: 1500 / 1500 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.867 seconds (Warm-up)
## Chain 1: 0.689 seconds (Sampling)
## Chain 1: 1.556 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'ets' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 6.1e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.61 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 1500 [ 0%] (Warmup)
## Chain 2: Iteration: 150 / 1500 [ 10%] (Warmup)
## Chain 2: Iteration: 300 / 1500 [ 20%] (Warmup)
## Chain 2: Iteration: 450 / 1500 [ 30%] (Warmup)
## Chain 2: Iteration: 600 / 1500 [ 40%] (Warmup)
## Chain 2: Iteration: 750 / 1500 [ 50%] (Warmup)
## Chain 2: Iteration: 751 / 1500 [ 50%] (Sampling)
## Chain 2: Iteration: 900 / 1500 [ 60%] (Sampling)
## Chain 2: Iteration: 1050 / 1500 [ 70%] (Sampling)
## Chain 2: Iteration: 1200 / 1500 [ 80%] (Sampling)
## Chain 2: Iteration: 1350 / 1500 [ 90%] (Sampling)
## Chain 2: Iteration: 1500 / 1500 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 0.844 seconds (Warm-up)
## Chain 2: 0.675 seconds (Sampling)
## Chain 2: 1.519 seconds (Total)
## Chain 2:
library(bayesforecast)
library(ggplot2)
set.seed(123)
T <- 156
s <- 52
time <- 1:T
mu_true <- cumsum(rnorm(T, sd = 0.1))
gamma_true <- rep(rnorm(s, sd = 0.5), length.out = T) # kita simulasikan seasonal
y_web <- mu_true + gamma_true + rnorm(T, sd = 0.3)
ts_web <- ts(y_web, frequency = s)
# ---- Local Level dengan stan_LocalLevel (tanpa seasonal eksplisit) ----
fit_ll <- stan_LocalLevel(
ts = ts_web,
chains = 2,
iter = 1500,
prior_sigma0 = normal(0, 1),
prior_level = normal(0, 1)
)
##
## SAMPLING FOR MODEL 'ets' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 3.7e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.37 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 1500 [ 0%] (Warmup)
## Chain 1: Iteration: 150 / 1500 [ 10%] (Warmup)
## Chain 1: Iteration: 300 / 1500 [ 20%] (Warmup)
## Chain 1: Iteration: 450 / 1500 [ 30%] (Warmup)
## Chain 1: Iteration: 600 / 1500 [ 40%] (Warmup)
## Chain 1: Iteration: 750 / 1500 [ 50%] (Warmup)
## Chain 1: Iteration: 751 / 1500 [ 50%] (Sampling)
## Chain 1: Iteration: 900 / 1500 [ 60%] (Sampling)
## Chain 1: Iteration: 1050 / 1500 [ 70%] (Sampling)
## Chain 1: Iteration: 1200 / 1500 [ 80%] (Sampling)
## Chain 1: Iteration: 1350 / 1500 [ 90%] (Sampling)
## Chain 1: Iteration: 1500 / 1500 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.197 seconds (Warm-up)
## Chain 1: 0.171 seconds (Sampling)
## Chain 1: 0.368 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'ets' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 3.6e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.36 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 1500 [ 0%] (Warmup)
## Chain 2: Iteration: 150 / 1500 [ 10%] (Warmup)
## Chain 2: Iteration: 300 / 1500 [ 20%] (Warmup)
## Chain 2: Iteration: 450 / 1500 [ 30%] (Warmup)
## Chain 2: Iteration: 600 / 1500 [ 40%] (Warmup)
## Chain 2: Iteration: 750 / 1500 [ 50%] (Warmup)
## Chain 2: Iteration: 751 / 1500 [ 50%] (Sampling)
## Chain 2: Iteration: 900 / 1500 [ 60%] (Sampling)
## Chain 2: Iteration: 1050 / 1500 [ 70%] (Sampling)
## Chain 2: Iteration: 1200 / 1500 [ 80%] (Sampling)
## Chain 2: Iteration: 1350 / 1500 [ 90%] (Sampling)
## Chain 2: Iteration: 1500 / 1500 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 0.186 seconds (Warm-up)
## Chain 2: 0.162 seconds (Sampling)
## Chain 2: 0.348 seconds (Total)
## Chain 2:
summary(fit_ll)
## mean se 5% 95% ess Rhat
## sigma0 0.6459 0.0009 0.5899 0.7110 1565.806 1.0005
## level 0.1128 0.0009 0.0626 0.1780 1472.178 1.0067
## level1 0.2960 0.0073 -0.1741 0.7813 1474.195 1.0019
## loglik -152.4155 0.0350 -155.0325 -150.9943 1540.224 0.9996
# Forecast 26 minggu ke depan
fc_ll <- forecast(fit_ll, h = 26)
autoplot(fc_ll) +
ggtitle("Local Level (Bayesian) - Forecast 26 minggu") +
xlab("Waktu (minggu)") + ylab("Kunjungan")
# Residual & ACF
check_residuals(fit_ll)
res_ll <- residuals(fit_ll)
bayesforecast::ggacf(res_ll)
bayesforecast::ggpacf(res_ll)
fit_sarima_week <- stan_sarima(
ts = ts_web,
order = c(0, 0, 0),
seasonal = c(1, 0, 1), # contoh simple seasonal SARMA
period = 52,
chains = 2,
iter = 2000
)
##
## SAMPLING FOR MODEL 'Sarima' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 3e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.3 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 1: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 1: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 1: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 1: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.403 seconds (Warm-up)
## Chain 1: 0.341 seconds (Sampling)
## Chain 1: 0.744 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'Sarima' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 2.4e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.24 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 2: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 2: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 2: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 2: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 2: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 2: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 2: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 2: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 2: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 2: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 0.401 seconds (Warm-up)
## Chain 2: 0.341 seconds (Sampling)
## Chain 2: 0.742 seconds (Total)
## Chain 2:
fc_week <- forecast(fit_sarima_week, h = 26)
autoplot(fc_week) +
ggtitle("SARIMA musiman (periode 52) - Forecast 26 minggu")
Dokumen ini menyajikan versi INLA untuk contoh
pemodelan deret waktu mingguan yang sebelumnya di-fit dengan
bayesforecast, yaitu:
stan_LocalLevel() — model Local Level
(ETS(A,N,N)).stan_ssm(trend = FALSE, seasonal = TRUE) — model
Seasonal SSM.stan_sarima() dengan komponen musiman \((0,0,0)(1,0,1)_s\).Fokus utama di sini:
bayesforecast), danSimulasi Data Weekly Web Traffic
Kita gunakan skenario simulasi yang sama:
set.seed(123)
T <- 156
s <- 52
time <- 1:T
# Komponen level (mu_t) sebagai random walk
mu_true <- cumsum(rnorm(T, sd = 0.1))
# Komponen musiman gamma_t dengan periode s
gamma_true <- rep(rnorm(s, sd = 0.5), length.out = T)
# Observasi (menambahkan noise pengamatan)
y_web <- mu_true + gamma_true + rnorm(T, sd = 0.3)
ts_web <- ts(y_web, frequency = s)
autoplot(ts_web) +
ggtitle("Simulasi Weekly Web Traffic: Local Level + Musiman") +
xlab("Minggu") + ylab("Kunjungan (y_t)")
Model Local Level & Musiman: Formulasi Matematis
Model Pengamatan
Kita misalkan deret \(\{y_t\}_{t=1}^T\) mengikuti model:
\[ y_t = \mu_t + \gamma_t + \varepsilon_t, \qquad \varepsilon_t \sim \mathrm{N}(0, \sigma_\varepsilon^2), \quad t = 1, \dots, T. \]
Dengan:
Evolusi Level: Local Level (Random Walk)
Komponen level diasumsikan mengikuti proses:
\[ \mu_t = \mu_{t-1} + \eta_t, \qquad \eta_t \sim \mathrm{N}(0, \sigma_\mu^2), \quad t = 2, \dots, T, \]
dengan prior untuk level awal:
\[ \mu_1 \sim \mathrm{N}(m_0, \sigma_0^2). \]
Dalam kerangka INLA, hal ini diwujudkan sebagai efek
acak dengan model rw1:
\[ \mu_t = f_{\text{level}}(t), \qquad f_{\text{level}}(\cdot) \sim \text{RW1}(\sigma_\mu^2), \]
yang berarti perbedaan pertama \(f_{\text{level}}(t) - f_{\text{level}}(t-1)\) berdistribusi Gaussian i.i.d.
Komponen Musiman
Komponen musiman berperiode \(s\) dimodelkan sebagai proses Gaussian dengan pola berulang dan kendala sum-to-zero (agar level dan seasonal teridentifikasi):
\[ \gamma_t = f_{\text{season}}(t), \qquad t = 1, \dots, T, \]
dengan prior:
Di INLA, ini diimplementasikan dengan:
f(time_season, model = "seasonal", season.length = s)
Model Lengkap Versi INLA
Secara ringkas, model yang kita pasang di INLA adalah:
\[ y_t = \beta_0 + f_{\text{level}}(t) + f_{\text{season}}(t) + \varepsilon_t, \]
dengan:
Ini adalah versi INLA dari kombinasi:
stan_LocalLevel() (bagian level), danstan_ssm(trend = FALSE, seasonal = TRUE) (bagian
seasonal) di bayesforecast.Implementasi INLA: Local Level + Musiman
Menyiapkan Data
dat_inla <- data.frame(
y = as.numeric(ts_web),
time_level = 1:T, # indeks untuk komponen level (RW1)
time_season = 1:T # indeks untuk komponen musiman
)
head(dat_inla)
## y time_level time_season
## 1 0.72071944 1 1
## 2 -0.28149313 2 2
## 3 0.60106578 3 3
## 4 -0.03032784 4 4
## 5 0.99288363 5 5
## 6 -0.41111602 6 6
Menyusun Formula Model
s <- frequency(ts_web) # 52
formula_inla <- y ~ 1 +
f(time_level, model = "rw1") +
f(time_season, model = "seasonal", season.length = s)
formula_inla
## y ~ 1 + f(time_level, model = "rw1") + f(time_season, model = "seasonal",
## season.length = s)
Fitting Model dengan INLA
fit_inla <- inla(
formula_inla,
data = dat_inla,
family = "gaussian",
control.predictor = list(compute = TRUE),
control.compute = list(dic = TRUE, waic = TRUE)
)
summary(fit_inla)
## Time used:
## Pre = 1.13, Running = 0.353, Post = 0.0192, Total = 1.5
## Fixed effects:
## mean sd 0.025quant 0.5quant 0.975quant mode kld
## (Intercept) 0.27 0.025 0.221 0.27 0.319 0.27 424.939
##
## Random effects:
## Name Model
## time_level RW1 model
## time_season Seasonal model
##
## Model hyperparameters:
## mean sd 0.025quant 0.5quant
## Precision for the Gaussian observations 10.37 1.68 7.42 10.24
## Precision for time_level 256.34 171.12 64.57 213.19
## Precision for time_season 22074.06 24144.89 1482.65 14510.82
## 0.975quant mode
## Precision for the Gaussian observations 14.01 10.02
## Precision for time_level 704.52 147.52
## Precision for time_season 86183.98 4047.92
##
## Deviance Information Criterion (DIC) ...............: 147.03
## Deviance Information Criterion (DIC, saturated) ....: 224.18
## Effective number of parameters .....................: 65.84
##
## Watanabe-Akaike information criterion (WAIC) ...: 149.11
## Effective number of parameters .................: 53.43
##
## Marginal log-Likelihood: -87.65
## is computed
## Posterior summaries for the linear predictor and the fitted values are computed
## (Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')
Kita dapat melihat:
fit_inla$summary.fixed
## mean sd 0.025quant 0.5quant 0.975quant mode
## (Intercept) 0.2701256 0.02511042 0.220752 0.2701255 0.3194998 0.2701255
## kld
## (Intercept) 424.9391
pada:
fit_inla$summary.hyperpar
## mean sd 0.025quant
## Precision for the Gaussian observations 10.36462 1.677045 7.421421
## Precision for time_level 256.34438 171.124057 64.568693
## Precision for time_season 22074.06182 24144.887397 1482.647276
## 0.5quant 0.975quant mode
## Precision for the Gaussian observations 10.24316 14.00567 10.02424
## Precision for time_level 213.18829 704.52045 147.51986
## Precision for time_season 14510.81716 86183.97533 4047.91794
serta ukuran kecocokan model:
fit_inla$dic$dic
## [1] 147.0323
fit_inla$waic$waic
## [1] 149.1134
Dekomposisi: Level, Musiman, dan Fitted Values
Ekstrak Komponen Posterior Mean
level_effect <- fit_inla$summary.random$time_level[ , c("ID", "mean")]
season_effect <- fit_inla$summary.random$time_season[ , c("ID", "mean")]
df_decomp <- data.frame(
time = 1:T,
y = as.numeric(ts_web),
level = level_effect$mean,
season = season_effect$mean,
fitted = fit_inla$summary.fitted.values[1:T, "mean"]
)
head(df_decomp)
## time y level season fitted
## 1 1 0.72071944 -0.07637168 0.15619266 0.3498892
## 2 2 -0.28149313 -0.09821891 -0.32874512 -0.1568650
## 3 3 0.60106578 -0.11228180 0.46407819 0.6219156
## 4 4 -0.03032784 -0.12518175 0.03121308 0.1761691
## 5 5 0.99288363 -0.12589481 0.80593914 0.9501814
## 6 6 -0.41111602 -0.12955761 -0.49948511 -0.3589004
Plot Data vs Fitted
ggplot(df_decomp, aes(x = time)) +
geom_line(aes(y = y), colour = "grey40") +
geom_line(aes(y = fitted), colour = "blue") +
labs(
title = "Data & Nilai Fitted (INLA)",
x = "Minggu",
y = "Kunjungan"
)
Plot Komponen Level
ggplot(df_decomp, aes(x = time, y = level)) +
geom_line() +
labs(
title = "Komponen Level (RW1)",
x = "Minggu",
y = expression(mu[t])
)
Plot Komponen Musiman
ggplot(df_decomp, aes(x = time, y = season)) +
geom_line() +
labs(
title = "Komponen Musiman (Periode 52)",
x = "Minggu",
y = expression(gamma[t])
)
Interpretasi singkat:
Forecast 26 Minggu ke Depan dengan INLA
Di bayesforecast, forecast dilakukan dengan:
fc_ll <- forecast(fit_ll, h = 26)
autoplot(fc_ll)
Dalam INLA, kita lakukan data augmentation:
y = NA.time_level dan
time_season.summary.fitted.values untuk indeks masa
depan.Menyiapkan Data Forecast
h <- 26
time_future <- (T + 1):(T + h)
dat_future <- data.frame(
y = rep(NA, h),
time_level = time_future,
time_season = time_future
)
dat_all <- rbind(dat_inla, dat_future)
nrow(dat_all)
## [1] 182
Fitting Model dengan Data Diperluas
fit_forecast <- inla(
formula_inla,
data = dat_all,
family = "gaussian",
control.predictor = list(compute = TRUE),
control.compute = list(dic = TRUE, waic = TRUE)
)
fv <- fit_forecast$summary.fitted.values
idx_future <- (T + 1):(T + h)
forecast_mean <- fv[idx_future, "mean"]
forecast_lower <- fv[idx_future, "0.025quant"]
forecast_upper <- fv[idx_future, "0.975quant"]
forecast_mean
## [1] -0.118128813 -0.599136531 0.192065216 -0.239620075 0.533513997
## [6] -0.772067548 -0.794028336 1.470521187 -0.561622497 -0.115988297
## [11] -0.129828769 -0.490468189 -0.082261139 -0.128040825 -0.250757129
## [16] -0.180442130 -0.149903875 0.803941355 -0.576201006 -1.115070743
## [21] -0.448251343 -0.085367954 0.077527569 -0.866782356 -1.091267565
## [26] -0.001055076
Plot Forecast
df_hist <- data.frame(
time = 1:T,
y = as.numeric(ts_web)
)
df_fc <- data.frame(
time = time_future,
mean = forecast_mean,
lower = forecast_lower,
upper = forecast_upper
)
ggplot() +
geom_line(data = df_hist, aes(x = time, y = y)) +
geom_line(data = df_fc, aes(x = time, y = mean), linetype = "dashed") +
geom_ribbon(
data = df_fc,
aes(x = time, ymin = lower, ymax = upper),
alpha = 0.2
) +
labs(
title = "Forecast 26 Minggu (Local Level + Musiman, INLA)",
x = "Minggu",
y = "Kunjungan"
)
Interpretasi:
Hubungan dengan Model bayesforecast
stan_LocalLevel()
Model:
Hanya mencakup komponen level:
\[ y_t = \mu_t + \varepsilon_t, \qquad \mu_t = \mu_{t-1} + \eta_t. \]
Versi INLA:
rw1:formula_ll_inla <- y ~ 1 + f(time_level, model = "rw1")
Ini adalah analog langsung dari stan_LocalLevel().
stan_ssm(trend = FALSE, seasonal = TRUE)
Model:
Versi INLA:
seasonal:formula_season_inla <- y ~ 1 + f(time_season, model = "seasonal", season.length = s)
Kombinasi Local Level + Seasonal
Kode bayesforecast yang mengombinasikan Local Level dan
Seasonal (melalui dua fungsi) secara konseptual sepadan dengan model
INLA yang kita gunakan:
formula_inla <- y ~ 1 +
f(time_level, model = "rw1") +
f(time_season, model = "seasonal", season.length = s)
Catatan tentang SARIMA Musiman
Untuk model:
fit_sarima_week <- stan_sarima(
ts = ts_web,
order = c(0, 0, 0),
seasonal = c(1, 0, 1),
period = 52,
chains = 2,
iter = 2000
)
Ringkasan
Dalam dokumen ini kita telah:
f(time_level, model = "rw1"),f(time_season, model = "seasonal", season.length = s).stan_LocalLevel(),stan_ssm(seasonal = TRUE) danbayesforecast.Template ini bisa langsung dijadikan materi e-book / modul workshop untuk membandingkan Bayesian time series dengan Stan vs INLA pada konteks yang sama.
Simulasikan 200 observasi (misalnya bulanan) untuk variabel C, I, U dengan struktur cross-lag sederhana seperti di bawah ini:
dengan \(\varepsilon_{C,t}, \varepsilon_{I,t}, \varepsilon_{U,t}\) adalah error Gaussian dengan varians berbeda.
Estimasi model BVAR(2) untuk ketiga variabel tersebut.
Estimasi model dengan INLA untuk variabel C sebagai respons terhadap lag1(C), lag1(I), lag1(U).
Berdasarkan hasil, diskusikan:
library(BVAR) # untuk Bayesian VAR
library(INLA) # untuk pendekatan Bayesian alternatif
library(ggplot2)
set.seed(123)
T <- 200 # panjang deret waktu
C <- numeric(T)
I <- numeric(T)
U <- numeric(T)
# nilai awal
C[1:2] <- c(100, 102)
I[1:2] <- c(50, 51)
U[1:2] <- c(6, 6.1)
# loop simulasi sesuai struktur persamaan
for (t in 3:T) {
C[t] <- 0.6 * C[t-1] - 0.2 * I[t-1] + 0.1 * U[t-1] + rnorm(1, 0, 2)
I[t] <- 0.4 * I[t-1] + 0.3 * C[t-2] - 0.1 * U[t-1] + rnorm(1, 0, 1.5)
U[t] <- 0.5 * U[t-1] + 0.1 * C[t-1] + 0.05 * I[t-1] + rnorm(1, 0, 0.5)
}
data_mat <- cbind(C = C, I = I, U = U)
head(data_mat)
## C I U
## [1,] 100.000000 50.00000 6.000000
## [2,] 102.000000 51.00000 6.100000
## [3,] 50.489049 49.44473 16.579354
## [4,] 22.203435 48.91389 16.668351
## [5,] 6.127950 31.14784 12.656787
## [6,] -2.178444 19.69061 8.678488
df_ts <- data.frame(
t = 1:T,
C = C,
I = I,
U = U
)
ggplot(df_ts, aes(x = t)) +
geom_line(aes(y = C, colour = "C")) +
geom_line(aes(y = I, colour = "I")) +
geom_line(aes(y = U, colour = "U")) +
labs(
title = "Deret Waktu Simulasi: C, I, U",
x = "Waktu (t)",
y = "Nilai"
) +
scale_colour_manual(
name = "Variabel",
values = c("C" = "black", "I" = "blue", "U" = "red")
) +
theme_minimal()
bvar_fit <- bvar(
data = data_mat,
lags = 2, # VAR(2)
n_draw = 2000L, # jumlah sampel MCMC setelah burn-in
n_burn = 500L, # burn-in
verbose = FALSE
)
summary(bvar_fit)
## Bayesian VAR consisting of 198 observations, 3 variables and 2 lags.
## Time spent calculating: 0.39 secs
## Hyperparameters: lambda
## Hyperparameter values after optimisation: 0.36948
## Iterations (burnt / thinning): 2000 (500 / 1)
## Accepted draws (rate): 1390 (0.927)
##
## Numeric array (dimensions 7, 3) of coefficient values from a BVAR.
## Median values:
## C I U
## constant 0.017 -0.049 0.036
## C-lag1 0.561 -0.027 0.104
## I-lag1 -0.247 0.413 0.047
## U-lag1 -0.136 -0.182 0.533
## C-lag2 0.056 0.334 0.002
## I-lag2 0.023 -0.043 0.009
## U-lag2 0.167 0.132 -0.058
##
## Numeric array (dimensions 3, 3) of variance-covariance values from a BVAR.
## Median values:
## C I U
## C 3.500 0.040 -0.073
## I 0.040 1.969 -0.026
## U -0.073 -0.026 0.264
##
## Log-Likelihood: -892.4051
Ringkasan di atas menampilkan:
bvar_pred <- predict(
object = bvar_fit,
horizon = 12L
)
# Plot forecast (dengan pita ketidakpastian)
plot(bvar_pred, area = TRUE) +
ggtitle("Forecast 12 langkah ke depan – BVAR(2)")
## NULL
Diskusikan bentuk forecast untuk C, I, dan U, termasuk lebar interval prediksi.
# Hitung IRF dari BVAR
bvar_irf <- irf(
x = bvar_fit, # atau langsung irf(bvar_fit, ...)
horizon = 20L,
identification = FALSE,
n_thin = 1L
)
# Plot IRF untuk shock pada U
plot(
bvar_irf,
vars_impulse = "U",
vars_response = c("C", "I", "U"),
area = TRUE
) +
ggtitle("IRF: shock +1 pada U – BVAR(2)")
## NULL
Interpretasi singkat:
Jawaban
1️⃣ Bagaimana respon C terhadap shock positif pada U?
Dari IRF:
Di beberapa periode awal, garis respon C berada di bawah nol → konsumsi turun ketika terjadi shock kenaikan pengangguran.
Setelah beberapa periode, respon pelan-pelan naik ke arah nol, lalu mendekati 0 dan stabil.
Ini konsisten dengan koefisien BVAR:
U-lag1 di persamaan C = −0.136 (negatif)
U-lag2 di persamaan C = 0.167` (positif kecil, efek koreksi)
Jadi:
Shock positif pada U (pengangguran naik) menurunkan C di awal, lalu efek negatifnya berangsur mengecil dan menghilang.
2️⃣ Apakah efeknya sementara atau persisten?
Dari grafik IRF semua panel:
Respon C, I, dan U kembali mendekati nol sekitar horizon 8–10.
Tidak ada respon yang “nempel” jauh dari nol dalam jangka panjang.
Artinya:
Efek shock U bersifat sementara (transitory), bukan permanen. Sistem VAR mean-reverting: setelah beberapa periode, perekonomian kembali ke path asalnya.
3️⃣ Bagaimana respon I dan U sendiri terhadap shock U? a) Respon I (investasi) terhadap shock U
Dari IRF:
Di 1–2 periode awal, respon I cenderung negatif → investasi turun.
Kemudian respon pelan-pelan naik, mendekati nol dan stabil.
Ini cocok dengan koefisien:
Di persamaan I:
U-lag1 = −0.182 (negatif cukup besar)
U-lag2 = 0.132 (positif kecil, efek koreksi)
Kenaikan pengangguran menekan investasi dalam jangka pendek, lalu efek negatifnya mereda.
Panel ketiga (“Shock U on U”):
Di periode pertama, respon sangat positif → pengangguran naik tajam akibat shock.
Lalu respon turun cepat, mendekati nol dan stabil.
Koefisien BVAR ikut mendukung:
Di persamaan U:
U-lag1 = 0.533 (besar dan positif → persistence di jangka pendek)
U-lag2 = −0.058 (negatif kecil → dorongan balik ke mean)
Jadi shock pada U langsung menaikkan pengangguran, tapi karena ada efek mean reversion (lag2 negatif), tingkat pengangguran perlahan turun kembali ke level normal.
Ringkas untuk jawaban mahasiswa
Respon C terhadap shock U: Konsumsi C turun di awal (respon negatif), lalu efeknya mengecil dan kembali ke nol → shock U menekan konsumsi hanya sementara.
Sementara atau persisten? Efek shock U tidak permanen. Semua respon (C, I, U) cenderung kembali ke nol sekitar horizon 8–10, sehingga shock bersifat transitory / mean-reverting.
Respon I dan U terhadap shock U:
I: investasi turun dalam jangka pendek (respon negatif), lalu pulih bertahap ke level normal.
U: pengangguran sendiri melonjak tajam segera setelah shock, kemudian turun secara bertahap dan stabil kembali → ada persistence jangka pendek, tetapi tidak permanen.
Kita ingin fokus pada koefisien lag1(U) → C, yaitu pengaruh U_{t-1} terhadap C_t dalam persamaan konsumsi.
coef_bvar_all <- coef(bvar_fit, type = "mean")
str(coef_bvar_all)
## 'bvar_coefs' num [1:7, 1:3] 0.0216 0.5597 -0.2477 -0.1386 0.0565 ...
## - attr(*, "dimnames")=List of 2
## ..$ : chr [1:7] "constant" "C-lag1" "I-lag1" "U-lag1" ...
## ..$ : chr [1:3] "C" "I" "U"
dimnames(coef_bvar_all)
## [[1]]
## [1] "constant" "C-lag1" "I-lag1" "U-lag1" "C-lag2" "I-lag2" "U-lag2"
##
## [[2]]
## [1] "C" "I" "U"
Output dimnames akan menampilkan nama baris (lag dan
konstanta) dan kolom (persamaan C, I, U).
Sebagai ilustrasi (nama baris bisa sedikit berbeda tergantung versi
paket), misalkan baris lag pertama untuk C dan U dinamai
"C.l1" dan "U.l1". Kita dapat mengambil
koefisien:
# Sesuaikan nama baris sesuai hasil dimnames(coef_bvar_all)
coef_Clag1_to_C <- coef_bvar_all["C-lag1", "C"] # C_{t-1} → C_t
coef_Ulag1_to_C <- coef_bvar_all["U-lag1", "C"] # U_{t-1} → C_t
cat("BVAR: koef C_{t-1} → C_t =", coef_Clag1_to_C, "\n")
cat("BVAR: koef U_{t-1} → C_t =", coef_Ulag1_to_C, "\n")
Catatan: Peserta diminta untuk menyesuaikan nama baris sesuai dengan output yang muncul di mesin masing-masing.
Kini kita melihat pendekatan regresi Bayesian yang berbeda:
INLA.
Fokus: memodelkan C_t sebagai fungsi dari lag1(C), lag1(I), lag1(U).
Untuk regresi:
df_inla <- data.frame(
t = 3:T,
C = C[3:T],
lagC = C[2:(T-1)],
lagI = I[2:(T-1)],
lagU = U[2:(T-1)]
)
head(df_inla)
## t C lagC lagI lagU
## 1 3 50.489049 102.000000 51.000000 6.100000
## 2 4 22.203435 50.489049 49.444734 16.579354
## 3 5 6.127950 22.203435 48.913890 16.668351
## 4 6 -2.178444 6.127950 31.147844 12.656787
## 5 7 -3.575797 -2.178444 19.690612 8.678488
## 6 8 0.108588 -3.575797 9.012805 4.828009
formula_C <- C ~ 1 + lagC + lagI + lagU
inla_fit <- inla(
formula = formula_C,
data = df_inla,
family = "gaussian",
control.predictor = list(compute = TRUE)
)
summary(inla_fit)
## Time used:
## Pre = 0.969, Running = 0.243, Post = 0.00885, Total = 1.22
## Fixed effects:
## mean sd 0.025quant 0.5quant 0.975quant mode kld
## (Intercept) 0.030 0.138 -0.241 0.030 0.302 0.030 12.597
## lagC 0.593 0.027 0.540 0.593 0.646 0.593 462.202
## lagI -0.194 0.059 -0.309 -0.194 -0.079 -0.194 74.662
## lagU 0.076 0.143 -0.205 0.076 0.358 0.076 11.721
##
## Model hyperparameters:
## mean sd 0.025quant 0.5quant
## Precision for the Gaussian observations 0.28 0.028 0.227 0.279
## 0.975quant mode
## Precision for the Gaussian observations 0.338 0.277
##
## Marginal log-Likelihood: -439.55
## is computed
## Posterior summaries for the linear predictor and the fitted values are computed
## (Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')
Dari output, perhatikan:
mean dan
0.025quant–0.975quant untuk koefisien,lagU (yakni efek U_{t-1} terhadap
C_t).coef_lagU_to_C_inla <- inla_fit$summary.fixed["lagU", "mean"]
coef_lagU_to_C_inla
## [1] 0.07637149
Koefisien ini dapat dibandingkan dengan koefisien dari BVAR:
Tuliskan jawaban naratif untuk bagian ini (bukan kode):
Tuliskan jawaban Anda dalam bentuk paragraf kritis pada laporan terpisah atau di bawah bagian ini.
bayesforecast,rstan dan toolchain C++ (karena HMC membutuhkan
kompilasi).air, birth,
demgbp, atau data internal workshop) sehingga peserta bisa
langsung bereksperimen.Carpenter, B., et al. (2017). Stan: A probabilistic programming language. Journal of Statistical Software, 76(1).
Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., & Rubin, D. B. (2013). Bayesian Data Analysis (3rd ed.). CRC Press.
Jaya, I. G. N. M., Handoko, B., Andriyana, Y., Chadidjah, A., Kristiani, F., & Antikasari, M. (2023). Multivariate Bayesian Semiparametric Regression Model for Forecasting and Mapping HIV and TB Risks in West Java, Indonesia. Mathematics, 11(17), 3641.
Jaya, I. G. N. M., & Folmer, H. (2020). Bayesian spatiotemporal mapping of relative dengue disease risk in Bandung, Indonesia. Journal of Geographical Systems, 22(1), 105–142.
Jaya, I. G. N. M., Andriyana, Y., Tantular, B., Sunengsih, N., & Zulhanif, Z. (2016). A Bayesian Spatio-Temporal for Forecasting of Infectious Diseases by Means CAR Bayes. Proceedings International Conference on Applied Statistics, 2(1), 22–26.
Kruschke, J. K. (2014). Doing Bayesian Data Analysis (2nd ed.). Academic Press.
Rue, H., Martino, S., & Chopin, N. (2009). Approximate Bayesian inference for latent Gaussian models by using INLA. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 71(2), 319–392.