Input Data

Data yang digunakan adalah data Indeks Pembangunan Manusia (IPM) di Provinsi Sumatra Barat dari tahun 2010 - 2021. Menurut BPS, IPM adalah indikator komposit untuk mengukur capaian pembangunan kualitas hidup manusia. Indeks ini terbentuk dari rata-rata ukur capaian tiga dimensi utama pembangunan manusia, yaitu umur panjang dan hidup sehat, pengetahuan, dan standar hidup layak.

data <- read_excel("C:/R/tugas mpdw.xlsx", sheet = "Sheet1")
x <- data$Tahun
y <- data$HDI

head(data)

## # A tibble: 6 x 2
##   Tahun   HDI
##   <dbl> <dbl>
## 1  2010  67.2
## 2  2011  67.8
## 3  2012  68.4
## 4  2013  68.9
## 5  2014  69.4
## 6  2015  70.0

Eksplorasi Data

Terdapat korelasi positif yang sangat tinggi antara nilai IPM dengan tahun, terlihat bahwa IPM di Sumatra Barat terus meningkat dari tahun 2010 hingga 2021

ggplot(data, aes(x = Tahun, y = HDI)) +
  geom_line(col = "red") +
  geom_point() +
  geom_text(aes(label = paste(HDI, "%", sep = "")), vjust = -0.8, size = 2.5) +
  labs(title = "Indeks Pembangunan Manusia (IPM) Sumatra Barat",
       subtitle = "Tahun 2010-2021") +
  theme(plot.subtitle = element_text(hjust = 0.5)) +
  easy_center_title()

cor(x,y)

## [1] 0.9932977

Pembuatan Model Awal

model <- lm(y ~ x, data = data)
summary(model)

## 
## Call:
## lm(formula = y ~ x, data = data)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.45231 -0.09554 -0.03215  0.20099  0.33126 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -981.4216    38.6982  -25.36 2.08e-10 ***
## x              0.5218     0.0192   27.18 1.05e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2296 on 10 degrees of freedom
## Multiple R-squared:  0.9866, Adjusted R-squared:  0.9853 
## F-statistic: 738.5 on 1 and 10 DF,  p-value: 1.053e-10

Pendeteksian Autokorelasi

Eksploratif

par(mfrow = c(2,2))
plot(model, 1, pch = 16, cex = 0.9, col = "steelblue")
plot(x = 1:dim(data)[1], y = model$residuals,
     type = "b", main = "Residuals vs Order", xlab = "Observations",
     ylab = "Residuals", pch = 16, cex = 0.9, col = "steelblue")
hist(model$residuals, col = "steelblue", breaks = 5,  ylab = "Residuals",
     main = "Histogram of Residuals")
plot(model, 2, pch = 16, cex = 0.9, col = "steelblue")

ACF dan PACF plot

res <- model$residuals

par(mfrow = c(2,1))
acf(res)
pacf(res)

Secara eksploratif autokorelasi dapat diidentifikasi melalui plot sisaan vs order dan plot ACF dan PACF. Melalui plot sisaan vs order dapat dilihat bahwa naik turunnya sisaan cenderung memiliki pola dan tidak acak sedangkan pada ACF dan PACF plot tidak ada garis yang melewati garis biru kecuali pada lag = 0, yang berarti tidak terdeteksi adanya autokorelasi pada plot ini

Uji Statistik

1. Durbin-Watson

dwtest(model, alternative = "two.sided")

## 
##  Durbin-Watson test
## 
## data:  model
## DW = 0.82777, p-value = 0.004736
## alternative hypothesis: true autocorrelation is not 0

2. Breusch-Godfrey

bgtest(y ~ x, data = data, order = 1)

## 
##  Breusch-Godfrey test for serial correlation of order up to 1
## 
## data:  y ~ x
## LM test = 3.7277, df = 1, p-value = 0.05352

3. Run’s Test

runs.test(resid(model), alternative = "two.sided")

## 
##  Runs Test - Two sided
## 
## data:  resid(model)
## Standardized Runs Statistic = -1.2111, p-value = 0.2259

Dari ketiga uji statistik, uji Durbin-Watson menghasilkan p-value < 0.05, yang menunjukkan adanya autokorelasi pada sisaan model

Penanganan Autokorelasi

Cochrane-Orcutt

modelco <- cochrane.orcutt(model, max.iter = 1000, convergence = 4)
modelco

## Cochrane-orcutt estimation for first order autocorrelation 
##  
## Call:
## lm(formula = y ~ x, data = data)
## 
##  number of interaction: 174
##  rho 0.881252
## 
## Durbin-Watson statistic 
## (original):    0.82777 , p-value: 2.368e-03
## (transformed): 1.86312 , p-value: 2.634e-01
##  
##  coefficients: 
## (Intercept)           x 
## -506.428827    0.286928

rho.co <- modelco$rho

y.trans.rho <- y[-1] - y[-12]*rho.co
x.trans.rho <- x[-1] -  x[-12]*rho.co

modelco.rho <- lm(y.trans.rho ~ x.trans.rho)
summary(modelco.rho)

## 
## Call:
## lm(formula = y.trans.rho ~ x.trans.rho)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.35490 -0.09616 -0.03495  0.08117  0.27080 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept) -60.1372    36.3162  -1.656   0.1321  
## x.trans.rho   0.2869     0.1511   1.898   0.0901 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1882 on 9 degrees of freedom
## Multiple R-squared:  0.2859, Adjusted R-squared:  0.2066 
## F-statistic: 3.604 on 1 and 9 DF,  p-value: 0.09012

Dari proses Cochrane-Orcutt didapat nilai rho optimum sebesar 0.881252, yang akan digunakan sebagai parameter tarnsformasi peubah untuk membuat model baru

dwtest(modelco.rho)

## 
##  Durbin-Watson test
## 
## data:  modelco.rho
## DW = 1.8631, p-value = 0.2634
## alternative hypothesis: true autocorrelation is greater than 0

Uji Durbin-Watson pada model dengan peubah yang sudah di-transformasi menghasilkan p-value > 0.05, yang berarti sudah tidak ada autokorelasi pada sisaan model.

Fungsi Hildreth-Lu

hildreth.lu.func <- function(r, model){
  x <- model.matrix(model)[,-1]
  y <- model.response(model.frame(model))
  n <- length(y)
  t <- 2:n
  y <- y[t] - r * y[t-1]
  x <- x[t] - r * x[t-1]
  
  return(lm(y~x))
}

Penentuan rho optimum

r <- c(seq(0.1, 0.8, by = 0.1), seq(0.90, 0.99, by = 0.01))
tab <- data.frame("rho" = r, "SSE" = sapply(r, function(i){deviance(hildreth.lu.func(i, model))}))

plot(tab$SSE ~ tab$rho, type = "l")
abline(v = tab[which.min(tab$SSE), "rho"], lty = 3)

tab[which.min(tab$SSE),]

##   rho       SSE
## 9 0.9 0.3188584

r <- c(seq(0.8, 0.99, by = 0.01))
tab <- data.frame("rho" = r, "SSE" = sapply(r, function(i){deviance(hildreth.lu.func(i, model))}))

plot(tab$SSE ~ tab$rho, type = "l")
abline(v = tab[which.min(tab$SSE), "rho"], lty = 3)

tab[which.min(tab$SSE),]

##    rho       SSE
## 11 0.9 0.3188584

Penentuan rho optimum pada metode Hildreth-Lu dilakukan dengan iterasi berulang dari nilai-nilai rho hingga didapatkan nilai error yang paling kecil, didapatkan nilai rho optimum sebesar 0.9

Modeling Hildreth-Lu

modelhl <- hildreth.lu.func(0.9, model)
summary(modelhl)

## 
## Call:
## lm(formula = y ~ x)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.35849 -0.09568 -0.03470  0.08090  0.26934 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept) -40.7580    36.3418  -1.122    0.291
## x             0.2383     0.1795   1.328    0.217
## 
## Residual standard error: 0.1882 on 9 degrees of freedom
## Multiple R-squared:  0.1638, Adjusted R-squared:  0.07087 
## F-statistic: 1.763 on 1 and 9 DF,  p-value: 0.217

dwtest(modelhl, alternative = "two.sided")

## 
##  Durbin-Watson test
## 
## data:  modelhl
## DW = 1.8946, p-value = 0.564
## alternative hypothesis: true autocorrelation is not 0

Uji Durbin-Watson pada model baru memiliki p-value > 0.05 yang berarti sudah tidak ada autokorelasi pada taraf nyata 5%

Kesimpulan

1. Transformasi Balik Cochrane-Orcutt

cat("y = ", coef(modelco.rho)[1]/(1-0.881252), "+", coef(modelco.rho)[2],"x", sep = " ")

## y =  -506.427 + 0.2869277 x

\[ Y = -506.427 + 0.2869277x \] Dengan menggunakan nilai rho melalui metode cochrane-orcutt didapatkan bahwa ada peningkatan dugaan rataan nilai IPM di Sumatra Barat sebesar 0.2869 % setiap tahunnya.

2. Transformasi Balik Hildreth-Lu

cat("y = ", coef(modelhl)[1]/(1-0.9), "+", coef(modelhl)[2],"x", sep = " ")

## y =  -407.5805 + 0.2382727 x

\[ Y = -407.5805 + 0.2382727x \]

Dengan nilai rho yang didapat dari iterasi Hildreth-Lu menunjukkan bahwa ada peningkatan nilai dugaan rataan IPM di Sumatra Barat sebesar 0.2382% setiap pertambahan satu tahun.

Perbandingan Nilai Akurasi

error.awal <- sum(model$residuals^2)
error.cochrane <- sum(modelco.rho$residuals^2)
error.hildreth <- sum(modelhl$residuals^2)

tabel <- rbind(error.awal, error.cochrane, error.hildreth)
colnames(tabel) <- "SSE"
rownames(tabel) <- c("Model Awal","Cochrane-Orcutt","Hildreth-Lu")

kable(tabel)

	SSE
Model Awal	0.5271703
Cochrane-Orcutt	0.3189063
Hildreth-Lu	0.3188584

Dari kedua metode menghasilkan error yang hampir sama karena nilai rho yang diperoleh dari kedua metode tersebut juga hampir sama dan keduanya lebih kecil dari model awal yang menandakan penanganan autokorelasi juga meningkatkan akurasi model.

Daftar Pustaka

Montgomery,D.C.,et.al.2008.Forecasting Time Series Analysis 2nd.John Wiley

Aprianto A, Debataraja NN, Nurfitri I.2020.Metode Cochrane-Orcutt untuk mengatasi autokorelasi pada estimasi parameter ordinary least square. Buletin ilmiah Mat, Stat,dan terapannya.9(01)

Pendeteksian dan Penanganan Autokorelasi pada Regresi Time-Series

Muhammad Irsyad Robbani

9/7/2022