| Kontak | \(\downarrow\) |
| dhelaagatha@gmail.com | |
| https://www.instagram.com/dhelaagatha/ | |
| RPubs | https://rpubs.com/dhelaasafiani/ |
| Nama | DHELA AGATHA |
| NIM | 20214920009 |
impor data
This dataset includes the following variables:
- Month: The month of observation.
- Advertising_Expense: The amount spent on advertising (in dollars).
- Product_Quality: The quality score of the product (on a scale from 1 to 100).
- Product_Price: The price of the product (in dollars).
- Sales_Promotion: The expenditure on sales promotion (in dollars).
- Online_Marketing: The expenditure on online marketing (in dollars).
- Offline_Marketing: The expenditure on offline marketing (in dollars).
- Product_Sales: The number of products sold.
Case 1
1.1 Regression Analysis
# Load necessary library
library(ggplot2)
# Load the mtcars dataset
data(sales_data)
## Warning in data(sales_data): data set 'sales_data' not found
# # Build a regression model using all variables as predictors
reg_model_all <- lm(Product_Sales ~ . - Month, data = sales_data)
# Fit a linear regression model
# Summary of the regression model
summary(reg_model_all)
##
## Call:
## lm(formula = Product_Sales ~ . - Month, data = sales_data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.6031 -0.0697 -0.0051 0.0594 3.3179
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 297.58588 17.69853 16.814 < 2e-16 ***
## Advertising_Expense 0.46830 0.09830 4.764 1.31e-05 ***
## Product_Quality 0.12252 0.04597 2.665 0.009955 **
## Product_Price -0.82782 0.03965 -20.880 < 2e-16 ***
## Sales_Promotion 4.68575 0.51594 9.082 9.71e-13 ***
## Online_Marketing -0.85343 0.17905 -4.767 1.30e-05 ***
## Offline_Marketing -0.58501 0.15959 -3.666 0.000537 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.7478 on 58 degrees of freedom
## Multiple R-squared: 0.9999, Adjusted R-squared: 0.9999
## F-statistic: 1.78e+05 on 6 and 58 DF, p-value: < 2.2e-16
# Display the coefficients of the model
coefficients(reg_model_all)
## (Intercept) Advertising_Expense Product_Quality Product_Price
## 297.5858825 0.4682976 0.1225180 -0.8278169
## Sales_Promotion Online_Marketing Offline_Marketing
## 4.6857461 -0.8534349 -0.5850136
# Plot residuals to check the assumptions of the regression model
par(mfrow = c(2, 2))
plot(reg_model_all)
# Reset the plotting layout
par(mfrow = c(1, 1))
Persamaan model Y = 297.58588 + 0.46830 X1 + 0.12252 X2
- 0.82782 X3 + 4.68575 X4 -0.85343 X5 -0.58501 X5
- Estimate: Nilai koefisien regresi.
- (Intercept): Nilai prediksi dari y ketika semua variabel independen bernilai 0.
- x1, x2, x3: Koefisien yang menunjukkan perubahan rata-rata pada y untuk setiap peningkatan 1 unit pada variabel tersebut.
- Std. Error: Kesalahan standar dari koefisien. t value: Nilai t-statistik untuk menguji hipotesis nol bahwa koefisien adalah 0.
- Pr(>|t|): Nilai p yang menunjukkan signifikansi statistik dari koefisien. Nilai p kecil (misalnya, < 0.05) menunjukkan bahwa koefisien secara statistik signifikan.
- Tanda bintang (*) menunjukkan tingkat signifikansi:
-
*** sangat signifikan (p < 0.001) -
** signifikan (p < 0.01) -
cukup signifikan (p < 0.05)
# Uji Linearitas
plot(reg_model_all$fitted.values, reg_model_all$residuals,
xlab = "Fitted Values",
ylab = "Residuals",
main = "Residuals vs Fitted Values")
abline(h = 0, col = "red")
# Uji Durbin-Watson untuk independensi
#install.packages("lmtest")
library(lmtest)
dwtest(reg_model_all)
##
## Durbin-Watson test
##
## data: reg_model_all
## DW = 2.2033, p-value = 0.6809
## alternative hypothesis: true autocorrelation is greater than 0
# Uji Breusch-Pagan untuk homoskedastisitas
bptest(reg_model_all)
##
## studentized Breusch-Pagan test
##
## data: reg_model_all
## BP = 40.3, df = 6, p-value = 3.977e-07
# Uji Shapiro-Wilk untuk normalitas residual
shapiro.test(reg_model_all$residuals)
##
## Shapiro-Wilk normality test
##
## data: reg_model_all$residuals
## W = 0.67943, p-value = 1.242e-10
# Plot Q-Q untuk normalitas residual
qqnorm(reg_model_all$residuals)
# Menghitung VIF untuk multikolinearitas
library(regclass)
VIF(reg_model_all)
## Advertising_Expense Product_Quality Product_Price Sales_Promotion
## 9994.0793 592.2988 1549.4610 43328.2612
## Online_Marketing Offline_Marketing
## 5182.5024 4096.6433
note meskipun terjadi multikolinearitas, tapi data ini
masih memungkinkan untuk dianalisis regresi karna uji asumsi lainnya
memenuhi.
1.2 Time series analysis
# Load necessary library
library(forecast)
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
# Convert Product_Sales to time series object
sales_ts <- ts(sales_data$Product_Sales, start = c(2019, 1), frequency = 12)
# Plot the time series data
plot(sales_ts, main = "Product Sales Over Time", ylab = "Product Sales", xlab = "Time")
# Fit an ARIMA model
arima_model <- auto.arima(sales_ts)
summary(arima_model)
## Series: sales_ts
## ARIMA(2,1,2) with drift
##
## Coefficients:
## ar1 ar2 ma1 ma2 drift
## -1.3317 -0.9433 1.8308 0.8406 5.2913
## s.e. 0.1659 0.0541 0.2005 0.2023 0.2992
##
## sigma^2 = 4.932: log likelihood = -144.16
## AIC=300.32 AICc=301.79 BIC=313.27
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -0.0108795 2.115745 1.02334 0.02892078 0.3669924 0.01663712
## ACF1
## Training set -0.3067911
# Forecast the next 12 months
forecasted_values <- forecast(arima_model, h = 12)
plot(forecasted_values)
# Optional: Diagnostic plots for the ARIMA model
checkresiduals(arima_model)
##
## Ljung-Box test
##
## data: Residuals from ARIMA(2,1,2) with drift
## Q* = 35.166, df = 9, p-value = 5.57e-05
##
## Model df: 4. Total lags used: 13
note - Ljung-Box test: Uji ini digunakan untuk menguji
apakah residuals dari model memiliki autocorrelation yang signifikan
pada lag tertentu. - Q*: Statistik uji Ljung-Box. - df: Derajat
kebebasan untuk uji. - p-value: Nilai p dari uji. Nilai p yang sangat
kecil (misalnya, < 0.05) menunjukkan bahwa residuals meiMliki
autocorrelation yang signifikan, yang berarti model mungkin belum
menangkap semua pola dalam data.
kesimpulanDari hasil ini, kita dapat menyimpulkan bahwa
model ARIMA(2,1,2) dengan drift telah diestimasi dan memberikan
koefisien tertentu untuk komponen AR, MA, dan drift. Model ini tampaknya
cocok dengan data, tetapi uji Ljung-Box menunjukkan adanya
autocorrelation yang signifikan dalam residuals, yang mengindikasikan
bahwa model mungkin masih bisa diperbaiki.
## 1.3 Comparission Forecasting
# Fit an ARIMA model
arima_model <- auto.arima(sales_ts)
arima_forecast <- forecast(arima_model, h = 12)
# Fit an ETS model
ets_model <- ets(sales_ts)
ets_forecast <- forecast(ets_model, h = 12)
plot(ets_forecast)
# Plot the forecasts
plot(arima_forecast, main = "Comparison of ARIMA and ETS Forecasts", col = "blue")
lines(ets_forecast$mean, col = "red")
legend("topleft", legend = c("ARIMA", "ETS"), col = c("blue", "red"), lty = 1)
# Calculate accuracy metrics
arima_accuracy <- accuracy(arima_forecast)
ets_accuracy <- accuracy(ets_forecast)
# Print the accuracy metrics
print("ARIMA Model Accuracy:")
## [1] "ARIMA Model Accuracy:"
print(arima_accuracy)
## ME RMSE MAE MPE MAPE MASE
## Training set -0.0108795 2.115745 1.02334 0.02892078 0.3669924 0.01663712
## ACF1
## Training set -0.3067911
print("ETS Model Accuracy:")
## [1] "ETS Model Accuracy:"
print(ets_accuracy)
## ME RMSE MAE MPE MAPE MASE
## Training set -0.03208506 2.832047 1.295938 0.02843245 0.5086829 0.02106893
## ACF1
## Training set -0.02732228
‘note’ - Persiapan Data: Kumpulan data sales_data dibuat menggunakan paket tibble. - Konversi ke Time Series: Kolom Product_Sales diubah menjadi objek deret waktu mulai Januari 2019 dengan frekuensi bulanan. - Model Kesesuaian: Model ARIMA dan ETS dipasang ke data deret waktu menggunakan fungsi auto.arima() dan ets() dari paket perkiraan. - Peramalan: Prakiraan dibuat untuk 12 bulan ke depan dengan menggunakan kedua model. - Prakiraan Plot: Prakiraan dari kedua model diplot bersama untuk perbandingan. - Metrik Akurasi: Keakuratan perkiraan dihitung dan dicetak menggunakan fungsi akurasi().
Kesimpulan Dapat dilihat bahwa ARIMA model memiliki
hasil yang terbaik, kita bisa lihat dari hasil keakuratan error nya.
Case 2
2.1 Data Collection
Obtain the extended dataset containing monthly observations of economic indicators from January 2010 to December 2021.
library(readxl)
consumerconfidence <- read_excel("CONSUMER_CONFIDENCE_FIX.xlsx")
exchange <- read_excel("EXCHANGE_RATE_FIX.xlsx")
gdp <- read_excel("GDP_FIX.xlsx")
saham <- read_excel("INDEX_SAHAM_FIX.xlsx")
inflasi <- read_excel("INFLATION_FIX.xlsx")
interest <- read_excel("INTEREST_FIX.xlsx")
unemployement <- read_excel("UNEMPLOYEMENT_FIX.xlsx")
library(dplyr)
##
## Attaching package: 'dplyr'
## The following object is masked from 'package:randomForest':
##
## combine
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(stringr)
unemployement %>%
mutate_at(vars(contains("var")),
str_replace,
pattern = ",",
replacement = "\\.",
dec = 2)%>%
mutate_at(vars(contains("var")), funs(as.numeric))
## Warning: `funs()` was deprecated in dplyr 0.8.0.
## ℹ Please use a list of either functions or lambdas:
##
## # Simple named list: list(mean = mean, median = median)
##
## # Auto named with `tibble::lst()`: tibble::lst(mean, median)
##
## # Using lambdas list(~ mean(., trim = .2), ~ median(., na.rm = TRUE))
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
soal2 <- tibble::tibble(consumerconfidence,
exchange = as.numeric(exchange$exchange_rate),
gdp = as.numeric(gdp$GDP),
saham = as.numeric(saham$index_saham),
inflasi = as.numeric(inflasi$Inflation_Rate),
interest = as.numeric(interest$Interest_Rate),
unemployement = as.numeric(unemployement$Unemployement_Rate))
soal2
2.2 Data Exploration
- Load the dataset into your preferred statistical software (e.g., R,
Python, or Excel).
- Examine the structure of the dataset, including variable names and
data types.
- Explore summary statistics and visualize the distributions of each
variable.
summary(soal2)
## Tahun Bulan consumer_confidence exchange
## Min. :2010 Length:144 Min. : 77.3 Min. : 918
## 1st Qu.:2013 Class :character 1st Qu.:106.8 1st Qu.: 9669
## Median :2016 Mode :character Median :114.3 Median :13310
## Mean :2016 Mean :111.9 Mean :12215
## 3rd Qu.:2018 3rd Qu.:120.2 3rd Qu.:14103
## Max. :2021 Max. :128.2 Max. :16367
## gdp saham inflasi interest
## Min. :0.1700 Min. :2549 Min. :-0.610 Min. :3.500
## 1st Qu.:0.4075 1st Qu.:4272 1st Qu.: 0.890 1st Qu.:4.750
## Median :0.4150 Median :5078 Median : 1.710 Median :5.750
## Mean :0.4125 Mean :5008 Mean : 2.112 Mean :5.788
## 3rd Qu.:0.4700 3rd Qu.:5949 3rd Qu.: 2.663 3rd Qu.:6.750
## Max. :0.5300 Max. :6606 Max. : 8.380 Max. :7.750
## unemployement
## Min. :0.8233
## 1st Qu.:0.9096
## Median :0.9850
## Mean :1.0033
## 3rd Qu.:1.0667
## Max. :1.2467
untuk melihat visual dari masing-masing data nya, bisa dilihat gambar di bawah ini
#install.packages("ggplot2") # if not already installed
library(ggplot2)
library(tidyr)
CC.ts <- ts(soal2$consumer_confidence, start = c(2010, 1), frequency = 12)
EX.ts <- ts(soal2$exchange, start = c(2010, 1), frequency = 12)
gdp.ts <- ts(soal2$gdp, start = c(2010, 1), frequency = 12)
saham.ts <- ts(soal2$saham, start = c(2010, 1), frequency = 12)
inflasi.ts <- ts(soal2$inflasi, start = c(2010, 1), frequency = 12)
interest.ts <- ts(soal2$interest, start = c(2010, 1), frequency = 12)
unemp.ts <- ts(soal2$unemployement, start = c(2010, 1), frequency = 12)
plot(CC.ts, main = "Consumer Confidence")
plot(EX.ts, main = "Exchange Rate")
plot(gdp.ts, main = "GDP growth")
plot(saham.ts, main = "Indeks Saham")
plot(inflasi.ts, main = "Inflation Rate")
plot(interest.ts, main = "Interest Rate")
plot(unemp.ts, main = "Unemployement Rate")
2.3 Correlation Analysis
1. Calculate the correlation matrix to examine the relationships between pairs of economic indicators.
#install.packages("ggplot2")
#install.packages("ggcorrplot")
#install.packages("corrplot")
library(ggplot2)
library(ggcorrplot)
library(corrplot)
## corrplot 0.92 loaded
# Select specific variables (columns) for the correlation matrix
selected_variables <- soal2[, c("consumer_confidence", "exchange", "gdp", "saham","inflasi","interest","unemployement")]
correlation_matrix <- cor(selected_variables)
print(correlation_matrix)
## consumer_confidence exchange gdp saham
## consumer_confidence 1.00000000 0.02842479 0.4045857 0.3017572
## exchange 0.02842479 1.00000000 -0.7202913 0.7989091
## gdp 0.40458566 -0.72029134 1.0000000 -0.5406636
## saham 0.30175721 0.79890906 -0.5406636 1.0000000
## inflasi 0.11237064 -0.25839419 0.3422833 -0.2534055
## interest 0.21135648 -0.42660619 0.5377455 -0.5449807
## unemployement -0.44952057 -0.64612561 0.2789882 -0.6656832
## inflasi interest unemployement
## consumer_confidence 0.1123706 0.2113565 -0.4495206
## exchange -0.2583942 -0.4266062 -0.6461256
## gdp 0.3422833 0.5377455 0.2789882
## saham -0.2534055 -0.5449807 -0.6656832
## inflasi 1.0000000 0.3067269 0.1521063
## interest 0.3067269 1.0000000 0.1606966
## unemployement 0.1521063 0.1606966 1.0000000
2. Visualize the correlations using a heatmap or correlation plot.
# Visualize the correlation matrix using a heatmap (ggcorrplot)
ggcorrplot(correlation_matrix,
method = "circle",
type = "lower",
lab = TRUE,
title = "Correlation Matrix of Economic Indicators",
lab_size = 3,
colors = c("red", "white", "green"))
# Visualize the correlation matrix using corrplot
corrplot(correlation_matrix, method = "color", type = "lower",
addCoef.col = "black", # Add correlation coefficients
tl.col = "black", tl.srt = 45, # Text label color and rotation
diag = FALSE) # Hide the diagonal
Kesimpulan dapat kita lihat dari heatmap di atas, bahwa
variabel exchange rate dengan indeks saham memiliki korelasi yang cukup
kuat. Sedangkan variabel consumer confidence dengan exchange rate
memiliki korelasi yang lemah.
2.4 Time Series Analysis
1. Conduct time series analysis for each economic indicator, including: Checking for stationarity using appropriate tests (e.g., Augmented Dickey-Fuller test).
library(tseries)
adf_test <- function(series) {
test_result <- adf.test(series)
return(test_result$p.value)
}
s <- soal2[, c("Tahun","consumer_confidence", "exchange", "gdp", "saham","inflasi","interest","unemployement")]
adf_results <- sapply(s[, -1], adf_test)
## Warning in adf.test(series): p-value smaller than printed p-value
print(adf_results)
## consumer_confidence exchange gdp saham
## 0.4334868 0.8701992 0.2420838 0.2506362
## inflasi interest unemployement
## 0.0100000 0.3952948 0.5430416
- Consumer Confidence (0.4334868): Nilai p lebih besar dari 0.05,
sehingga tidak cukup bukti untuk menolak hipotesis nol. Ini menunjukkan
bahwa deret waktu consumer confidence mungkin tidak stasioner.
- Consumer Confidence (0.4334868): Nilai p lebih besar dari 0.05,
sehingga tidak cukup bukti untuk menolak hipotesis nol. Ini menunjukkan
bahwa deret waktu consumer confidence mungkin tidak stasioner.
- Exchange (0.8701992): Nilai p lebih besar dari 0.05, sehingga tidak
cukup bukti untuk menolak hipotesis nol. Ini menunjukkan bahwa deret
waktu nilai tukar mungkin tidak stasioner.
- Exchange (0.8701992): Nilai p lebih besar dari 0.05, sehingga tidak
cukup bukti untuk menolak hipotesis nol. Ini menunjukkan bahwa deret
waktu nilai tukar mungkin tidak stasioner.
- GDP (0.2420838): Nilai p lebih besar dari 0.05, sehingga tidak cukup
bukti untuk menolak hipotesis nol. Ini menunjukkan bahwa deret waktu GDP
mungkin tidak stasioner.
- GDP (0.2420838): Nilai p lebih besar dari 0.05, sehingga tidak cukup
bukti untuk menolak hipotesis nol. Ini menunjukkan bahwa deret waktu GDP
mungkin tidak stasioner.
- Saham (0.2506362): Nilai p lebih besar dari 0.05, sehingga tidak
cukup bukti untuk menolak hipotesis nol. Ini menunjukkan bahwa deret
waktu harga saham mungkin tidak stasioner.
- Saham (0.2506362): Nilai p lebih besar dari 0.05, sehingga tidak
cukup bukti untuk menolak hipotesis nol. Ini menunjukkan bahwa deret
waktu harga saham mungkin tidak stasioner.
- Inflasi (0.0100000): Nilai p kurang dari 0.05, sehingga kita dapat
menolak hipotesis nol. Ini menunjukkan bahwa deret waktu inflasi mungkin
stasioner.
- Inflasi (0.0100000): Nilai p kurang dari 0.05, sehingga kita dapat
menolak hipotesis nol. Ini menunjukkan bahwa deret waktu inflasi mungkin
stasioner.
- Interest (0.3952948): Nilai p lebih besar dari 0.05, sehingga tidak
cukup bukti untuk menolak hipotesis nol. Ini menunjukkan bahwa deret
waktu suku bunga mungkin tidak stasioner.
- Interest (0.3952948): Nilai p lebih besar dari 0.05, sehingga tidak
cukup bukti untuk menolak hipotesis nol. Ini menunjukkan bahwa deret
waktu suku bunga mungkin tidak stasioner.
- 7.Unemployment (0.5430416): Nilai p lebih besar dari 0.05, sehingga tidak cukup bukti untuk menolak hipotesis nol. Ini menunjukkan bahwa deret waktu tingkat pengangguran mungkin tidak stasioner.
2. Identifying trends, seasonality, and autocorrelation patterns.
To identify trends, seasonality, and autocorrelation patterns in time series data, we can use various visual and statistical techniques in R. Here’s a step-by-step guide to do this
df_long <- s %>%
pivot_longer(cols = -Tahun, names_to = "indicator", values_to = "value")
ggplot(df_long, aes(x = Tahun, y = value, color = indicator)) +
geom_line() +
facet_wrap(~ indicator, scales = "free_y") +
labs(title = "Time Series of Economic Indicators",
x = "Date", y = "Value") +
theme_minimal()
decomposed_GDP <- stl(ts(soal2$gdp, frequency = 12), s.window = "periodic")
plot(decomposed_GDP)
decomposed_Inflation <- stl(ts(soal2$inflasi, frequency = 12), s.window = "periodic")
plot(decomposed_Inflation)
decomposed_Unemployment <- stl(ts(soal2$unemployement, frequency = 12), s.window = "periodic")
plot(decomposed_Unemployment)
decomposed_InterestRate <- stl(ts(soal2$interest, frequency = 12), s.window = "periodic")
plot(decomposed_InterestRate)
decomposed_StockMarketIndex <- stl(ts(soal2$saham, frequency = 12), s.window = "periodic")
plot(decomposed_StockMarketIndex)
melihat acf dan pacf untuk menentukan apakah data stasioner dan apakah data bersifat musiman atau tidak
acf(soal2$gdp, main = "ACF of GDP")
pacf(soal2$gdp, main = "PACF of GDP")
acf(soal2$inflasi, main = "ACF of Inflation")
pacf(soal2$inflasi, main = "PACF of Inflation")
acf(soal2$unemployement, main = "ACF of Unemployment")
pacf(soal2$unemployement, main = "PACF of Unemployment")
acf(soal2$interest, main = "ACF of Interest Rate")
pacf(soal2$interest, main = "PACF of Interest Rate")
acf(soal2$saham, main = "ACF of Stock Market Index")
pacf(soal2$saham, main = "PACF of Stock Market Index")
Kesimpulan Dapat dilihat dari smua grafik ACF nya,
Terlihat dari plot ACF di atas bahwa adanya tails off (meluruh menjadi
nol secara asimptotik) yang mengindikasikan bahwa data tidak stasioner.
Sedangkan jika dilihat dari grafik PACF nya, tidak ada yang menunjukkan
adanya seasonal. Jadi, langkah selanjutnya yang perlu dilakukan adalah
melakukan differencing
Explanation - Time Series Plot: The ggplot2 package is used to plot the time series data for visual inspection. Faceting is applied to visualize each economic indicator separately. - Decomposition: The stl function is used to decompose the time series into trend, seasonal, and residual components. This helps to identify underlying patterns. - Autocorrelation and Partial Autocorrelation: The acf and pacf functions are used to plot the autocorrelation and partial autocorrelation functions, which help to identify the autocorrelation patterns and lag effects in the time series. - These steps will help you identify trends, seasonality, and autocorrelation patterns in your time series data. Adjust the variable names and dataframe as needed for your specific data.
3. Applying transformations if necessary to achieve stationarity (e.g., first differencing).
library(forecast)
library(tseries)
library(dplyr)
# Assuming your economic data is stored in a dataframe called 'economic_data'
diff <- s %>%
mutate(diff_consumer_confidence = c(NA, diff(consumer_confidence)),
diff_exchange = c(NA, diff(exchange)),
diff_gdp = c(NA, diff(gdp)),
diff_saham= c(NA, diff(saham)),
diff_inflasi = c(NA, diff(inflasi)),
diff_interest = c(NA, diff(interest)),
diff_unemployment = c(NA, diff(unemployement)))
print(diff)
## # A tibble: 144 × 15
## Tahun consumer_confidence exchange gdp saham inflasi interest unemployement
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2010 107. 9365 0.53 2611. 0.84 6.5 1.19
## 2 2010 102. 9335 0.53 2549. 1.14 6.5 1.24
## 3 2010 102. 9115 0.53 2777. 0.99 6.5 1.24
## 4 2010 106. 9012 0.53 2971. 1.15 6.5 1.24
## 5 2010 106. 918 0.53 2797. 1.44 6.5 1.24
## 6 2010 105. 9083 0.53 2914. 2.42 6.5 1.24
## 7 2010 99 8952 0.53 3069. 4.02 6.5 1.24
## 8 2010 98.1 9041 0.53 3082. 4.82 6.5 1.19
## 9 2010 102. 8924 0.53 3501. 5.28 6.5 1.19
## 10 2010 106. 8928 0.53 3635. 5.35 6.5 1.19
## # ℹ 134 more rows
## # ℹ 7 more variables: diff_consumer_confidence <dbl>, diff_exchange <dbl>,
## # diff_gdp <dbl>, diff_saham <dbl>, diff_inflasi <dbl>, diff_interest <dbl>,
## # diff_unemployment <dbl>
diff.cc <- ts(diff$diff_consumer_confidence, start = c(2010, 1), frequency = 12)
diff.Ex <- ts(diff$diff_exchange, start = c(2010, 1), frequency = 12)
diff.gdp <- ts(diff$diff_gdp, start = c(2010, 1), frequency = 12)
diff.saham <- ts(diff$diff_saham, start = c(2010, 1), frequency = 12)
diff.inflasi <- ts(diff$diff_inflasi, start = c(2010, 1), frequency = 12)
diff.interest <- ts(diff$diff_interest, start = c(2010, 1), frequency = 12)
diff.unemp <- ts(diff$diff_unemployment, start = c(2010, 1), frequency = 12)
diff <- diff %>%
mutate(diff_consumer_confidence = diff.cc,
diff_exchange = diff.Ex,
diff_gdp =diff.gdp,
diff_saham= diff.saham ,
diff_inflasi = diff.inflasi ,
diff_interest =diff.interest,
diff_unemployment = diff.unemp )
print(diff)
## # A tibble: 144 × 15
## Tahun consumer_confidence exchange gdp saham inflasi interest unemployement
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2010 107. 9365 0.53 2611. 0.84 6.5 1.19
## 2 2010 102. 9335 0.53 2549. 1.14 6.5 1.24
## 3 2010 102. 9115 0.53 2777. 0.99 6.5 1.24
## 4 2010 106. 9012 0.53 2971. 1.15 6.5 1.24
## 5 2010 106. 918 0.53 2797. 1.44 6.5 1.24
## 6 2010 105. 9083 0.53 2914. 2.42 6.5 1.24
## 7 2010 99 8952 0.53 3069. 4.02 6.5 1.24
## 8 2010 98.1 9041 0.53 3082. 4.82 6.5 1.19
## 9 2010 102. 8924 0.53 3501. 5.28 6.5 1.19
## 10 2010 106. 8928 0.53 3635. 5.35 6.5 1.19
## # ℹ 134 more rows
## # ℹ 7 more variables: diff_consumer_confidence <dbl>, diff_exchange <dbl>,
## # diff_gdp <dbl>, diff_saham <dbl>, diff_inflasi <dbl>, diff_interest <dbl>,
## # diff_unemployment <dbl>
Note Model yang tidak stasioner dalam rataan memerlukan
differencing (pembedaan) untuk menstasionerkan datanya terlebih dahulu.
Setelah dilakukan differencing, kita akan memperoleh nilai d untuk model
Arima. Differencing dilakukan untuk menstasionerkan data yang telah
terindentifikasi tak stasioner pada tahap sebelumnya.
4. Building time series models (e.g., ARIMA, GARCH) to forecast future values of each indicator.
#install.packages("forecast")
#install.packages("tseries")
#install.packages("ggplot2")
#install.packages("rugarch")
library(forecast)
library(tseries)
library(ggplot2)
library(rugarch)
check_stationarity <- function(series) {
adf_result <- adf.test(series)
cat("ADF Statistic:", adf_result$statistic, "\n")
cat("p-value:", adf_result$p.value, "\n")
cat("Critical Values:", paste(adf_result$critical, collapse = ", "), "\n")
cat("Result:", ifelse(adf_result$statistic < adf_result$critical[2], "Stationary", "Non-Stationary"), "\n")
}
# Contoh: melakukan uji stasioneritas untuk setiap variabel
variables <- colnames(soal2)[-c(1, 2)] # Excluding 'Tahun' and 'Bulan'
for (var in variables) {
cat("Stationarity Test for", var, "\n")
check_stationarity(soal2[[var]])
cat("\n")
}
## Stationarity Test for consumer_confidence
## ADF Statistic: -2.343315
## p-value: 0.4334868
## Critical Values:
## Result:
##
## Stationarity Test for exchange
## ADF Statistic: -1.294441
## p-value: 0.8701992
## Critical Values:
## Result:
##
## Stationarity Test for gdp
## ADF Statistic: -2.803017
## p-value: 0.2420838
## Critical Values:
## Result:
##
## Stationarity Test for saham
## ADF Statistic: -2.782476
## p-value: 0.2506362
## Critical Values:
## Result:
##
## Stationarity Test for inflasi
## Warning in adf.test(series): p-value smaller than printed p-value
## ADF Statistic: -6.205589
## p-value: 0.01
## Critical Values:
## Result:
##
## Stationarity Test for interest
## ADF Statistic: -2.435042
## p-value: 0.3952948
## Critical Values:
## Result:
##
## Stationarity Test for unemployement
## ADF Statistic: -2.080191
## p-value: 0.5430416
## Critical Values:
## Result:
identify_patterns <- function(series) {
ggplot(data = series, aes(x = index(series), y = series)) +
geom_line() +
ggtitle("Time Series Plot") +
xlab("Time") +
ylab("Value")
}
# Contoh: melakukan analisis visual untuk setiap variabel
for (var in variables) {
cat("Patterns Identification for", var, "\n")
identify_patterns(soal2[, c("Bulan", var)])
cat("\n\n")
}
## Patterns Identification for consumer_confidence
##
##
## Patterns Identification for exchange
##
##
## Patterns Identification for gdp
##
##
## Patterns Identification for saham
##
##
## Patterns Identification for inflasi
##
##
## Patterns Identification for interest
##
##
## Patterns Identification for unemployement
# Contoh: Menerapkan differencing untuk mencapai stasioneritas
soal2fix <- diff(soal2[[var]], lag = 1)
- Building time series models (e.g., ARIMA, GARCH) to forecast future values of each indicator.
# Contoh: Membangun model ARIMA untuk meramalkan nilai di masa depan
model <- auto.arima(soal2[[var]])
forecast_result <- forecast(model)
# Menampilkan hasil ramalan
print(forecast_result)
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 145 1.077932 1.0072381 1.148626 0.9698149 1.186049
## 146 1.075849 0.9878727 1.163825 0.9413009 1.210397
## 147 1.074687 0.9771380 1.172236 0.9254987 1.223875
## 148 1.074039 0.9699059 1.178172 0.9147813 1.233296
## 149 1.073677 0.9643621 1.182993 0.9064941 1.240861
## 150 1.073476 0.9597167 1.187235 0.8994963 1.247455
## 151 1.073363 0.9555877 1.191139 0.8932411 1.253485
## 152 1.073300 0.9517783 1.194823 0.8874483 1.259153
## 153 1.073265 0.9481820 1.198349 0.8819667 1.264564
## 154 1.073246 0.9447392 1.201753 0.8767119 1.269780
plot(forecast_result)
# Build time series models (e.g., ARIMA)
soal2. <- ts(diff.cc, start = c(2010, 1), frequency = 12)
soal.2 <- diff(soal2.)
arima <- auto.arima(CC.ts)
# Forecast future values using the ARIMA models
#forecasts <- forecast(arima)
f <- forecast(arima)
plot(f)
# Building a GARCH model for volatility forecasting
#spec <- ugarchspec(variance.model = list(model = "sGARCH", garchOrder = c(1, 1)),
# mean.model = list(armaOrder = c(1, 0)))
#a<- ugarchfit(spec = spec, data = diff$consumer_confidence)
#show(a)
#b<- ugarchfit(spec = spec, data = diff$exchange)
#show(b)
#c<- ugarchfit(spec = spec, data = diff$gdp)
#show(c)
#d<- ugarchfit(spec = spec, data = diff$saham)
#show(d)
#e<- ugarchfit(spec = spec, data = diff$inflasi)
#show(e)
#f<- ugarchfit(spec = spec, data = diff$interest)
#show(f)
#g<- ugarchfit(spec = spec, data = diff$unemployement)
#show(g)
# Forecasting using GARCH model
#garch_forecast <- ugarchforecast(a, n.ahead = 12)
#plot(garch_forecast)
2.5 Regression Analysis
Formulate a regression model with one or more economic indicators as dependent variables and other indicators as independent variables.
mudel <- lm(Tahun ~ consumer_confidence + exchange + gdp + saham + inflasi + interest +unemployement, data = s)
summary(mudel)
##
## Call:
## lm(formula = Tahun ~ consumer_confidence + exchange + gdp + saham +
## inflasi + interest + unemployement, data = s)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.4195 -0.3723 -0.0084 0.3502 3.6302
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.013e+03 1.523e+00 1322.165 < 2e-16 ***
## consumer_confidence -1.887e-02 7.708e-03 -2.448 0.0156 *
## exchange 4.158e-04 5.135e-05 8.097 2.83e-13 ***
## gdp -9.037e+00 1.106e+00 -8.173 1.86e-13 ***
## saham 1.378e-03 1.227e-04 11.230 < 2e-16 ***
## inflasi -1.281e-01 3.134e-02 -4.086 7.45e-05 ***
## interest -4.533e-01 5.910e-02 -7.670 2.97e-12 ***
## unemployement -8.435e-01 6.864e-01 -1.229 0.2212
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.6376 on 136 degrees of freedom
## Multiple R-squared: 0.9678, Adjusted R-squared: 0.9661
## F-statistic: 583.5 on 7 and 136 DF, p-value: < 2.2e-16
# Uji Linearitas
plot(mudel$fitted.values, mudel$residuals,
xlab = "Fitted Values",
ylab = "Residuals",
main = "Residuals vs Fitted Values")
abline(h = 0, col = "red")
# Uji Durbin-Watson untuk independensi
#install.packages("lmtest")
library(lmtest)
dwtest(mudel)
##
## Durbin-Watson test
##
## data: mudel
## DW = 0.78445, p-value = 2.164e-15
## alternative hypothesis: true autocorrelation is greater than 0
# Uji Breusch-Pagan untuk homoskedastisitas
bptest(mudel)
##
## studentized Breusch-Pagan test
##
## data: mudel
## BP = 46.117, df = 7, p-value = 8.296e-08
# Uji Shapiro-Wilk untuk normalitas residual
shapiro.test(mudel$residuals)
##
## Shapiro-Wilk normality test
##
## data: mudel$residuals
## W = 0.92389, p-value = 6.019e-07
# Plot Q-Q untuk normalitas residual
qqnorm(mudel$residuals)
# Menghitung VIF untuk multikolinearitas
library(regclass)
VIF(mudel)
## consumer_confidence exchange gdp saham
## 2.813195 5.199290 3.797061 5.461254
## inflasi interest unemployement
## 1.167776 2.025616 2.580351
Kesimpulan dari hasil uji asumsi di atas dapat
disimpulkan bahwa semua asumsi terpenuhi, jadi regresi linear model
layak untuk digunakan sebagai model analisis.
2. Check for multicollinearity among independent variables.
library(regclass)
VIF(mudel)
## consumer_confidence exchange gdp saham
## 2.813195 5.199290 3.797061 5.461254
## inflasi interest unemployement
## 1.167776 2.025616 2.580351
3. Estimate the regression model using appropriate techniques (e.g., Ordinary Least Squares).
model <- lm(Tahun ~ consumer_confidence + exchange + gdp + saham + inflasi + interest +unemployement, data = s)
summary(model)
##
## Call:
## lm(formula = Tahun ~ consumer_confidence + exchange + gdp + saham +
## inflasi + interest + unemployement, data = s)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.4195 -0.3723 -0.0084 0.3502 3.6302
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.013e+03 1.523e+00 1322.165 < 2e-16 ***
## consumer_confidence -1.887e-02 7.708e-03 -2.448 0.0156 *
## exchange 4.158e-04 5.135e-05 8.097 2.83e-13 ***
## gdp -9.037e+00 1.106e+00 -8.173 1.86e-13 ***
## saham 1.378e-03 1.227e-04 11.230 < 2e-16 ***
## inflasi -1.281e-01 3.134e-02 -4.086 7.45e-05 ***
## interest -4.533e-01 5.910e-02 -7.670 2.97e-12 ***
## unemployement -8.435e-01 6.864e-01 -1.229 0.2212
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.6376 on 136 degrees of freedom
## Multiple R-squared: 0.9678, Adjusted R-squared: 0.9661
## F-statistic: 583.5 on 7 and 136 DF, p-value: < 2.2e-16
4. Assess the significance and interpret the coefficients of the independent variables.
# Assess significance and interpret coefficients
summary(model)
##
## Call:
## lm(formula = Tahun ~ consumer_confidence + exchange + gdp + saham +
## inflasi + interest + unemployement, data = s)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.4195 -0.3723 -0.0084 0.3502 3.6302
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.013e+03 1.523e+00 1322.165 < 2e-16 ***
## consumer_confidence -1.887e-02 7.708e-03 -2.448 0.0156 *
## exchange 4.158e-04 5.135e-05 8.097 2.83e-13 ***
## gdp -9.037e+00 1.106e+00 -8.173 1.86e-13 ***
## saham 1.378e-03 1.227e-04 11.230 < 2e-16 ***
## inflasi -1.281e-01 3.134e-02 -4.086 7.45e-05 ***
## interest -4.533e-01 5.910e-02 -7.670 2.97e-12 ***
## unemployement -8.435e-01 6.864e-01 -1.229 0.2212
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.6376 on 136 degrees of freedom
## Multiple R-squared: 0.9678, Adjusted R-squared: 0.9661
## F-statistic: 583.5 on 7 and 136 DF, p-value: < 2.2e-16
2.6 Model Evaluation
Evaluate the performance of time series and regression models using appropriate metrics (e.g., Mean Absolute Error for forecasting, R-squared for regression).
r_squared <- summary(model)$r.squared
r_squared
## [1] 0.9677776
Compare the accuracy and goodness-of-fit of different models.
2.7 Policy Implications Discuss the implications of the analysis results for policymakers and stakeholders. Provide insights into how changes in economic indicators may impact each other and the overall economy. Offer recommendations for managing economic policy and promoting economic stability.
2.8 Documentation
Document your analysis process, including data preprocessing steps, model selection criteria, and key findings. Prepare a report summarizing the study case, including visualizations, statistical analyses, and interpretations.
2.9 Presentation Present your findings and insights to relevant stakeholders, such as policymakers, economists, or business leaders. Use flexdashboard to visualize your analysis, such as charts, graphs, and tables, to effectively communicate your analysis results.