1 Evidence # 2

Get access to the Tec de Monterrey – Centro Internacional de Casos through mitec.itesm.mx and read the business case “Mexico and Its Attractiveness for Nearshoring”. Please use the required dataset and the business case’s background to prepare a R-Markdown file (PDF or html) that address the following instructions:

1.1 Loading Libraries

library(ggplot2)
library(lubridate)
library(dplyr)
library(xts)
library(dplyr)
library(zoo)
library(tseries)
library(stats)
library(forecast)
library(astsa)
library(corrplot)
library(AER)
library(vars)
library(dynlm)
library(vars)
library(TSstudio)
library(Metrics)  # For RMSE calculation
library(tidyverse)
library(sarima)
library(readxl)
library(dygraphs)

1.2 Load data

# PANEL DATA

# Reading the dataset
sp_data <- read.csv("/Users/daviddrums180/Desktop/sp_data.csv")

# Getting a glimpse of the data
head(sp_data)

##   periodo IED_Flujos    IED_M Exportaciones Exportaciones_m Empleo Educacion
## 1    1997   12145.60 294151.2       9087.62        220090.8     NA      7.20
## 2    1998    8373.50 210875.6       9875.07        248690.6     NA      7.31
## 3    1999   13960.32 299734.4      10990.01        235960.5     NA      7.43
## 4    2000   18248.69 362631.8      12482.96        248057.2  97.83      7.56
## 5    2001   30057.18 546548.4      11300.44        205482.9  97.36      7.68
## 6    2002   24099.21 468332.0      11923.10        231707.6  97.66      7.80
##   Salario_Diario Innovacion Inseguridad_Robo Inseguridad_Homicidio
## 1          24.30      11.30           266.51                 14.55
## 2          31.91      11.37           314.78                 14.32
## 3          31.91      12.46           272.89                 12.64
## 4          35.12      13.15           216.98                 10.86
## 5          37.57      13.47           214.53                 10.25
## 6          39.74      12.80           197.80                  9.94
##   Tipo_de_Cambio Densidad_Carretera Densidad_Poblacion CO2_Emisiones
## 1           8.06               0.05              47.44          3.68
## 2           9.94               0.05              48.76          3.85
## 3           9.52               0.06              49.48          3.69
## 4           9.60               0.06              50.58          3.87
## 5           9.17               0.06              51.28          3.81
## 6          10.36               0.06              51.95          3.82
##   PIB_Per_Capita  INPC crisis_financiera
## 1       127570.1 33.28                 0
## 2       126738.8 39.47                 0
## 3       129164.7 44.34                 0
## 4       130874.9 48.31                 0
## 5       128083.4 50.43                 0
## 6       128205.9 53.31                 0

# TIME SERIES DATA

# Reading the dataset
sp_time <- read_excel("SP.xlsx")

# Getting a glimpse of the data
head(sp_time)

## # A tibble: 6 × 3
##     Año Trimestre IED_Flujos
##   <dbl> <chr>          <dbl>
## 1  1999 I              3596.
## 2  1999 II             3396.
## 3  1999 III            3028.
## 4  1999 IV             3940.
## 5  2000 I              4601.
## 6  2000 II             4857.

1.3 I. Introduction

Briefly describe what is time series analysis (1-2 paragraphs). Please cite at least 1 external reference to develop your explanation.

Time series analysis refers to the study of data points indexed or listed at successive points in time. Typically, a time series is used to track the movement of a particular metric over a specific period, whether it’s daily, monthly, annually, or at other regular intervals. This type of analysis is essential because the patterns observed in a historical dataset can often be used to forecast future data points or to identify underlying patterns or trends.

The primary goal is to extract meaningful statistics and characteristics from the data, such as trend, seasonality, and cycles. One key aspect of time series analysis is the emphasis on data stationarity. A stationary time series has properties that don’t change over time, making it easier to model and predict. When a series is not stationary, it may need to be transformed to become stationary before analysis.

Source: DevSpark. (2014, December 22). Nearshoring: The case for South America, part II. Medium. https://medium.com/@devspark/nearshoring-the-case-for-south-america-part-ii-32c11a6c1e07

Alvim, L., & Averbuch, M. (2023a, June 28). Supply chain latest: US nearshoring proof grows as Mexico exports jump. Bloomberg.com. https://www.bloomberg.com/news/newsletters/2023-06-28/supply-chain-latest-us-nearshoring-proof-grows-as-mexico-exports-jump?utm_medium=cpc_search&utm_campaign=NB_ENG_DSAXX_DSAXXXXXXXXXX_EVG_XXXX_XXX_COUSA_EN_EN_X_BLOM_GO_SE_XXX_XXXXXXXXXX&gclid=EAIaIQobChMI-tfX5MSggQMV2gatBh0v4ACSEAAYAyAAEgKTEfD_BwE&gclsrc=aw.ds&embedded-checkout=true

1.4 II. Background

What are the latest trends of Nearshoring in Mexico? Please cite at least 1 external reference to develop your explanation.

Mexico is increasingly emerging as a pivotal nearshoring hub, particularly for companies seeking proximity to the U.S. market. Recent data from the National Institute of Statistics and Geography highlighted a 5.8% year-over-year rise in Mexican exports in May, totaling $52.9 billion, with a significant portion being manufactured goods, which includes a 31% increase in automotive shipments to the U.S. from the previous year. Furthermore, Mexico’s foreign direct investment surged by 48% in the first quarter, attesting to its growing attractiveness for companies looking to relocate or expand their operations closer to the U.S.

This nearshoring trend, often at the expense of distant locations like China, is not only being adopted by U.S. companies but also by some Chinese firms setting up in Mexico to circumvent American tariffs. Notably, investments in infrastructure, such as Ternium’s $2.2 billion steel slab mill and related roadway developments in Nuevo Leon, further support this trend. Additionally, cities with a strong manufacturing presence are witnessing soaring rental prices in their industrial parks, exemplified by a 21% increase in major Mexican cities in 2022, driven partly by big players like Tesla seeking the benefits of lower costs and geographical proximity to the U.S. market1.

Source: Arzamendia, O. (2019, May 22). Introducción al Análisis de series Cronológicas con python Y pandas. Medium. https://medium.com/datos-y-ciencia/introducci%C3%B3n-al-an%C3%A1lisis-de-series-cronol%C3%B3gicas-con-python-y-pandas-99fc8d4bb56d

1.5 III. Description of the Problem Situation

What is the problem situation? How to address the problem situation? The problem at hand revolves around understanding the behavior and predicting future trends of the Foreign Direct Investment Flows variable in the context of Mexico’s attractiveness for nearshoring. Time series analysis offers a structured approach to deciphering this puzzle. Initially, with only the “ied” data, one can delve into its historical patterns to unearth inherent trends, seasonality, or cyclicity. Essential diagnostics like checking for stationarity, a cornerstone for many time series models, will dictate whether transformations are required to stabilize the data. Additionally, autocorrelation checks can provide insights into the data’s lagged relationships, giving a hint of how past values might influence the future.

But the world isn’t all strict, so merely examining “ied” in isolation might not capture the complete picture. Vector Autoregression (VAR) models, which allow for the inclusion of multiple time-dependent variables, can come into play here. With VAR, we can introduce other pertinent variables that might interact with “ied”, reflecting a more interconnected understanding. This multivariate approach would enable businesses to see how changes in one variable might ripple across others, providing a holistic view of the intricate web of dependencies.

However, the idea is that while time series models can forecast based on historical patterns, they cannot inherently account for unforeseen external shocks, such as abrupt geopolitical events or sudden global crises (as for example COVID-19). This limitation underscores the importance of continually updating models with fresh data and perhaps coupling statistical forecasts with scenario planning. The latter can incorporate potential external events, ensuring a blend of data-driven predictions with qualitative assessments of the ever-evolving global landscape. By combining these techniques, stakeholders can be better positioned to understand and navigate the complexities of nearshoring trends in Mexico.

1.6 IV. Data and Methodology

Briefly describe the dataset’s selected variables The dataset comprises a blend of economic, societal, and environmental variables that offer a comprehensive snapshot of Mexico’s dynamics. Here’s a succinct description of the selected variables:

IED_Flujos: Captures the Foreign Direct Investment flows, reflecting investor confidence and Mexico’s attractiveness to foreign entities.

Exportaciones: Represents non-petroleum exports, signifying the country’s manufacturing and export prowess, inclusive of contributions from the Export Maquiladora Industry.

Empleo: Denotes the employment rate among the economically active populace, serving as a barometer for the job market’s vitality.

Educación: Indicates the average years of education the populace undergoes, providing insights into the workforce’s educational standard and potential productivity.

Salario_Diario: Reflects the daily minimum wage, a window into the basic living standard and labor cost in Mexico.

Innovación: Measures the country’s inventive spirit, represented by the rate of patents applied within its borders. Inseguridad_Robo & Inseguridad_Homicidio: These variables gauge societal safety by capturing the rates of violent robberies and homicides, respectively. Their values can deeply influence both the local quality of life and external investor sentiment.

Tipo_de_Cambio: Signifies the exchange rate, with its stability or volatility often mirroring broader economic health and policy dynamics.

Densidad_Carretera & Densidad_Población: These metrics, denoting infrastructure density and population distribution respectively, provide insights into the country’s developmental trajectory and demographic layout.

CO2_Emisiones: Presents an environmental perspective, marking the per capita carbon emissions—a key consideration in today’s sustainability-focused global ecosystem.

PIB_Per_Cápita: An adjusted representation of individual economic output, reflecting both the overall economic health and the typical living standard in the country.

INPC: Acts as a sentinel for inflationary trends via the Consumer Price Index, influencing both internal purchasing power and external economic perceptions.

Collectively, these variables create a multidimensional mosaic of Mexico, crucial for understanding its nearshoring potential and broader socio-economic landscape.

Plot the variable IED_Flujos using a time series format:

# Combine the 'Año' and 'Trimestre' columns into a single 'Date' column
sp_time$Date <- as.Date(paste0(sp_time$Año, "-", ifelse(sp_time$Trimestre == "I", "01-01",
                                                ifelse(sp_time$Trimestre == "II", "04-01",
                                                ifelse(sp_time$Trimestre == "III", "07-01",
                                                       "10-01")))), format="%Y-%m-%d")

# Plotting the IED_Flujos variable in a time series format
ggplot(sp_time, aes(x = Date, y = IED_Flujos)) +
  geom_line(color = "blue", size = 1) + 
  labs(title = "Time Series of IED_Flujos", 
       x = "Date", 
       y = "IED_Flujos (in Million Dollars)") + 
  theme_minimal()

1. decompose the time series data into trend, seasonal, and random components. Briefly, describe the decomposition time series plot. Do the time series data show a trend? Do the time series data show seasonality?

# Convert the data into a time series object
# Considering that the data is quarterly, frequency = 4
ts_IED <- ts(sp_time$IED_Flujos, start = c(1999, 1), frequency = 4)

# Decompose the time series
ts_decomposed <- decompose(ts_IED)

# Plot the decomposed time series
plot(ts_decomposed)

1. detect the presence of stationary

# Graphical Method
plot(ts_IED, main = "IED_Flujos Time Series", ylab = "Million Dollars")

# Conduct the ADF test
adf_result <- adf.test(ts_IED, alternative = "stationary")

# Print the result
print(adf_result)

## 
##  Augmented Dickey-Fuller Test
## 
## data:  ts_IED
## Dickey-Fuller = -4.1994, Lag order = 4, p-value = 0.01
## alternative hypothesis: stationary

1. detect the presence of serial autocorrelation

# Ljung-Box test
lb_result <- Box.test(ts_IED, type = "Ljung-Box")

# Print the result
print(lb_result)

## 
##  Box-Ljung test
## 
## data:  ts_IED
## X-squared = 0.028693, df = 1, p-value = 0.8655

Observations: The time series plot for IED_Flujos presents a mild upward trend, suggesting that values tend to increase over time. Concurrently, noticeable seasonality is evident, characterized by peaks consistently appearing around the middle of each year. The Augmented Dickey-Fuller Test provides a Dickey-Fuller statistic of -4.1994 with a p-value of 0.01. This low p-value indicates that the series is stationary in mean. Visually, however, the series appears to display fluctuations in variance over time. Lastly, the results from the Box-Ljung test, with an X-squared value of 0.028693 and a p-value of 0.8655, suggest that there’s no significant autocorrelation present in the data at lag 1.

1.7 V. Time Series Regression Analysis

1.7.1 a. Time Series Model 1

Estimate 2 different time series regression models. You might want to consider ARMA (p,q) and / or ARIMA (p,d,q).
1. ARMA (p,q) Model:

# Fit ARMA model
arma_model <- Arima(ts_IED, order=c(1,0,1))
summary(arma_model)

## Series: ts_IED 
## ARIMA(1,0,1) with non-zero mean 
## 
## Coefficients:
##           ar1     ma1       mean
##       -0.3272  0.3942  7032.6478
## s.e.   0.4053  0.3839   423.1518
## 
## sigma^2 = 16087631:  log likelihood = -931.19
## AIC=1870.38   AICc=1870.82   BIC=1880.63
## 
## Training set error measures:
##                    ME     RMSE      MAE       MPE     MAPE     MASE        ACF1
## Training set 1.331509 3947.771 2780.021 -31.52498 51.81354 1.076896 -0.03902898

1. ARIMA (p,d,q) Model:

# Fit ARIMA model 
arima_model <- Arima(ts_IED, order=c(1,1,1))
summary(arima_model)

## Series: ts_IED 
## ARIMA(1,1,1) 
## 
## Coefficients:
##           ar1      ma1
##       -0.0520  -0.9399
## s.e.   0.1072   0.0323
## 
## sigma^2 = 15868763:  log likelihood = -922.46
## AIC=1850.91   AICc=1851.17   BIC=1858.57
## 
## Training set error measures:
##                    ME     RMSE      MAE       MPE    MAPE     MASE        ACF1
## Training set 623.1432 3920.824 2632.375 -19.31158 47.1274 1.019703 -0.05334378

Based on diagnostic tests, compare the 2 estimated time series regression models, and select the results that you consider might generate the best forecast.
1. Model ARMA Diagnostic Tests:

# AIC for Model 1
cat("Model 1 AIC:", AIC(arma_model), "\n")

## Model 1 AIC: 1870.376

# Ljung-Box Test for Model 1
ljung_box_armamodel <- Box.test(residuals(arma_model), type="Ljung-Box", lag=log(length(residuals(arma_model))))
cat("Model 1 Ljung-Box p-value:", ljung_box_armamodel$p.value, "\n")

## Model 1 Ljung-Box p-value: 0.0006915839

# Dickey-Fuller Test for Stationarity for Model 1
df_test_armamodel<- adf.test(residuals(arma_model))
cat("Model 1 Stationarity p-value:", df_test_armamodel$p.value, "\n")

## Model 1 Stationarity p-value: 0.01

# Standardized Residuals for Model 1
plot(residuals(arma_model)/sd(residuals(arma_model)), main="Standardized Residuals for Model 1")

# ACF of Residuals for Model 1
acf(residuals(arma_model), main="ACF of Residuals for Model 1")

1. Model 2 Diagnostic Tests:

# AIC for Model 2
cat("Model 2 AIC:", AIC(arima_model), "\n")

## Model 2 AIC: 1850.911

# Ljung-Box Test for Model 2
ljung_box_arimamodel <- Box.test(residuals(arima_model), type="Ljung-Box", lag=log(length(residuals(arima_model))))
cat("Model 2 Ljung-Box p-value:", ljung_box_arimamodel$p.value, "\n")

## Model 2 Ljung-Box p-value: 2.899821e-05

# Dickey-Fuller Test for Stationarity for Model 2
df_test_model2 <- adf.test(residuals(arima_model))
cat("Model 2 Stationarity p-value:", df_test_model2$p.value, "\n")

## Model 2 Stationarity p-value: 0.01

# Standardized Residuals for Model 2
plot(residuals(arima_model)/sd(residuals(arima_model)), main="Standardized Residuals for Model 2")

# ACF of Residuals for Model 2
acf(residuals(arima_model), main="ACF of Residuals for Model 2")

> Model Selection: The ARIMA model has a lower AIC, suggesting it fits the data better. * Both models have stationary residuals, which is good. * The ARIMA model’s Ljung-Box p-value suggests there might be some autocorrelation not captured, while the ARMA model seems to do a better job in this aspect. * The ARIMA model’s residuals seem to exhibit a stationary process with constant variance and mean, an improvement over the ARMA model. the information, it’s a close call. The ARIMA model seems to fit the data better (as suggested by the AIC), but the ARMA model handles autocorrelation better. Since the goal is to forecast and get a more precise fit, the ARIMA model might be the better choice due to its lower AIC.

By using the selected model, make a forecast for the next 5 periods. In doing so, include a time series plot showing your forecast.

# Forecast using the ARIMA model
forecast_results <- forecast(arima_model, h=5)

# Print the forecasted results
cat("\nForecasted Results for the Next 5 Periods:\n")

## 
## Forecasted Results for the Next 5 Periods:

print(forecast_results$mean)

##          Qtr1     Qtr2     Qtr3     Qtr4
## 2023 8227.637 7892.280 7909.715 7908.808
## 2024 7908.856

cat("\nLower Confidence Interval (95%):\n")

## 
## Lower Confidence Interval (95%):

print(forecast_results$lower[,2])  # Assuming 95% confidence interval; adjust index if different

##           Qtr1      Qtr2      Qtr3      Qtr4
## 2023 419.99682  84.38254  87.92538  74.36801
## 2024  61.72284

cat("\nUpper Confidence Interval (95%):\n")

## 
## Upper Confidence Interval (95%):

print(forecast_results$upper[,2])  # Assuming 95% confidence interval; adjust index if different

##          Qtr1     Qtr2     Qtr3     Qtr4
## 2023 16035.28 15700.18 15731.50 15743.25
## 2024 15755.99

# Plot the forecasts
plot(forecast_results, main="5-period Forecast Using ARIMA Model")

1.7.2 b. Time Series Model 2

From the time series dataset, select the explanatory variables that might explain the Nearshoring in Mexico.

Selected Variables for Nearshoring Model and Justifications:

1. Economic Indicators:

PIB_Per_Cápita:
- Represents the economic health and affluence of the country. A strong per capita GDP suggests a healthy economy, which can signal an environment conducive for business operations. Companies look for economically stable regions for nearshoring to ensure consistent operational conditions.
Tipo_de_Cambio:
- The currency exchange rate directly impacts cost-effectiveness. A favorable exchange rate can reduce operational costs for international businesses, making it crucial in the nearshoring decision process.
Exportaciones:
- Indicates Mexico’s capability in global trade, suggesting existing infrastructure and expertise in producing goods and services that meet international standards.

2. Labor Market Dynamics:

Empleo & Salario_Diario:
- Employment rates combined with daily wages present the labor market’s health. While employment rates can signal economic stability, daily wages provide insight into the labor cost. Both are vital as businesses consider the availability and cost of labor in nearshoring decisions.
Educación:
- Companies in specialized industries prioritize regions with an educated workforce. This variable can indicate the availability of skilled labor, crucial for sectors requiring technical expertise.

3. Infrastructure & Logistics:

Densidad_Carretera:
- An efficient transportation system is vital, especially for manufacturing businesses. Road density can indicate the country’s logistical capabilities, affecting the ease and cost of transporting goods.

By considering these variables, we can understand the primary drivers influencing nearshoring decisions in Mexico. They provide a holistic view, capturing economic, labor, and infrastructural aspects essential for business operations.

Describe the hypothetical relationship / impact between each selected factor and the dependent variable IED_Flujos. For example, how does the exchange rate increase / reduce the foreign direct investment flows in Mexico? In describing the above relationships, please include a time series plot that displays the selected variables’ performance over the time period.

sp_data$Date <- as.Date(paste0(sp_data$periodo, "-01-01"))

PIB_Per_Cápita (GDP per Capita):
- A higher GDP per capita typically indicates a prosperous and stable economy. This prosperity can be attractive for foreign investors, as it suggests a thriving market with potential for growth and returns. Thus, an increase in PIB_Per_Cápita could be associated with an increase in IED_Flujos.

ggplot(sp_data, aes(x = Date, y = PIB_Per_Capita)) + 
  geom_line(color = "blue") + 
  labs(title = "Time Series of PIB_Per_Cápita", 
       x = "Date", 
       y = "PIB_Per_Cápita") + 
  theme_minimal()

Tipo_de_Cambio (Exchange Rate):
- The exchange rate can have a dual effect on FDI. A weaker local currency can make investments cheaper for foreign entities, potentially increasing IED_Flujos. However, a very volatile exchange rate can be perceived as economic instability, which might deter potential investors.

ggplot(sp_data, aes(x = Date, y = Tipo_de_Cambio)) + 
  geom_line(color = "red") + 
  labs(title = "Time Series of Tipo_de_Cambio", 
       x = "Date", 
       y = "Tipo_de_Cambio") + 
  theme_minimal()

Inseguridad_Robo and Inseguridad_Homicidio (Security Issues):
- An increase in crime rates, whether robbery or homicide, can deter foreign investments. High crime rates can signal instability and increased operational risks for businesses, leading to a potential decrease in IED_Flujos.

ggplot(sp_data, aes(x = Date)) + 
  geom_line(aes(y = Inseguridad_Robo, color="Robo")) +
  geom_line(aes(y = Inseguridad_Homicidio, color="Homicidio")) + 
  labs(title = "Time Series of Inseguridad Variables", 
       x = "Date", 
       y = "Count") + 
  scale_color_manual(values=c(Robo="green", Homicidio="purple")) +
  theme_minimal()

Innovación (Innovation, measured as Patent Rates):
- A higher rate of patent applications can be an indicator of an innovative and dynamic economic environment. This can be attractive for foreign investors, especially those from sectors that thrive on innovation, potentially leading to an increase in IED_Flujos.

ggplot(sp_data, aes(x = Date, y = Innovacion)) + 
  geom_line(color = "cyan") + 
  labs(title = "Time Series of Innovación", 
       x = "Date", 
       y = "Innovación") + 
  theme_minimal()

Salario_Diario (Daily Wages):
- Lower daily wages can reduce operational costs for businesses, making the country an attractive location for foreign investors, potentially increasing IED_Flujos. Conversely, very low wages might indicate a lower-skilled workforce or potential socio-economic challenges that could deter some investors.

ggplot(sp_data, aes(x = Date, y = Salario_Diario)) + 
  geom_line(color = "magenta") + 
  labs(title = "Time Series of Salario_Diario", 
       x = "Date", 
       y = "Salario_Diario") + 
  theme_minimal()

Estimate a VAR_Model that includes at least 1 explanatory factor that might affect the dependent variable IED_Flujos.

First, it’s essential to assess whether the variables under study are stationary or not

# Replacing NA values with median for each column
for(column in names(sp_data)) {
  if(is.numeric(sp_data[[column]])) {
    sp_data[[column]][is.na(sp_data[[column]])] <- median(sp_data[[column]], na.rm = TRUE)
  }
}

# Check if variables are or not stationary
adf.test(sp_data$IED_Flujos)             # Check stationarity for IED_Flujos

## 
##  Augmented Dickey-Fuller Test
## 
## data:  sp_data$IED_Flujos
## Dickey-Fuller = -3.0832, Lag order = 2, p-value = 0.1597
## alternative hypothesis: stationary

# List of variables to check for stationarity
vars_to_check <- c("IED_M", "Exportaciones", "Empleo", "Educacion", 
                   "Salario_Diario", "Innovacion", "Tipo_de_Cambio", 
                   "Densidad_Carretera", "Densidad_Poblacion", 
                   "CO2_Emisiones", "PIB_Per_Capita", "INPC")

# Checking for stationarity
lapply(vars_to_check, function(var){
  adf.test(sp_data[[var]])
})

## [[1]]
## 
##  Augmented Dickey-Fuller Test
## 
## data:  sp_data[[var]]
## Dickey-Fuller = -2.0122, Lag order = 2, p-value = 0.5677
## alternative hypothesis: stationary
## 
## 
## [[2]]
## 
##  Augmented Dickey-Fuller Test
## 
## data:  sp_data[[var]]
## Dickey-Fuller = -0.88228, Lag order = 2, p-value = 0.9379
## alternative hypothesis: stationary
## 
## 
## [[3]]
## 
##  Augmented Dickey-Fuller Test
## 
## data:  sp_data[[var]]
## Dickey-Fuller = -1.3805, Lag order = 2, p-value = 0.8084
## alternative hypothesis: stationary
## 
## 
## [[4]]
## 
##  Augmented Dickey-Fuller Test
## 
## data:  sp_data[[var]]
## Dickey-Fuller = 1.1162, Lag order = 2, p-value = 0.99
## alternative hypothesis: stationary
## 
## 
## [[5]]
## 
##  Augmented Dickey-Fuller Test
## 
## data:  sp_data[[var]]
## Dickey-Fuller = 6.0073, Lag order = 2, p-value = 0.99
## alternative hypothesis: stationary
## 
## 
## [[6]]
## 
##  Augmented Dickey-Fuller Test
## 
## data:  sp_data[[var]]
## Dickey-Fuller = -3.8976, Lag order = 2, p-value = 0.02874
## alternative hypothesis: stationary
## 
## 
## [[7]]
## 
##  Augmented Dickey-Fuller Test
## 
## data:  sp_data[[var]]
## Dickey-Fuller = -2.386, Lag order = 2, p-value = 0.4254
## alternative hypothesis: stationary
## 
## 
## [[8]]
## 
##  Augmented Dickey-Fuller Test
## 
## data:  sp_data[[var]]
## Dickey-Fuller = -2.4107, Lag order = 2, p-value = 0.4159
## alternative hypothesis: stationary
## 
## 
## [[9]]
## 
##  Augmented Dickey-Fuller Test
## 
## data:  sp_data[[var]]
## Dickey-Fuller = -1.8778, Lag order = 2, p-value = 0.6189
## alternative hypothesis: stationary
## 
## 
## [[10]]
## 
##  Augmented Dickey-Fuller Test
## 
## data:  sp_data[[var]]
## Dickey-Fuller = -2.1377, Lag order = 2, p-value = 0.5199
## alternative hypothesis: stationary
## 
## 
## [[11]]
## 
##  Augmented Dickey-Fuller Test
## 
## data:  sp_data[[var]]
## Dickey-Fuller = -2.5796, Lag order = 2, p-value = 0.3516
## alternative hypothesis: stationary
## 
## 
## [[12]]
## 
##  Augmented Dickey-Fuller Test
## 
## data:  sp_data[[var]]
## Dickey-Fuller = 2.4665, Lag order = 2, p-value = 0.99
## alternative hypothesis: stationary

Estimating the VAR model:

# Differencing the series and adding them to the original dataset
sp_data$IED_Flujos_diff <- c(NA, diff(sp_data$IED_Flujos))
sp_data$Tipo_Cambio_diff <- c(NA, diff(sp_data$Tipo_de_Cambio))
sp_data$Innovacion_diff <- c(NA, diff(sp_data$Innovacion))
sp_data$Salario_Diario_diff <- c(NA, diff(sp_data$Salario_Diario))
sp_data$PIB_Per_Capita_diff <- c(NA, diff(sp_data$PIB_Per_Capita))

# Constructing the differenced dataset
diff_data <- data.frame(
  IED_Flujos_diff = sp_data$IED_Flujos_diff,
  Tipo_Cambio_diff = sp_data$Tipo_Cambio_diff,
  Innovacion_diff = sp_data$Innovacion_diff,
  Salario_Diario_diff = sp_data$Salario_Diario_diff,
  PIB_Per_Capita_diff = sp_data$PIB_Per_Capita_diff
)

# Remove the first row because of NA values introduced by differencing
diff_data <- diff_data[-1, ]

# Preparing data for VAR modeling
VAR_ts <- as.data.frame(cbind(
  IED_Flujos_diff = diff_data$IED_Flujos_diff, 
  Tipo_Cambio_diff = diff_data$Tipo_Cambio_diff, 
  Innovacion_diff = diff_data$Innovacion_diff, 
  Salario_Diario_diff = diff_data$Salario_Diario_diff, 
  PIB_Per_Capita_diff = diff_data$PIB_Per_Capita_diff
))

# We are manually setting the lag order
# Estimating the VAR model using the specified number of lags
VAR_model <- VAR(VAR_ts, p=3, type="const")
summary(VAR_model)

## 
## VAR Estimation Results:
## ========================= 
## Endogenous variables: IED_Flujos_diff, Tipo_Cambio_diff, Innovacion_diff, Salario_Diario_diff, PIB_Per_Capita_diff 
## Deterministic variables: const 
## Sample size: 22 
## Log Likelihood: -476.861 
## Roots of the characteristic polynomial:
## 1.277 1.023 1.023 0.9517 0.9517 0.9439 0.9439 0.8641 0.8641 0.7378 0.7378 0.5115 0.5115 0.3395 0.241
## Call:
## VAR(y = VAR_ts, p = 3, type = "const")
## 
## 
## Estimation results for equation IED_Flujos_diff: 
## ================================================ 
## IED_Flujos_diff = IED_Flujos_diff.l1 + Tipo_Cambio_diff.l1 + Innovacion_diff.l1 + Salario_Diario_diff.l1 + PIB_Per_Capita_diff.l1 + IED_Flujos_diff.l2 + Tipo_Cambio_diff.l2 + Innovacion_diff.l2 + Salario_Diario_diff.l2 + PIB_Per_Capita_diff.l2 + IED_Flujos_diff.l3 + Tipo_Cambio_diff.l3 + Innovacion_diff.l3 + Salario_Diario_diff.l3 + PIB_Per_Capita_diff.l3 + const 
## 
##                          Estimate Std. Error t value Pr(>|t|)  
## IED_Flujos_diff.l1        -1.1440     0.4197  -2.726   0.0344 *
## Tipo_Cambio_diff.l1    -4535.4638  2124.1099  -2.135   0.0767 .
## Innovacion_diff.l1      1973.9568  3510.3912   0.562   0.5943  
## Salario_Diario_diff.l1     0.1369   592.1873   0.000   0.9998  
## PIB_Per_Capita_diff.l1     0.8810     1.3163   0.669   0.5282  
## IED_Flujos_diff.l2        -0.6836     0.5609  -1.219   0.2687  
## Tipo_Cambio_diff.l2     -404.1922  2682.2096  -0.151   0.8852  
## Innovacion_diff.l2      3004.3732  4073.0378   0.738   0.4886  
## Salario_Diario_diff.l2   502.3638   633.5768   0.793   0.4580  
## PIB_Per_Capita_diff.l2     0.4020     1.2886   0.312   0.7657  
## IED_Flujos_diff.l3        -0.2586     0.4318  -0.599   0.5710  
## Tipo_Cambio_diff.l3     -897.1591  2329.3556  -0.385   0.7134  
## Innovacion_diff.l3      1792.8779  4200.5978   0.427   0.6844  
## Salario_Diario_diff.l3  -163.7761   711.5913  -0.230   0.8256  
## PIB_Per_Capita_diff.l3    -0.1743     1.2471  -0.140   0.8934  
## const                   3187.8866  4914.4771   0.649   0.5406  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## 
## Residual standard error: 9150 on 6 degrees of freedom
## Multiple R-Squared: 0.7212,  Adjusted R-squared: 0.02405 
## F-statistic: 1.035 on 15 and 6 DF,  p-value: 0.5202 
## 
## 
## Estimation results for equation Tipo_Cambio_diff: 
## ================================================= 
## Tipo_Cambio_diff = IED_Flujos_diff.l1 + Tipo_Cambio_diff.l1 + Innovacion_diff.l1 + Salario_Diario_diff.l1 + PIB_Per_Capita_diff.l1 + IED_Flujos_diff.l2 + Tipo_Cambio_diff.l2 + Innovacion_diff.l2 + Salario_Diario_diff.l2 + PIB_Per_Capita_diff.l2 + IED_Flujos_diff.l3 + Tipo_Cambio_diff.l3 + Innovacion_diff.l3 + Salario_Diario_diff.l3 + PIB_Per_Capita_diff.l3 + const 
## 
##                          Estimate Std. Error t value Pr(>|t|)
## IED_Flujos_diff.l1      4.375e-05  7.160e-05   0.611    0.564
## Tipo_Cambio_diff.l1     2.237e-01  3.624e-01   0.617    0.560
## Innovacion_diff.l1     -3.555e-02  5.989e-01  -0.059    0.955
## Salario_Diario_diff.l1  7.361e-02  1.010e-01   0.729    0.494
## PIB_Per_Capita_diff.l1  1.201e-04  2.246e-04   0.535    0.612
## IED_Flujos_diff.l2      4.435e-06  9.569e-05   0.046    0.965
## Tipo_Cambio_diff.l2    -5.058e-01  4.576e-01  -1.105    0.311
## Innovacion_diff.l2     -3.450e-01  6.949e-01  -0.497    0.637
## Salario_Diario_diff.l2 -4.405e-02  1.081e-01  -0.408    0.698
## PIB_Per_Capita_diff.l2  8.037e-05  2.199e-04   0.366    0.727
## IED_Flujos_diff.l3      5.589e-05  7.366e-05   0.759    0.477
## Tipo_Cambio_diff.l3    -2.042e-01  3.974e-01  -0.514    0.626
## Innovacion_diff.l3      5.371e-01  7.167e-01   0.749    0.482
## Salario_Diario_diff.l3 -6.185e-02  1.214e-01  -0.509    0.629
## PIB_Per_Capita_diff.l3  5.381e-05  2.128e-04   0.253    0.809
## const                   3.819e-01  8.385e-01   0.455    0.665
## 
## 
## Residual standard error: 1.561 on 6 degrees of freedom
## Multiple R-Squared: 0.5911,  Adjusted R-squared: -0.4313 
## F-statistic: 0.5781 on 15 and 6 DF,  p-value: 0.8179 
## 
## 
## Estimation results for equation Innovacion_diff: 
## ================================================ 
## Innovacion_diff = IED_Flujos_diff.l1 + Tipo_Cambio_diff.l1 + Innovacion_diff.l1 + Salario_Diario_diff.l1 + PIB_Per_Capita_diff.l1 + IED_Flujos_diff.l2 + Tipo_Cambio_diff.l2 + Innovacion_diff.l2 + Salario_Diario_diff.l2 + PIB_Per_Capita_diff.l2 + IED_Flujos_diff.l3 + Tipo_Cambio_diff.l3 + Innovacion_diff.l3 + Salario_Diario_diff.l3 + PIB_Per_Capita_diff.l3 + const 
## 
##                          Estimate Std. Error t value Pr(>|t|)  
## IED_Flujos_diff.l1     -4.860e-05  4.489e-05  -1.083   0.3205  
## Tipo_Cambio_diff.l1    -4.486e-01  2.272e-01  -1.974   0.0958 .
## Innovacion_diff.l1     -4.543e-02  3.755e-01  -0.121   0.9077  
## Salario_Diario_diff.l1  1.065e-01  6.335e-02   1.681   0.1437  
## PIB_Per_Capita_diff.l1  2.292e-04  1.408e-04   1.628   0.1547  
## IED_Flujos_diff.l2      8.365e-07  6.000e-05   0.014   0.9893  
## Tipo_Cambio_diff.l2    -5.151e-02  2.869e-01  -0.180   0.8634  
## Innovacion_diff.l2      6.551e-01  4.357e-01   1.503   0.1834  
## Salario_Diario_diff.l2  2.787e-02  6.778e-02   0.411   0.6952  
## PIB_Per_Capita_diff.l2 -7.490e-05  1.379e-04  -0.543   0.6065  
## IED_Flujos_diff.l3      8.482e-06  4.619e-05   0.184   0.8604  
## Tipo_Cambio_diff.l3    -1.640e-01  2.492e-01  -0.658   0.5349  
## Innovacion_diff.l3     -4.717e-01  4.494e-01  -1.050   0.3343  
## Salario_Diario_diff.l3 -1.782e-01  7.612e-02  -2.341   0.0578 .
## PIB_Per_Capita_diff.l3 -7.449e-05  1.334e-04  -0.558   0.5968  
## const                   3.657e-01  5.257e-01   0.696   0.5127  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## 
## Residual standard error: 0.9788 on 6 degrees of freedom
## Multiple R-Squared: 0.7067,  Adjusted R-squared: -0.02666 
## F-statistic: 0.9636 on 15 and 6 DF,  p-value: 0.5604 
## 
## 
## Estimation results for equation Salario_Diario_diff: 
## ==================================================== 
## Salario_Diario_diff = IED_Flujos_diff.l1 + Tipo_Cambio_diff.l1 + Innovacion_diff.l1 + Salario_Diario_diff.l1 + PIB_Per_Capita_diff.l1 + IED_Flujos_diff.l2 + Tipo_Cambio_diff.l2 + Innovacion_diff.l2 + Salario_Diario_diff.l2 + PIB_Per_Capita_diff.l2 + IED_Flujos_diff.l3 + Tipo_Cambio_diff.l3 + Innovacion_diff.l3 + Salario_Diario_diff.l3 + PIB_Per_Capita_diff.l3 + const 
## 
##                          Estimate Std. Error t value Pr(>|t|)   
## IED_Flujos_diff.l1     -1.710e-04  1.469e-04  -1.164   0.2887   
## Tipo_Cambio_diff.l1     8.262e-01  7.436e-01   1.111   0.3091   
## Innovacion_diff.l1     -5.402e-01  1.229e+00  -0.440   0.6757   
## Salario_Diario_diff.l1  7.308e-02  2.073e-01   0.353   0.7365   
## PIB_Per_Capita_diff.l1 -3.070e-04  4.608e-04  -0.666   0.5301   
## IED_Flujos_diff.l2     -4.013e-04  1.964e-04  -2.043   0.0870 . 
## Tipo_Cambio_diff.l2    -7.303e-01  9.390e-01  -0.778   0.4663   
## Innovacion_diff.l2     -6.388e-02  1.426e+00  -0.045   0.9657   
## Salario_Diario_diff.l2  7.878e-01  2.218e-01   3.552   0.0120 * 
## PIB_Per_Capita_diff.l2  1.963e-06  4.511e-04   0.004   0.9967   
## IED_Flujos_diff.l3     -4.036e-04  1.512e-04  -2.670   0.0370 * 
## Tipo_Cambio_diff.l3    -1.473e+00  8.155e-01  -1.806   0.1209   
## Innovacion_diff.l3     -4.453e-03  1.471e+00  -0.003   0.9977   
## Salario_Diario_diff.l3  1.095e+00  2.491e-01   4.394   0.0046 **
## PIB_Per_Capita_diff.l3  8.653e-04  4.366e-04   1.982   0.0948 . 
## const                  -4.078e-01  1.721e+00  -0.237   0.8205   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## 
## Residual standard error: 3.203 on 6 degrees of freedom
## Multiple R-Squared: 0.9563,  Adjusted R-squared: 0.8471 
## F-statistic: 8.753 on 15 and 6 DF,  p-value: 0.006792 
## 
## 
## Estimation results for equation PIB_Per_Capita_diff: 
## ==================================================== 
## PIB_Per_Capita_diff = IED_Flujos_diff.l1 + Tipo_Cambio_diff.l1 + Innovacion_diff.l1 + Salario_Diario_diff.l1 + PIB_Per_Capita_diff.l1 + IED_Flujos_diff.l2 + Tipo_Cambio_diff.l2 + Innovacion_diff.l2 + Salario_Diario_diff.l2 + PIB_Per_Capita_diff.l2 + IED_Flujos_diff.l3 + Tipo_Cambio_diff.l3 + Innovacion_diff.l3 + Salario_Diario_diff.l3 + PIB_Per_Capita_diff.l3 + const 
## 
##                          Estimate Std. Error t value Pr(>|t|)  
## IED_Flujos_diff.l1      9.591e-02  1.040e-01   0.922   0.3919  
## Tipo_Cambio_diff.l1    -4.497e+02  5.263e+02  -0.854   0.4256  
## Innovacion_diff.l1      1.990e+02  8.697e+02   0.229   0.8266  
## Salario_Diario_diff.l1  1.058e+02  1.467e+02   0.721   0.4979  
## PIB_Per_Capita_diff.l1  7.182e-01  3.261e-01   2.202   0.0699 .
## IED_Flujos_diff.l2      2.339e-01  1.390e-01   1.684   0.1433  
## Tipo_Cambio_diff.l2     1.709e+03  6.645e+02   2.572   0.0422 *
## Innovacion_diff.l2      9.764e+02  1.009e+03   0.968   0.3706  
## Salario_Diario_diff.l2  1.599e+01  1.570e+02   0.102   0.9222  
## PIB_Per_Capita_diff.l2 -9.282e-01  3.193e-01  -2.907   0.0271 *
## IED_Flujos_diff.l3      2.225e-01  1.070e-01   2.080   0.0827 .
## Tipo_Cambio_diff.l3    -5.894e+01  5.771e+02  -0.102   0.9220  
## Innovacion_diff.l3     -2.385e+03  1.041e+03  -2.291   0.0618 .
## Salario_Diario_diff.l3 -6.354e+02  1.763e+02  -3.604   0.0113 *
## PIB_Per_Capita_diff.l3 -1.476e-01  3.090e-01  -0.478   0.6498  
## const                   1.884e+03  1.218e+03   1.548   0.1727  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## 
## Residual standard error: 2267 on 6 degrees of freedom
## Multiple R-Squared: 0.8424,  Adjusted R-squared: 0.4485 
## F-statistic: 2.139 on 15 and 6 DF,  p-value: 0.1783 
## 
## 
## 
## Covariance matrix of residuals:
##                     IED_Flujos_diff Tipo_Cambio_diff Innovacion_diff
## IED_Flujos_diff            83721738       -4551.7001       2504.3316
## Tipo_Cambio_diff              -4552           2.4369          0.3248
## Innovacion_diff                2504           0.3248          0.9581
## Salario_Diario_diff            1690           3.1698          1.0228
## PIB_Per_Capita_diff         1787583         273.6407        726.9152
##                     Salario_Diario_diff PIB_Per_Capita_diff
## IED_Flujos_diff                1689.603           1787582.6
## Tipo_Cambio_diff                  3.170               273.6
## Innovacion_diff                   1.023               726.9
## Salario_Diario_diff              10.261              2507.3
## PIB_Per_Capita_diff            2507.342           5139055.8
## 
## Correlation matrix of residuals:
##                     IED_Flujos_diff Tipo_Cambio_diff Innovacion_diff
## IED_Flujos_diff             1.00000         -0.31866          0.2796
## Tipo_Cambio_diff           -0.31866          1.00000          0.2125
## Innovacion_diff             0.27962          0.21255          1.0000
## Salario_Diario_diff         0.05764          0.63388          0.3262
## PIB_Per_Capita_diff         0.08618          0.07732          0.3276
##                     Salario_Diario_diff PIB_Per_Capita_diff
## IED_Flujos_diff                 0.05764             0.08618
## Tipo_Cambio_diff                0.63388             0.07732
## Innovacion_diff                 0.32620             0.32759
## Salario_Diario_diff             1.00000             0.34528
## PIB_Per_Capita_diff             0.34528             1.00000

Detect if the estimated VAR_Model residuals are stationary.

# Extracting residuals from the VAR model
VAR_model_residuals <- data.frame(residuals(VAR_model))

# Check stationarity for one of the residuals (repeat for others)
adf_result <- adf.test(VAR_model_residuals$IED_Flujos_diff)
print(adf_result)

## 
##  Augmented Dickey-Fuller Test
## 
## data:  VAR_model_residuals$IED_Flujos_diff
## Dickey-Fuller = -2.2091, Lag order = 2, p-value = 0.4927
## alternative hypothesis: stationary

The results are not stationary, as the p-value of 0.4927 is greater than the typical significance level (0.05).

Detect if the estimated VAR_Model residuals show serial autocorrelation.

# Perform the serial autocorrelation test on the VAR model residuals
Box.test(VAR_model_residuals$IED_Flujos_diff,lag=3,type="Ljung-Box")

## 
##  Box-Ljung test
## 
## data:  VAR_model_residuals$IED_Flujos_diff
## X-squared = 0.47912, df = 3, p-value = 0.9235

The residuals of the variable IED_Flujos_diff do not exhibit serial autocorrelation as the p-value is 0.9235, which is greater than common significance level (0.05).

Based on the regression results and diagnostic tests, select the VAR_Model that you consider might generate the best forecast.

In the transition from the first to the second model, PIB_Per_Capita_diff was removed due to its perceived insignificance. Additionally, the lag order was reduced from 3 to 2 in the second model to preserve more data entries from the limited dataset.

# Making a model 2 with only (ESTIMATED) significant variables
VAR_ts2 <- as.data.frame(cbind(
  IED_Flujos_diff = diff_data$IED_Flujos_diff, 
  Tipo_Cambio_diff = diff_data$Tipo_Cambio_diff, 
  Innovacion_diff = diff_data$Innovacion_diff, 
  Salario_Diario_diff = diff_data$Salario_Diario_diff
))

# We are manually setting the lag order
# Estimating the VAR model using the specified number of lags
# Diminish lag since as there are only 22 entries there are 3 entries being loss
VAR_model2 <- VAR(VAR_ts2, p=2, type="const")
summary(VAR_model2)

## 
## VAR Estimation Results:
## ========================= 
## Endogenous variables: IED_Flujos_diff, Tipo_Cambio_diff, Innovacion_diff, Salario_Diario_diff 
## Deterministic variables: const 
## Sample size: 23 
## Log Likelihood: -354.539 
## Roots of the characteristic polynomial:
## 1.228 0.7333 0.7333 0.7082 0.5884 0.5884 0.4102 0.3658
## Call:
## VAR(y = VAR_ts2, p = 2, type = "const")
## 
## 
## Estimation results for equation IED_Flujos_diff: 
## ================================================ 
## IED_Flujos_diff = IED_Flujos_diff.l1 + Tipo_Cambio_diff.l1 + Innovacion_diff.l1 + Salario_Diario_diff.l1 + IED_Flujos_diff.l2 + Tipo_Cambio_diff.l2 + Innovacion_diff.l2 + Salario_Diario_diff.l2 + const 
## 
##                          Estimate Std. Error t value Pr(>|t|)   
## IED_Flujos_diff.l1        -0.9836     0.2537  -3.877  0.00168 **
## Tipo_Cambio_diff.l1    -3587.2585  1384.2498  -2.591  0.02133 * 
## Innovacion_diff.l1      3193.9608  1612.2089   1.981  0.06757 . 
## Salario_Diario_diff.l1  -222.1135   292.7729  -0.759  0.46064   
## IED_Flujos_diff.l2        -0.3795     0.2574  -1.474  0.16253   
## Tipo_Cambio_diff.l2      480.8237  1392.3345   0.345  0.73498   
## Innovacion_diff.l2      1848.9845  2321.0299   0.797  0.43897   
## Salario_Diario_diff.l2   266.5967   329.1393   0.810  0.43150   
## const                   3302.0301  2339.5681   1.411  0.17997   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## 
## Residual standard error: 6690 on 14 degrees of freedom
## Multiple R-Squared: 0.6544,  Adjusted R-squared: 0.4569 
## F-statistic: 3.314 on 8 and 14 DF,  p-value: 0.02421 
## 
## 
## Estimation results for equation Tipo_Cambio_diff: 
## ================================================= 
## Tipo_Cambio_diff = IED_Flujos_diff.l1 + Tipo_Cambio_diff.l1 + Innovacion_diff.l1 + Salario_Diario_diff.l1 + IED_Flujos_diff.l2 + Tipo_Cambio_diff.l2 + Innovacion_diff.l2 + Salario_Diario_diff.l2 + const 
## 
##                          Estimate Std. Error t value Pr(>|t|)
## IED_Flujos_diff.l1     -1.519e-06  5.120e-05  -0.030    0.977
## Tipo_Cambio_diff.l1     1.325e-01  2.794e-01   0.474    0.643
## Innovacion_diff.l1      3.337e-01  3.254e-01   1.026    0.323
## Salario_Diario_diff.l1  3.828e-02  5.908e-02   0.648    0.528
## IED_Flujos_diff.l2     -3.491e-05  5.194e-05  -0.672    0.512
## Tipo_Cambio_diff.l2    -3.002e-01  2.810e-01  -1.068    0.303
## Innovacion_diff.l2      4.560e-02  4.684e-01   0.097    0.924
## Salario_Diario_diff.l2 -1.099e-01  6.642e-02  -1.655    0.120
## const                   8.151e-01  4.721e-01   1.726    0.106
## 
## 
## Residual standard error: 1.35 on 14 degrees of freedom
## Multiple R-Squared: 0.2888,  Adjusted R-squared: -0.1175 
## F-statistic: 0.7108 on 8 and 14 DF,  p-value: 0.6791 
## 
## 
## Estimation results for equation Innovacion_diff: 
## ================================================ 
## Innovacion_diff = IED_Flujos_diff.l1 + Tipo_Cambio_diff.l1 + Innovacion_diff.l1 + Salario_Diario_diff.l1 + IED_Flujos_diff.l2 + Tipo_Cambio_diff.l2 + Innovacion_diff.l2 + Salario_Diario_diff.l2 + const 
## 
##                          Estimate Std. Error t value Pr(>|t|)
## IED_Flujos_diff.l1     -4.885e-05  4.154e-05  -1.176    0.259
## Tipo_Cambio_diff.l1    -2.156e-01  2.266e-01  -0.951    0.358
## Innovacion_diff.l1      1.539e-01  2.640e-01   0.583    0.569
## Salario_Diario_diff.l1 -7.522e-03  4.794e-02  -0.157    0.878
## IED_Flujos_diff.l2     -1.508e-05  4.214e-05  -0.358    0.726
## Tipo_Cambio_diff.l2    -1.665e-01  2.280e-01  -0.730    0.477
## Innovacion_diff.l2      2.200e-01  3.800e-01   0.579    0.572
## Salario_Diario_diff.l2  3.931e-02  5.389e-02   0.730    0.478
## const                   1.278e-01  3.831e-01   0.334    0.744
## 
## 
## Residual standard error: 1.095 on 14 degrees of freedom
## Multiple R-Squared: 0.1626,  Adjusted R-squared: -0.316 
## F-statistic: 0.3397 on 8 and 14 DF,  p-value: 0.9355 
## 
## 
## Estimation results for equation Salario_Diario_diff: 
## ==================================================== 
## Salario_Diario_diff = IED_Flujos_diff.l1 + Tipo_Cambio_diff.l1 + Innovacion_diff.l1 + Salario_Diario_diff.l1 + IED_Flujos_diff.l2 + Tipo_Cambio_diff.l2 + Innovacion_diff.l2 + Salario_Diario_diff.l2 + const 
## 
##                          Estimate Std. Error t value Pr(>|t|)   
## IED_Flujos_diff.l1     -3.555e-05  2.053e-04  -0.173  0.86500   
## Tipo_Cambio_diff.l1     7.385e-01  1.120e+00   0.659  0.52041   
## Innovacion_diff.l1      1.024e-01  1.305e+00   0.078  0.93856   
## Salario_Diario_diff.l1  5.825e-01  2.369e-01   2.458  0.02759 * 
## IED_Flujos_diff.l2      2.601e-05  2.083e-04   0.125  0.90242   
## Tipo_Cambio_diff.l2    -3.934e-01  1.127e+00  -0.349  0.73221   
## Innovacion_diff.l2     -5.371e-02  1.878e+00  -0.029  0.97759   
## Salario_Diario_diff.l2  8.074e-01  2.664e-01   3.031  0.00898 **
## const                  -2.510e-01  1.893e+00  -0.133  0.89643   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## 
## Residual standard error: 5.414 on 14 degrees of freedom
## Multiple R-Squared: 0.7106,  Adjusted R-squared: 0.5452 
## F-statistic: 4.297 on 8 and 14 DF,  p-value: 0.008547 
## 
## 
## 
## Covariance matrix of residuals:
##                     IED_Flujos_diff Tipo_Cambio_diff Innovacion_diff
## IED_Flujos_diff          44754717.3       -2.532e+03      2363.93823
## Tipo_Cambio_diff            -2532.2        1.823e+00         0.07743
## Innovacion_diff              2363.9        7.743e-02         1.19977
## Salario_Diario_diff           900.7       -9.935e-01        -1.57269
##                     Salario_Diario_diff
## IED_Flujos_diff                900.7255
## Tipo_Cambio_diff                -0.9935
## Innovacion_diff                 -1.5727
## Salario_Diario_diff             29.3101
## 
## Correlation matrix of residuals:
##                     IED_Flujos_diff Tipo_Cambio_diff Innovacion_diff
## IED_Flujos_diff             1.00000         -0.28036         0.32260
## Tipo_Cambio_diff           -0.28036          1.00000         0.05236
## Innovacion_diff             0.32260          0.05236         1.00000
## Salario_Diario_diff         0.02487         -0.13593        -0.26521
##                     Salario_Diario_diff
## IED_Flujos_diff                 0.02487
## Tipo_Cambio_diff               -0.13593
## Innovacion_diff                -0.26521
## Salario_Diario_diff             1.00000

# Checking to see if stationary
VAR_model_residuals2 <- data.frame(residuals(VAR_model2))
adf_result <- adf.test(VAR_model_residuals2$IED_Flujos_diff)
print(adf_result)

## 
##  Augmented Dickey-Fuller Test
## 
## data:  VAR_model_residuals2$IED_Flujos_diff
## Dickey-Fuller = -2.7697, Lag order = 2, p-value = 0.2792
## alternative hypothesis: stationary

Briefly interpret the regression results. That is, is there a statistically significant relationship between the explanatory variable(s) and the main dependent variable?

The regression results for the variable IED_Flujos_diff as the dependent variable reveal significant relationships with its own lagged value and the lagged value of Tipo_Cambio_diff. A one-unit increase in the first lag of IED_Flujos_diff leads to a decrease of approximately 0.9836 units in its current value, and this is statistically significant (p-value = 0.00168). Additionally, a unit increase in the lagged value of Tipo_Cambio_diff results in a substantial decrease in IED_Flujos_diff by around 3587 units, also significant at the 0.05 level.

For the other dependent variables (Tipo_Cambio_diff, Innovacion_diff, and Salario_Diario_diff), their relationships with the lagged explanatory variables aren’t as robust. Most relationships aren’t statistically significant, as indicated by their high p-values. However, it’s noteworthy that for the equation of Salario_Diario_diff, its first lag is statistically significant and positively related.

The main variable of interest, IED_Flujos_diff, appears to be most influenced by its own past values and the past values of Tipo_Cambio_diff. However, the other dependent variables in the system show weaker relationships with their respective explanatory variables.

Key Points: * IED_Flujos_diff is significantly influenced by its own lagged value and the lagged value of Tipo_Cambio_diff. * Most relationships in the system for the other dependent variables are not statistically significant. * Past values play an essential role in predicting the current state of these economic variables.

Is there an instantaneous causality between IED_Flujos and the selected explanatory variables? Estimate a Granger Causality Test to either reject or fail to reject the hypothesis of instantaneous causality.

# Granger causality testing each variable against all the others.
# There could be a unidirectional, bidirectional, or no causality relationships between variables.
granger_IED_Flujos <- causality(VAR_model2, cause="IED_Flujos_diff")
print(granger_IED_Flujos)

## $Granger
## 
##  Granger causality H0: IED_Flujos_diff do not Granger-cause
##  Tipo_Cambio_diff Innovacion_diff Salario_Diario_diff
## 
## data:  VAR object VAR_model2
## F-Test = 0.49753, df1 = 6, df2 = 56, p-value = 0.8075
## 
## 
## $Instant
## 
##  H0: No instantaneous causality between: IED_Flujos_diff and
##  Tipo_Cambio_diff Innovacion_diff Salario_Diario_diff
## 
## data:  VAR object VAR_model2
## Chi-squared = 3.8121, df = 3, p-value = 0.2825

The Granger causality test suggests that variations in Tipo_Cambio_diff, Innovacion_diff, and Salario_Diario_diff do not Granger-cause variations in IED_Flujos_diff since the p-value is 0.8075, which is greater than a conventional significance level (0.05). The Instantaneous Causality Test also indicates that there’s no instantaneous causality between IED_Flujos_diff and the selected explanatory variables, with a p-value of 0.2825. Thus, past values of the explanatory variables don’t provide any useful information in predicting IED_Flujos_diff, and there’s no contemporaneous causality either.

Based on the selected VAR_Model, forecast the increasing / decreasing trend of FDI inflows in Mexico for the next 5 periods. Display the forecast in a time series plot.

The cumulative sum is used in these forecasted differences to approximate the original undifferenced data. In this way the value is adjuststed by adding the most recent known value from the original data series to provide forecasted values in the original scale.

# Forecasting for model 2
forecast_2_diff <- predict(VAR_model2, n.ahead=4)$fcst$IED_Flujos_diff[,1]

# Cumulatively summing up the forecasted differences
cumsum_forecast_2 <- cumsum(forecast_2_diff)

# Adding the last known value to the cumulative sums to get the forecast in original scale
last_known_value <- tail(diff_data$IED_Flujos, n=1)
forecast_2_original_scale <- last_known_value + cumsum_forecast_2

# Combine the original and forecasted values
combined_series <- c(sp_data$IED_Flujos, forecast_2_original_scale)

# Create a combined time series for plotting
time_series <- 1:length(combined_series)

# Plot the combined series
plot(time_series, combined_series, type="l", main="Forecast of IED Flujos", xlab="Time", ylab="Value", col="blue", lwd=2)

# Highlight the forecasted portion in red
lines((length(sp_data$IED_Flujos)+1):length(combined_series), forecast_2_original_scale, col="red", lwd=2)

# Add a legend to differentiate between actual and forecasted values, moved to top left
legend("topleft", legend=c("Original Data", "Forecast"), col=c("blue", "red"), lty=1, lwd=2)

1.8 VI. Conclusions and Recommendations

Briefly describe the main insights from previous sections.

In our exploration of the economic indicators of Mexico, the data’s underlying patterns became more apparent. The IED_Flujos metric presents a recurring theme: there’s a noticeable peak every year around its midpoint. This suggests a periodic event or change in conditions impacting this metric, and it’s a phenomenon that warrants further investigation. This oscillation, though with an overall sense of direction, isn’t straightforward, with multiple variables in play causing these shifts.

One tool that provided deeper insight into these relationships was the Vector Autoregression (VAR) model. This model highlighted that the current state of IED_Flujos doesn’t just happen in isolation—it’s significantly tied to its own past behavior and to the history of the Tipo_Cambio_diff. This interdependence means that to understand one, you need to look at the other. While other metrics in the dataset were harder to pin down in terms of predictability and causation, these two variables demonstrated a clear dance with each other over time.

Building on the VAR analysis, we also ventured into forecasting the trend for the upcoming periods. Starting from the last known value of 36215.37, the predictions for IED_Flujos for periods 27 to 30 are increasingly positive. The values are projected to be 39272.73, 42757.51, 49373.68, and 54356.47 respectively. This upward trajectory suggests that, at least in the short term, there’s a positive outlook for this particular economic metric.

Based on the selected results, please share at least 1 recommendation that address the problem situation.

The evident interconnectedness between IED_Flujos and Tipo_Cambio_diff as revealed by the VAR model suggests a more integrated approach to economic planning. Understanding the causal factors behind the periodic mid-year peak in the IED_Flujos data can be a starting point for policymakers and stakeholders. Furthermore, given the positive forecast for IED_Flujos in the upcoming periods, it might be beneficial to focus on consolidating this growth. Ensuring that factors influencing the Tipo_Cambio_diff remain stable could be key to maintaining this positive momentum. The predictions provide a favorable short-term economic outlook, and leveraging this knowledge can help in better financial and policy planning.

Evidence 2

David Dominguez - A01570975

2023-09-09