Briefly describe what is time series analysis.
According to TIBCO “Time series analysis is a technique in statistics that deals with time series data and trend analysis, it follows periodic time intervals that have been measured in regular time intervals or have been collected in particular time intervals.”It is an analysis that, through a specific period of time, seeks to know the sequence of data points collected during a time interval. What distinguishes time series data from other data is that the analysis can show how variables change over time, which tells us that time plays a big role as they show us how data is attached over time. of your result. Thus providing us with additional information and creating an order between the data. It should be noted that it is a vital tool for statistics and data analysis since it can be used in a wide variety of fields, such as finance, meteorology, economics, among others.
What are the latest trends of Nearshoring in Mexico? Please cite at least 1 external reference to develop your explanation. Complementing evidence 1, the topic of nearshoring is becoming more and more popular as companies look for new ways to innovate their sunmsintras chains and implement distribution centers in profitable locations, one of the advantages of nearshoring are cost savings, speed to market and quality control. Nearshoring is boosting job creation and the Mexican and US economy, particularly in the manufacturing, financial and IT sectors. learning. (Abdel Haisam, 2023) In my opinion digital transformation is an upcoming opportunity that focusing on starting the digital presence and capability of the company with a digital infrastructure, IT and AI are becoming a fast growing trend on the rise for company in the United States and China and many other countries.
What is the problem situation? How to address the problem situation? There’s many factors to take into consideration when answering these questions, but in my opinion these questions seek to answer Maria concerns and questions by analyzing the data through the definition of Econometrics. The problem situation will be that due the activity of China and the war on trade with the United states, also the active war with Russia and Ukraine, Maria wants to find another country were is attractive and offers innovation and development seeking alternatives third option for country, Maria came across Mexico and found out about its potential of production and recent investment like Tesla, investment 10 billion dollars to create a production center. This event and many others have popped up the question regarding all the factors, and know if Mexico is a good country to implement a near-shoring strategy due to its proximity with the United States market.
The aim is to have a model to predict the events base on time series, so we can know how the behavior over time has impact the variables and estimates if entering the Mexican laboral production market is a good idea according to the variable dependent on “IED_MXN”, since it is a way of measuring whether nearshoring is occurring in the country, to add to this showing the time series format, adding an ARMA and ARIMA model and lastly a VAR model that this will help us forecast when using multiple time series variables estimating single regression model, the estimation of the VAR regression model is y OLS.
#Installing libraries
library(readxl)
library(tidyverse)
library(ggplot2)
library(corrplot)
library(gmodels)
library(effects)
library(stargazer)
library(olsrr)
#library(kableExtra)
library(jtools)
library(fastmap)
library(Hmisc)
library(naniar)
library(glmnet)
library(caret)
library(car)
library(lmtest)
library(dplyr)
library(xts)
library(zoo)
library(tseries)
library(stats)
library(forecast)
library(astsa)
library(corrplot)
library(AER)
library(dynlm)
library(vars)
#library(mFilter)
library(TSstudio)
library(tidyverse)
library(sarima)
library(stargazer)
library(forecast)
bd <-read.csv("C:\\Users\\sebastian\\Downloads\\SP_SeriesTiempo.csv")
bd1<- read.csv("C:\\Users\\sebastian\\Downloads\\sp_data.csv")
This time series file of the variable “FDI_flows” has values from quarterly periods of a data set that records the evolution of Foreign Direct Investment (FDI). The variable “FDI_flows” generally refers to the amount of money that enters or is sold from the country as foreign direct investment in a specific quarter, which is used to evaluate near shoring in Mexico. The periods are broken down on a quarterly basis, a conversion of each year has 4 quarters.
# setting time series format
bd$date <- as.yearmon(bd$periodo, format="%m/%d/%Y")
IEDF <-ts(bd$IED_Flujos,frequency=4,start=c(1999,1))
# Descriptive stadistics of the dependet variable
summary(bd$IED_Flujos)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1341 4351 6238 7036 8053 22794
plot(bd$date, bd$IED_Flujos, type="l", col="red", lwd=2, xlab="Time Period", ylab="Stock price", main="IED flujos Trimestral")
In this graph we can see the behavior of FDI_Flows on a quarterly basis from 1999 to 2023.
# Decompose a time series
# 1) observed: data observations
# 2) trend: increasing / decreasing value of data observations
# 3) seasonality: repeating short-term cycle in time series
# 4) noise: random variation in time series
flujo_time<-decompose(IEDF)
plot(flujo_time)
Briefly, describe the decomposition time series plot.
Do the time series data show a trend?
Based on the we can see that it is possible to see a trend with a lot of volatility since in the periods we can see how it rises and falls drastically, as the periods we can see that during the quarters it has very high and low peaks. At a far away look we can infer that has a positive trend, from the moment it started to the end date, obviously theirs high and lows but if we see from behind we can see that over time in average it has gone up since it firs quarter, also there’s drastically increase in late 2014s, then it goes down and remains in some way constant.
Do the time series data show seasonality?
As we are observing a trimester period, the graph is very detailed, since over time we can see some high and very low peaks, this high pattern stands out throughout the year, since the graph has the same behavior over time, which at first glance can be seen to be stationary, over time this pattern remains constant.
# Stationary Test
# H0: Non-stationary and HA: Stationary. p-values < 0.05 reject the H0.
adf.test(bd$IED_Flujos)
##
## Augmented Dickey-Fuller Test
##
## data: bd$IED_Flujos
## Dickey-Fuller = -4.1994, Lag order = 4, p-value = 0.01
## alternative hypothesis: stationary
# p-values < 0.05 reject the H0.The time series data is stationary.
# Serial Autocorrelation
acf(bd$IED_Flujos,main="Significant Autocorrelations")
# There is not much serial autocorrelation despite the number of lags of the variable
a. Time Series Model 1 - Estimate 2 different time series regression models. You might want to consider ARMA (p,q) and / or ARIMA (p,d,q).
plot(bd$date,bd$IED_Flujos,type="l",col="blue",lwd=2,xlab="Time Period",ylab="IED_Flujos",main="IED_Flujos")
plot(bd$date,log(bd$IED_Flujos),type="l",col="red",lwd=2,xlab="Time Period",ylab="IED_Flujos",main="IED_Flujos")
plot(diff(log(bd$IED_Flujos)),type="l",ylab="first order difference",main = "Differences IED_Flujos")
adf.test(log(bd$IED_Flujos))
##
## Augmented Dickey-Fuller Test
##
## data: log(bd$IED_Flujos)
## Dickey-Fuller = -3.7217, Lag order = 4, p-value = 0.02635
## alternative hypothesis: stationary
# # The p value of 0.02635 is smaller than 0.05, which this means we reject H0, stationary
adf.test(diff(log(bd$IED_Flujos)))
## Warning in adf.test(diff(log(bd$IED_Flujos))): p-value smaller than printed
## p-value
##
## Augmented Dickey-Fuller Test
##
## data: diff(log(bd$IED_Flujos))
## Dickey-Fuller = -5.9411, Lag order = 4, p-value = 0.01
## alternative hypothesis: stationary
## p value is 0.01 is less than 0.05 which means we reject H0, meaning it is stationary.
# Model 2 ARIMA 1
IED_ARIMA <- Arima(log(bd$IED_Flujos), order = c(2, 1, 1))
print(IED_ARIMA)
## Series: log(bd$IED_Flujos)
## ARIMA(2,1,1)
##
## Coefficients:
## ar1 ar2 ma1
## 0.0012 -0.415 -0.8601
## s.e. 0.1056 0.102 0.0708
##
## sigma^2 = 0.246: log likelihood = -67.79
## AIC=143.59 AICc=144.03 BIC=153.8
# plot model arima
plot(IED_ARIMA$residuals, main = "ARIMA(2,1,1) - IED Flujos")
acf(IED_ARIMA$residuals, main = "ACF - ARIMA 2,1,1")
Box.test(IED_ARIMA$residuals, lag = 1, type = "Ljung-Box")
##
## Box-Ljung test
##
## data: IED_ARIMA$residuals
## X-squared = 0.14398, df = 1, p-value = 0.7044
adf.test(IED_ARIMA$residuals)
## Warning in adf.test(IED_ARIMA$residuals): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: IED_ARIMA$residuals
## Dickey-Fuller = -4.646, Lag order = 4, p-value = 0.01
## alternative hypothesis: stationary
# p value 0.01 is less than 0.05, meaning is stationary.
# Model 2 ARIMA2
IED_ARIMA2 <- Arima(bd$IED_Flujos, order = c(0, 1, 2))
print(IED_ARIMA2)
## Series: bd$IED_Flujos
## ARIMA(0,1,2)
##
## Coefficients:
## ma1 ma2
## -1.0896 0.1489
## s.e. 0.1718 0.1724
##
## sigma^2 = 15775204: log likelihood = -922.22
## AIC=1850.44 AICc=1850.71 BIC=1858.11
#Plot model
plot(IED_ARIMA2$residuals, main = "ARIMA(0,1,2) - IED Flujos")
acf(IED_ARIMA2$residuals, main = "ACF - ARIMA (0,1,2) ")
Box.test(IED_ARIMA2$residuals, lag = 1, type = "Ljung-Box")
##
## Box-Ljung test
##
## data: IED_ARIMA2$residuals
## X-squared = 0.0083572, df = 1, p-value = 0.9272
adf.test(IED_ARIMA2$residuals)
## Warning in adf.test(IED_ARIMA2$residuals): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: IED_ARIMA2$residuals
## Dickey-Fuller = -4.4229, Lag order = 4, p-value = 0.01
## alternative hypothesis: stationary
# our p value is less than 0.05, meaning its stationary
# Model 3 ARMA
summary(IED_ARMA<-arma(log(bd$IED_Flujos),order=c(1,1)))
##
## Call:
## arma(x = log(bd$IED_Flujos), order = c(1, 1))
##
## Model:
## ARMA(1,1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.41237 -0.35244 -0.00571 0.27709 1.50759
##
## Coefficient(s):
## Estimate Std. Error t value Pr(>|t|)
## ar1 -0.2976 0.2271 -1.310 0.19003
## ma1 0.5173 0.1999 2.588 0.00967 **
## intercept 11.3149 1.9794 5.716 1.09e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Fit:
## sigma^2 estimated as 0.2688, Conditional Sum-of-Squares = 25.26, AIC = 152.3
# plot model
# plot(IED_ARMA)
# se tarda mucho en mostrar las graficas
IED_estimated<-exp(IED_ARMA$fitted.values)
plot(IED_estimated)
IED_ARMA_residuales<- IED_ARMA$residuals
Box.test(IED_ARMA_residuales,lag=5,type="Ljung-Box")
##
## Box-Ljung test
##
## data: IED_ARMA_residuales
## X-squared = 13.689, df = 5, p-value = 0.01771
# p-value is > 0.05 indicating that ARMA Model does show residual serial autocorrelation.
IED_ARMA$residuals <- na.omit(IED_ARMA$residuals)
adf.test(IED_ARMA$residuals)
##
## Augmented Dickey-Fuller Test
##
## data: IED_ARMA$residuals
## Dickey-Fuller = -3.644, Lag order = 4, p-value = 0.03336
## alternative hypothesis: stationary
#Show summary
summary(IED_ARMA)
##
## Call:
## arma(x = log(bd$IED_Flujos), order = c(1, 1))
##
## Model:
## ARMA(1,1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.41237 -0.35244 -0.00571 0.27709 1.50759
##
## Coefficient(s):
## Estimate Std. Error t value Pr(>|t|)
## ar1 -0.2976 0.2271 -1.310 0.19003
## ma1 0.5173 0.1999 2.588 0.00967 **
## intercept 11.3149 1.9794 5.716 1.09e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Fit:
## sigma^2 estimated as 0.2688, Conditional Sum-of-Squares = 25.26, AIC = 152.3
summary(IED_ARMA)
##
## Call:
## arma(x = log(bd$IED_Flujos), order = c(1, 1))
##
## Model:
## ARMA(1,1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.41237 -0.35244 -0.00571 0.27709 1.50759
##
## Coefficient(s):
## Estimate Std. Error t value Pr(>|t|)
## ar1 -0.2976 0.2271 -1.310 0.19003
## ma1 0.5173 0.1999 2.588 0.00967 **
## intercept 11.3149 1.9794 5.716 1.09e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Fit:
## sigma^2 estimated as 0.2688, Conditional Sum-of-Squares = 25.26, AIC = 152.3
## The p-value of Ljung box is 0.01771 this havng a grader result than 0.05, means that it does not have serial autocorrelation. ADF is telling us that ARMA is stationary since the p-value is 0.01 and its lower than 0.05.
# model 1
AIC(IED_ARIMA)
## [1] 143.5873
# Arima 1
f_arima1<- fitted(IED_ARIMA)
r_arima1 <- sqrt(mean((f_arima1 - bd$IED_Flujos)^2))
print(r_arima1)
## [1] 8065.557
# model 2
AIC(IED_ARIMA2)
## [1] 1850.445
fvalues_arima2 <- fitted(IED_ARIMA2)
R_ARIMA2 <- sqrt(mean((fvalues_arima2 - bd$IED_Flujos)^2))
print(R_ARIMA2)
## [1] 3909.249
# Forecast ARMA
ARMA <- arima(log(bd$IED_Flujos), order = c(1, 0, 1))
AICAA <- AIC(ARMA)
AICAA
## [1] 152.6035
ARMA <- arima(log(bd$IED_Flujos), order = c(1, 0, 1))
residuales_arma <- ARMA$residuals
r_arma <- sqrt(mean((log(bd$IED_Flujos) - residuales_arma)^2))
print(r_arma)
## [1] 8.721015
#AIC evaluation
AIC(IED_ARIMA)
## [1] 143.5873
#
AIC(IED_ARIMA2)
## [1] 1850.445
AICAA
## [1] 152.6035
Through the evaluation, 3 proposed models were analyzed, for the ARMA model the least AIC was -152.60 and in addition there were low RMSE values, these tell us that the model fits the information in a more precise way. With this model and information we generate a projection of the forecast 5 years into the future.
IED_Model_forecast<-forecast(IED_estimated,h=5)
## Warning in ets(object, lambda = lambda, biasadj = biasadj,
## allow.multiplicative.trend = allow.multiplicative.trend, : Missing values
## encountered. Using longest contiguous portion of time series
IED_Model_forecast
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 97 6188.802 5222.402 7155.202 4710.821 7666.783
## 98 6188.802 5222.402 7155.202 4710.821 7666.783
## 99 6188.802 5222.402 7155.202 4710.821 7666.783
## 100 6188.802 5222.402 7155.202 4710.821 7666.783
## 101 6188.802 5222.402 7155.202 4710.821 7666.783
plot(IED_Model_forecast)
autoplot(IED_Model_forecast)
b. Time Series Model 2 From the time series dataset, select the explanatory variables that might explain the Nearshoring in Mexico.
Education: The quality and access to teaching and learning in a society. It can be measured by indicators such as the level of education achieved by the population, the literacy rate, etc. Education is essential for the development of a society, since it influences economic growth.
Innovation: Is the process of creating and applying new ideas, products or methods to improve and advance in different areas, from technology to the economy. It involves creativity, development, implementation and seeks to have a positive impact.
Exchange rate: The value of one currency with respect to another. It can be fixed or floating, depending on whether exchange rates are determined by the market or by government intervention.
Exportation: Refer to goods and services produced in a country and sold to other countries. These sales may include manufactured products, raw materials, technological services, among others.
# 1 educacion
ggplot(bd1, aes(x = periodo, y = Educacion)) +
geom_line(color = "purple") +
labs(title = "Time Series of Educacion",
x = "Date",
y = "Educacion") +
theme_minimal()
## Warning: Removed 3 rows containing missing values (`geom_line()`).
ggplot(bd1, aes(x = periodo, y = Innovacion)) +
geom_line(color = "blue") +
labs(title = "Time Series of Innovacion",
x = "Date",
y = "Innovacion") +
theme_minimal()
## Warning: Removed 2 rows containing missing values (`geom_line()`).
ggplot(bd1, aes(x = periodo, y = Tipo_de_Cambio)) +
geom_line(color = "green") +
labs(title = "Time Series of Exchange rate",
x = "Date",
y = "Tipo de Cambio") +
theme_minimal()
ggplot(bd1, aes(x = periodo, y = Exportaciones)) +
geom_line(color = "orange") +
labs(title = "Time Series of Exportaciones",
x = "Date",
y = "Exportaciones") +
theme_minimal()
ggplot(bd1, aes(x = periodo, y = Salario_Diario)) +
geom_line(color = "purple") +
labs(title = "Time Series of Salario Diario",
x = "Date",
y = "Salario Diario") +
theme_minimal()
ggplot(bd1, aes(x = periodo, y = Inseguridad_Homicidio)) +
geom_line(color = "red") +
labs(title = "Time Series of Inseguridad_Homicidio",
x = "Date",
y = "Inseguridad_Homicidio") +
theme_minimal()
## Warning: Removed 1 row containing missing values (`geom_line()`).
Describe the hypothetical relationship / impact between each selected factor and the dependent variable IED_Flujos. For example, how does the exchange rate increase / reduce the foreign direct investment flows in Mexico?
We can see at first glance that we can positive and negatives variables, our positive variables that have a good influence on nearshoring in mexico is, educaction, exchange rate, exportation, daily wage and our negative impact variables our the rate of homicide and insecurity, and in some way innovation because in the graph we can see that it decreases.
The variables with the greatest impact on IED_MXN were Education, had a significance of 1%, and Insecurity due to Homicide, with a statistical significance of 5%. And for one unit of the coefficient “Inseguridad_Homicidio” the dependent variable is estimated to decrease by approximately 0.0221
The coefficient for “Education” tells us that for one unit it increases in education, the dependent variable increases by 0.3271 units approximately, For “Tipo_de_Cambio” is approximately 0.0345, indicating that a positive change in the exchange rate is associated with an estimated increase of approximately 0.0345 units in the dependent variable. And for “Salario_Diario” , for each unit it increases the dependent variable is estimated to increase by 0.0006 units but it will have a slightly small effect.
for(column in names(bd1)) {
if(is.numeric(bd1[[column]])) {
bd1[[column]][is.na(bd1[[column]])] <- median(bd1[[column]], na.rm = TRUE)
}
}
adf.test(bd1$IED_Flujos)
##
## Augmented Dickey-Fuller Test
##
## data: bd1$IED_Flujos
## Dickey-Fuller = -3.0832, Lag order = 2, p-value = 0.1597
## alternative hypothesis: stationary
VAR <- cbind(bd1$IED_Flujos, bd1$PIB_Per_Capita, bd1$Tipo_de_Cambio)
lag_select<-VARselect(VAR,lag.max=5,type="const", season=52)
lag_select$selection
## AIC(n) HQ(n) SC(n) FPE(n)
## 1 1 1 1
lag_select$criteria
## 1 2 3 4 5
## AIC(n) -Inf -Inf -Inf -Inf -Inf
## HQ(n) -Inf -Inf -Inf -Inf -Inf
## SC(n) -Inf -Inf -Inf -Inf -Inf
## FPE(n) 0 0 0 0 0
d_IED<-diff(bd1$IED_Flujos)
d_GDP<-diff(bd1$PIB_Per_Capita)
d_Exchange<-diff(bd1$Tipo_de_Cambio)
VAR_ld<- cbind(d_IED, d_GDP, d_Exchange)
colnames(VAR_ld)<-cbind("IED","GDP","Exchange rate")
VAR_m1<-VAR(VAR_ld,p=1,type="const",season=NULL,exog=NULL)
summary(VAR_m1)
##
## VAR Estimation Results:
## =========================
## Endogenous variables: IED, GDP, Exchange.rate
## Deterministic variables: const
## Sample size: 24
## Log Likelihood: -504.546
## Roots of the characteristic polynomial:
## 0.3893 0.3421 0.2301
## Call:
## VAR(y = VAR_ld, p = 1, type = "const", exogen = NULL)
##
##
## Estimation results for equation IED:
## ====================================
## IED = IED.l1 + GDP.l1 + Exchange.rate.l1 + const
##
## Estimate Std. Error t value Pr(>|t|)
## IED.l1 -0.7119 0.1733 -4.107 0.000548 ***
## GDP.l1 0.8930 0.5135 1.739 0.097380 .
## Exchange.rate.l1 -3124.8934 1202.0980 -2.600 0.017144 *
## const 2792.4588 1566.8882 1.782 0.089911 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 6844 on 20 degrees of freedom
## Multiple R-Squared: 0.489, Adjusted R-squared: 0.4124
## F-statistic: 6.38 on 3 and 20 DF, p-value: 0.003284
##
##
## Estimation results for equation GDP:
## ====================================
## GDP = IED.l1 + GDP.l1 + Exchange.rate.l1 + const
##
## Estimate Std. Error t value Pr(>|t|)
## IED.l1 -0.04424 0.07501 -0.590 0.562
## GDP.l1 0.36042 0.22220 1.622 0.120
## Exchange.rate.l1 -5.24003 520.20423 -0.010 0.992
## const 647.20008 678.06610 0.954 0.351
##
##
## Residual standard error: 2962 on 20 degrees of freedom
## Multiple R-Squared: 0.1195, Adjusted R-squared: -0.01258
## F-statistic: 0.9047 on 3 and 20 DF, p-value: 0.4563
##
##
## Estimation results for equation Exchange.rate:
## ==============================================
## Exchange.rate = IED.l1 + GDP.l1 + Exchange.rate.l1 + const
##
## Estimate Std. Error t value Pr(>|t|)
## IED.l1 3.862e-05 3.286e-05 1.175 0.254
## GDP.l1 2.543e-05 9.734e-05 0.261 0.797
## Exchange.rate.l1 7.416e-02 2.279e-01 0.325 0.748
## const 3.087e-01 2.970e-01 1.039 0.311
##
##
## Residual standard error: 1.297 on 20 degrees of freedom
## Multiple R-Squared: 0.07951, Adjusted R-squared: -0.05856
## F-statistic: 0.5759 on 3 and 20 DF, p-value: 0.6375
##
##
##
## Covariance matrix of residuals:
## IED GDP Exchange.rate
## IED 46844023 4114725.2 -1934.792
## GDP 4114725 8772476.7 -72.103
## Exchange.rate -1935 -72.1 1.683
##
## Correlation matrix of residuals:
## IED GDP Exchange.rate
## IED 1.0000 0.20298 -0.21788
## GDP 0.2030 1.00000 -0.01876
## Exchange.rate -0.2179 -0.01876 1.00000
#Transform non stationary to stationary
VAR_m1_residuals<-data.frame(residuals(VAR_m1))
adf.test(VAR_m1_residuals$IED)
##
## Augmented Dickey-Fuller Test
##
## data: VAR_m1_residuals$IED
## Dickey-Fuller = -3.5146, Lag order = 2, p-value = 0.06187
## alternative hypothesis: stationary
# p value is 0.06 bigger than 0.05 this is non stationary
Box.test(VAR_m1_residuals$IED,lag=1,type="Ljung-Box")
##
## Box-Ljung test
##
## data: VAR_m1_residuals$IED
## X-squared = 0.47979, df = 1, p-value = 0.4885
# p value is 0.488 meaning is much grater than 0.05, meaning no autocorrelation
On behalf of the information from the diagnostics test we see that the VAR model is our best chose because of its fit data, by transforming the variables “Educacion” and “Inseguridads_Homicidio” by applying logarithmic and difference operations so we can make them stationary.
Since the p- value is greater than 0.05, we don’t know if there autocorrelation at a lag of 1 in the model’s residuals and so by failing to reject the null hypothesis by the unknown information of the model.
gran <- causality(VAR_m1,cause="IED")
gran
## $Granger
##
## Granger causality H0: IED do not Granger-cause GDP Exchange.rate
##
## data: VAR object VAR_m1
## F-Test = 0.85197, df1 = 2, df2 = 60, p-value = 0.4317
##
##
## $Instant
##
## H0: No instantaneous causality between: IED and GDP Exchange.rate
##
## data: VAR object VAR_m1
## Chi-squared = 1.9218, df = 2, p-value = 0.3826
# with the granger test we can observed that our p-value is 0.38 meaning is greater than 5, there's is no instantaneous casualty between the selected explanatory variables.
f_1 <- predict(VAR_m1,n.ahead=60,ci=0.95)
fanchart(f_1,names="IED_Flujos",main="IED_Flujos",xlab="Time Period",ylab="IED_Flujos")
## Warning in fanchart(f_1, names = "IED_Flujos", main = "IED_Flujos", xlab = "Time Period", :
## Invalid variable name(s) supplied, using first variable.
model_forecast11 <-forecast(IED_estimated,h=5)
## Warning in ets(object, lambda = lambda, biasadj = biasadj,
## allow.multiplicative.trend = allow.multiplicative.trend, : Missing values
## encountered. Using longest contiguous portion of time series
model_forecast11
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 97 6188.802 5222.402 7155.202 4710.821 7666.783
## 98 6188.802 5222.402 7155.202 4710.821 7666.783
## 99 6188.802 5222.402 7155.202 4710.821 7666.783
## 100 6188.802 5222.402 7155.202 4710.821 7666.783
## 101 6188.802 5222.402 7155.202 4710.821 7666.783
plot(model_forecast11)
autoplot(model_forecast11)
Through the evaluation and based on the results in the diagnostics test, the wining model was ARMA model with an AIC of 152.60 and in addition there were low RMSE values, these tell us that the model fits the information in a more precise way. Since being a valuable tool for forecasting it combines auto regressive and moving average or AR/MA components to capture the autocorrelation patterns in the data. The main difference between the ARMA and ARIMA model, is that ARIMA uses non stationary data, and ARMA stationary data, since our data data is non-stationary, has trends, or seasonality, its recommended to use ARIMA. This report will help Maria on behalf of her problem, because it gives her more tools to make an accurate and informed decisions, on expand to Mexico.
The primary variables to consider when wanting to expand near-shoring in any country will be the level of education, number of exports, insecurity, salary wages and PIB to have a more diverse vision. Another great tool to determine the viability of a country is to use the PESTLE tool. It allows identifying external factors that may affect the activity and normal functioning of companies. Also I recommend to have a data set set in the same time period, and same currency so we can have more accurate information and don’t waste to much time on converting information.
Lastly Mexico presents a favorable environment for near shoring, standing out in terms of innovation and opportunities. Its geographic position, diversity, growing economy, and educated population make it an attractive destination for companies looking to outsource services and manufacturing. Geographic proximity to the United States, supported by trade agreements such as the USMCA, provides strategic access to one of the largest markets in the world. Like Tesla chose to near-shoring in Mexico due to its strategic location near the United States, lower labor costs, and the favorable trade agreements the country offers. Thanks to this agreement Mexico is attracting more attention towards foreign companies that want to invest.
Time series analysis: Definition, types, techniques, and when it’s used. (s/f). Tableau. Recuperado el 6 de septiembre de 2023, de https://www.tableau.com/learn/articles/time-series-analysis
(S/f-b). Tibco.com. Recuperado el 6 de septiembre de 2023, de https://www.tibco.com/reference-center/what-is-time-series-analysis
Mexico: The future of nearshoring. (s/f). Tango.io. Recuperado el 6 de septiembre de 2023, de https://www.tango.io/blog/mexico-the-future-of-nearshoring
IZO. (2019, julio 11). Análisis PESTEL: ¿Qué es y Cómo Ayuda en la Estrategia? IZO. https://izo.es/que-es-analisis-pestel/