What is “Nearshoring” and why Mexico might be attractive to
it?
Nearshoring is a business strategy that has gained a lot of strength in
recent months, it is based on shortening value chains, bringing
production closer to the final market, compared to its leading
predecessor “offshoring” not only seeks the services of a third party
country, but rather seeks to reduce transportation and logistics costs,
in turn transit times, improving the efficiency of the chain and
allowing a better connection between different areas of the company.
(Duran, 2023).
Mexico is considered one of the best alternatives for the relocation of value chains, mainly for the North American market, being one of the largest markets in the world. In Mexico, there is a sufficiently developed infrastructure and high-level human capital to attract the attention of American companies, other factors that make Mexico a great alternative for this market is the shared land border, this would greatly reduce the transportation costs and times. Nearshoring can be very attractive for Mexico because economic sectors of the country can be strengthened, even the T-MEC can be used to achieve greater benefits within sectors such as the automotive industry. (Duran, 2023).
What is “Predictive Analytics”?
Predictive analytics is one of the 3 great methods of data analysis,
this method is used to make evaluations of what will happen in the
future, obtaining historical information, necessary to carry out
statistical modeling that will define the possibilities of future
results, even Machine learning techniques are applied to improve the
analysis. A good predictive analytics needs a good previous descriptive
analytics, to know the nature of the data and understand the database
that you are managing. (University of Bath, 2021).
Regression analysis is the main tool of predictive analysis, this is a statistical process that looks for relationships between variables to predict the future values of your dependent variable using at least one independent variable. There are various types of regression models that are used in predictive analysis, the main ones are: linear regression (with all its variants), lasso regression and rigid regression. Lasso and Ridge are used to check significant variables towards the prediction of your dependent. For a good predictive analysis it is recommended to carry out multiple models and compare their attitude and their predictive capacity. (Wohlwend, 2023)
How regression analysis can help us to predict the occurrence
of “Nearshoring” for the Mexican case?
In Mexico there are a large number of economic and social variables that
can be thought to explain the nearshoring phenomenon, of which there are
many records for analysis. Having a good database is essential for the
regression analysis, which will analyze these independent variables and
find the relationship between them and the variable that will help us
measure nearshoring, which in the case of Mexico would be foreign direct
investment.
Problem Situation
Starting with the COVID-19 pandemic, many sectors and countries were affected in various ways. One of the most affected countries was China, as it was the epicenter of the pandemic. The effect of the pandemic in China had an international impact, by concentrating the productions of the largest economic markets in the country, the supply chains were broken, affecting the global economy.
What happened in China made the economic markets think about stopping concentrating their productions in the country and looking for alternatives. For the American market, the option that seems to be the most striking is nearshoring, seeking to transfer their production processes to countries like Mexico that They have great capacity, labor and a closeness that would reduce costs and facilitate transfers to the American market.
# Import the database to the Rmarkdown
bd = read.csv("C:\\Users\\Silva\\Documents\\Tec\\CSV\\Semestre5\\sp_data.csv")
# Calling the libraries that would be used on the analysis
library(tidyverse)
library(ggplot2)
library(corrplot)
library(gmodels)
library(effects)
library(stargazer)
library(olsrr)
library(kableExtra)
library(jtools)
library(fastmap)
library(dlookr)
library(Hmisc)
library(naniar)
library(glmnet)
library(caret)
library(car)
library(lmtest)
library(xts)
library(dygraphs)
library(tseries)
# Identifying missing values
missing_values = colSums(is.na(bd))
missing_values
## periodo IED_M Exportaciones_m
## 0 0 0
## Empleo Educacion Salario_Diario
## 3 3 0
## Innovacion Inseguridad_Robo Inseguridad_Homicidio
## 2 0 1
## Tipo_de_Cambio Densidad_Carretera Densidad_Poblacion
## 0 0 0
## CO2_Emisiones PIB_Per_Capita INPC
## 3 0 0
## crisis_financiera
## 0
## Missing values were found in variables, since there are few records in the database it is important to keep them, an imputation of values will be made in the missing ones with the median of the variable to affect the analysis of the data as little as possible.
## Eliminate missing values
bd <- bd %>%
mutate(across(everything(), ~ifelse(is.na(.), median(., na.rm = TRUE), .)))
## Structure of data
str(bd)
## 'data.frame': 26 obs. of 16 variables:
## $ periodo : int 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 ...
## $ IED_M : num 294151 210876 299734 362632 546548 ...
## $ Exportaciones_m : num 220091 248691 235961 248057 205483 ...
## $ Empleo : num 96.5 96.5 96.5 97.8 97.4 ...
## $ Educacion : num 7.2 7.31 7.43 7.56 7.68 7.8 7.93 8.04 8.14 8.26 ...
## $ Salario_Diario : num 24.3 31.9 31.9 35.1 37.6 ...
## $ Innovacion : num 11.3 11.4 12.5 13.2 13.5 ...
## $ Inseguridad_Robo : num 267 315 273 217 215 ...
## $ Inseguridad_Homicidio: num 14.6 14.3 12.6 10.9 10.2 ...
## $ Tipo_de_Cambio : num 8.06 9.94 9.52 9.6 9.17 ...
## $ Densidad_Carretera : num 0.05 0.05 0.06 0.06 0.06 0.06 0.06 0.06 0.06 0.06 ...
## $ Densidad_Poblacion : num 47.4 48.8 49.5 50.6 51.3 ...
## $ CO2_Emisiones : num 3.68 3.85 3.69 3.87 3.81 3.82 3.95 3.98 4.1 4.19 ...
## $ PIB_Per_Capita : num 127570 126739 129165 130875 128083 ...
## $ INPC : num 33.3 39.5 44.3 48.3 50.4 ...
## $ crisis_financiera : int 0 0 0 0 0 0 0 0 0 0 ...
summary(bd)
## periodo IED_M Exportaciones_m Empleo
## Min. :1997 Min. :210876 Min. :205483 Min. :95.06
## 1st Qu.:2003 1st Qu.:368560 1st Qu.:262337 1st Qu.:96.08
## Median :2010 Median :497054 Median :366294 Median :96.53
## Mean :2010 Mean :493596 Mean :433856 Mean :96.48
## 3rd Qu.:2016 3rd Qu.:578606 3rd Qu.:632356 3rd Qu.:97.01
## Max. :2022 Max. :754438 Max. :785655 Max. :97.83
## Educacion Salario_Diario Innovacion Inseguridad_Robo
## Min. :7.200 Min. : 24.30 Min. :11.28 Min. :120.5
## 1st Qu.:7.957 1st Qu.: 41.97 1st Qu.:12.60 1st Qu.:148.3
## Median :8.460 Median : 54.48 Median :13.09 Median :181.8
## Mean :8.428 Mean : 65.16 Mean :13.10 Mean :185.4
## 3rd Qu.:8.925 3rd Qu.: 72.31 3rd Qu.:13.61 3rd Qu.:209.9
## Max. :9.580 Max. :172.87 Max. :15.11 Max. :314.8
## Inseguridad_Homicidio Tipo_de_Cambio Densidad_Carretera Densidad_Poblacion
## Min. : 8.04 Min. : 8.06 Min. :0.05000 Min. :47.44
## 1st Qu.:10.40 1st Qu.:10.75 1st Qu.:0.06000 1st Qu.:52.77
## Median :16.93 Median :13.02 Median :0.07000 Median :58.09
## Mean :17.28 Mean :13.91 Mean :0.07115 Mean :57.33
## 3rd Qu.:22.34 3rd Qu.:18.49 3rd Qu.:0.08000 3rd Qu.:61.39
## Max. :29.59 Max. :20.66 Max. :0.09000 Max. :65.60
## CO2_Emisiones PIB_Per_Capita INPC crisis_financiera
## Min. :3.590 Min. :126739 Min. : 33.28 Min. :0.00000
## 1st Qu.:3.842 1st Qu.:130964 1st Qu.: 56.15 1st Qu.:0.00000
## Median :3.930 Median :136845 Median : 73.35 Median :0.00000
## Mean :3.943 Mean :138550 Mean : 75.17 Mean :0.07692
## 3rd Qu.:4.090 3rd Qu.:146148 3rd Qu.: 91.29 3rd Qu.:0.00000
## Max. :4.220 Max. :153236 Max. :126.48 Max. :1.00000
describe(bd$IED_M)
## bd$IED_M
## n missing distinct Info Mean Gmd .05 .10
## 26 0 26 1 493596 167384 295547 312026
## .25 .50 .75 .90 .95
## 368560 497054 578606 691611 700045
##
## lowest : 210876 294151 299734 324318 350979, highest: 671018 683318 699904 700092 754437
# Histogram of dependent variable to see the normality of the registers
hist(bd$IED_M)
# Histogram of the natural logarithmic of dependent variable to ...
hist(log(bd$IED_M))
Foreign direct investment is a dependent variable that has a lot of volatility, it has had many ups and downs over the years but generally maintains a positive trend, its lowest values have been in 2009 and 2011.
The employment variable has had a lot of variation in its affectation to the variable, it can hardly be found that there is a positive or negative trend in its relationship with the dependent variable, the relationship that exists between the two may be non-linear.
The average level of schooling in Mexico seems to have a positive effect on foreign investment. This variable may be an indicator of labor capacity, which may attract the attention of foreign companies.
Foreign direct investment in relation to the homicide rate also has volatility in its results, which can lead us to the conclusion that its relationship is positive, which raises many doubts and it is believed that the values of the homicide rate may become irrelevant with to explain the independent variable.
GDP per capita may be one of the economic variables with the greatest impact on foreign direct investment, with the graph we can see a positive relationship between the two, the better Mexico does economically, the greater foreign investment will be.
Most of the variables in the database have high positive relationships between them and with the dependent variable. Employment and theft are the variables that stand out for having negative relationships with the others. having a financial crisis in the country seems to be unrelated to any other variable.
Estimation method
The estimation method used will be Ordinary Least Squares (OLS), it is the most used method for linear regression models, which will be used to analyze and predict the possible results of foreign direct investment. The OLS method minimizes the sum of the squares of the differences between the observed values and the values predicted by the model.(XLSTAT, 2023)
hypothesis 1
H0: The variable “PIB Per Capita” has an positive and significant impact
on the dependent variable.
H1: The variable “PIB Per Capita” has no significant impact on the
dependent variable.
Hypothesis 2
H0: The variable “Empleo” has a linear impact on the dependent
variable.
H1: The variable “Empleo” has not a linear impact on the dependent
variable.
Hypothesis 3
H0: The variable “Inseguridad_homicidio” has a positive impact on the
dependent variable
H1: The variable “Inseguridad_homicidio” has not a positive impact on
the dependent variable
modelo_2 = lm(IED_M ~ Tipo_de_Cambio + Empleo + Salario_Diario + PIB_Per_Capita + CO2_Emisiones + Inseguridad_Robo,data= bd)
summary(modelo_2)
##
## Call:
## lm(formula = IED_M ~ Tipo_de_Cambio + Empleo + Salario_Diario +
## PIB_Per_Capita + CO2_Emisiones + Inseguridad_Robo, data = bd)
##
## Residuals:
## Min 1Q Median 3Q Max
## -148196 -35045 -14655 27300 216923
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -3.474e+06 4.055e+06 -0.857 0.4022
## Tipo_de_Cambio 6.856e+03 1.519e+04 0.451 0.6569
## Empleo 2.553e+04 3.583e+04 0.712 0.4848
## Salario_Diario -1.285e+03 1.159e+03 -1.109 0.2814
## PIB_Per_Capita 1.195e+01 4.824e+00 2.477 0.0228 *
## CO2_Emisiones -6.299e+03 1.718e+05 -0.037 0.9711
## Inseguridad_Robo -7.392e+02 6.260e+02 -1.181 0.2522
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 95320 on 19 degrees of freedom
## Multiple R-squared: 0.6663, Adjusted R-squared: 0.5609
## F-statistic: 6.323 on 6 and 19 DF, p-value: 0.0008732
# Check the veracity of the model, coefficient used for comparison and selection of the variable.
AIC(modelo_2)
## [1] 677.8074
modelo_5 = lm(log(IED_M) ~ Tipo_de_Cambio + Empleo + Salario_Diario + PIB_Per_Capita + CO2_Emisiones + Inseguridad_Robo,data= bd)
summary(modelo_5)
##
## Call:
## lm(formula = log(IED_M) ~ Tipo_de_Cambio + Empleo + Salario_Diario +
## PIB_Per_Capita + CO2_Emisiones + Inseguridad_Robo, data = bd)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.31681 -0.07548 -0.03732 0.09449 0.40316
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.937e+00 8.801e+00 0.561 0.5814
## Tipo_de_Cambio 1.159e-02 3.297e-02 0.352 0.7290
## Empleo 5.597e-02 7.777e-02 0.720 0.4805
## Salario_Diario -2.671e-03 2.516e-03 -1.061 0.3018
## PIB_Per_Capita 2.432e-05 1.047e-05 2.323 0.0314 *
## CO2_Emisiones -3.903e-02 3.729e-01 -0.105 0.9177
## Inseguridad_Robo -2.566e-03 1.359e-03 -1.888 0.0743 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.2069 on 19 degrees of freedom
## Multiple R-squared: 0.6772, Adjusted R-squared: 0.5752
## F-statistic: 6.643 on 6 and 19 DF, p-value: 0.0006555
# Check the veracity of the model, coefficient used for comparison and selection of the variable.
AIC(modelo_5)
## [1] -0.3009011
modelo_6 = lm(log(IED_M) ~ log(Tipo_de_Cambio) + log(Empleo) + log(Salario_Diario) + log(PIB_Per_Capita) + log(CO2_Emisiones) + log(Inseguridad_Robo),data= bd)
summary(modelo_6)
##
## Call:
## lm(formula = log(IED_M) ~ log(Tipo_de_Cambio) + log(Empleo) +
## log(Salario_Diario) + log(PIB_Per_Capita) + log(CO2_Emisiones) +
## log(Inseguridad_Robo), data = bd)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.33129 -0.10150 -0.02987 0.07535 0.43537
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -58.8950 41.9032 -1.405 0.1760
## log(Tipo_de_Cambio) 0.1503 0.5673 0.265 0.7938
## log(Empleo) 6.2551 8.0883 0.773 0.4488
## log(Salario_Diario) -0.1777 0.2911 -0.610 0.5489
## log(PIB_Per_Capita) 3.8096 1.5341 2.483 0.0225 *
## log(CO2_Emisiones) 0.3394 1.5077 0.225 0.8243
## log(Inseguridad_Robo) -0.3563 0.3023 -1.178 0.2532
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.2187 on 19 degrees of freedom
## Multiple R-squared: 0.6394, Adjusted R-squared: 0.5255
## F-statistic: 5.615 on 6 and 19 DF, p-value: 0.0017
# Check the veracity of the model, coefficient used for comparison and selection of the variable.
AIC(modelo_6)
## [1] 2.575984
stargazer(modelo_2,modelo_5,modelo_6,type="text",title="OLS Regression Results",single.row=TRUE,ci=FALSE,ci.level=0.9)
##
## OLS Regression Results
## =================================================================================================
## Dependent variable:
## -------------------------------------------------------------------
## IED_M log(IED_M)
## (1) (2) (3)
## -------------------------------------------------------------------------------------------------
## Tipo_de_Cambio 6,855.639 (15,189.400) 0.012 (0.033)
## Empleo 25,527.140 (35,829.660) 0.056 (0.078)
## Salario_Diario -1,285.295 (1,159.315) -0.003 (0.003)
## PIB_Per_Capita 11.949** (4.824) 0.00002** (0.00001)
## CO2_Emisiones -6,298.513 (171,784.000) -0.039 (0.373)
## Inseguridad_Robo -739.248 (626.042) -0.003* (0.001)
## log(Tipo_de_Cambio) 0.150 (0.567)
## log(Empleo) 6.255 (8.088)
## log(Salario_Diario) -0.178 (0.291)
## log(PIB_Per_Capita) 3.810** (1.534)
## log(CO2_Emisiones) 0.339 (1.508)
## log(Inseguridad_Robo) -0.356 (0.302)
## Constant -3,474,435.000 (4,054,579.000) 4.937 (8.801) -58.895 (41.903)
## -------------------------------------------------------------------------------------------------
## Observations 26 26 26
## R2 0.666 0.677 0.639
## Adjusted R2 0.561 0.575 0.526
## Residual Std. Error (df = 19) 95,316.190 0.207 0.219
## F Statistic (df = 6; 19) 6.323*** 6.643*** 5.615***
## =================================================================================================
## Note: *p<0.1; **p<0.05; ***p<0.01
## multicollinearity
vif(modelo_5)
## Tipo_de_Cambio Empleo Salario_Diario PIB_Per_Capita
## 10.934894 1.835855 4.752951 5.027690
## CO2_Emisiones Inseguridad_Robo
## 2.631591 2.450383
## heteroscedasticity
bptest(modelo_5)
##
## studentized Breusch-Pagan test
##
## data: modelo_5
## BP = 5.7852, df = 6, p-value = 0.4477
## normality of residuals
histogram(modelo_5$residuals)
In model 2, there is no multicollinearity, so the accuracy of the predictive power of the model can be trusted. The variable that had the highest value in the VIF test was the exchange rate. When performing the BPtest of the model, a P-value greater than 0.05 was obtained, this leads us to rule out H0, concluding that there is no heterosedasticity in the model.
The model selection criteria will be based mainly on the comparison of the AIC statistic, which indicates the predictive power of each model. In this case, a regression model has better predictive qualities the lower its AIC compared to the others. The value of R2 of the model will also be taken into account to know the number of cases that the model explains. Finally, the selection will be confirmed once having multicollinearity and heterosedasticity in the model is ruled out.
Model 2 is the one selected as it has a lower AIC and a higher R2 compared to the others, in addition to having ruled out multicollinearity and heterosedasticity. This model uses the transformation to the natural logarithm of the dependent variable, so its estimates should be interpreted as percentages. The variables that explain this model are: “Tipo_de_Cambio”, “Empleo”, “Salario_Diario”, “PIB_Per_Capita”, “CO2_Emisiones” and “Inseguridad_Robo”.
##
## Call:
## lm(formula = log(IED_M) ~ Tipo_de_Cambio + Empleo + Salario_Diario +
## PIB_Per_Capita + CO2_Emisiones + Inseguridad_Robo, data = bd)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.31681 -0.07548 -0.03732 0.09449 0.40316
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.937e+00 8.801e+00 0.561 0.5814
## Tipo_de_Cambio 1.159e-02 3.297e-02 0.352 0.7290
## Empleo 5.597e-02 7.777e-02 0.720 0.4805
## Salario_Diario -2.671e-03 2.516e-03 -1.061 0.3018
## PIB_Per_Capita 2.432e-05 1.047e-05 2.323 0.0314 *
## CO2_Emisiones -3.903e-02 3.729e-01 -0.105 0.9177
## Inseguridad_Robo -2.566e-03 1.359e-03 -1.888 0.0743 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.2069 on 19 degrees of freedom
## Multiple R-squared: 0.6772, Adjusted R-squared: 0.5752
## F-statistic: 6.643 on 6 and 19 DF, p-value: 0.0006555
The selected model has an R2 of 0.57 and a negative AIC, regarding the independent variables of the model it can be observed that the one with “PIB_Per_Capita” is the most significant to predict foreign direct investment, has a positive impact y is the variable with which it exerts the lowest percentage change for each unit increased. “Inseguridad_Robo” is the second most significant variable of the model, this variable has a negative impact, the higher the theft, the lower the foreign direct investment. “Tipo_de_Cambio” and “Empleo” are other variables of the model that have a positive impact on the dependent variable. “CO2_Emisiones” and “Salario_Diario” are variables with negative impact.
Glossary of variables
- “Tipo_de_Cambio” = Exchange rate
- “Empleo” = Employment rate
- “Salario_Diario” = Daily salary
- “PIB_Per_Capita” = GDP_Per_Capita
- “CO2_Emisiones” = CO2 emissions
- “Inseguridad_Robo” = insecurity (robbery)
avPlots(modelo_5)
set.seed(123)
training.samples<-bd$IED_M %>%
createDataPartition(p=0.75,list=FALSE)
train.data<-bd[training.samples, ]
test.data<-bd[-training.samples, ]
selected_model = lm(log(IED_M) ~ Tipo_de_Cambio + Empleo + Salario_Diario + PIB_Per_Capita + CO2_Emisiones + Inseguridad_Robo, data=bd)
summary(selected_model)
##
## Call:
## lm(formula = log(IED_M) ~ Tipo_de_Cambio + Empleo + Salario_Diario +
## PIB_Per_Capita + CO2_Emisiones + Inseguridad_Robo, data = bd)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.31681 -0.07548 -0.03732 0.09449 0.40316
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.937e+00 8.801e+00 0.561 0.5814
## Tipo_de_Cambio 1.159e-02 3.297e-02 0.352 0.7290
## Empleo 5.597e-02 7.777e-02 0.720 0.4805
## Salario_Diario -2.671e-03 2.516e-03 -1.061 0.3018
## PIB_Per_Capita 2.432e-05 1.047e-05 2.323 0.0314 *
## CO2_Emisiones -3.903e-02 3.729e-01 -0.105 0.9177
## Inseguridad_Robo -2.566e-03 1.359e-03 -1.888 0.0743 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.2069 on 19 degrees of freedom
## Multiple R-squared: 0.6772, Adjusted R-squared: 0.5752
## F-statistic: 6.643 on 6 and 19 DF, p-value: 0.0006555
RMSE(selected_model$fitted.values,test.data$IED_M)
## [1] 532900
x = model.matrix(log(IED_M) ~ Tipo_de_Cambio + Empleo + Salario_Diario + PIB_Per_Capita + CO2_Emisiones + Inseguridad_Robo, train.data)[,-1]
y = train.data$IED_M
set.seed(123)
cv.lasso<-cv.glmnet(x,y,alpha=1)
cv.lasso$lambda.min
## [1] 2865.87
lassomodel<-glmnet(x,y,alpha=1,lambda=cv.lasso$lambda.min)
coef(lassomodel)
## 7 x 1 sparse Matrix of class "dgCMatrix"
## s0
## (Intercept) -3.500878e+06
## Tipo_de_Cambio 1.397916e+04
## Empleo 2.945387e+04
## Salario_Diario -1.154902e+03
## PIB_Per_Capita 8.214213e+00
## CO2_Emisiones .
## Inseguridad_Robo -6.260518e+02
x.test<-model.matrix(log(IED_M) ~ Tipo_de_Cambio + Empleo + Salario_Diario + PIB_Per_Capita + CO2_Emisiones + Inseguridad_Robo,test.data)[,-1]
lassopredictions <- lassomodel %>% predict(x.test) %>% as.vector()
data.frame(
RMSE = RMSE(lassopredictions, test.data$IED_M),
Rsquare = R2(lassopredictions, test.data$IED_M))
## RMSE Rsquare
## 1 133038 0.606517
# Lasso model graph
lbs_fun <- function(fit, offset_x=1, ...) {
L <- length(fit$lambda)
x <- log(fit$lambda[L])+ offset_x
y <- fit$beta[, L]
labs <- names(y)
text(x, y, labels=labs, ...)
}
lasso<-glmnet(scale(x),y,alpha=1)
plot(lasso,xvar="lambda",label=T)
lbs_fun(lasso)
abline(v=cv.lasso$lambda.min,col="red",lty=2)
abline(v=cv.lasso$lambda.1se,col="blue",lty=2)
bd$periodo<- as.Date(paste0(bd$periodo, "-01-01"))
bdxts<-xts(bd$IED_M,order.by=bd$periodo)
dygraph(bdxts, main = "Foreign Direct Investment Flows") %>%
dyOptions(colors = RColorBrewer::brewer.pal(4, "Dark2")) %>%
dyShading(from = "2018-12-3",
to = "2022-12-26",
color = "#FFE6E6")
# Stationary test
adf.test(bd$IED_M)
##
## Augmented Dickey-Fuller Test
##
## data: bd$IED_M
## Dickey-Fuller = -2.0122, Lag order = 2, p-value = 0.5677
## alternative hypothesis: stationary
With a p-value of 0.57 Fails to Reject the H0. Time series data is non-stationary.
acf(bd$IED_M,main="Significant Autocorrelations")
The dependent variables has some serial autocorrelation on T1 from T1 to T-2, on further lags the autocorrelation can´t be considered significant.
Duran R. (2023). Nearshoring: 10 preguntas y respuestas sobre el tema del que todos hablan. EGADE. https://egade.tec.mx/es/egade-ideas/investigacion/nearshoring-10-preguntas-y-respuestas-sobre-el-tema-del-que-todos-hablan
University of Bath. (2021). Descriptive, predictive and prescriptive: three types of business analytics. The University of Bath. https://online.bath.ac.uk/content/descriptive-predictive-and-prescriptive-three-types-business-analytics#:~:text=There%20are%20three%20types%20of,should%20happen%20in%20the%20future
Wohlwend, B. (2023). Three Regression Models for Data Science: Linear Regression, Lasso Regression, and Ridge Regression. Medium. https://medium.com/@brandon93.w/three-regression-models-for-data-science-linear-regression-lasso-regression-and-ridge-regression-6aac73c0d7a5#:~:text=Comparison%20of%20Linear%2C%20Lasso%2C%20and%20Ridge%20Regression&text=Model%20Complexity%20and%20Overfitting%3A%20All,to%20limit%20the%20model’s%20complexity
XLSTAT. (2023). Ordinary Least Squares regression (OLS). XLSTAT. https://www.xlstat.com/en/solutions/features/ordinary-least-squares-regression-ols