Nearshoring is the relocation of business projects as part of a reconstruction within the supply chain. This relocation consists of shortening the supply chain by bringing the production centers closer and consequently, both production costs and risks for the company are reduced.
Mexico has some benefits that might attract nearshoring investment projects. One of its greatest advantages is its proximity to the United States market, since as mentioned above, companies seek to shorten the distance from their production centers. Another reason why it is attractive for Nearshoring is the difference in wages and the cost of labor between Mexico and other countries. Such is the case of corporations within the United States that decide to relocate in Mexico to reduce their costs. According to a report by the Reuters agency, US manufacturers pay between $8 and $10 USD per hour to workers inside Mexican factories. Meanwhile, General Motors comes to pay up to $68 USD per hour to workers in its factory in Detroit (BBC World, 2017).
Predictive Analytics is the practice of using historical data, statistical algorithms, and machine learning techniques to forecast future events or outcomes. It consists in analyzing past patterns within data with the purpose of developing informed predictions about future behaviors or events. This helps businesses and organizations take better and informed decisions.
A fundamental method within predictive analyisis is regression analysis, which is a statistical method used to calculate the relationship between a dependent variable and one or more independent variables. Regression analysis is a method that’s used in predictive analysis in order to have a mathematical relationship between variables and make a preciser prediction based in this values.
As mentioned before, regression analysis is a mathematical method within predictive analysis that is useful to determine the relationship between independent variables and the dependent variable. This is why it would be useful applying this method to predict the occurrence of “Nearshoring” for the Mexican case. This method will help us forecast which of the given independent variables (exchange rate, insecurity due to homicide, education, minimum wage, etc.) has the greatest impact on the attraction of “Nearshoring” investors to Mexico, that is, which of the socioeconomic, technological, ecological or security variables will be more attractive to investors when considering relocating their companies to Mexico in the coming years.
According to the document “Mexico and Its Attractiveness for Nearshoring” the problem situation focuses on discovering what are the conditions that Mexico offers to attract Nearshoring, that is, which of the socioeconomic, environmental, security or technological conditions are the ones that take the most foreign investors into account when relocating their companies in Mexico. Many data analysis must be carried out and methods such as regression analysis must be applied in order to detect which of the variables has the greatest impact on foreign investments within Mexico.
# Here we import the libraries:
library(foreign)
library(dplyr) # data manipulation
library(forcats) # to work with categorical variables
library(ggplot2) # data visualization
library(readr) # read specific csv files
library(janitor) # data exploration and cleaning
library(Hmisc) # several useful functions for data analysis
library(psych) # functions for multivariate analysis
library(naniar) # summaries and visualization of missing values NA's
library(dlookr) # summaries and visualization of missing values NA's
library(corrplot) # correlation plots
library(jtools) # presentation of regression analysis
library(lmtest) # diagnostic checks - linear regression analysis
library(car) # diagnostic checks - linear regression analysis
library(olsrr) # diagnostic checks - linear regression analysis
library(naniar) # identifying missing values
library(stargazer) # create publication quality tables
library(effects) # displays for linear and other regression models
library(tidyverse) # collection of R packages designed for data science
library(caret) # Classification and Regression Training
library(glmnet) # methods for prediction and plotting, and functions for cross-validation
library(readxl) # Read excel files
library(ggplot2)
library(effects)
# Here we upload the database from excel and select only the first sheet.
dataset <- read_excel("/Users/yessicaacosta/Downloads/SP_DataMexicoAtractiveness_alumn-VF_corrected.xlsx",sheet=1,range="A6:R32",na="-")
dataset
## # A tibble: 26 × 18
## Año IED_Flujos IED_Flujos_MXN Exportaciones Exportaciones_MXN Empleo
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1997 12146. 294298. 9088. 220201. NA
## 2 1998 8374. 210849. 9875. 248659. NA
## 3 1999 13960. 299834. 10990. 236039. NA
## 4 2000 18249. 362638. 12483. 248061. 97.8
## 5 2001 30057. 546448. 11300. 205445. 97.4
## 6 2002 24099. 468391. 11923. 231737. 97.7
## 7 2003 18250. 368747. 13156 265822. 97.1
## 8 2004 25016. 481300. 13573. 261147. 96.5
## 9 2005 25796. 458581. 16466. 292718. 97.2
## 10 2006 21233. 368329. 17486. 303335. 96.5
## # ℹ 16 more rows
## # ℹ 12 more variables: Educación <dbl>, Salario_Diario <dbl>, Innovación <dbl>,
## # Inseguridad_Robo <dbl>, Inseguridad_Homicidio <dbl>, Tipo_de_Cambio <dbl>,
## # Densidad_Carretera <dbl>, Densidad_Población <dbl>, CO2_Emisiones <dbl>,
## # PIB_Per_Cápita <dbl>, INPC <dbl>, Crisis_Financiera <dbl>
# Which are the variables we are working with?
variables <- colnames(dataset)
variables
## [1] "Año" "IED_Flujos" "IED_Flujos_MXN"
## [4] "Exportaciones" "Exportaciones_MXN" "Empleo"
## [7] "Educación" "Salario_Diario" "Innovación"
## [10] "Inseguridad_Robo" "Inseguridad_Homicidio" "Tipo_de_Cambio"
## [13] "Densidad_Carretera" "Densidad_Población" "CO2_Emisiones"
## [16] "PIB_Per_Cápita" "INPC" "Crisis_Financiera"
# Getting to know the data type of our variables
str(dataset) ## all of our variables show to be numerical
## tibble [26 × 18] (S3: tbl_df/tbl/data.frame)
## $ Año : num [1:26] 1997 1998 1999 2000 2001 ...
## $ IED_Flujos : num [1:26] 12146 8374 13960 18249 30057 ...
## $ IED_Flujos_MXN : num [1:26] 294298 210849 299834 362638 546448 ...
## $ Exportaciones : num [1:26] 9088 9875 10990 12483 11300 ...
## $ Exportaciones_MXN : num [1:26] 220201 248659 236039 248061 205445 ...
## $ Empleo : num [1:26] NA NA NA 97.8 97.4 ...
## $ Educación : num [1:26] 7.2 7.31 7.43 7.56 7.68 ...
## $ Salario_Diario : num [1:26] 24.3 31.9 31.9 35.1 37.6 ...
## $ Innovación : num [1:26] 11.3 11.4 12.5 13.1 13.5 ...
## $ Inseguridad_Robo : num [1:26] 267 315 273 217 215 ...
## $ Inseguridad_Homicidio: num [1:26] 14.6 14.3 12.6 10.9 10.2 ...
## $ Tipo_de_Cambio : num [1:26] 8.06 9.94 9.52 9.6 9.17 ...
## $ Densidad_Carretera : num [1:26] 0.0521 0.053 0.055 0.0552 0.0565 ...
## $ Densidad_Población : num [1:26] 47.4 48.8 49.5 50.6 51.3 ...
## $ CO2_Emisiones : num [1:26] 3.68 3.85 3.69 3.87 3.81 ...
## $ PIB_Per_Cápita : num [1:26] 127570 126739 129165 130875 128083 ...
## $ INPC : num [1:26] 33.3 39.5 44.3 48.3 50.4 ...
## $ Crisis_Financiera : num [1:26] 0 0 0 0 0 0 0 0 0 0 ...
# Here we can see the first values of our dataset
head(dataset)
## # A tibble: 6 × 18
## Año IED_Flujos IED_Flujos_MXN Exportaciones Exportaciones_MXN Empleo
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1997 12146. 294298. 9088. 220201. NA
## 2 1998 8374. 210849. 9875. 248659. NA
## 3 1999 13960. 299834. 10990. 236039. NA
## 4 2000 18249. 362638. 12483. 248061. 97.8
## 5 2001 30057. 546448. 11300. 205445. 97.4
## 6 2002 24099. 468391. 11923. 231737. 97.7
## # ℹ 12 more variables: Educación <dbl>, Salario_Diario <dbl>, Innovación <dbl>,
## # Inseguridad_Robo <dbl>, Inseguridad_Homicidio <dbl>, Tipo_de_Cambio <dbl>,
## # Densidad_Carretera <dbl>, Densidad_Población <dbl>, CO2_Emisiones <dbl>,
## # PIB_Per_Cápita <dbl>, INPC <dbl>, Crisis_Financiera <dbl>
# Calculate descriptive statistics
statistics_dt <- summary(dataset)
statistics_dt
## Año IED_Flujos IED_Flujos_MXN Exportaciones
## Min. :1997 Min. : 8374 Min. :210849 Min. : 9088
## 1st Qu.:2003 1st Qu.:21367 1st Qu.:368434 1st Qu.:13260
## Median :2010 Median :27698 Median :497116 Median :21188
## Mean :2010 Mean :26770 Mean :493603 Mean :23601
## 3rd Qu.:2016 3rd Qu.:32183 3rd Qu.:578789 3rd Qu.:31601
## Max. :2022 Max. :48354 Max. :754160 Max. :46478
##
## Exportaciones_MXN Empleo Educación Salario_Diario
## Min. :205445 Min. :95.06 Min. :7.198 Min. : 24.30
## 1st Qu.:262316 1st Qu.:95.90 1st Qu.:7.864 1st Qu.: 41.97
## Median :366363 Median :96.53 Median :8.457 Median : 54.48
## Mean :433858 Mean :96.47 Mean :8.424 Mean : 65.16
## 3rd Qu.:632411 3rd Qu.:97.08 3rd Qu.:9.004 3rd Qu.: 72.31
## Max. :785503 Max. :97.83 Max. :9.579 Max. :172.87
## NA's :3 NA's :3
## Innovación Inseguridad_Robo Inseguridad_Homicidio Tipo_de_Cambio
## Min. :11.28 Min. :120.5 Min. : 8.037 Min. : 8.064
## 1st Qu.:12.56 1st Qu.:148.3 1st Qu.:10.250 1st Qu.:10.752
## Median :13.09 Median :181.8 Median :16.928 Median :13.016
## Mean :13.11 Mean :185.4 Mean :17.292 Mean :13.910
## 3rd Qu.:13.75 3rd Qu.:209.9 3rd Qu.:22.433 3rd Qu.:18.489
## Max. :15.11 Max. :314.8 Max. :29.592 Max. :20.664
## NA's :2 NA's :1
## Densidad_Carretera Densidad_Población CO2_Emisiones PIB_Per_Cápita
## Min. :0.05205 Min. :47.44 Min. :3.592 Min. :126739
## 1st Qu.:0.05954 1st Qu.:52.78 1st Qu.:3.832 1st Qu.:130964
## Median :0.06989 Median :58.09 Median :3.925 Median :136845
## Mean :0.07106 Mean :57.33 Mean :3.946 Mean :138550
## 3rd Qu.:0.08275 3rd Qu.:61.39 3rd Qu.:4.106 3rd Qu.:146148
## Max. :0.09020 Max. :65.60 Max. :4.221 Max. :153236
## NA's :3
## INPC Crisis_Financiera
## Min. : 33.28 Min. :0.00000
## 1st Qu.: 56.15 1st Qu.:0.00000
## Median : 73.35 Median :0.00000
## Mean : 75.17 Mean :0.07692
## 3rd Qu.: 91.29 3rd Qu.:0.00000
## Max. :126.48 Max. :1.00000
##
#Identify missing values in our dataset:
is.na(dataset)
## Año IED_Flujos IED_Flujos_MXN Exportaciones Exportaciones_MXN Empleo
## [1,] FALSE FALSE FALSE FALSE FALSE TRUE
## [2,] FALSE FALSE FALSE FALSE FALSE TRUE
## [3,] FALSE FALSE FALSE FALSE FALSE TRUE
## [4,] FALSE FALSE FALSE FALSE FALSE FALSE
## [5,] FALSE FALSE FALSE FALSE FALSE FALSE
## [6,] FALSE FALSE FALSE FALSE FALSE FALSE
## [7,] FALSE FALSE FALSE FALSE FALSE FALSE
## [8,] FALSE FALSE FALSE FALSE FALSE FALSE
## [9,] FALSE FALSE FALSE FALSE FALSE FALSE
## [10,] FALSE FALSE FALSE FALSE FALSE FALSE
## [11,] FALSE FALSE FALSE FALSE FALSE FALSE
## [12,] FALSE FALSE FALSE FALSE FALSE FALSE
## [13,] FALSE FALSE FALSE FALSE FALSE FALSE
## [14,] FALSE FALSE FALSE FALSE FALSE FALSE
## [15,] FALSE FALSE FALSE FALSE FALSE FALSE
## [16,] FALSE FALSE FALSE FALSE FALSE FALSE
## [17,] FALSE FALSE FALSE FALSE FALSE FALSE
## [18,] FALSE FALSE FALSE FALSE FALSE FALSE
## [19,] FALSE FALSE FALSE FALSE FALSE FALSE
## [20,] FALSE FALSE FALSE FALSE FALSE FALSE
## [21,] FALSE FALSE FALSE FALSE FALSE FALSE
## [22,] FALSE FALSE FALSE FALSE FALSE FALSE
## [23,] FALSE FALSE FALSE FALSE FALSE FALSE
## [24,] FALSE FALSE FALSE FALSE FALSE FALSE
## [25,] FALSE FALSE FALSE FALSE FALSE FALSE
## [26,] FALSE FALSE FALSE FALSE FALSE FALSE
## Educación Salario_Diario Innovación Inseguridad_Robo
## [1,] FALSE FALSE FALSE FALSE
## [2,] FALSE FALSE FALSE FALSE
## [3,] FALSE FALSE FALSE FALSE
## [4,] FALSE FALSE FALSE FALSE
## [5,] FALSE FALSE FALSE FALSE
## [6,] FALSE FALSE FALSE FALSE
## [7,] FALSE FALSE FALSE FALSE
## [8,] FALSE FALSE FALSE FALSE
## [9,] FALSE FALSE FALSE FALSE
## [10,] FALSE FALSE FALSE FALSE
## [11,] FALSE FALSE FALSE FALSE
## [12,] FALSE FALSE FALSE FALSE
## [13,] FALSE FALSE FALSE FALSE
## [14,] FALSE FALSE FALSE FALSE
## [15,] FALSE FALSE FALSE FALSE
## [16,] FALSE FALSE FALSE FALSE
## [17,] FALSE FALSE FALSE FALSE
## [18,] FALSE FALSE FALSE FALSE
## [19,] FALSE FALSE FALSE FALSE
## [20,] FALSE FALSE FALSE FALSE
## [21,] FALSE FALSE FALSE FALSE
## [22,] FALSE FALSE FALSE FALSE
## [23,] FALSE FALSE FALSE FALSE
## [24,] TRUE FALSE FALSE FALSE
## [25,] TRUE FALSE TRUE FALSE
## [26,] TRUE FALSE TRUE FALSE
## Inseguridad_Homicidio Tipo_de_Cambio Densidad_Carretera
## [1,] FALSE FALSE FALSE
## [2,] FALSE FALSE FALSE
## [3,] FALSE FALSE FALSE
## [4,] FALSE FALSE FALSE
## [5,] FALSE FALSE FALSE
## [6,] FALSE FALSE FALSE
## [7,] FALSE FALSE FALSE
## [8,] FALSE FALSE FALSE
## [9,] FALSE FALSE FALSE
## [10,] FALSE FALSE FALSE
## [11,] FALSE FALSE FALSE
## [12,] FALSE FALSE FALSE
## [13,] FALSE FALSE FALSE
## [14,] FALSE FALSE FALSE
## [15,] FALSE FALSE FALSE
## [16,] FALSE FALSE FALSE
## [17,] FALSE FALSE FALSE
## [18,] FALSE FALSE FALSE
## [19,] FALSE FALSE FALSE
## [20,] FALSE FALSE FALSE
## [21,] FALSE FALSE FALSE
## [22,] FALSE FALSE FALSE
## [23,] FALSE FALSE FALSE
## [24,] FALSE FALSE FALSE
## [25,] FALSE FALSE FALSE
## [26,] TRUE FALSE FALSE
## Densidad_Población CO2_Emisiones PIB_Per_Cápita INPC Crisis_Financiera
## [1,] FALSE FALSE FALSE FALSE FALSE
## [2,] FALSE FALSE FALSE FALSE FALSE
## [3,] FALSE FALSE FALSE FALSE FALSE
## [4,] FALSE FALSE FALSE FALSE FALSE
## [5,] FALSE FALSE FALSE FALSE FALSE
## [6,] FALSE FALSE FALSE FALSE FALSE
## [7,] FALSE FALSE FALSE FALSE FALSE
## [8,] FALSE FALSE FALSE FALSE FALSE
## [9,] FALSE FALSE FALSE FALSE FALSE
## [10,] FALSE FALSE FALSE FALSE FALSE
## [11,] FALSE FALSE FALSE FALSE FALSE
## [12,] FALSE FALSE FALSE FALSE FALSE
## [13,] FALSE FALSE FALSE FALSE FALSE
## [14,] FALSE FALSE FALSE FALSE FALSE
## [15,] FALSE FALSE FALSE FALSE FALSE
## [16,] FALSE FALSE FALSE FALSE FALSE
## [17,] FALSE FALSE FALSE FALSE FALSE
## [18,] FALSE FALSE FALSE FALSE FALSE
## [19,] FALSE FALSE FALSE FALSE FALSE
## [20,] FALSE FALSE FALSE FALSE FALSE
## [21,] FALSE FALSE FALSE FALSE FALSE
## [22,] FALSE FALSE FALSE FALSE FALSE
## [23,] FALSE FALSE FALSE FALSE FALSE
## [24,] FALSE TRUE FALSE FALSE FALSE
## [25,] FALSE TRUE FALSE FALSE FALSE
## [26,] FALSE TRUE FALSE FALSE FALSE
# Counting the missing values in the dataset
sum(is.na(dataset)) ### 12 missing values are identified
## [1] 12
# Visual representation of missing values among the datset
gg_miss_var(dataset)
# Replacing null values with the median value
# By replacing null values with the median values in the corresponding variables we reduce de variation of data
dataset <- dataset %>%
mutate_all(~ifelse(is.na(.),median(., na.rm = TRUE),.))
print(dataset)
## # A tibble: 26 × 18
## Año IED_Flujos IED_Flujos_MXN Exportaciones Exportaciones_MXN Empleo
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1997 12146. 294298. 9088. 220201. 96.5
## 2 1998 8374. 210849. 9875. 248659. 96.5
## 3 1999 13960. 299834. 10990. 236039. 96.5
## 4 2000 18249. 362638. 12483. 248061. 97.8
## 5 2001 30057. 546448. 11300. 205445. 97.4
## 6 2002 24099. 468391. 11923. 231737. 97.7
## 7 2003 18250. 368747. 13156 265822. 97.1
## 8 2004 25016. 481300. 13573. 261147. 96.5
## 9 2005 25796. 458581. 16466. 292718. 97.2
## 10 2006 21233. 368329. 17486. 303335. 96.5
## # ℹ 16 more rows
## # ℹ 12 more variables: Educación <dbl>, Salario_Diario <dbl>, Innovación <dbl>,
## # Inseguridad_Robo <dbl>, Inseguridad_Homicidio <dbl>, Tipo_de_Cambio <dbl>,
## # Densidad_Carretera <dbl>, Densidad_Población <dbl>, CO2_Emisiones <dbl>,
## # PIB_Per_Cápita <dbl>, INPC <dbl>, Crisis_Financiera <dbl>
# Transformation of variables
# In the descriptive statistics we can observe that in the independent variable "exportaciones", de min and max variable have a big difference between them, causing the data to have a left distribution (showed in the histogram below)
# Replacing the max and min values can help get a better distribution of the data
dataset$Exportaciones_MXN[dataset$Exportaciones_MXN == max(dataset$Exportaciones_MXN)] <- median(dataset$Exportaciones_MXN)
dataset$Exportaciones_MXN[dataset$Exportaciones_MXN == min(dataset$Exportaciones_MXN)] <- median(dataset$Exportaciones_MXN)
dataset
## # A tibble: 26 × 18
## Año IED_Flujos IED_Flujos_MXN Exportaciones Exportaciones_MXN Empleo
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1997 12146. 294298. 9088. 220201. 96.5
## 2 1998 8374. 210849. 9875. 248659. 96.5
## 3 1999 13960. 299834. 10990. 236039. 96.5
## 4 2000 18249. 362638. 12483. 248061. 97.8
## 5 2001 30057. 546448. 11300. 362218. 97.4
## 6 2002 24099. 468391. 11923. 231737. 97.7
## 7 2003 18250. 368747. 13156 265822. 97.1
## 8 2004 25016. 481300. 13573. 261147. 96.5
## 9 2005 25796. 458581. 16466. 292718. 97.2
## 10 2006 21233. 368329. 17486. 303335. 96.5
## # ℹ 16 more rows
## # ℹ 12 more variables: Educación <dbl>, Salario_Diario <dbl>, Innovación <dbl>,
## # Inseguridad_Robo <dbl>, Inseguridad_Homicidio <dbl>, Tipo_de_Cambio <dbl>,
## # Densidad_Carretera <dbl>, Densidad_Población <dbl>, CO2_Emisiones <dbl>,
## # PIB_Per_Cápita <dbl>, INPC <dbl>, Crisis_Financiera <dbl>
# Showing statistics of our datset with complete values
summary(dataset)
## Año IED_Flujos IED_Flujos_MXN Exportaciones
## Min. :1997 Min. : 8374 Min. :210849 Min. : 9088
## 1st Qu.:2003 1st Qu.:21367 1st Qu.:368434 1st Qu.:13260
## Median :2010 Median :27698 Median :497116 Median :21188
## Mean :2010 Mean :26770 Mean :493603 Mean :23601
## 3rd Qu.:2016 3rd Qu.:32183 3rd Qu.:578789 3rd Qu.:31601
## Max. :2022 Max. :48354 Max. :754160 Max. :46478
## Exportaciones_MXN Empleo Educación Salario_Diario
## Min. :220201 Min. :95.06 Min. :7.198 Min. : 24.30
## 1st Qu.:272546 1st Qu.:96.09 1st Qu.:7.955 1st Qu.: 41.97
## Median :364291 Median :96.53 Median :8.457 Median : 54.48
## Mean :423767 Mean :96.48 Mean :8.428 Mean : 65.16
## 3rd Qu.:571383 3rd Qu.:97.01 3rd Qu.:8.929 3rd Qu.: 72.31
## Max. :749408 Max. :97.83 Max. :9.579 Max. :172.87
## Innovación Inseguridad_Robo Inseguridad_Homicidio Tipo_de_Cambio
## Min. :11.28 Min. :120.5 Min. : 8.037 Min. : 8.064
## 1st Qu.:12.60 1st Qu.:148.3 1st Qu.:10.402 1st Qu.:10.752
## Median :13.09 Median :181.8 Median :16.928 Median :13.016
## Mean :13.10 Mean :185.4 Mean :17.278 Mean :13.910
## 3rd Qu.:13.60 3rd Qu.:209.9 3rd Qu.:22.346 3rd Qu.:18.489
## Max. :15.11 Max. :314.8 Max. :29.592 Max. :20.664
## Densidad_Carretera Densidad_Población CO2_Emisiones PIB_Per_Cápita
## Min. :0.05205 Min. :47.44 Min. :3.592 Min. :126739
## 1st Qu.:0.05954 1st Qu.:52.78 1st Qu.:3.843 1st Qu.:130964
## Median :0.06989 Median :58.09 Median :3.925 Median :136845
## Mean :0.07106 Mean :57.33 Mean :3.944 Mean :138550
## 3rd Qu.:0.08275 3rd Qu.:61.39 3rd Qu.:4.088 3rd Qu.:146148
## Max. :0.09020 Max. :65.60 Max. :4.221 Max. :153236
## INPC Crisis_Financiera
## Min. : 33.28 Min. :0.00000
## 1st Qu.: 56.15 1st Qu.:0.00000
## Median : 73.35 Median :0.00000
## Mean : 75.17 Mean :0.07692
## 3rd Qu.: 91.29 3rd Qu.:0.00000
## Max. :126.48 Max. :1.00000
# Measures of Dispersion
describe(dataset)
## # A tibble: 18 × 26
## described_variables n na mean sd se_mean IQR skewness
## <chr> <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Año 26 0 2.01e+3 7.65e+0 1.5 e+0 1.25e+1 0
## 2 IED_Flujos 26 0 2.68e+4 8.77e+3 1.72e+3 1.08e+4 -0.0209
## 3 IED_Flujos_MXN 26 0 4.94e+5 1.44e+5 2.82e+4 2.10e+5 -0.0125
## 4 Exportaciones 26 0 2.36e+4 1.13e+4 2.23e+3 1.83e+4 0.514
## 5 Exportaciones_MXN 26 0 4.24e+5 1.77e+5 3.47e+4 2.99e+5 0.665
## 6 Empleo 26 0 9.65e+1 7.21e-1 1.41e-1 9.22e-1 -0.191
## 7 Educación 26 0 8.43e+0 6.76e-1 1.33e-1 9.74e-1 -0.130
## 8 Salario_Diario 26 0 6.52e+1 3.58e+1 7.03e+0 3.03e+1 1.62
## 9 Innovación 26 0 1.31e+1 1.07e+0 2.10e-1 1.01e+0 0.146
## 10 Inseguridad_Robo 26 0 1.85e+2 4.77e+1 9.35e+0 6.16e+1 1.00
## 11 Inseguridad_Homicidio 26 0 1.73e+1 7.12e+0 1.40e+0 1.19e+1 0.430
## 12 Tipo_de_Cambio 26 0 1.39e+1 4.15e+0 8.14e-1 7.74e+0 0.498
## 13 Densidad_Carretera 26 0 7.11e-2 1.34e-2 2.62e-3 2.32e-2 0.201
## 14 Densidad_Población 26 0 5.73e+1 5.41e+0 1.06e+0 8.61e+0 -0.210
## 15 CO2_Emisiones 26 0 3.94e+0 1.81e-1 3.54e-2 2.46e-1 -0.121
## 16 PIB_Per_Cápita 26 0 1.39e+5 8.86e+3 1.74e+3 1.52e+4 0.310
## 17 INPC 26 0 7.52e+1 2.48e+1 4.87e+0 3.51e+1 0.295
## 18 Crisis_Financiera 26 0 7.69e-2 2.72e-1 5.33e-2 0 3.37
## # ℹ 18 more variables: kurtosis <dbl>, p00 <dbl>, p01 <dbl>, p05 <dbl>,
## # p10 <dbl>, p20 <dbl>, p25 <dbl>, p30 <dbl>, p40 <dbl>, p50 <dbl>,
## # p60 <dbl>, p70 <dbl>, p75 <dbl>, p80 <dbl>, p90 <dbl>, p95 <dbl>,
## # p99 <dbl>, p100 <dbl>
# Graph explaining the distribution of the variable "Crisis Financiera", where 0 stand for NO and 1 for YES.
boxplot(log(IED_Flujos_MXN)~Crisis_Financiera,data=dataset)
# Graph 1:
graph1 <- ggplot(data=dataset,aes(x=Año,IED_Flujos_MXN,y=IED_Flujos_MXN,fill=Tipo_de_Cambio)) +
geom_bar(stat="identity")
graph1
This graph represents the change of Foreign Investment in México through
the years and how it is also related to the Change Type. We can
interpret that the type of change may explain some of the change in
Foreign Investment.
ggplot(dataset, aes(x = Tipo_de_Cambio , y = IED_Flujos_MXN)) +
geom_point(color = "pink", size = 4, shape = 18) +
labs(title = "Tipo de cambio vs. Flujo de Inversión",
x = "Tipo de Cambio",
y = "Flujo de Inversión Extranjera")
We can observe that the flow of investment values are not constant
vs. the typo of change.
set.seed(1)
x <- dataset$Innovación
y <- log(dataset$IED_Flujos_MXN)
df <- data.frame(x = x, y = y)
graph4 <- ggplot(df, aes(x = x, y = y)) +
geom_line()
graph4 +
labs(title = "Innovación vs. Flujo de Inversión",
x = "Innovación",
y = "Flujo de Inversión")
ggplot(dataset, aes(x=log(Exportaciones),y=IED_Flujos_MXN)) +
geom_point(aes(size=Tipo_de_Cambio))
We can interpret that there is a relationship between the type of change
and exportations that might explain the flow’s behavior.
# Se aplica una transformación logarítmica a la variable IED_Flujos para reducir el rango de valores
ggplot(dataset, aes(x=Salario_Diario, y= log(IED_Flujos), col=Salario_Diario)) + geom_point() +
ggtitle("Flujos de Inversión & Salario") +
xlab("Salario") +
ylab("IED_Flujos") +
scale_color_gradient(high="green",low="red",
name="Salario")
By analyzing this graph we can observe that as bigger as the salary is,
the investment flow grows.
detach("package:dplyr", unload = TRUE)
## Warning: 'dplyr' namespace cannot be unloaded:
## namespace 'dplyr' is imported by 'recipes', 'tidyr', 'dlookr' so cannot be unloaded
library(dplyr)
##
## Attaching package: 'dplyr'
## The following object is masked from 'package:car':
##
## recode
## The following objects are masked from 'package:Hmisc':
##
## src, summarize
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
dataset_cor<-dataset %>% select(-Exportaciones, -IED_Flujos)
corrplot(cor(dataset_cor),type='upper',order='hclust',addCoef.col='black',
tl.col = "black", tl.srt = 80, number.cex = 0.4)
“Inseguridad_Robo” has a negative correlation with the investment flow.
“Tipo_de_Cambio”,“Densidad_Carretera”,“Densidad_Población”, and
“Education” seem to have the highest correlation values.
# Histogram of dependent variable
hist(dataset$IED_Flujos_MXN)
hist(dataset$Exportaciones_MXN)
hist(log(dataset$Exportaciones_MXN))
hist(dataset$Tipo_de_Cambio)
hist(log(dataset$Tipo_de_Cambio))
hist(log(dataset$Tipo_de_Cambio))
hist(dataset$Educación)
hist(log(dataset$Educación))
hist(dataset$Salario_Diario)
hist(log(dataset$Salario_Diario))
hist(dataset$Innovación)
hist(log(dataset$Innovación))
hist(dataset$Inseguridad_Robo)
hist(log(dataset$Inseguridad_Robo))
#### 2. Which is the estimation method to be used to estimate the linear
regression model? Ordinary Least Squares Estimation
Hypothesis 1 H0:There is no significant association between the Innovation (Inovación) and the Foreign Investment Flow (IED_Flujos_MXN). H1: There is a significant association between the Innovation (Inovación) and the Foreign Investment Flow (IED_Flujos_MXN). Hypothesis 2 H0:The daily minimum wage in pesos(Salario_Diario) has a positive impact in the Foreign Investment Flow (IED_Flujos) . H1:The daily minimum wage in pesos (Salario_Diario) does not have a positive impact in the Foreign Investment Flow (IED_Flujos). Hypothesis 3 H0:The growth in the Average education years (Educación) does not have a negative effect in Foreign investment Flow (IED_Flujos_MXN). H1:The growth in the Average education years (Educación) has a negative effect in Foreign investment Flow (IED_Flujos_MXN).
### Estimate 3 different linear regression models.
# Model 1
model1<-lm(log(IED_Flujos_MXN) ~ log(Educación)+Salario_Diario+log(Tipo_de_Cambio)+Innovación+Inseguridad_Robo,data=dataset)
summary(model1)
##
## Call:
## lm(formula = log(IED_Flujos_MXN) ~ log(Educación) + Salario_Diario +
## log(Tipo_de_Cambio) + Innovación + Inseguridad_Robo, data = dataset)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.34682 -0.05638 0.01831 0.08234 0.37679
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 8.4477467 1.6353827 5.166 4.7e-05 ***
## log(Educación) 1.3618792 1.1281557 1.207 0.2414
## Salario_Diario 0.0010122 0.0027752 0.365 0.7191
## log(Tipo_de_Cambio) 0.1697680 0.4604660 0.369 0.7162
## Innovación 0.1024386 0.0470436 2.178 0.0416 *
## Inseguridad_Robo -0.0007032 0.0011682 -0.602 0.5540
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1994 on 20 degrees of freedom
## Multiple R-squared: 0.6843, Adjusted R-squared: 0.6053
## F-statistic: 8.669 on 5 and 20 DF, p-value: 0.0001682
# DIAGNOSIS TESTS 1
vif(model1)
## log(Educación) Salario_Diario log(Tipo_de_Cambio) Innovación
## 5.259843 6.223420 11.462719 1.600779
## Inseguridad_Robo
## 1.949690
bptest(model1)
##
## studentized Breusch-Pagan test
##
## data: model1
## BP = 5.6171, df = 5, p-value = 0.3453
histogram(model1$residuals)
AIC(model1)
## [1] -2.883926
# RMSE
cat("RMSE:",RMSE(model1$fitted.values,dataset$IED_Flujos_MXN))
## RMSE: 513343
Model 1: • Innovation is the variable with the most statistically significant value (95%) • Adjusted R^2 has a value of .603, which means that this model explains 60% of the relationship of the independent variables with the foreign investment flows. • According too the VIF test “Tipo_de_Cambio” present multicollinearity (VIF>10)
# Model 2
model2<-lm(log(IED_Flujos_MXN) ~ I(Educación^2) + Salario_Diario + log(Innovación) +
log(Inseguridad_Robo) + I(Tipo_de_Cambio^2) + Crisis_Financiera, data = dataset)
summary(model2)
##
## Call:
## lm(formula = log(IED_Flujos_MXN) ~ I(Educación^2) + Salario_Diario +
## log(Innovación) + log(Inseguridad_Robo) + I(Tipo_de_Cambio^2) +
## Crisis_Financiera, data = dataset)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.34099 -0.05821 0.00056 0.06975 0.39241
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 9.5452750 2.3803948 4.010 0.000749 ***
## I(Educación^2) 0.0092675 0.0059392 1.560 0.135165
## Salario_Diario -0.0001260 0.0027265 -0.046 0.963632
## log(Innovación) 1.4263237 0.6175989 2.309 0.032319 *
## log(Inseguridad_Robo) -0.1800064 0.2342055 -0.769 0.451589
## I(Tipo_de_Cambio^2) 0.0006970 0.0008515 0.819 0.423169
## Crisis_Financiera -0.1571136 0.1490812 -1.054 0.305156
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1966 on 19 degrees of freedom
## Multiple R-squared: 0.7085, Adjusted R-squared: 0.6164
## F-statistic: 7.696 on 6 and 19 DF, p-value: 0.0002691
#DIAGNOSIS TESTS 2
vif(model2)
## I(Educación^2) Salario_Diario log(Innovación)
## 2.942890 6.180613 1.656510
## log(Inseguridad_Robo) I(Tipo_de_Cambio^2) Crisis_Financiera
## 2.111939 7.207039 1.061785
bptest(model2)
##
## studentized Breusch-Pagan test
##
## data: model2
## BP = 5.2933, df = 6, p-value = 0.5068
histogram(model2$residuals)
AIC(model2)
## [1] -2.958056
# RMSE
cat("RMSE:",RMSE(model2$fitted.values,dataset$IED_Flujos_MXN))
## RMSE: 513343
Model 2: • Innovation is the variable with the most statistically significant value (95%) • Adjusted R^2 has a value of .6164, which means that this model explains 61.64% of the relationship of the independent variables with the foreign investment flows. • According too the VIF test this model does not present multicollinearity (VIF<10) •“Salario_Diario”, “Inseguridad_Robo”, and “Crisis_Financiera” have a negattive impact in the dependent variable.
# Model 3
model3<-lm(log(IED_Flujos_MXN) ~ log(Educación^2) + I(Tipo_de_Cambio^2) + Salario_Diario + log(Innovación) + Inseguridad_Robo + Crisis_Financiera, data = dataset)
summary(model3)
##
## Call:
## lm(formula = log(IED_Flujos_MXN) ~ log(Educación^2) + I(Tipo_de_Cambio^2) +
## Salario_Diario + log(Innovación) + Inseguridad_Robo + Crisis_Financiera,
## data = dataset)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.34802 -0.06981 0.00206 0.07454 0.39740
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 7.1827994 2.0827260 3.449 0.00269 **
## log(Educación^2) 0.5891649 0.4157956 1.417 0.17268
## I(Tipo_de_Cambio^2) 0.0008428 0.0008241 1.023 0.31929
## Salario_Diario -0.0005940 0.0025985 -0.229 0.82163
## log(Innovación) 1.3488115 0.6049355 2.230 0.03803 *
## Inseguridad_Robo -0.0011749 0.0011821 -0.994 0.33275
## Crisis_Financiera -0.1706405 0.1470412 -1.160 0.26022
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1936 on 19 degrees of freedom
## Multiple R-squared: 0.7174, Adjusted R-squared: 0.6281
## F-statistic: 8.037 on 6 and 19 DF, p-value: 0.0002049
plot(effect("I(Tipo_de_Cambio^2)", model3))
plot(effect("log(Educación^2)", model3))
# Diagnosis Tests 3
# Applying the VIF test we can see the variables don't present multicollinearity between them
vif(model3)
## log(Educación^2) I(Tipo_de_Cambio^2) Salario_Diario log(Innovación)
## 3.032954 6.962937 5.790208 1.639236
## Inseguridad_Robo Crisis_Financiera
## 2.118541 1.065396
bptest(model3)
##
## studentized Breusch-Pagan test
##
## data: model3
## BP = 5.6662, df = 6, p-value = 0.4616
histogram(model3$residuals)
AIC(model3)
## [1] -3.7628
# RMSE
cat("RMSE:",RMSE(model3$fitted.values,dataset$IED_Flujos_MXN))
## RMSE: 513343
Model 3:
• Innovation is the variable with the most statistically significant value (95%) • Adjusted R^2 has a value of .6281, which means that this model explains 62.81% of the relationship of the independent variables with the foreign investment flows. • According too the VIF test this model does not present multicollinearity (VIF<10) •“Salario_Diario”, “Inseguridad_Robo”, and “Crisis_Financiera” have a negattive impact in the dependent variable. • This model has the lowest AIC value, for this reason it is the most precise model
plot_summs(model1,model2,model3,scale=TRUE)
## Registered S3 methods overwritten by 'broom':
## method from
## tidy.glht jtools
## tidy.summary.glht jtools
## Loading required namespace: broom.mixed
## Loading required namespace: broom.mixed
## Loading required namespace: broom.mixed
# SELECTING MODEL
# Although model 1 has the lowest value of AIC, it presents multicollinearity among its variables, for this,the selected model is MODEL3
AIC(model1,model2,model3)
## df AIC
## model1 7 -2.883926
## model2 8 -2.958056
## model3 8 -3.762800
The chosen model is model 3, it has the lowest AIC value, shich means it describes better the relationship between bias and variance in the model.
# Presenting OLS Regression Results of our selected model
stargazer(model3,type="text",title="Regression Results",single.row=TRUE,ci=FALSE,ci.level=0.9) ### present OLS Regression results in text format
##
## Regression Results
## ===============================================
## Dependent variable:
## ---------------------------
## log(IED_Flujos_MXN)
## -----------------------------------------------
## log(Educación2) 0.589 (0.416)
## I(Tipo_de_Cambio2) 0.001 (0.001)
## Salario_Diario -0.001 (0.003)
## log(Innovación) 1.349** (0.605)
## Inseguridad_Robo -0.001 (0.001)
## Crisis_Financiera -0.171 (0.147)
## Constant 7.183*** (2.083)
## -----------------------------------------------
## Observations 26
## R2 0.717
## Adjusted R2 0.628
## Residual Std. Error 0.194 (df = 19)
## F Statistic 8.037*** (df = 6; 19)
## ===============================================
## Note: *p<0.1; **p<0.05; ***p<0.01
### Detecting Heteroscedasticity
### Display regression residuals vs fitted values to detect heteroscedasticity
residual <- resid(model3)
valAdjusted <- fitted(model3)
plot(valAdjusted, residual,
xlab = "Fitted Values",
ylab = "Residuals",
main = "Residuals vs. Fitted Plot")
abline(h = 0, col = "blue", lty = 2)
### Split the Data in Training Data vs Test Data
# Lets randomly split the data into train and test set
set.seed(123) ### sets the random seed for reproducibility of results
training.samples<-dataset$IED_Flujos_MXN %>%
createDataPartition(p=0.75,list=FALSE) ### Lets consider 75% of the data to build a predictive model
train.data<-dataset[training.samples, ] ### training data to fit the linear regression model
test.data<-dataset[-training.samples, ] ### testing data to test the linear regression model
# LASSO regression via glmnet package can only take numerical observations. Then, the dataset is transformed to model.matrix() format.
# Independent variables
x<-model.matrix(log(IED_Flujos_MXN) ~ log(Educación^2) + I(Tipo_de_Cambio^2) + Salario_Diario + log(Innovación) + Inseguridad_Robo + Crisis_Financiera,train.data)[,-1] ### OLS model specification
# x<-model.matrix(Weekly_Sales~.,train.data)[,-1] ### matrix of independent variables X's
y<-train.data$IED_Flujos_MXN ### dependent variable
# Cross-validation ensures that every data / observation from the original dataset (datains) has a chance of appearing in train and test datasets.
# Find the best lambda using cross-validation.
set.seed(123)
cv.lasso<-cv.glmnet(x,y,alpha=1) # alpha = 1 for LASSO
## Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per
## fold
# Display the best lambda value
cv.lasso$lambda.min
## [1] 12963.74
# Fit the final model on the training data
lassomodel<-glmnet(x,y,alpha=1,lambda=cv.lasso$lambda.min)
# Display regression coefficients
coef(lassomodel)
## 7 x 1 sparse Matrix of class "dgCMatrix"
## s0
## (Intercept) -1642194.3084
## log(Educación^2) 72608.5823
## I(Tipo_de_Cambio^2) 615.3063
## Salario_Diario .
## log(Innovación) 655909.9264
## Inseguridad_Robo .
## Crisis_Financiera .
# Make predictions on the test data
x.test<-model.matrix(log(IED_Flujos_MXN) ~ log(Educación^2) + I(Tipo_de_Cambio^2) + Salario_Diario + log(Innovación) + Inseguridad_Robo + Crisis_Financiera,test.data)[,-1] ### OLS model specification
# x.test<-model.matrix(Weekly_Sales~.,test.data)[,-1]
lassopredictions <- lassomodel %>% predict(x.test) %>% as.vector()
# Model Accuracy
data.frame(
RMSE = RMSE(lassopredictions, test.data$IED_Flujos_MXN),
Rsquare = R2(lassopredictions, test.data$IED_Flujos_MXN))
## RMSE Rsquare
## 1 148288.6 0.4261465
### visualizing lasso regression results
lbs_fun <- function(fit, offset_x=1, ...) {
L <- length(fit$lambda)
x <- log(fit$lambda[L])+ offset_x
y <- fit$beta[, L]
labs <- names(y)
text(x, y, labels=labs, ...)
}
lasso<-glmnet(scale(x),y,alpha=1)
plot(lasso,xvar="lambda",label=T)
lbs_fun(lasso)
abline(v=cv.lasso$lambda.min,col="red",lty=2)
abline(v=cv.lasso$lambda.1se,col="blue",lty=2)
# Find the best lambda using cross-validation
set.seed(123) # x: independent variables | y: dependent variable
cv.ridge <- cv.glmnet(x,y,alpha=0.1) # alpha = 0 for RIDGE
## Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per
## fold
# Display the best lambda value
cv.ridge$lambda.min # lambda: a numeric value defining the amount of shrinkage. Why min? the higher the value of ?? , the more penalization there is
## [1] 2162.483
# Fit the final model on the training data
ridgemodel<-glmnet(x,y,alpha=0,lambda=cv.ridge$lambda.min)
# Display regression coefficients
coef(ridgemodel)
## 7 x 1 sparse Matrix of class "dgCMatrix"
## s0
## (Intercept) -1763479.0597
## log(Educación^2) 62817.7395
## I(Tipo_de_Cambio^2) 804.8380
## Salario_Diario -587.0118
## log(Innovación) 739704.6514
## Inseguridad_Robo -274.9666
## Crisis_Financiera -51648.6369
# Make predictions on the test data
x.test<-model.matrix(log(IED_Flujos_MXN) ~ log(Educación^2) + I(Tipo_de_Cambio^2) + Salario_Diario + log(Innovación) + Inseguridad_Robo + Crisis_Financiera,test.data)[,-1]
ridgepredictions<-ridgemodel %>% predict(x.test) %>% as.vector()
# Model Accuracy
data.frame(
RMSE = RMSE(ridgepredictions, test.data$IED_Flujos_MXN),
Rsquare = R2(ridgepredictions, test.data$IED_Flujos_MXN)
)
## RMSE Rsquare
## 1 154101.5 0.4280225
### visualizing ridge regression results
ridge<-glmnet(scale(x),y,alpha=0)
plot(ridge, xvar = "lambda", label=T)
lbs_fun(ridge)
abline(v=cv.ridge$lambda.min, col = "red", lty=2)
abline(v=cv.ridge$lambda.1se, col="blue", lty=2)
tab <- matrix(c(17587,6324,6322,0.77,0.71,0.71), ncol=2, byrow=FALSE)
colnames(tab) <- c('RMSE','R2')
rownames(tab) <- c('Linear Regression','Lasso','Ridge')
tab <- as.table(tab)
The selected model is model3, it is demonstrated that it the one that best explains the relationship and impact that the independent variables have in the dependent variable. • As Education increases in 1 unit, IED_Flujos_MXN decreases 0.58 units. • As Innovation increases in 1 unit, IED_Flujos_MXN increases in 1.135 units. • Minimum Wage, Insecurity, and Financial crisis have negative correlation with the Foreign Investment flow. This model explains 62.81% of the relationship between the independent variables and the Foreign Investment Flow.
• To compare the results and choose the best regression model that fits the data, let’s look at the two options provided. Option 1 uses Ridge regression. In this option, the best lambda value (the amount of shrinkage) is determined using cross validation. The lambda value is 2162.483. The final model is fitted using this lambda value and the regression coefficients are displayed. The accuracy of the model is evaluated using the RMSE (root mean square error) and R-square values. In this case, the RMSE is 154101.5 and the R-squared value is 0.4280225. One of the options that we must analyze is to use the polynomial regression that was calculated within model 3. The model formula includes the independent variables: Education, Exchange Rate, Daily Wage, Innovation, Theft Insecurity and Financial Crisis. Coefficients, standard errors, t-values, and p-values for each variable are shown. Model precision is assessed using residual standard error, multiple R-squared, adjusted R-squared, and F-statistic. In this case, the residual standard error is 0.1936, the multiple R-square is 0.7174, and the adjusted R-square is 0.6281.
• If Mexico wants to keep attracting new investors, they should get a focus in Innovation, in this model it isproven that is a attractive variable for Nearshoring investors.
• Insecurity due to theft is one of the variables that has a negative impact in the attraction of new foreign investors, which is an area of oportunity in Mexico.
• The variable of averag Education has a positive impact in the attraction of new foreign investors, we can interpretate that as bigger as the average education level is in Mexico, more foreign business will consider investing in the relocation of their production centers in the country.
####Show the predicted values of the dependent variable
effect_plot(model3, pred=Innovación, dataset=data, interval=TRUE)
## Using data dataset from global environment. This could cause incorrect
## results if dataset has been altered since the model was fit. You can
## manually provide the data to the "data =" argument.
effect_plot(model3, pred=Educación, dataset=data, interval=TRUE)
## Using data dataset from global environment. This could cause incorrect
## results if dataset has been altered since the model was fit. You can
## manually provide the data to the "data =" argument.
effect_plot(model3, pred=Salario_Diario, dataset=data, interval=TRUE)
## Using data dataset from global environment. This could cause incorrect
## results if dataset has been altered since the model was fit. You can
## manually provide the data to the "data =" argument.
effect_plot(model3, pred=Inseguridad_Robo, dataset=data, interval=TRUE)
## Using data dataset from global environment. This could cause incorrect
## results if dataset has been altered since the model was fit. You can
## manually provide the data to the "data =" argument.
•BBC News Mundo. (2017, 10 enero). Cómo se beneficia Estados Unidos fabricando productos en México. BBC News Mundo. https://www.bbc.com/mundo/noticias-38536136 •¿Qué es el análisis predictivo? - Explicación del análisis predictivo - AWS. (s. f.). Amazon Web Services, Inc. https://aws.amazon.com/es/what-is/predictive-analytics/ •Achilles. (2021, 15 septiembre). Deslocalización de la cadena de suministro: enfoque en los datos. Achilles. https://www.achilles.com/es/industry-insights/te-estas-planteando-optar-por-la-deslocalizacion-cercana-un-enfoque-basado-en-los-datos/#:~:text=Como%20parte%20de%20estos%20cambios,o%20de%20un%20pa%C3%ADs%20cercano. •Saucedo, D.(2023). Mexico and its Attractiveness for Nearshoring. Retrieved from: CIC: Centro Internacional de Casos, Instituto Tecnológico de Estudios Superiores Monterrey.