|
Background
What is nearshoring? “Nearshoring” refers to the practice of transferring a business process to companies in a nearby country, usually with the aim of reducing production or operating costs and taking advantage of geographical and temporal advantages. It is similar to “offshoring”, which involves moving businesses or production processes to other countries, but the key difference is in geographical proximity.
With this strategy, in which a company seeks to move part of its production to be closer to its final destination. For example, after the economic disruption caused by COVID 19, many companies are looking for shorter and more resilient production chains that are able to stay in operation at all times.
With nearshoring companies can reduce costs: one of the main reasons companies consider nearshoring is the potential to access skilled labor at a lower cost. Wages in nearshore locations are often significantly less than in the home country. Nearshoring also can make it easier for employees to travel, if employees need to travel between offices, geographic proximity can make travel less expensive. When the companies move to a nearby country, they take lower regulation and political risks than moving to more distant locations.
(Vargas C 2023)
Why might Mexico be attractive to Nearshoring? Mexico has become an attractive nearshoring destination for many companies, especially to nearby countries such as the United States and Canada.
One of the most important advantages that Mexico offers as a nearshoring destination is its geographical proximity to the United States and Canada. There is also some cultural proximity and above all a highly skilled workforce that offers competitive costs, which although not as low as those in the Asian regions, can offer other kinds of advantages. (Samuel Garcia Mexico’s Industry Supply Chain 2023)
In recent years, Mexico has presented challenges and limitations in transport and communications infrastructure due to the low level of investment in these sectors. However, Mexico has a strong and growing business ecosystem, with domestic and foreign companies operating in various sectors that can facilitate collaboration and the exchange of knowledge between companies. (EGADE 2023)
What is Predictive Analytics? According to IBM (2023), Predictive analytics is a branch of advanced analytics that makes predictions about future outcomes using the data with statistical modeling, data mining techniques and machine learning.
Predictive analytics implies the use of data, statistical algorithms, and machine learning techniques. Its main purpose is to know what has happened to provide a accurate information of what will happen in the future. Essentially, companies use predictive analytics to find patterns in this data to identify risks and new opportunities.
(IBM 2023)
What the use is of regression analysis in predictive analytics? Regression analysis is a foundational statistical tool used in predictive analytics to understand relationships between variables and predict future outcomes. At its core, regression analysis estimates the relationships among variables, helping to model the relationship between a dependent variable and one or more independent variables.
We know that predictive analytics is a tool for machine learning and big data but regression modeling is a tool for predictive analytics. Regression analysis is the process of looking at dependent variables (outcomes) and an independent variable (the action). It seeks to determine the relationship or a connection between variables. Essentially, it evaluates whether a relationship exists between the variables and the robustness of that link. (GutCheck 2017)
How regression analysis can help us to predict the occurrence of Nearshoring for the Mexican case?
Regression analysis can be used to predict the occurrence of approximations to Mexico by analyzing various factors that may influence the decision of companies to make nearshoring. Having historical data of companies that have approached Mexico could indicate cost savings they have previously achieved, type of industry, company size, previous nearby destinations, global economic indicators, trade policies and more. With this we can identify which variables (factors) seem most relevant (statistically speaking) for the decision to approach Mexico.
With the regression analysis we can see how different variables influence the nearby decision. And finally, once the model has been selected, the probability of a company near Mexico can be predicted based on the values of the independent variables, and it can help quantify how much each factor contributes to the decision. (T. Osvarauld 2023)
Problem Situation According to the document “Mexico and Its Attractiveness for Nearshoring”, what is the problem situation? how to address the problem situation?
The purpose of this evidence is to help Maria, an analyst in a Mexican company that wants to know if Mexico can be attractive to other countries that want to make nearshoring in this country. She has made an investigation based on INEGI, Bank of Mexico and the Ministry of Economy, with some variables such as GDP per capita, daily wage, exportations in millions of dollars, exchange rate, road information, etc.
Basically she wants to know what econometric model she should use to help her predict the consequences of nearshoring in Mexico, why this country may be attractive to do nearshoring and what are some opportunities that Mexico has in terms of relocating businesses in this area.
With this work we want to know factors that attract the nearshoring or that frighten it for foreign investors in Mexico.
Data and Methodology. Exploratory Data Analysis EDA
# Import BD
library(foreign)
bd<- read.csv("C:\\Users\\85171075\\Desktop\\Mariana\\TEC\\Econometrics\\sp_data.csv")
summary(bd)
## periodo IED_Flujos IED_M Exportaciones
## Min. :1997 Min. : 8374 Min. :210876 Min. : 9088
## 1st Qu.:2003 1st Qu.:21367 1st Qu.:368560 1st Qu.:13260
## Median :2010 Median :27698 Median :497054 Median :21188
## Mean :2010 Mean :26770 Mean :493596 Mean :23601
## 3rd Qu.:2016 3rd Qu.:32183 3rd Qu.:578606 3rd Qu.:31601
## Max. :2022 Max. :48354 Max. :754438 Max. :46478
##
## Exportaciones_m Empleo Educacion Salario_Diario
## Min. :205483 Min. :95.06 Min. :7.200 Min. : 24.30
## 1st Qu.:262337 1st Qu.:95.89 1st Qu.:7.865 1st Qu.: 41.97
## Median :366294 Median :96.53 Median :8.460 Median : 54.48
## Mean :433856 Mean :96.47 Mean :8.423 Mean : 65.16
## 3rd Qu.:632356 3rd Qu.:97.08 3rd Qu.:9.000 3rd Qu.: 72.31
## Max. :785655 Max. :97.83 Max. :9.580 Max. :172.87
## NA's :3 NA's :3
## Innovacion Inseguridad_Robo Inseguridad_Homicidio Tipo_de_Cambio
## Min. :11.28 Min. :120.5 Min. : 8.04 Min. : 8.06
## 1st Qu.:12.56 1st Qu.:148.3 1st Qu.:10.25 1st Qu.:10.75
## Median :13.09 Median :181.8 Median :16.93 Median :13.02
## Mean :13.11 Mean :185.4 Mean :17.29 Mean :13.91
## 3rd Qu.:13.75 3rd Qu.:209.9 3rd Qu.:22.43 3rd Qu.:18.49
## Max. :15.11 Max. :314.8 Max. :29.59 Max. :20.66
## NA's :2 NA's :1
## Densidad_Carretera Densidad_Poblacion CO2_Emisiones PIB_Per_Capita
## Min. :0.05000 Min. :47.44 Min. :3.590 Min. :126739
## 1st Qu.:0.06000 1st Qu.:52.77 1st Qu.:3.830 1st Qu.:130964
## Median :0.07000 Median :58.09 Median :3.930 Median :136845
## Mean :0.07115 Mean :57.33 Mean :3.945 Mean :138550
## 3rd Qu.:0.08000 3rd Qu.:61.39 3rd Qu.:4.105 3rd Qu.:146148
## Max. :0.09000 Max. :65.60 Max. :4.220 Max. :153236
## NA's :3
## INPC crisis_financiera
## Min. : 33.28 Min. :0.00000
## 1st Qu.: 56.15 1st Qu.:0.00000
## Median : 73.35 Median :0.00000
## Mean : 75.17 Mean :0.07692
## 3rd Qu.: 91.29 3rd Qu.:0.00000
## Max. :126.48 Max. :1.00000
##
#Installing libraries
#library(pysch)
library(readxl)
library(tidyverse)
library(ggplot2)
library(corrplot)
library(gmodels)
library(effects)
library(stargazer)
library(olsrr)
library(kableExtra)
library(jtools)
library(fastmap)
library(dlookr)
library(Hmisc)
library(naniar)
library(glmnet)
library(caret)
library(car)
library(lmtest)
library(dplyr)
#Identify missing values
missing_values<-colSums(is.na(bd))
missing_values
## periodo IED_Flujos IED_M
## 0 0 0
## Exportaciones Exportaciones_m Empleo
## 0 0 3
## Educacion Salario_Diario Innovacion
## 3 0 2
## Inseguridad_Robo Inseguridad_Homicidio Tipo_de_Cambio
## 0 1 0
## Densidad_Carretera Densidad_Poblacion CO2_Emisiones
## 0 0 3
## PIB_Per_Capita INPC crisis_financiera
## 0 0 0
#Display data set structure
str(bd)
## 'data.frame': 26 obs. of 18 variables:
## $ periodo : int 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 ...
## $ IED_Flujos : num 12146 8374 13960 18249 30057 ...
## $ IED_M : num 294151 210876 299734 362632 546548 ...
## $ Exportaciones : num 9088 9875 10990 12483 11300 ...
## $ Exportaciones_m : num 220091 248691 235961 248057 205483 ...
## $ Empleo : num NA NA NA 97.8 97.4 ...
## $ Educacion : num 7.2 7.31 7.43 7.56 7.68 7.8 7.93 8.04 8.14 8.26 ...
## $ Salario_Diario : num 24.3 31.9 31.9 35.1 37.6 ...
## $ Innovacion : num 11.3 11.4 12.5 13.2 13.5 ...
## $ Inseguridad_Robo : num 267 315 273 217 215 ...
## $ Inseguridad_Homicidio: num 14.6 14.3 12.6 10.9 10.2 ...
## $ Tipo_de_Cambio : num 8.06 9.94 9.52 9.6 9.17 ...
## $ Densidad_Carretera : num 0.05 0.05 0.06 0.06 0.06 0.06 0.06 0.06 0.06 0.06 ...
## $ Densidad_Poblacion : num 47.4 48.8 49.5 50.6 51.3 ...
## $ CO2_Emisiones : num 3.68 3.85 3.69 3.87 3.81 3.82 3.95 3.98 4.1 4.19 ...
## $ PIB_Per_Capita : num 127570 126739 129165 130875 128083 ...
## $ INPC : num 33.3 39.5 44.3 48.3 50.4 ...
## $ crisis_financiera : int 0 0 0 0 0 0 0 0 0 0 ...
# Include descriptive statistics (mean, median, standard deviation, minimum, maximum)
summary(bd)
## periodo IED_Flujos IED_M Exportaciones
## Min. :1997 Min. : 8374 Min. :210876 Min. : 9088
## 1st Qu.:2003 1st Qu.:21367 1st Qu.:368560 1st Qu.:13260
## Median :2010 Median :27698 Median :497054 Median :21188
## Mean :2010 Mean :26770 Mean :493596 Mean :23601
## 3rd Qu.:2016 3rd Qu.:32183 3rd Qu.:578606 3rd Qu.:31601
## Max. :2022 Max. :48354 Max. :754438 Max. :46478
##
## Exportaciones_m Empleo Educacion Salario_Diario
## Min. :205483 Min. :95.06 Min. :7.200 Min. : 24.30
## 1st Qu.:262337 1st Qu.:95.89 1st Qu.:7.865 1st Qu.: 41.97
## Median :366294 Median :96.53 Median :8.460 Median : 54.48
## Mean :433856 Mean :96.47 Mean :8.423 Mean : 65.16
## 3rd Qu.:632356 3rd Qu.:97.08 3rd Qu.:9.000 3rd Qu.: 72.31
## Max. :785655 Max. :97.83 Max. :9.580 Max. :172.87
## NA's :3 NA's :3
## Innovacion Inseguridad_Robo Inseguridad_Homicidio Tipo_de_Cambio
## Min. :11.28 Min. :120.5 Min. : 8.04 Min. : 8.06
## 1st Qu.:12.56 1st Qu.:148.3 1st Qu.:10.25 1st Qu.:10.75
## Median :13.09 Median :181.8 Median :16.93 Median :13.02
## Mean :13.11 Mean :185.4 Mean :17.29 Mean :13.91
## 3rd Qu.:13.75 3rd Qu.:209.9 3rd Qu.:22.43 3rd Qu.:18.49
## Max. :15.11 Max. :314.8 Max. :29.59 Max. :20.66
## NA's :2 NA's :1
## Densidad_Carretera Densidad_Poblacion CO2_Emisiones PIB_Per_Capita
## Min. :0.05000 Min. :47.44 Min. :3.590 Min. :126739
## 1st Qu.:0.06000 1st Qu.:52.77 1st Qu.:3.830 1st Qu.:130964
## Median :0.07000 Median :58.09 Median :3.930 Median :136845
## Mean :0.07115 Mean :57.33 Mean :3.945 Mean :138550
## 3rd Qu.:0.08000 3rd Qu.:61.39 3rd Qu.:4.105 3rd Qu.:146148
## Max. :0.09000 Max. :65.60 Max. :4.220 Max. :153236
## NA's :3
## INPC crisis_financiera
## Min. : 33.28 Min. :0.00000
## 1st Qu.: 56.15 1st Qu.:0.00000
## Median : 73.35 Median :0.00000
## Mean : 75.17 Mean :0.07692
## 3rd Qu.: 91.29 3rd Qu.:0.00000
## Max. :126.48 Max. :1.00000
##
# We can observe that of our dependent variable "IED_M" the minimum value is 210,876 and our maximum value is 754438, since we do not have the information by regions, it is not possible to know in which areas of the country are the most flow of investments or where they've been lower. This would serve us a little more by regions to see which points and areas of the country are more attractive to do near shoring. This suggestion also applies to determine daily wages, when you want to move a business to another country you have to consider the salary of workers to see if the company suits and can be interesting for the population near the area.
# Also we can see that the information of the data base starts form 1997 until 2022. With this summary function we can see the maximum and minimum values of each variable (column) given.
# Transform variables if required
#bd$Salario_Diario = as.factor(bd$Salario_Diario)
# Which is the estimation method to be used to estimate the linear regression model?
#In linear regression analysis, the most common estimation method is the Ordinary Least Squares (OLS) method. The OLS method minimizes the sum of the squared differences (or "errors") between the observed values (dependent variable values) and the values predicted by the model. Its main goal is to fit a linear model to a data base in which the sum of the squared differences between the observed values and the values predicted by the model is minimized ("errors”). In R, the OLS cna be seen in the "regression line" that best fits the data in a when we plot scatter plot.
# Replace missing values
bd <- bd %>%
mutate(across(everything(), ~ifelse(is.na(.), median(., na.rm = TRUE), .)))
bd
## periodo IED_Flujos IED_M Exportaciones Exportaciones_m Empleo Educacion
## 1 1997 12145.60 294151.2 9087.62 220090.8 96.53 7.20
## 2 1998 8373.50 210875.6 9875.07 248690.6 96.53 7.31
## 3 1999 13960.32 299734.4 10990.01 235960.5 96.53 7.43
## 4 2000 18248.69 362631.8 12482.96 248057.2 97.83 7.56
## 5 2001 30057.18 546548.4 11300.44 205482.9 97.36 7.68
## 6 2002 24099.21 468332.0 11923.10 231707.6 97.66 7.80
## 7 2003 18249.97 368752.8 13156.00 265825.7 97.06 7.93
## 8 2004 25015.57 481349.2 13573.13 261173.9 96.48 8.04
## 9 2005 25795.82 458544.8 16465.81 292695.1 97.17 8.14
## 10 2006 21232.54 368495.8 17485.93 303472.5 96.53 8.26
## 11 2007 32393.33 542793.7 19103.85 320110.6 96.60 8.36
## 12 2008 29502.46 586217.7 16924.76 336297.2 95.68 8.46
## 13 2009 17849.95 324318.4 19702.63 357980.1 95.20 8.56
## 14 2010 27189.28 449223.7 22673.14 374607.6 95.06 8.63
## 15 2011 25632.52 460653.8 24333.02 437299.9 95.49 8.75
## 16 2012 21769.32 350978.6 26297.98 423992.5 95.53 8.85
## 17 2013 48354.42 754437.5 27687.57 431988.2 95.75 8.95
## 18 2014 30351.25 512758.2 31676.78 535151.9 96.24 9.05
## 19 2015 35943.75 699904.1 29959.94 583386.1 96.04 9.15
## 20 2016 31188.98 700091.6 31375.06 704268.5 96.62 9.25
## 21 2017 34017.05 683318.0 33322.62 669368.6 96.85 9.35
## 22 2018 34100.43 671018.4 35341.90 695447.7 96.64 9.45
## 23 2019 34577.16 615945.4 36414.73 648679.3 97.09 9.58
## 24 2020 28205.89 514711.7 41077.34 749594.7 96.21 8.46
## 25 2021 31553.52 551937.8 44914.78 785654.5 96.49 8.46
## 26 2022 36215.37 555771.9 46477.59 713259.0 97.24 8.46
## Salario_Diario Innovacion Inseguridad_Robo Inseguridad_Homicidio
## 1 24.30 11.30 266.51 14.55
## 2 31.91 11.37 314.78 14.32
## 3 31.91 12.46 272.89 12.64
## 4 35.12 13.15 216.98 10.86
## 5 37.57 13.47 214.53 10.25
## 6 39.74 12.80 197.80 9.94
## 7 41.53 11.81 183.22 9.81
## 8 43.30 12.61 146.28 8.92
## 9 45.24 13.41 136.94 9.22
## 10 47.05 14.23 135.59 9.60
## 11 48.88 15.04 145.92 8.04
## 12 50.84 14.82 158.17 12.52
## 13 53.19 12.59 175.77 17.46
## 14 55.77 12.69 201.94 22.43
## 15 58.06 12.10 212.61 23.42
## 16 60.75 13.03 190.28 22.09
## 17 63.12 13.22 185.56 19.74
## 18 65.58 13.65 154.41 16.93
## 19 70.10 15.11 180.44 17.37
## 20 73.04 14.40 160.57 20.31
## 21 88.36 14.05 230.43 26.22
## 22 88.36 13.25 184.25 29.59
## 23 102.68 12.70 173.45 29.21
## 24 123.22 11.28 133.90 28.98
## 25 141.70 13.09 127.13 27.89
## 26 172.87 13.09 120.49 16.93
## Tipo_de_Cambio Densidad_Carretera Densidad_Poblacion CO2_Emisiones
## 1 8.06 0.05 47.44 3.68
## 2 9.94 0.05 48.76 3.85
## 3 9.52 0.06 49.48 3.69
## 4 9.60 0.06 50.58 3.87
## 5 9.17 0.06 51.28 3.81
## 6 10.36 0.06 51.95 3.82
## 7 11.20 0.06 52.61 3.95
## 8 11.22 0.06 53.27 3.98
## 9 10.71 0.06 54.78 4.10
## 10 10.88 0.06 55.44 4.19
## 11 10.90 0.06 56.17 4.22
## 12 13.77 0.07 56.96 4.19
## 13 13.04 0.07 57.73 4.04
## 14 12.38 0.07 58.45 4.11
## 15 13.98 0.07 59.15 4.19
## 16 12.99 0.07 59.85 4.20
## 17 13.07 0.08 59.49 4.06
## 18 14.73 0.08 60.17 3.89
## 19 17.34 0.08 60.86 3.93
## 20 20.66 0.08 61.57 3.89
## 21 19.74 0.09 62.28 3.84
## 22 19.66 0.09 63.11 3.65
## 23 18.87 0.09 63.90 3.59
## 24 19.94 0.09 64.59 3.93
## 25 20.52 0.09 65.16 3.93
## 26 19.41 0.09 65.60 3.93
## PIB_Per_Capita INPC crisis_financiera
## 1 127570.1 33.28 0
## 2 126738.8 39.47 0
## 3 129164.7 44.34 0
## 4 130874.9 48.31 0
## 5 128083.4 50.43 0
## 6 128205.9 53.31 0
## 7 128737.9 55.43 0
## 8 132563.5 58.31 0
## 9 132941.1 60.25 0
## 10 135894.9 62.69 0
## 11 137795.7 65.05 0
## 12 135176.0 69.30 1
## 13 131233.0 71.77 1
## 14 134991.7 74.93 0
## 15 138891.9 77.79 0
## 16 141530.2 80.57 0
## 17 144112.0 83.77 0
## 18 147277.4 87.19 0
## 19 149433.5 89.05 0
## 20 152275.4 92.04 0
## 21 153235.7 98.27 0
## 22 153133.8 99.91 0
## 23 150233.1 105.93 0
## 24 142609.3 109.27 0
## 25 142772.0 117.31 0
## 26 146826.7 126.48 0
#Identify missing values
missing_values<-colSums(is.na(bd))
missing_values
## periodo IED_Flujos IED_M
## 0 0 0
## Exportaciones Exportaciones_m Empleo
## 0 0 0
## Educacion Salario_Diario Innovacion
## 0 0 0
## Inseguridad_Robo Inseguridad_Homicidio Tipo_de_Cambio
## 0 0 0
## Densidad_Carretera Densidad_Poblacion CO2_Emisiones
## 0 0 0
## PIB_Per_Capita INPC crisis_financiera
## 0 0 0
# Y= IED_Flujos
# Histogram Exportaciones
hist1 <- ggplot(bd, aes(x = Exportaciones_m)) +
geom_histogram(aes(y = ..density..), bins = 10, fill = "pink", color = "black") +
geom_density(color = "red") +
labs(title = "Distribución de Exportaciones", x = "Valor", y = "Frecuencia") +
theme(plot.title = element_text(hjust = 0.5))
print(hist1)
## Warning: The dot-dot notation (`..density..`) was deprecated in ggplot2 3.4.0.
## ℹ Please use `after_stat(density)` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
# Histogram Salario diario
hist2 <- ggplot(bd, aes(x = Salario_Diario)) +
geom_histogram(aes(y = ..density..), bins = 10, fill = "purple", color = "black") +
geom_density(color = "red") +
labs(title = "Distribución de Salario diario", x = "Valor", y = "Frecuencia") +
theme(plot.title = element_text(hjust = 0.5))
print(hist2)
# Histogram Educacion
hist3 <- ggplot(bd, aes(x = Educacion)) +
geom_histogram(aes(y = ..density..), bins = 10, fill = "yellow", color = "black") +
geom_density(color = "red") +
labs(title = "Distribución de Años de educación", x = "Valor", y = "Frecuencia") +
theme(plot.title = element_text(hjust = 0.5))
print(hist3)
# Histogram PIB_per_capita
hist4 <- ggplot(bd, aes(x = PIB_Per_Capita)) +
geom_histogram(aes(y = ..density..), bins = 10, fill = "blue", color = "black") +
geom_density(color = "red") +
labs(title = "Distribución de PIB", x = "Valor", y = "Frecuencia") +
theme(plot.title = element_text(hjust = 0.5))
print(hist4)
#Scatter plot 1
ggplot(data=bd, aes(x=PIB_Per_Capita, y=IED_M)) +
geom_point() +
labs(title="Scatter Plot PIB vs IED ", x="PIB", y="IED") +
theme_minimal()
#Scatter plot 2
ggplot(data=bd, aes(x=CO2_Emisiones, y=IED_M)) +
geom_point() +
labs(title="Scatter CO2_Emisiones vs IED ", x="CO2_Emisiones", y="IED") +
theme_minimal()
# With this histograms we can see that some variables are lightly skewed, which means that the tail of the distribution extends more to the right, rather than the left or viceversa. This means that there are a few very high values pulling the tail out to the right/left side. The presence of skewness can influence the choice of tests or models. Many statistical techniques assume normality (center distribution). When data is skewed, certain assumptions are violated, which we may ned data transformations (it can be into log).
# When plotting two variables in a scatter plot it can be seen that the points follow a general direction, either ascending or descending, it is possible that there is a correlation between those two variables.With scatter plots you can see general trends in data that are not necessarily correlated.
# Display a histogram of dependent variable
hist4=histogram(bd$IED_M)
hist4
# Display a histogram of dependent variable in LOG
ggplot(data = bd, aes(x = log(IED_M)))+
geom_histogram(bins = 10, fill = "lightpink", color = "black", boundary = 15) + labs(title = "Frequency of Flujos de inversión extranjera directa", x="Media_values", y="Frequency")+ theme(plot.title = element_text(hjust = 0.5))
res <- cor(bd)
round(res, 2)
## periodo IED_Flujos IED_M Exportaciones Exportaciones_m
## periodo 1.00 0.72 0.69 0.98 0.95
## IED_Flujos 0.72 1.00 0.94 0.66 0.60
## IED_M 0.69 0.94 1.00 0.61 0.64
## Exportaciones 0.98 0.66 0.61 1.00 0.97
## Exportaciones_m 0.95 0.60 0.64 0.97 1.00
## Empleo -0.21 -0.06 0.02 -0.13 -0.09
## Educacion 0.84 0.73 0.74 0.73 0.75
## Salario_Diario 0.88 0.56 0.48 0.94 0.88
## Innovacion 0.25 0.53 0.58 0.16 0.17
## Inseguridad_Robo -0.59 -0.55 -0.45 -0.54 -0.45
## Inseguridad_Homicidio 0.78 0.40 0.42 0.78 0.82
## Tipo_de_Cambio 0.94 0.60 0.68 0.93 0.98
## Densidad_Carretera 0.96 0.73 0.72 0.95 0.95
## Densidad_Poblacion 1.00 0.72 0.67 0.96 0.93
## CO2_Emisiones 0.02 0.09 -0.06 -0.07 -0.18
## PIB_Per_Capita 0.89 0.73 0.78 0.85 0.89
## INPC 0.99 0.70 0.65 0.99 0.95
## crisis_financiera -0.04 -0.10 -0.08 -0.14 -0.13
## Empleo Educacion Salario_Diario Innovacion
## periodo -0.21 0.84 0.88 0.25
## IED_Flujos -0.06 0.73 0.56 0.53
## IED_M 0.02 0.74 0.48 0.58
## Exportaciones -0.13 0.73 0.94 0.16
## Exportaciones_m -0.09 0.75 0.88 0.17
## Empleo 1.00 -0.32 0.04 0.01
## Educacion -0.32 1.00 0.51 0.45
## Salario_Diario 0.04 0.51 1.00 0.05
## Innovacion 0.01 0.45 0.05 1.00
## Inseguridad_Robo 0.02 -0.44 -0.54 -0.42
## Inseguridad_Homicidio -0.33 0.68 0.64 -0.17
## Tipo_de_Cambio -0.09 0.78 0.85 0.22
## Densidad_Carretera -0.13 0.82 0.86 0.21
## Densidad_Poblacion -0.26 0.85 0.86 0.28
## CO2_Emisiones -0.51 0.07 -0.11 0.33
## PIB_Per_Capita -0.11 0.91 0.67 0.43
## INPC -0.14 0.78 0.93 0.22
## crisis_financiera -0.42 0.04 -0.11 0.16
## Inseguridad_Robo Inseguridad_Homicidio Tipo_de_Cambio
## periodo -0.59 0.78 0.94
## IED_Flujos -0.55 0.40 0.60
## IED_M -0.45 0.42 0.68
## Exportaciones -0.54 0.78 0.93
## Exportaciones_m -0.45 0.82 0.98
## Empleo 0.02 -0.33 -0.09
## Educacion -0.44 0.68 0.78
## Salario_Diario -0.54 0.64 0.85
## Innovacion -0.42 -0.17 0.22
## Inseguridad_Robo 1.00 -0.08 -0.45
## Inseguridad_Homicidio -0.08 1.00 0.79
## Tipo_de_Cambio -0.45 0.79 1.00
## Densidad_Carretera -0.47 0.81 0.94
## Densidad_Poblacion -0.62 0.76 0.92
## CO2_Emisiones -0.41 -0.25 -0.17
## PIB_Per_Capita -0.40 0.70 0.88
## INPC -0.59 0.75 0.94
## crisis_financiera -0.11 -0.09 -0.04
## Densidad_Carretera Densidad_Poblacion CO2_Emisiones
## periodo 0.96 1.00 0.02
## IED_Flujos 0.73 0.72 0.09
## IED_M 0.72 0.67 -0.06
## Exportaciones 0.95 0.96 -0.07
## Exportaciones_m 0.95 0.93 -0.18
## Empleo -0.13 -0.26 -0.51
## Educacion 0.82 0.85 0.07
## Salario_Diario 0.86 0.86 -0.11
## Innovacion 0.21 0.28 0.33
## Inseguridad_Robo -0.47 -0.62 -0.41
## Inseguridad_Homicidio 0.81 0.76 -0.25
## Tipo_de_Cambio 0.94 0.92 -0.17
## Densidad_Carretera 1.00 0.95 -0.17
## Densidad_Poblacion 0.95 1.00 0.09
## CO2_Emisiones -0.17 0.09 1.00
## PIB_Per_Capita 0.89 0.87 -0.11
## INPC 0.96 0.98 -0.01
## crisis_financiera -0.03 0.00 0.28
## PIB_Per_Capita INPC crisis_financiera
## periodo 0.89 0.99 -0.04
## IED_Flujos 0.73 0.70 -0.10
## IED_M 0.78 0.65 -0.08
## Exportaciones 0.85 0.99 -0.14
## Exportaciones_m 0.89 0.95 -0.13
## Empleo -0.11 -0.14 -0.42
## Educacion 0.91 0.78 0.04
## Salario_Diario 0.67 0.93 -0.11
## Innovacion 0.43 0.22 0.16
## Inseguridad_Robo -0.40 -0.59 -0.11
## Inseguridad_Homicidio 0.70 0.75 -0.09
## Tipo_de_Cambio 0.88 0.94 -0.04
## Densidad_Carretera 0.89 0.96 -0.03
## Densidad_Poblacion 0.87 0.98 0.00
## CO2_Emisiones -0.11 -0.01 0.28
## PIB_Per_Capita 1.00 0.85 -0.18
## INPC 0.85 1.00 -0.06
## crisis_financiera -0.18 -0.06 1.00
# Display a correlation plot
cor_matrix <- cor(bd, use = "complete.obs")
corrplot(cor_matrix, method = "circle",type="upper")
#qualitative data
corrplot(cor(bd),type='upper',order='hclust',addCoef.col='black')
Linear Regression Analysis
1st hypothesis
H0: The years of education have a significant impact on the flow of
foreign direct imports. H1: The years of education have not an impact on
the flow of foreign direct imports.
2nd hypothesis
H0: The financial crisis has a significant impact on the flow of foreign
direct imports. H1: The The financial crisis has not an impact on the
flow of foreign direct imports.
3rd hypothesis
H0: The minimum wage has a significant impact on foreign direct imports.
H1: The minimum wage has not an impact on foreign direct imports.
Models.
# Modelo Inicial 1
mod1 <- lm(IED_M ~Exportaciones_m+ Educacion+Inseguridad_Homicidio+Salario_Diario+Tipo_de_Cambio+crisis_financiera, data = bd)
summary(mod1)
##
## Call:
## lm(formula = IED_M ~ Exportaciones_m + Educacion + Inseguridad_Homicidio +
## Salario_Diario + Tipo_de_Cambio + crisis_financiera, data = bd)
##
## Residuals:
## Min 1Q Median 3Q Max
## -140411 -35956 -6718 37603 230766
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -8.172e+05 3.718e+05 -2.198 0.0405 *
## Exportaciones_m -3.719e-01 7.759e-01 -0.479 0.6372
## Educacion 1.358e+05 5.124e+04 2.650 0.0158 *
## Inseguridad_Homicidio -8.057e+03 4.996e+03 -1.612 0.1233
## Salario_Diario 9.475e+00 1.342e+03 0.007 0.9944
## Tipo_de_Cambio 3.403e+04 3.027e+04 1.124 0.2749
## crisis_financiera -8.978e+04 8.382e+04 -1.071 0.2975
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 95540 on 19 degrees of freedom
## Multiple R-squared: 0.6647, Adjusted R-squared: 0.5589
## F-statistic: 6.279 on 6 and 19 DF, p-value: 0.0009093
# Modelo 2
mod2 <- lm(log(IED_M) ~log(lag(IED_M)) + log(Exportaciones_m)+ log(Educacion)+log(Inseguridad_Homicidio)+log(Salario_Diario)+log(Tipo_de_Cambio)+crisis_financiera, data = bd)
summary(mod2)
##
## Call:
## lm(formula = log(IED_M) ~ log(lag(IED_M)) + log(Exportaciones_m) +
## log(Educacion) + log(Inseguridad_Homicidio) + log(Salario_Diario) +
## log(Tipo_de_Cambio) + crisis_financiera, data = bd)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.26148 -0.15268 0.02053 0.09231 0.35313
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 18.2555 5.8840 3.103 0.00647 **
## log(lag(IED_M)) -0.3087 0.2307 -1.338 0.19858
## log(Exportaciones_m) -1.1575 0.6432 -1.799 0.08972 .
## log(Educacion) 4.0847 1.1024 3.705 0.00176 **
## log(Inseguridad_Homicidio) -0.3817 0.1815 -2.103 0.05069 .
## log(Salario_Diario) 0.3481 0.2690 1.294 0.21291
## log(Tipo_de_Cambio) 1.8107 0.8306 2.180 0.04361 *
## crisis_financiera -0.2093 0.1651 -1.268 0.22195
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1956 on 17 degrees of freedom
## (1 observation deleted due to missingness)
## Multiple R-squared: 0.7157, Adjusted R-squared: 0.5986
## F-statistic: 6.113 on 7 and 17 DF, p-value: 0.001103
## Model 3
mod3 <- lm(log(IED_M) ~ log(Educacion)+log(Inseguridad_Homicidio)+log(Salario_Diario)+crisis_financiera, data = bd)
summary(mod3)
##
## Call:
## lm(formula = log(IED_M) ~ log(Educacion) + log(Inseguridad_Homicidio) +
## log(Salario_Diario) + crisis_financiera, data = bd)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.37219 -0.09881 0.01057 0.11007 0.37397
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.6697 1.2952 5.149 4.21e-05 ***
## log(Educacion) 2.9459 0.7722 3.815 0.00101 **
## log(Inseguridad_Homicidio) -0.2959 0.1412 -2.096 0.04840 *
## log(Salario_Diario) 0.2346 0.1350 1.738 0.09679 .
## crisis_financiera -0.1221 0.1525 -0.801 0.43231
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.2046 on 21 degrees of freedom
## Multiple R-squared: 0.6512, Adjusted R-squared: 0.5847
## F-statistic: 9.8 on 4 and 21 DF, p-value: 0.0001235
mod4 <- lm(log(IED_M) ~log(lag(IED_M)) + log(Exportaciones_m)+ log(Educacion^2)+log(Inseguridad_Homicidio)+log(Salario_Diario)+log(Tipo_de_Cambio)+crisis_financiera, data = bd)
summary(mod4)
##
## Call:
## lm(formula = log(IED_M) ~ log(lag(IED_M)) + log(Exportaciones_m) +
## log(Educacion^2) + log(Inseguridad_Homicidio) + log(Salario_Diario) +
## log(Tipo_de_Cambio) + crisis_financiera, data = bd)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.26148 -0.15268 0.02053 0.09231 0.35313
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 18.2555 5.8840 3.103 0.00647 **
## log(lag(IED_M)) -0.3087 0.2307 -1.338 0.19858
## log(Exportaciones_m) -1.1575 0.6432 -1.799 0.08972 .
## log(Educacion^2) 2.0423 0.5512 3.705 0.00176 **
## log(Inseguridad_Homicidio) -0.3817 0.1815 -2.103 0.05069 .
## log(Salario_Diario) 0.3481 0.2690 1.294 0.21291
## log(Tipo_de_Cambio) 1.8107 0.8306 2.180 0.04361 *
## crisis_financiera -0.2093 0.1651 -1.268 0.22195
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1956 on 17 degrees of freedom
## (1 observation deleted due to missingness)
## Multiple R-squared: 0.7157, Adjusted R-squared: 0.5986
## F-statistic: 6.113 on 7 and 17 DF, p-value: 0.001103
Compare models
stargazer(mod1,mod2,mod3,type="text",title="OLS Regression Results",single.row=TRUE,ci=FALSE,ci.level=0.9)
##
## OLS Regression Results
## ===================================================================================================
## Dependent variable:
## ------------------------------------------------------------------------
## IED_M log(IED_M)
## (1) (2) (3)
## ---------------------------------------------------------------------------------------------------
## Exportaciones_m -0.372 (0.776)
## Educacion 135,770.100** (51,237.110)
## Inseguridad_Homicidio -8,056.819 (4,996.485)
## Salario_Diario 9.475 (1,342.384)
## Tipo_de_Cambio 34,033.200 (30,274.240)
## log(lag(IED_M)) -0.309 (0.231)
## log(Exportaciones_m) -1.157* (0.643)
## log(Educacion) 4.085*** (1.102) 2.946*** (0.772)
## log(Inseguridad_Homicidio) -0.382* (0.182) -0.296** (0.141)
## log(Salario_Diario) 0.348 (0.269) 0.235* (0.135)
## log(Tipo_de_Cambio) 1.811** (0.831)
## crisis_financiera -89,784.880 (83,821.190) -0.209 (0.165) -0.122 (0.152)
## Constant -817,189.700** (371,769.400) 18.255*** (5.884) 6.670*** (1.295)
## ---------------------------------------------------------------------------------------------------
## Observations 26 25 26
## R2 0.665 0.716 0.651
## Adjusted R2 0.559 0.599 0.585
## Residual Std. Error 95,540.370 (df = 19) 0.196 (df = 17) 0.205 (df = 21)
## F Statistic 6.279*** (df = 6; 19) 6.113*** (df = 7; 17) 9.800*** (df = 4; 21)
## ===================================================================================================
## Note: *p<0.1; **p<0.05; ***p<0.01
# Model graphs
par(mfrow = c(1,2))
plot(x=predict(mod1),y=bd$IED_M,
xlab='Predicted values',ylab='Observed values',
main='Model 1')
abline(a=0,b=1,col="yellow")
plot(x=predict(mod3),y=bd$IED_M,
xlab='Predicted values',ylab='Observed values',
main='Model 3')
abline(a=0,b=1,col="yellow")
# Show the level of accuracy for each linear regression model
# Model 1 - level of accuracy
AIC(mod1)
## [1] 677.9295
# Model 2 - level of accuracy
AIC(mod2)
## [1] -2.283171
# Model 3 - level of accuracy
AIC(mod3)
## [1] -2.285823
# Model 4 - level of accuracy
AIC(mod4)
## [1] -2.283171
Model selction Select the regression model that better fits the data. Please consider diagnostic tests in selecting the model.
#Diagnostic tests
# Model 1:
vif(mod1)
## Exportaciones_m Educacion Inseguridad_Homicidio
## 62.709115 3.278016 3.462635
## Salario_Diario Tipo_de_Cambio crisis_financiera
## 6.342684 43.235405 1.421025
bptest(mod1)
##
## studentized Breusch-Pagan test
##
## data: mod1
## BP = 5.8766, df = 6, p-value = 0.4372
AIC(mod1)
## [1] 677.9295
histogram(mod1$residuals)
# Model 2:
vif(mod2)
## log(lag(IED_M)) log(Exportaciones_m)
## 3.467282 50.247929
## log(Educacion) log(Inseguridad_Homicidio)
## 4.416202 3.834936
## log(Salario_Diario) log(Tipo_de_Cambio)
## 9.422846 34.018824
## crisis_financiera
## 1.311019
bptest(mod2)
##
## studentized Breusch-Pagan test
##
## data: mod2
## BP = 8.1307, df = 7, p-value = 0.3212
AIC(mod2)
## [1] -2.283171
histogram(mod2$residuals)
# Model 3:
vif(mod3)
## log(Educacion) log(Inseguridad_Homicidio)
## 2.333886 2.124843
## log(Salario_Diario) crisis_financiera
## 2.508881 1.025598
bptest(mod3)
##
## studentized Breusch-Pagan test
##
## data: mod3
## BP = 3.1743, df = 4, p-value = 0.5291
AIC(mod3)
## [1] -2.285823
histogram(mod3$residuals)
# VIF
vif(mod1)
## Exportaciones_m Educacion Inseguridad_Homicidio
## 62.709115 3.278016 3.462635
## Salario_Diario Tipo_de_Cambio crisis_financiera
## 6.342684 43.235405 1.421025
vif(mod2)
## log(lag(IED_M)) log(Exportaciones_m)
## 3.467282 50.247929
## log(Educacion) log(Inseguridad_Homicidio)
## 4.416202 3.834936
## log(Salario_Diario) log(Tipo_de_Cambio)
## 9.422846 34.018824
## crisis_financiera
## 1.311019
vif(mod3)
## log(Educacion) log(Inseguridad_Homicidio)
## 2.333886 2.124843
## log(Salario_Diario) crisis_financiera
## 2.508881 1.025598
vif(mod4)
## log(lag(IED_M)) log(Exportaciones_m)
## 3.467282 50.247929
## log(Educacion^2) log(Inseguridad_Homicidio)
## 4.416202 3.834936
## log(Salario_Diario) log(Tipo_de_Cambio)
## 9.422846 34.018824
## crisis_financiera
## 1.311019
Show the predicted values of the dependent variable (e.g., effects plot)
plot(x=predict(mod3),y=bd$IED_M,
xlab='Predicted values',ylab='Observed values',
main='Model 3')
EXTRA LASSO
set.seed(123)
training.samples<-bd$IED_M %>%
createDataPartition(p=0.75,list=FALSE)
train.data<-bd[training.samples, ]
test.data<-bd[-training.samples, ]
selected_model = lm(log(IED_M) ~ log(lag(IED_M)) + log(Exportaciones_m)+ log(Educacion^2)+log(Inseguridad_Homicidio)+log(Salario_Diario)+log(Tipo_de_Cambio)+crisis_financiera, data=bd)
summary(selected_model)
##
## Call:
## lm(formula = log(IED_M) ~ log(lag(IED_M)) + log(Exportaciones_m) +
## log(Educacion^2) + log(Inseguridad_Homicidio) + log(Salario_Diario) +
## log(Tipo_de_Cambio) + crisis_financiera, data = bd)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.26148 -0.15268 0.02053 0.09231 0.35313
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 18.2555 5.8840 3.103 0.00647 **
## log(lag(IED_M)) -0.3087 0.2307 -1.338 0.19858
## log(Exportaciones_m) -1.1575 0.6432 -1.799 0.08972 .
## log(Educacion^2) 2.0423 0.5512 3.705 0.00176 **
## log(Inseguridad_Homicidio) -0.3817 0.1815 -2.103 0.05069 .
## log(Salario_Diario) 0.3481 0.2690 1.294 0.21291
## log(Tipo_de_Cambio) 1.8107 0.8306 2.180 0.04361 *
## crisis_financiera -0.2093 0.1651 -1.268 0.22195
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1956 on 17 degrees of freedom
## (1 observation deleted due to missingness)
## Multiple R-squared: 0.7157, Adjusted R-squared: 0.5986
## F-statistic: 6.113 on 7 and 17 DF, p-value: 0.001103
RMSE(selected_model$fitted.values,test.data$IED_M)
## Warning in pred - obs: longitud de objeto mayor no es múltiplo de la longitud
## de uno menor
## [1] 535321.2
#mod3 <- lm(log(IED_M) ~log(lag(IED_M)) + log(Exportaciones_m)+ log(Educacion^2)+log(Inseguridad_Homicidio)+log(Salario_Diario)+log(Tipo_de_Cambio)+crisis_financiera, data = bd)
#summary(mod3)
x = model.matrix(log(IED_M) ~ log(lag(IED_M)) + log(Exportaciones_m)+ log(Educacion^2)+log(Inseguridad_Homicidio)+log(Salario_Diario)+log(Tipo_de_Cambio)+crisis_financiera, train.data)[,-1]
y = train.data$IED_M[-1]
set.seed(123)
cv.lasso<-cv.glmnet(x,y,alpha=1)
## Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per
## fold
cv.lasso$lambda.min
## [1] 1122.986
lassomodel<-glmnet(x,y,alpha=1,lambda=cv.lasso$lambda.min)
coef(lassomodel)
## 8 x 1 sparse Matrix of class "dgCMatrix"
## s0
## (Intercept) 1653473.85
## log(lag(IED_M)) -60727.67
## log(Exportaciones_m) -312043.98
## log(Educacion^2) 447115.13
## log(Inseguridad_Homicidio) -143672.53
## log(Salario_Diario) -25170.11
## log(Tipo_de_Cambio) 864080.72
## crisis_financiera -80593.82
x.test<-model.matrix(log(IED_M) ~ Tipo_de_Cambio + Empleo + Salario_Diario + PIB_Per_Capita + CO2_Emisiones + Inseguridad_Robo,test.data)[,-1]
#lassopredictions <- lassomodel %>% predict(x.test) %>% as.vector()
#data.frame(
# RMSE = RMSE(lassopredictions, test.data$IED_M),
# Rsquare = R2(lassopredictions, test.data$IED_M))
# Lasso model graph
lbs_fun <- function(fit, offset_x=1, ...) {
L <- length(fit$lambda)
x <- log(fit$lambda[L])+ offset_x
y <- fit$beta[, L]
labs <- names(y)
text(x, y, labels=labs, ...)
}
lasso<-glmnet(scale(x),y,alpha=1)
plot(lasso,xvar="lambda",label=T)
lbs_fun(lasso)
abline(v=cv.lasso$lambda.min,col="red",lty=2)
abline(v=cv.lasso$lambda.1se,col="blue",lty=2)
As we can see, the signs of the coefficients are not different tather
than my winning model #3. Which verifies that the model 3 is the most
appropiate in comparison with my other models.
A Lasso regression is used to estimate linear regression models regularized by loss L1 for a dependent variable in one or more independent variables, and includes optional modes to display crawl charts and to select the alpha hyperparameter value based on cross validation.
Lasso regression generates “scattered coefficients”: coefficient vectors in which most of them take the zero value. This means that the model will ignore some of the predictive features, which can be considered a type of automatic feature selection. Including fewer features is a simpler model to interpret that can reveal the most important features of the dataset. In the event that there is some correlation between the predictive characteristics, Lasso will tend to choose one of them at random.
EXTRA DETECT AUTOCORRELTION (JUST A GRAPH)
ts_data <- ts(bd)
acf(ts_data)
Box.test(mod3$residuals,lag=5,type="Ljung-Box")
##
## Box-Ljung test
##
## data: mod3$residuals
## X-squared = 4.3962, df = 5, p-value = 0.4939
Box.test(mod1$residuals,lag=5,type="Ljung-Box")
##
## Box-Ljung test
##
## data: mod1$residuals
## X-squared = 5.5708, df = 5, p-value = 0.3502
tsmodel1_res<-ts(mod3$residuals,start=1997,end=2020,frequency=1)
tsmodel2_res<-ts(mod1$residuals,start=1997,end=2020,frequency=1)
### Detect serial autocorrelation in acf plots
acf(tsmodel1_res)
acf(tsmodel2_res)
#The Ljung-Box test is used to check for autocorrelation in the residuals of a time series model. This parameter sets the number of lags to consider in the test. The test will check for autocorrelation up to the 5th lag in this case.
#Model 3
# The p-value of 0.4587 > than the conventional significance level of 0.05, I fail to reject the null hypothesis. This means that there is no evidence to suggest significant autocorrelation in the residuals of the model.
#Model 1
# The p-value of 0.3502 > than 0.05, I fail to reject the null hypothesis. There's no evidence to suggest significant autocorrelation in the residuals of this model.
# There's no significant evidence of autocorrelation in the residuals based on the Ljung-Box test. This is something positive, it indicates that the models might be capturing the underlying patterns in the data effectively, without leaving patterns (autocorrelation) in the residuals.
Conclusions. General interpretations (VIF, multicollinearity and Insights) The winning model in my case was 3, this can be seen in these factors: It has the smallest VIF (Variance Inflation Factor) in general of the variables which tells us that the data are more accurate and there is very little multicollinearity, there is only a little in the exchange rate variable.
As we have seen in class, multicollinearity exists when there is a correlation between multiple independent variables in a multiple regression model. To obtain the most accurate model, what I did was to eliminate variables so that they did not have a VIF greater than 10, which is considered multicollinearity. The variables with the most impact on our dependent variable are Educacion and Inseguridad Homicidio.
Insights:
A 1-unit increase in the logarithm of “Educacion” is related with an estimated increase of 2.54806 in the variable, holding other variables constant. This is statistically significant at the 0.05/5%.
A 1-unit increase in variable “Inseguridad_Homicidio” is related with a decrease of 0.34275 in the variable, and the other variables are constant. This is also statistically significant at the 5%l.
Talking about Tipo_de_cambio, its not statistically significant (p = 0.4140). It seems to have a potential positive association with the dependent variable , but the evidence isn’t strong enough.
The Adjusted R-squared of model 3 is 0.5786, in my models is the most accurate in relationship with my other models.The model 3 is my most statistically significant model affecting the dependent variable. Educacion and Inseguridad_Homicidio are significant predictors in the model. Other variables like Salario_Diario, Tipo_de_Cambio, and crisis_financiera are not statistically significant to the depend variable IED_M
Vargas C. (2023, March 29). Nearshoring, la nueva frontera de México. https://egade.tec.mx/es/egade-ideas/opinion/nearshoring-la-nueva-frontera-de-mexico
Garcia S. (2023, August 16). Invita Samuel García a capitalizar ventajas que ofrece “Near Nuevo León” https://www.nl.gob.mx/boletines-comunicados-y-avisos/invita-samuel-garcia-capitalizar-ventajas-que-ofrece-el-near-nuevo
GutCheck (2017, December 5) PREDICTIVE ANALYTICS AND REGRESSION MODELS EXPLAINED https://gutcheckit.com/blog/predictive-analytics-regression-models-explained/
IBM. (2022, May 20) What is predictive analytics?. https://www.ibm.com/topics/predictive-analytics
T. Osvarauld (2023, February 10) Predictive analytics https://www.cio.com/article/228901/what-is-predictive-analytics-transforming-data-into-future-insights.html