According to Oxford dictionary Near shoring is define as “outsourcing of work to an adjacent country at an equivalent level of economic development”, refers to the strategic action where companies seek to reconfigure their value chain processes, in order to make them shorter and less expensive through the proximity of the production centers with the purchase markets. Another important term is offshore, it’s a production model that’s defined by relocating the production centers in a lower cost country, for example it is well known that many retail brands implement this strategy by settling production centers where the hand-labor cost is lower. China is currently a leader in manufacturing industries and has the largest money reserves in the world, making it one step away from becoming a world leading power. Also it is considered the “factory of the world” by being a leader in the manufacturing industries, also a great part of its economics comes from agriculture, contributing 10% of GDP. (Rios, 2014).
“Near shoring: practice in which a company transfers its activities to a country close to the destination market or the final consumer.”
Three years ago a breathing virus so deadly that impacted a global scale and changed the way of living our daily life, Wudanhg a city in China was the first city to reporte activity and spreading of the virus into other cities in and outside of China. Causing a disruption and closing centers that were vital for the supply chain, resulting in a huge step back causing a significant decrease in the international trade levels for various raw materials and products. (RS, s/f) (BBVA,2022). On another more recent note, the conflict between Russia and Ukraine has affected and restricted supply of raw metals that were destined for the United StateS, 30% of platinum group elements, 13% of titanium, and 11% of nickel were sourced from Russia and these are no longer available. (Hamilton, 2023). Resulting in alternative supply channels from other countries by forming strategic alliances for mutual benefits, for example looking for countries where there’s cheaper cost for hand labor and raw metal materials.
As stated by many international companies and investors have found it vital to reconsider new locations for their global regional production chains outside of China in order to reduce the risk and uncertainty of the depende of the Asian economy. (BBVA, 2022) (Egade Ideas, 2023). Mexico is currently the leading country in nearshoring in the Latin American continent. In the first quarter of 2022, the number of square meters oriented to nearshoring represented 50% of all that could be registered during the year 2021. The furniture sector was the most prominent with an occupancy rate of 97%, which is why it has become an investment center for international companies since it is estimated that the Mexican Association of Industrial Parks will obtain an investment of around 1,600 million dollars from real estate developers, thus managing to grow its demand. (Mundi, 2022). According to the Real Estate portal, Vianey Macías mentions “In terms of infrastructure, mainly energy, in the following years the consolidation of Mexico as part of Latin America will be registered. What we can see in terms of infrastructure is that we currently have more than 180 new projects in the country, of which 110 are in force, around 80 belong to hydrocarbon projects.” Thus positioning Mexico as a full country of opportunities and innovation, as a result, it has attracted more investors and talent for its development.
Briefly describe what is Predictive Analytics. In doing so, please also explain what the use is of regression analysis in predictive analytics.
In the world of business analytics it is now that to make a solid decision it must go past a certain test and analysis to indicate the better choice for a company. There are three types of analysis that help companies drive and make decisions: descriptive analysis that explains what has already happened, predictive analysis tells us what could happen, and prescriptive analysis that tells us what should happen in the future. (University of Bath, 2023). According to consultancy firm Gartner, “predictive analytics is a form of advanced analytics that examines data or content to answer the question: what is likely to happen in the future?”, in other words it uses all the data collected to create a possible outcome or prediction about an assumption.
Being said that Predictive analysis is a data driven approach backed up by historical data it involves the application of statistical algorithms, equations, and machine learning to identify relationships, data patterns and create models based on valid information for decision making using regression and time series techniques, forecasting the potential of the outcome result. According to the University of Bath in their words, predictive analytics is a more advanced method of data analysis that uses probabilities to make assessments of what could happen in the future. By making predictions as a result of using machine learning algorithms techniques it takes existing data and attempts to fill in the mission data with the best possible guesses on behalf of their modeling techniques.
Regression analysis is a widely used technique for predictive analysis, since it is a statistical method that does not help model the relationship between a dependent variable and one or more independent variables. The main goal of the regression analysis is to find the best fit line or curve that represents the relationship between the variables, thereby making a prediction based on the information about the future result based on historical information and the relationship between the two variables. Therefore, in the predictive analysis sector, the analysis of the regression serves as a way of predicting future values of the dependent variable in question from the value of the independent variable.
Briefly explain how regression analysis can help us to predict the occurrence of “Nearshoring” for the Mexican case.
Thanks to the geographical position of Mexico with the United States, which they are neighbors, this has a positive impact since it involves lower transportation costs and secondary participants for the supply chain, having this proximity attracts more investors to Mexico. The United States is trying to relocate its manufacturing production in China, thanks to its proximity and the USMCA (T-MEC) trade agreement, which provides certainty to trade and investment, making regional trade between the US, Mexico, and Canada more inclusive. (Secretaria de Economia, 2018). In 2023, an investment of ten billion dollars was announced by Tesla to build production centers in the state of Nuevo León, thanks to the advantage of its proximity to the United States, the second largest automotive market in the world after China.
This is a clear example of near shoring. Mexico has been a center of automotive manufacturing and has an extensive auto parts industry which is owned by multinational companies to manufacture their products. The medium-sized automotive industry has benefited from the agreement of the T-MEC trade agreement that vehicles produced in Mexico that contain at least 75% manufacturing with local components do not pay the 25% tariff that must be paid by cars from other countries, having cars “Made in Mexico” that in a few hours cross the border and reach their final destination.(Barría, 2023). With that said, regression analysis can help us predict near shoring in Mexico when analyzing trade agreements with the United States, its geographic proximity, costs, among others, where logistic regression can help us model the probability of near shoring based on these factors and the coefficients of the model will show us the influence of each factor, thus having a greater prediction and understanding for decision making.
According to the document “Mexico and Its Attractiveness for Near shoring”, what is the problem situation?
There’s many factors to take into consideration when answering these questions, but in my opinion these questions seek to answer Maria concerns and questions by analyzing the data through the definition of Econometrics. The problem situation will be that due the activity of China and the war on trade with the United states, also the active war with Russia and Ukraine, Maria wants to find another country were is attractive and offers innovation and development seeking alternatives third option for country, Maria came across Mexico and found out about its potential of production and recent investment like Tesla, investment 10 billion dollars to create a production center. This event and many others has popped up the question regarding all the factors, and know if Mexico is a good country to implement a near shoring strategy due to its proximity with the United States market. The aim is to have a model to explain the variables or factors that affect near shoring in Mexico, according to the variable dependent on “IED_MXN”, since it is a way of measuring whether near shoring is occurring in the country.
bd<- read.csv("C:\\Users\\sebastian\\Downloads\\sp_data.csv")
bd
## periodo IED_Flujos IED_MXN Exportaciones Exportaciones_MXN Empleo Educacion
## 1 1997 12145.60 294151.2 9087.62 220090.8 NA 7.20
## 2 1998 8373.50 210875.6 9875.07 248690.6 NA 7.31
## 3 1999 13960.32 299734.4 10990.01 235960.5 NA 7.43
## 4 2000 18248.69 362631.8 12482.96 248057.2 97.83 7.56
## 5 2001 30057.18 546548.4 11300.44 205482.9 97.36 7.68
## 6 2002 24099.21 468332.0 11923.10 231707.6 97.66 7.80
## 7 2003 18249.97 368752.8 13156.00 265825.7 97.06 7.93
## 8 2004 25015.57 481349.2 13573.13 261173.9 96.48 8.04
## 9 2005 25795.82 458544.8 16465.81 292695.1 97.17 8.14
## 10 2006 21232.54 368495.8 17485.93 303472.5 96.53 8.26
## 11 2007 32393.33 542793.7 19103.85 320110.6 96.60 8.36
## 12 2008 29502.46 586217.7 16924.76 336297.2 95.68 8.46
## 13 2009 17849.95 324318.4 19702.63 357980.1 95.20 8.56
## 14 2010 27189.28 449223.7 22673.14 374607.6 95.06 8.63
## 15 2011 25632.52 460653.8 24333.02 437299.9 95.49 8.75
## 16 2012 21769.32 350978.6 26297.98 423992.5 95.53 8.85
## 17 2013 48354.42 754437.5 27687.57 431988.2 95.75 8.95
## 18 2014 30351.25 512758.2 31676.78 535151.9 96.24 9.05
## 19 2015 35943.75 699904.1 29959.94 583386.1 96.04 9.15
## 20 2016 31188.98 700091.6 31375.06 704268.5 96.62 9.25
## 21 2017 34017.05 683318.0 33322.62 669368.6 96.85 9.35
## 22 2018 34100.43 671018.4 35341.90 695447.7 96.64 9.45
## 23 2019 34577.16 615945.4 36414.73 648679.3 97.09 9.58
## 24 2020 28205.89 514711.7 41077.34 749594.7 96.21 NA
## 25 2021 31553.52 551937.8 44914.78 785654.5 96.49 NA
## 26 2022 36215.37 555771.9 46477.59 713259.0 97.24 NA
## Salario_Diario Innovacion Inseguridad_Robo Inseguridad_Homicidio
## 1 24.30 11.30 266.51 14.55
## 2 31.91 11.37 314.78 14.32
## 3 31.91 12.46 272.89 12.64
## 4 35.12 13.15 216.98 10.86
## 5 37.57 13.47 214.53 10.25
## 6 39.74 12.80 197.80 9.94
## 7 41.53 11.81 183.22 9.81
## 8 43.30 12.61 146.28 8.92
## 9 45.24 13.41 136.94 9.22
## 10 47.05 14.23 135.59 9.60
## 11 48.88 15.04 145.92 8.04
## 12 50.84 14.82 158.17 12.52
## 13 53.19 12.59 175.77 17.46
## 14 55.77 12.69 201.94 22.43
## 15 58.06 12.10 212.61 23.42
## 16 60.75 13.03 190.28 22.09
## 17 63.12 13.22 185.56 19.74
## 18 65.58 13.65 154.41 16.93
## 19 70.10 15.11 180.44 17.37
## 20 73.04 14.40 160.57 20.31
## 21 88.36 14.05 230.43 26.22
## 22 88.36 13.25 184.25 29.59
## 23 102.68 12.70 173.45 29.21
## 24 123.22 11.28 133.90 28.98
## 25 141.70 NA 127.13 27.89
## 26 172.87 NA 120.49 NA
## Tipo_de_Cambio Densidad_Carretera Densidad_Poblacion CO2_Emisiones
## 1 8.06 0.05 47.44 3.68
## 2 9.94 0.05 48.76 3.85
## 3 9.52 0.06 49.48 3.69
## 4 9.60 0.06 50.58 3.87
## 5 9.17 0.06 51.28 3.81
## 6 10.36 0.06 51.95 3.82
## 7 11.20 0.06 52.61 3.95
## 8 11.22 0.06 53.27 3.98
## 9 10.71 0.06 54.78 4.10
## 10 10.88 0.06 55.44 4.19
## 11 10.90 0.06 56.17 4.22
## 12 13.77 0.07 56.96 4.19
## 13 13.04 0.07 57.73 4.04
## 14 12.38 0.07 58.45 4.11
## 15 13.98 0.07 59.15 4.19
## 16 12.99 0.07 59.85 4.20
## 17 13.07 0.08 59.49 4.06
## 18 14.73 0.08 60.17 3.89
## 19 17.34 0.08 60.86 3.93
## 20 20.66 0.08 61.57 3.89
## 21 19.74 0.09 62.28 3.84
## 22 19.66 0.09 63.11 3.65
## 23 18.87 0.09 63.90 3.59
## 24 19.94 0.09 64.59 NA
## 25 20.52 0.09 65.16 NA
## 26 19.41 0.09 65.60 NA
## PIB_Per_Capita INPC crisis_financiera
## 1 127570.1 33.28 0
## 2 126738.8 39.47 0
## 3 129164.7 44.34 0
## 4 130874.9 48.31 0
## 5 128083.4 50.43 0
## 6 128205.9 53.31 0
## 7 128737.9 55.43 0
## 8 132563.5 58.31 0
## 9 132941.1 60.25 0
## 10 135894.9 62.69 0
## 11 137795.7 65.05 0
## 12 135176.0 69.30 1
## 13 131233.0 71.77 1
## 14 134991.7 74.93 0
## 15 138891.9 77.79 0
## 16 141530.2 80.57 0
## 17 144112.0 83.77 0
## 18 147277.4 87.19 0
## 19 149433.5 89.05 0
## 20 152275.4 92.04 0
## 21 153235.7 98.27 0
## 22 153133.8 99.91 0
## 23 150233.1 105.93 0
## 24 142609.3 109.27 0
## 25 142772.0 117.31 0
## 26 146826.7 126.48 0
#Installing libraries
#library(pysch)
library(tidyverse)
library(ggplot2)
library(corrplot)
library(gmodels)
library(effects)
library(stargazer)
library(olsrr)
library(kableExtra)
library(jtools)
library(fastmap)
library(dlookr)
library(Hmisc)
library(naniar)
library(glmnet)
library(caret)
library(car)
library(lmtest)
library(MASS)
#Identify missing values
missing_values = colSums(is.na(bd))
missing_values
## periodo IED_Flujos IED_MXN
## 0 0 0
## Exportaciones Exportaciones_MXN Empleo
## 0 0 3
## Educacion Salario_Diario Innovacion
## 3 0 2
## Inseguridad_Robo Inseguridad_Homicidio Tipo_de_Cambio
## 0 1 0
## Densidad_Carretera Densidad_Poblacion CO2_Emisiones
## 0 0 3
## PIB_Per_Capita INPC crisis_financiera
## 0 0 0
#Display data set structure
str(bd)
## 'data.frame': 26 obs. of 18 variables:
## $ periodo : int 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 ...
## $ IED_Flujos : num 12146 8374 13960 18249 30057 ...
## $ IED_MXN : num 294151 210876 299734 362632 546548 ...
## $ Exportaciones : num 9088 9875 10990 12483 11300 ...
## $ Exportaciones_MXN : num 220091 248691 235961 248057 205483 ...
## $ Empleo : num NA NA NA 97.8 97.4 ...
## $ Educacion : num 7.2 7.31 7.43 7.56 7.68 7.8 7.93 8.04 8.14 8.26 ...
## $ Salario_Diario : num 24.3 31.9 31.9 35.1 37.6 ...
## $ Innovacion : num 11.3 11.4 12.5 13.2 13.5 ...
## $ Inseguridad_Robo : num 267 315 273 217 215 ...
## $ Inseguridad_Homicidio: num 14.6 14.3 12.6 10.9 10.2 ...
## $ Tipo_de_Cambio : num 8.06 9.94 9.52 9.6 9.17 ...
## $ Densidad_Carretera : num 0.05 0.05 0.06 0.06 0.06 0.06 0.06 0.06 0.06 0.06 ...
## $ Densidad_Poblacion : num 47.4 48.8 49.5 50.6 51.3 ...
## $ CO2_Emisiones : num 3.68 3.85 3.69 3.87 3.81 3.82 3.95 3.98 4.1 4.19 ...
## $ PIB_Per_Capita : num 127570 126739 129165 130875 128083 ...
## $ INPC : num 33.3 39.5 44.3 48.3 50.4 ...
## $ crisis_financiera : int 0 0 0 0 0 0 0 0 0 0 ...
# Include descriptive statistic
summary(bd)
## periodo IED_Flujos IED_MXN Exportaciones
## Min. :1997 Min. : 8374 Min. :210876 Min. : 9088
## 1st Qu.:2003 1st Qu.:21367 1st Qu.:368560 1st Qu.:13260
## Median :2010 Median :27698 Median :497054 Median :21188
## Mean :2010 Mean :26770 Mean :493596 Mean :23601
## 3rd Qu.:2016 3rd Qu.:32183 3rd Qu.:578606 3rd Qu.:31601
## Max. :2022 Max. :48354 Max. :754438 Max. :46478
##
## Exportaciones_MXN Empleo Educacion Salario_Diario
## Min. :205483 Min. :95.06 Min. :7.200 Min. : 24.30
## 1st Qu.:262337 1st Qu.:95.89 1st Qu.:7.865 1st Qu.: 41.97
## Median :366294 Median :96.53 Median :8.460 Median : 54.48
## Mean :433856 Mean :96.47 Mean :8.423 Mean : 65.16
## 3rd Qu.:632356 3rd Qu.:97.08 3rd Qu.:9.000 3rd Qu.: 72.31
## Max. :785655 Max. :97.83 Max. :9.580 Max. :172.87
## NA's :3 NA's :3
## Innovacion Inseguridad_Robo Inseguridad_Homicidio Tipo_de_Cambio
## Min. :11.28 Min. :120.5 Min. : 8.04 Min. : 8.06
## 1st Qu.:12.56 1st Qu.:148.3 1st Qu.:10.25 1st Qu.:10.75
## Median :13.09 Median :181.8 Median :16.93 Median :13.02
## Mean :13.11 Mean :185.4 Mean :17.29 Mean :13.91
## 3rd Qu.:13.75 3rd Qu.:209.9 3rd Qu.:22.43 3rd Qu.:18.49
## Max. :15.11 Max. :314.8 Max. :29.59 Max. :20.66
## NA's :2 NA's :1
## Densidad_Carretera Densidad_Poblacion CO2_Emisiones PIB_Per_Capita
## Min. :0.05000 Min. :47.44 Min. :3.590 Min. :126739
## 1st Qu.:0.06000 1st Qu.:52.77 1st Qu.:3.830 1st Qu.:130964
## Median :0.07000 Median :58.09 Median :3.930 Median :136845
## Mean :0.07115 Mean :57.33 Mean :3.945 Mean :138550
## 3rd Qu.:0.08000 3rd Qu.:61.39 3rd Qu.:4.105 3rd Qu.:146148
## Max. :0.09000 Max. :65.60 Max. :4.220 Max. :153236
## NA's :3
## INPC crisis_financiera
## Min. : 33.28 Min. :0.00000
## 1st Qu.: 56.15 1st Qu.:0.00000
## Median : 73.35 Median :0.00000
## Mean : 75.17 Mean :0.07692
## 3rd Qu.: 91.29 3rd Qu.:0.00000
## Max. :126.48 Max. :1.00000
##
# Replace missing values
bd <- bd %>%
mutate(across(everything(), ~ifelse(is.na(.), median(., na.rm = TRUE), .)))
bd
## periodo IED_Flujos IED_MXN Exportaciones Exportaciones_MXN Empleo Educacion
## 1 1997 12145.60 294151.2 9087.62 220090.8 96.53 7.20
## 2 1998 8373.50 210875.6 9875.07 248690.6 96.53 7.31
## 3 1999 13960.32 299734.4 10990.01 235960.5 96.53 7.43
## 4 2000 18248.69 362631.8 12482.96 248057.2 97.83 7.56
## 5 2001 30057.18 546548.4 11300.44 205482.9 97.36 7.68
## 6 2002 24099.21 468332.0 11923.10 231707.6 97.66 7.80
## 7 2003 18249.97 368752.8 13156.00 265825.7 97.06 7.93
## 8 2004 25015.57 481349.2 13573.13 261173.9 96.48 8.04
## 9 2005 25795.82 458544.8 16465.81 292695.1 97.17 8.14
## 10 2006 21232.54 368495.8 17485.93 303472.5 96.53 8.26
## 11 2007 32393.33 542793.7 19103.85 320110.6 96.60 8.36
## 12 2008 29502.46 586217.7 16924.76 336297.2 95.68 8.46
## 13 2009 17849.95 324318.4 19702.63 357980.1 95.20 8.56
## 14 2010 27189.28 449223.7 22673.14 374607.6 95.06 8.63
## 15 2011 25632.52 460653.8 24333.02 437299.9 95.49 8.75
## 16 2012 21769.32 350978.6 26297.98 423992.5 95.53 8.85
## 17 2013 48354.42 754437.5 27687.57 431988.2 95.75 8.95
## 18 2014 30351.25 512758.2 31676.78 535151.9 96.24 9.05
## 19 2015 35943.75 699904.1 29959.94 583386.1 96.04 9.15
## 20 2016 31188.98 700091.6 31375.06 704268.5 96.62 9.25
## 21 2017 34017.05 683318.0 33322.62 669368.6 96.85 9.35
## 22 2018 34100.43 671018.4 35341.90 695447.7 96.64 9.45
## 23 2019 34577.16 615945.4 36414.73 648679.3 97.09 9.58
## 24 2020 28205.89 514711.7 41077.34 749594.7 96.21 8.46
## 25 2021 31553.52 551937.8 44914.78 785654.5 96.49 8.46
## 26 2022 36215.37 555771.9 46477.59 713259.0 97.24 8.46
## Salario_Diario Innovacion Inseguridad_Robo Inseguridad_Homicidio
## 1 24.30 11.30 266.51 14.55
## 2 31.91 11.37 314.78 14.32
## 3 31.91 12.46 272.89 12.64
## 4 35.12 13.15 216.98 10.86
## 5 37.57 13.47 214.53 10.25
## 6 39.74 12.80 197.80 9.94
## 7 41.53 11.81 183.22 9.81
## 8 43.30 12.61 146.28 8.92
## 9 45.24 13.41 136.94 9.22
## 10 47.05 14.23 135.59 9.60
## 11 48.88 15.04 145.92 8.04
## 12 50.84 14.82 158.17 12.52
## 13 53.19 12.59 175.77 17.46
## 14 55.77 12.69 201.94 22.43
## 15 58.06 12.10 212.61 23.42
## 16 60.75 13.03 190.28 22.09
## 17 63.12 13.22 185.56 19.74
## 18 65.58 13.65 154.41 16.93
## 19 70.10 15.11 180.44 17.37
## 20 73.04 14.40 160.57 20.31
## 21 88.36 14.05 230.43 26.22
## 22 88.36 13.25 184.25 29.59
## 23 102.68 12.70 173.45 29.21
## 24 123.22 11.28 133.90 28.98
## 25 141.70 13.09 127.13 27.89
## 26 172.87 13.09 120.49 16.93
## Tipo_de_Cambio Densidad_Carretera Densidad_Poblacion CO2_Emisiones
## 1 8.06 0.05 47.44 3.68
## 2 9.94 0.05 48.76 3.85
## 3 9.52 0.06 49.48 3.69
## 4 9.60 0.06 50.58 3.87
## 5 9.17 0.06 51.28 3.81
## 6 10.36 0.06 51.95 3.82
## 7 11.20 0.06 52.61 3.95
## 8 11.22 0.06 53.27 3.98
## 9 10.71 0.06 54.78 4.10
## 10 10.88 0.06 55.44 4.19
## 11 10.90 0.06 56.17 4.22
## 12 13.77 0.07 56.96 4.19
## 13 13.04 0.07 57.73 4.04
## 14 12.38 0.07 58.45 4.11
## 15 13.98 0.07 59.15 4.19
## 16 12.99 0.07 59.85 4.20
## 17 13.07 0.08 59.49 4.06
## 18 14.73 0.08 60.17 3.89
## 19 17.34 0.08 60.86 3.93
## 20 20.66 0.08 61.57 3.89
## 21 19.74 0.09 62.28 3.84
## 22 19.66 0.09 63.11 3.65
## 23 18.87 0.09 63.90 3.59
## 24 19.94 0.09 64.59 3.93
## 25 20.52 0.09 65.16 3.93
## 26 19.41 0.09 65.60 3.93
## PIB_Per_Capita INPC crisis_financiera
## 1 127570.1 33.28 0
## 2 126738.8 39.47 0
## 3 129164.7 44.34 0
## 4 130874.9 48.31 0
## 5 128083.4 50.43 0
## 6 128205.9 53.31 0
## 7 128737.9 55.43 0
## 8 132563.5 58.31 0
## 9 132941.1 60.25 0
## 10 135894.9 62.69 0
## 11 137795.7 65.05 0
## 12 135176.0 69.30 1
## 13 131233.0 71.77 1
## 14 134991.7 74.93 0
## 15 138891.9 77.79 0
## 16 141530.2 80.57 0
## 17 144112.0 83.77 0
## 18 147277.4 87.19 0
## 19 149433.5 89.05 0
## 20 152275.4 92.04 0
## 21 153235.7 98.27 0
## 22 153133.8 99.91 0
## 23 150233.1 105.93 0
## 24 142609.3 109.27 0
## 25 142772.0 117.31 0
## 26 146826.7 126.48 0
#Identify missing values
miss_values<-colSums(is.na(bd))
miss_values
## periodo IED_Flujos IED_MXN
## 0 0 0
## Exportaciones Exportaciones_MXN Empleo
## 0 0 0
## Educacion Salario_Diario Innovacion
## 0 0 0
## Inseguridad_Robo Inseguridad_Homicidio Tipo_de_Cambio
## 0 0 0
## Densidad_Carretera Densidad_Poblacion CO2_Emisiones
## 0 0 0
## PIB_Per_Capita INPC crisis_financiera
## 0 0 0
According to Professor Antony Unwin from the University of Augsburg, Data visualization, facilitated by the power of the computer, represents one of the fundamental tools of modern data science.
# Histogram Exportacion (graph 1)
hist1=ggplot(data = bd, aes(x = Exportaciones_MXN))+
geom_histogram(bins = 10, fill = "yellow", color = "black", boundary = 15) + labs(title = "IED Flujos vs Exportaciones", x="Exportaciones", y="IED_MXN")+ theme(plot.title = element_text(hjust = 0.5))
hist1
# Histogram Educacion (graph 2)
ggplot(bd, aes(y=IED_MXN, x= Educacion)) +
geom_point(stat= "identity", fill="black", color="green", alpha=0.7) +
labs(title="Educacion", y="IED") +
theme_minimal()
# Histogram PIB (graph3)
hist3=ggplot(data = bd, aes(x = PIB_Per_Capita))+
geom_histogram(bins = 10, fill = "red", color = "black", boundary = 15) + labs(title = "IED Flujos vs PIB", x="PIB", y="IED_MXN")+ theme(plot.title = element_text(hjust = 0.5))
hist3
# Histogram Tipo de Cambio (graph 4)
hist4=ggplot(data = bd, aes(x = Salario_Diario))+
geom_histogram(bins = 10, fill = "orange", color = "black", boundary = 15) + labs(title = "IED Flujos vs Salario Minimo", x="Salario Minimo", y="IED_MXN")+ theme(plot.title = element_text(hjust = 0.5))
hist4
# Histogram Inovacion (graph 5)
hist5=ggplot(data = bd, aes(x = Innovacion))+
geom_histogram(bins = 10, fill = "grey", color = "black", boundary = 15) + labs(title = "IED Flujos vs Inovacion", x="Inovacion", y="IED_MXN")+ theme(plot.title = element_text(hjust = 0.5))
hist5
Ordinary Least Squares Method (OLS): Linear regression technique that is used to estimate the unknown parameters in a model, this method relies on minimizing the sum of squared residuals between the actual and predicted values.(Kumar, s/f). This method is one of the most common for linear regression, since it helps us analyze the decrease in the sum of the squares between the current and default values for the Foreign Direct Investment Flows variable.
H0:The Education variable has a significant impact on the IED Flow
H1:The Education variable has no impact on the IED Flow.
H0:Having a high percentage of Minimum Wage has a significant impact on the IED Flow.
H1:Having a high percentage of Minimum Wage is not significant on the IED Flow.
H0:The insecurity by homicide variable has a negative impact on the IED Flow
H1:The insecurity by homicide varible has no impact on the IED Flow.
#Code given by the Professor to know the correlation of the variables
res <- cor(bd)
round(res, 2)
## periodo IED_Flujos IED_MXN Exportaciones
## periodo 1.00 0.72 0.69 0.98
## IED_Flujos 0.72 1.00 0.94 0.66
## IED_MXN 0.69 0.94 1.00 0.61
## Exportaciones 0.98 0.66 0.61 1.00
## Exportaciones_MXN 0.95 0.60 0.64 0.97
## Empleo -0.21 -0.06 0.02 -0.13
## Educacion 0.84 0.73 0.74 0.73
## Salario_Diario 0.88 0.56 0.48 0.94
## Innovacion 0.25 0.53 0.58 0.16
## Inseguridad_Robo -0.59 -0.55 -0.45 -0.54
## Inseguridad_Homicidio 0.78 0.40 0.42 0.78
## Tipo_de_Cambio 0.94 0.60 0.68 0.93
## Densidad_Carretera 0.96 0.73 0.72 0.95
## Densidad_Poblacion 1.00 0.72 0.67 0.96
## CO2_Emisiones 0.02 0.09 -0.06 -0.07
## PIB_Per_Capita 0.89 0.73 0.78 0.85
## INPC 0.99 0.70 0.65 0.99
## crisis_financiera -0.04 -0.10 -0.08 -0.14
## Exportaciones_MXN Empleo Educacion Salario_Diario
## periodo 0.95 -0.21 0.84 0.88
## IED_Flujos 0.60 -0.06 0.73 0.56
## IED_MXN 0.64 0.02 0.74 0.48
## Exportaciones 0.97 -0.13 0.73 0.94
## Exportaciones_MXN 1.00 -0.09 0.75 0.88
## Empleo -0.09 1.00 -0.32 0.04
## Educacion 0.75 -0.32 1.00 0.51
## Salario_Diario 0.88 0.04 0.51 1.00
## Innovacion 0.17 0.01 0.45 0.05
## Inseguridad_Robo -0.45 0.02 -0.44 -0.54
## Inseguridad_Homicidio 0.82 -0.33 0.68 0.64
## Tipo_de_Cambio 0.98 -0.09 0.78 0.85
## Densidad_Carretera 0.95 -0.13 0.82 0.86
## Densidad_Poblacion 0.93 -0.26 0.85 0.86
## CO2_Emisiones -0.18 -0.51 0.07 -0.11
## PIB_Per_Capita 0.89 -0.11 0.91 0.67
## INPC 0.95 -0.14 0.78 0.93
## crisis_financiera -0.13 -0.42 0.04 -0.11
## Innovacion Inseguridad_Robo Inseguridad_Homicidio
## periodo 0.25 -0.59 0.78
## IED_Flujos 0.53 -0.55 0.40
## IED_MXN 0.58 -0.45 0.42
## Exportaciones 0.16 -0.54 0.78
## Exportaciones_MXN 0.17 -0.45 0.82
## Empleo 0.01 0.02 -0.33
## Educacion 0.45 -0.44 0.68
## Salario_Diario 0.05 -0.54 0.64
## Innovacion 1.00 -0.42 -0.17
## Inseguridad_Robo -0.42 1.00 -0.08
## Inseguridad_Homicidio -0.17 -0.08 1.00
## Tipo_de_Cambio 0.22 -0.45 0.79
## Densidad_Carretera 0.21 -0.47 0.81
## Densidad_Poblacion 0.28 -0.62 0.76
## CO2_Emisiones 0.33 -0.41 -0.25
## PIB_Per_Capita 0.43 -0.40 0.70
## INPC 0.22 -0.59 0.75
## crisis_financiera 0.16 -0.11 -0.09
## Tipo_de_Cambio Densidad_Carretera Densidad_Poblacion
## periodo 0.94 0.96 1.00
## IED_Flujos 0.60 0.73 0.72
## IED_MXN 0.68 0.72 0.67
## Exportaciones 0.93 0.95 0.96
## Exportaciones_MXN 0.98 0.95 0.93
## Empleo -0.09 -0.13 -0.26
## Educacion 0.78 0.82 0.85
## Salario_Diario 0.85 0.86 0.86
## Innovacion 0.22 0.21 0.28
## Inseguridad_Robo -0.45 -0.47 -0.62
## Inseguridad_Homicidio 0.79 0.81 0.76
## Tipo_de_Cambio 1.00 0.94 0.92
## Densidad_Carretera 0.94 1.00 0.95
## Densidad_Poblacion 0.92 0.95 1.00
## CO2_Emisiones -0.17 -0.17 0.09
## PIB_Per_Capita 0.88 0.89 0.87
## INPC 0.94 0.96 0.98
## crisis_financiera -0.04 -0.03 0.00
## CO2_Emisiones PIB_Per_Capita INPC crisis_financiera
## periodo 0.02 0.89 0.99 -0.04
## IED_Flujos 0.09 0.73 0.70 -0.10
## IED_MXN -0.06 0.78 0.65 -0.08
## Exportaciones -0.07 0.85 0.99 -0.14
## Exportaciones_MXN -0.18 0.89 0.95 -0.13
## Empleo -0.51 -0.11 -0.14 -0.42
## Educacion 0.07 0.91 0.78 0.04
## Salario_Diario -0.11 0.67 0.93 -0.11
## Innovacion 0.33 0.43 0.22 0.16
## Inseguridad_Robo -0.41 -0.40 -0.59 -0.11
## Inseguridad_Homicidio -0.25 0.70 0.75 -0.09
## Tipo_de_Cambio -0.17 0.88 0.94 -0.04
## Densidad_Carretera -0.17 0.89 0.96 -0.03
## Densidad_Poblacion 0.09 0.87 0.98 0.00
## CO2_Emisiones 1.00 -0.11 -0.01 0.28
## PIB_Per_Capita -0.11 1.00 0.85 -0.18
## INPC -0.01 0.85 1.00 -0.06
## crisis_financiera 0.28 -0.18 -0.06 1.00
# Display the correlation plot
co_matrix <- cor(bd, use = "complete.obs")
corrplot(co_matrix, method = "circle",type="upper")
#qualitative data
corrplot(cor(bd), type = "upper", order = 'hclust',addCoef.col='purple')
Estimate 3 different linear regression models
## Model 1
m1 <- lm(IED_MXN ~ Exportaciones_MXN + Educacion + Inseguridad_Homicidio + Salario_Diario + Tipo_de_Cambio + crisis_financiera, data = bd)
summary(m1)
##
## Call:
## lm(formula = IED_MXN ~ Exportaciones_MXN + Educacion + Inseguridad_Homicidio +
## Salario_Diario + Tipo_de_Cambio + crisis_financiera, data = bd)
##
## Residuals:
## Min 1Q Median 3Q Max
## -140411 -35956 -6718 37603 230766
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -8.172e+05 3.718e+05 -2.198 0.0405 *
## Exportaciones_MXN -3.719e-01 7.759e-01 -0.479 0.6372
## Educacion 1.358e+05 5.124e+04 2.650 0.0158 *
## Inseguridad_Homicidio -8.057e+03 4.996e+03 -1.612 0.1233
## Salario_Diario 9.475e+00 1.342e+03 0.007 0.9944
## Tipo_de_Cambio 3.403e+04 3.027e+04 1.124 0.2749
## crisis_financiera -8.978e+04 8.382e+04 -1.071 0.2975
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 95540 on 19 degrees of freedom
## Multiple R-squared: 0.6647, Adjusted R-squared: 0.5589
## F-statistic: 6.279 on 6 and 19 DF, p-value: 0.0009093
## Model 2: Linear model logarithmic
m2 <- lm(log(IED_MXN) ~ Exportaciones_MXN + Educacion + Inseguridad_Homicidio + Salario_Diario + Tipo_de_Cambio + crisis_financiera, data = bd)
summary(m2)
##
## Call:
## lm(formula = log(IED_MXN) ~ Exportaciones_MXN + Educacion + Inseguridad_Homicidio +
## Salario_Diario + Tipo_de_Cambio + crisis_financiera, data = bd)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.36451 -0.09749 0.01704 0.09774 0.38631
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 9.954e+00 8.170e-01 12.184 2e-10 ***
## Exportaciones_MXN -1.046e-06 1.705e-06 -0.614 0.54677
## Educacion 3.357e-01 1.126e-01 2.982 0.00767 **
## Inseguridad_Homicidio -1.914e-02 1.098e-02 -1.743 0.09754 .
## Salario_Diario 1.546e-03 2.950e-03 0.524 0.60631
## Tipo_de_Cambio 7.045e-02 6.653e-02 1.059 0.30289
## crisis_financiera -2.003e-01 1.842e-01 -1.088 0.29040
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.21 on 19 degrees of freedom
## Multiple R-squared: 0.6675, Adjusted R-squared: 0.5625
## F-statistic: 6.358 on 6 and 19 DF, p-value: 0.000846
## Model 3: polynomial
mp = lm(log(IED_MXN) ~ Exportaciones_MXN + Educacion + I(Educacion^2) + Inseguridad_Homicidio + Salario_Diario + Tipo_de_Cambio + crisis_financiera, data = bd)
summary(mp)
##
## Call:
## lm(formula = log(IED_MXN) ~ Exportaciones_MXN + Educacion + I(Educacion^2) +
## Inseguridad_Homicidio + Salario_Diario + Tipo_de_Cambio +
## crisis_financiera, data = bd)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.35136 -0.08603 0.00789 0.08305 0.38928
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 7.425e+00 8.057e+00 0.922 0.369
## Exportaciones_MXN -1.066e-06 1.748e-06 -0.610 0.550
## Educacion 9.453e-01 1.935e+00 0.489 0.631
## I(Educacion^2) -3.704e-02 1.174e-01 -0.316 0.756
## Inseguridad_Homicidio -1.781e-02 1.201e-02 -1.482 0.156
## Salario_Diario 1.094e-03 3.344e-03 0.327 0.747
## Tipo_de_Cambio 7.445e-02 6.933e-02 1.074 0.297
## crisis_financiera -2.190e-01 1.978e-01 -1.107 0.283
##
## Residual standard error: 0.2151 on 18 degrees of freedom
## Multiple R-squared: 0.6694, Adjusted R-squared: 0.5408
## F-statistic: 5.206 on 7 and 18 DF, p-value: 0.002223
# Model comparison
stargazer(m1,m2,mp,type="text",title="OLS Regression Results",single.row=TRUE,ci=FALSE,ci.level=0.9)
##
## OLS Regression Results
## ==============================================================================================
## Dependent variable:
## ------------------------------------------------------------------------
## IED_MXN log(IED_MXN)
## (1) (2) (3)
## ----------------------------------------------------------------------------------------------
## Exportaciones_MXN -0.372 (0.776) -0.00000 (0.00000) -0.00000 (0.00000)
## Educacion 135,770.100** (51,237.110) 0.336*** (0.113) 0.945 (1.935)
## I(Educacion2) -0.037 (0.117)
## Inseguridad_Homicidio -8,056.819 (4,996.485) -0.019* (0.011) -0.018 (0.012)
## Salario_Diario 9.475 (1,342.384) 0.002 (0.003) 0.001 (0.003)
## Tipo_de_Cambio 34,033.200 (30,274.240) 0.070 (0.067) 0.074 (0.069)
## crisis_financiera -89,784.880 (83,821.190) -0.200 (0.184) -0.219 (0.198)
## Constant -817,189.700** (371,769.400) 9.954*** (0.817) 7.425 (8.057)
## ----------------------------------------------------------------------------------------------
## Observations 26 26 26
## R2 0.665 0.668 0.669
## Adjusted R2 0.559 0.563 0.541
## Residual Std. Error 95,540.370 (df = 19) 0.210 (df = 19) 0.215 (df = 18)
## F Statistic 6.279*** (df = 6; 19) 6.358*** (df = 6; 19) 5.206*** (df = 7; 18)
## ==============================================================================================
## Note: *p<0.1; **p<0.05; ***p<0.01
Diagnostics test plays an important role on modern tasks, by helping identify and improve the accuracy of the linear regression results and predictive analytics. By examining trends and correlation between the variables to determine the cause and get to know the story of what happened.
This process helps us to obtain a better precision in the three models obtained, in order to have the most reliable when taking action in the business decision making, optimization process, among others.
# Modelo 1
AIC(m1)
## [1] 677.9295
# Modelo 2
AIC(m2)
## [1] 0.464849
# Modelo 3
AIC(mp)
## [1] 2.321377
AIC: “Estimator of the relative quality of the model that takes into account its complexity. As the number of input parameters of a polynomial increases, the value of R will be better, because the mean square error decreases. The Akaike information criterion (aic metric) penalizes complex models in favor of simple ones to avoid overfitting.”(KeepCoding,2023)
VIF: Variance Inflation Factor, it help us diagnose multicollinearity. One thing we have to take in consideration is if our result in VIF is greater than 10, is preferable to eliminate the variable that is causing the multicollnearity.
bptest: The Breusch-Pagan Test is estimated to validate the presence of heteroscedasticity. A p-value≥ fails to reject the null hypothesis of homoscedasticity.
# Show the level of accuracy for each linear regression model
# Model 1
vif(m1)
## Exportaciones_MXN Educacion Inseguridad_Homicidio
## 62.709115 3.278016 3.462635
## Salario_Diario Tipo_de_Cambio crisis_financiera
## 6.342684 43.235405 1.421025
bptest(m1)
##
## studentized Breusch-Pagan test
##
## data: m1
## BP = 5.8766, df = 6, p-value = 0.4372
histogram(m1$residuals)
# Model 2
vif(m2)
## Exportaciones_MXN Educacion Inseguridad_Homicidio
## 62.709115 3.278016 3.462635
## Salario_Diario Tipo_de_Cambio crisis_financiera
## 6.342684 43.235405 1.421025
bptest(m2)
##
## studentized Breusch-Pagan test
##
## data: m2
## BP = 4.5779, df = 6, p-value = 0.599
histogram(m2$residuals)
# Model 3
vif(mp)
## Exportaciones_MXN Educacion I(Educacion^2)
## 62.790629 922.033359 956.049826
## Inseguridad_Homicidio Salario_Diario Tipo_de_Cambio
## 3.947527 7.764539 44.728842
## crisis_financiera
## 1.561342
bptest(mp)
##
## studentized Breusch-Pagan test
##
## data: mp
## BP = 5.0375, df = 7, p-value = 0.6554
histogram(mp$residuals)
Taking the proposed analysis as a reference, 3 regression models were developed and to know which is the most appropriate to use according to its relationship with the information, for this purpose two types of determination methods are used, the R square (R²), this statistical measure gives us It helps to evaluate the fit of a model based on the data set and helps to explain the variation in the dependent variable, being “IED_MXN” the foreign investment flows to Mexico. Another statistical measure used was the AIC, which looks for a simpler model and helps us make predictions within context. It should be noted that the value produced by the AIC is not positive or negative, since an absolute value is maintained that helps us as a comparison metric to identify which of the models is more relevant to use. According to the diagnostic test, the AIC, vif and bptest were applied for the 3 models, our first consideration for the selection of models was to know the highest value of R² at first glance, model 3 had this criterion but this metric is not the final one, but it serves to give us an idea, when applying the AIC test I realized that Model 2 had a value of 0.464849 compared to Model 3 that had 2.321377, therefore it is always advisable to take the AIC with the lowest index, thus saying that Model 2 is preferable to use as a matter of the variables examined.
Finally, the vif is applied, which helps us distinguish the multicollinearity of the variables in question from the “IED_MXN”, for this measure it is preferable to have the variables below 10, in which case we had the variable “Exportaciones_MXN” with a vif of 62.709115 and “Tipo_de_Cambio” with a vif of 43.235405, which makes sense that these two variables have come out very high because they were the only ones that were modified by the change of currency from dollars to pesos, therefore if the high difference of these two variables compared to the others. As a result, these variables show a multicollinearity which must be eliminated, the elimination process is shown below.
# Elimante Multicollinearity eliminte the variable of exportion mxn
m221 <- lm(log(IED_MXN) ~ Educacion + Inseguridad_Homicidio+ Tipo_de_Cambio + Salario_Diario + crisis_financiera, data = bd)
summary(m221)
##
## Call:
## lm(formula = log(IED_MXN) ~ Educacion + Inseguridad_Homicidio +
## Tipo_de_Cambio + Salario_Diario + crisis_financiera, data = bd)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.3595 -0.1061 0.0312 0.0789 0.4050
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 10.1825198 0.7156825 14.228 6.35e-12 ***
## Educacion 0.3271492 0.1099697 2.975 0.00749 **
## Inseguridad_Homicidio -0.0221244 0.0096855 -2.284 0.03342 *
## Tipo_de_Cambio 0.0344618 0.0308980 1.115 0.27793
## Salario_Diario 0.0005884 0.0024641 0.239 0.81369
## crisis_financiera -0.1420174 0.1553137 -0.914 0.37140
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.2067 on 20 degrees of freedom
## Multiple R-squared: 0.6609, Adjusted R-squared: 0.5762
## F-statistic: 7.798 on 5 and 20 DF, p-value: 0.0003279
#VIF de 9.6
vif(m221)
## Educacion Inseguridad_Homicidio Tipo_de_Cambio
## 3.227493 2.781000 9.625698
## Salario_Diario crisis_financiera
## 4.567722 1.042775
# -1.025
AIC(m221)
## [1] -1.025014
For the variable “Education” we obtained a p=0.00749 and “Insecurity_Homicide” with a value of p=0.03342, these tell us that they are significant at the 5% level. On the other hand, “Exchange_Rate”, “Daily_Salary” and “financial_crisis” are not significant. After eliminating the Exports variable due to its high vif number, the vif number in each low variable and in the “Tipo_de_Cambio” variable that was previously at 43.235405 and now went to 9.625698. Our model 3 was the most accurate since in the diagnostic test R decreased and the AIC was in a good rang. Having these 2 factors of significance, the AIC is more important than the R^2.
Variables that had a significant negative impact:
-Inseguridad_Homicidio, number of homicides, homicide rate per 100,000 inhabitants
-crisis_financiera, disturbance that produces a significant loss of value of or financial assets, 2008 and 2009
Variables that had a significant positive impact:
Educacion, average years of education.
Tipo_de_Cambio, FIX exchange rate. peso per dollar
Salario_Diario, minimum wage in pesos per day.
The coefficients indicate the impact of how each variable in relation to our Foreign Direct Investment Flows variable “IED_MXN.
# Effect plots
library(car)
lm(formula = log(IED_MXN) ~ Educacion + Inseguridad_Homicidio+ Tipo_de_Cambio + Salario_Diario + crisis_financiera, data = bd)
##
## Call:
## lm(formula = log(IED_MXN) ~ Educacion + Inseguridad_Homicidio +
## Tipo_de_Cambio + Salario_Diario + crisis_financiera, data = bd)
##
## Coefficients:
## (Intercept) Educacion Inseguridad_Homicidio
## 10.1825198 0.3271492 -0.0221244
## Tipo_de_Cambio Salario_Diario crisis_financiera
## 0.0344618 0.0005884 -0.1420174
avPlots(m221)
“Educacion”: At a glance we can see that it has a positive pattern, this gives us an understanding that on average as people’s educational levels increase, the IED_MXN variable tends to increase. It is implied that people have high levels of study, which will contribute to economic growth.
“Inseguridad_Homicidio”: The results indicate that the higher the index of insecurity per homicide, it results in a low decrease in the variable “IED_MXN”. Implying that the greater increase in the independent variable has an impact since investors can be redirected to a safer area.
“Tipo_de_Cambio”: Based on the visualization, it tells us that when the exchange rate is positive in the variable “IED_MXN”, thus helping the country’s economy and affecting international trade, facilitating purchases or agreements with other countries.
“Salario_Diario”: It is possible to observe a certain stability in relation to the variables, where a greater number of quantities in daily wages shows a slight increase in the variable “IED_MXN”. We can infer that people have more money to spend, thus helping the growth of the country.
“crisis_financiera”: The presence of this variable tends to have a negative impact on the country’s economy, resulting in economic instability and as a consequence people will be more careful with the use of their money, just as the number of investments tends to increase.
# Lasso Rgeresion
x <- as.matrix(bd[, c("Educacion", "Inseguridad_Homicidio", "Tipo_de_Cambio", "Salario_Diario", "crisis_financiera")])
y <- log(bd$IED_MXN)
# Fit the Lasso model
set.seed(123)
lasso_model <- glmnet(x, y, alpha = 1)
cv_lasso <- cv.glmnet(x, y, alpha = 1)
## Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per
## fold
optimal_lambda <- cv_lasso$lambda.min
print(coef(lasso_model, s = optimal_lambda))
## 6 x 1 sparse Matrix of class "dgCMatrix"
## s1
## (Intercept) 11.01204948
## Educacion 0.22651296
## Inseguridad_Homicidio .
## Tipo_de_Cambio 0.01028804
## Salario_Diario .
## crisis_financiera .
## Prediction analysis
predictions <- predict(lasso_model, newx = x, s = optimal_lambda)
### visualizing
lasso<-glmnet(scale(x),y,alpha=0)
plot(lasso, xvar = "lambda", label=T)
abline(v=cv_lasso$lambda.min, col = "red", lty=2)
abline(v=cv_lasso$lambda.1se, col="blue", lty=2)
-Educacion is the black line
-Tipo de Cambio is the green line
-Salario_Diario is the blue lines
-crisis_financiera is the light blue line
-inseguridad_Homicidio is the red line
The Lasso regression model tells us that only “Educacion” and “Tipo_de_Cambio” significantly predict “IED_MXN”. We can observe that in the variable “Educacion”, for each unit it increases by 0.2265. The same for “Tipo_de_Cambio”, for each unit it increases by 0.0103 keeping the other variables constant. On the other hand, the results of Inseguridad_Homicidio, Salario_Diario, and crisis_financiera were represented with a point, which means that they do not provide significant information on the behavior of influence for, so only “IED_MXN” the first two variables mentioned are significant predictors.
Model 2 was the most accurate of all, due the result it obtain in the vif that all variables have and it tells that theirs less presence of multicollinearity. Thanks to the analysis by vif, in our wining model we saw that two variables have a lot of multicollinearity and this was true due to that modification in the excel sheet, from us dollars to Mexican pesos, despite the high levels of multicollinearity, the “Exportacion_MXN” variable was eliminated, with this change it was possible that the vif of each variable would remain below 10. Thanks to this elimination, a more precise model was obtained, in the same way the AIC remained in a good range. The variables that had a positive impact where Educacion,Tipo_de_Cambio and Salario_Diario, due to there positive contribution to the country economy. The variables that had a negative impact where Inseguridad_Homicidio and crisis_financiera this relates by being situations that would result in a bad impact for the Foreign Direct Investment.
The intercept estimated value for the dependent variables 10.1825 saying that when all independent values are zero.
The variables with the greatest impact on IED_MXN were Education, had a significance of 1%, and Insecurity due to Homicide, with a statistical significance of 5%.
The coefficient for “Education” tells us that for one unit it increases in education, the dependent variable increases by 0.3271 units approximately,
For “Tipo_de_Cambio” is approximately 0.0345, indicating that a positive change in the exchange rate is associated with an estimated increase of approximately 0.0345 units in the dependent variable.
And for “Salario_Diario” , for each unit it increases the dependent variable is estimated to increase by 0.0006 units but it will have a slightly small effect.
For “Inseguridad_Homicidio” and “crisis_financiera” tells us that for one unit it will have a decrease in the variable IED_MXN. For one unit of the coefficient “Inseguridad_Homicidio” the dependent variable is estimated to decrease by approximately 0.0221. Suggestions there’s a financial crisis will decrease by .1420 units on behalf of the Foreign Direct Investment.
This document has been updated from the original for a better understanding and better use since in the FDI_Flows and Exportation variables were US currency (US.dollar) and by multiplying the exchange rate table, this adjustment was made to make the information more precise and have the same currency, which in this case is National Mexican Coin (MXN).
Nearshoring. (s/f). Oxfordreference.com. https://doi.org/10.1093/oi/authority.20110803100226581
PricewaterhouseCoopers. (s/f). ¿Por qué el nearshoring es una oportunidad para la economía mexicana? PwC. Recuperado el 24 de agosto de 2023, de https://www.pwc.com/mx/es/opinion/por-que-el-nearshoring-es-una-oportunidad-para-la-economia-mexicana.html
Fernández, R. D. (s/f). Nearshoring: 10 preguntas y respuestas sobre el tema del que todos hablan. EGADE. Recuperado el 24 de agosto de 2023, de https://egade.tec.mx/es/egade-ideas/investigacion/nearshoring-10-preguntas-y-respuestas-sobre-el-tema-del-que-todos-hablan
Hamilton, E. (2023). The global supply chain consequences of the Russia-Ukraine war. Ufl.edu. Recuperado el 24 de agosto de 2023, de https://news.ufl.edu/2023/02/russia-ukraine-global-supply-chain/
Mundi. (2022, agosto 26). ¿Qué es nearshoring y cómo se aplica en México? Mundi. https://mundi.io/exportacion/que-es-nearshoring/
Descriptive, predictive and prescriptive: three types of business analytics. (2021, abril 22). The University of Bath. https://online.bath.ac.uk/content/descriptive-predictive-and-prescriptive-thre-types-business-analytics
BBC News Mundo. (2023, marzo 1). Tesla llega a México: las ventajas del país para ser el mayor fabricante de autos eléctricos de América Latina (y qué gran obstáculo enfrenta). BBC. https://www.bbc.com/mundo/noticias-america-latina-64819256
Kumar, A. (2023, abril 14). Ordinary Least Squares Method: Concepts & Examples. Data Analytics. https://vitalflux.com/ordinary-least-squares-method-concepts-examples/
¿Qué es el criterio de información Akaike AIC? (2022, noviembre 4). KeepCoding Bootcamps. https://keepcoding.io/blog/que-es-el-criterio-de-informacion-akaik/