Introduction to Econometrics (Gpo 103)
Professor: David Saucedo de la Fuente
Abraham Castañon Alfaro - A01747966
library(foreign)
library(dplyr) # data manipulation
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(forcats) # to work with categorical variables
library(ggplot2) # data visualization
library(readr) # read specific csv files
library(janitor) # data exploration and cleaning
##
## Attaching package: 'janitor'
## The following objects are masked from 'package:stats':
##
## chisq.test, fisher.test
library(Hmisc) # several useful functions for data analysis
##
## Attaching package: 'Hmisc'
## The following objects are masked from 'package:dplyr':
##
## src, summarize
## The following objects are masked from 'package:base':
##
## format.pval, units
library(psych) # functions for multivariate analysis
##
## Attaching package: 'psych'
## The following object is masked from 'package:Hmisc':
##
## describe
## The following objects are masked from 'package:ggplot2':
##
## %+%, alpha
library(naniar) # summaries and visualization of missing values NAs
library(dlookr) # summaries and visualization of missing values NAs
##
## Attaching package: 'dlookr'
## The following object is masked from 'package:psych':
##
## describe
## The following object is masked from 'package:Hmisc':
##
## describe
## The following object is masked from 'package:base':
##
## transform
library(corrplot) # correlation plots
## corrplot 0.92 loaded
library(jtools) # presentation of regression analysis
##
## Attaching package: 'jtools'
## The following object is masked from 'package:Hmisc':
##
## %nin%
library(lmtest) # diagnostic checks - linear regression analysis
## Loading required package: zoo
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
library(car) # diagnostic checks - linear regression analysis
## Loading required package: carData
##
## Attaching package: 'car'
## The following object is masked from 'package:psych':
##
## logit
## The following object is masked from 'package:dplyr':
##
## recode
library(olsrr) # diagnostic checks - linear regression analysis
##
## Attaching package: 'olsrr'
## The following object is masked from 'package:datasets':
##
## rivers
library(naniar) # identifying missing values
library(stargazer) # create publication quality tables
##
## Please cite as:
## Hlavac, Marek (2022). stargazer: Well-Formatted Regression and Summary Statistics Tables.
## R package version 5.2.3. https://CRAN.R-project.org/package=stargazer
library(effects) # displays for linear and other regression models
## Registered S3 method overwritten by 'survey':
## method from
## summary.pps dlookr
## lattice theme set by effectsTheme()
## See ?effectsTheme for details.
library(tidyverse) # collection of R packages designed for data science
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ lubridate 1.9.2 ✔ tibble 3.2.1
## ✔ purrr 1.0.1 ✔ tidyr 1.3.0
## ✔ stringr 1.5.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ psych::%+%() masks ggplot2::%+%()
## ✖ psych::alpha() masks ggplot2::alpha()
## ✖ tidyr::extract() masks dlookr::extract()
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ✖ car::recode() masks dplyr::recode()
## ✖ purrr::some() masks car::some()
## ✖ Hmisc::src() masks dplyr::src()
## ✖ Hmisc::summarize() masks dplyr::summarize()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(caret) # Classification and Regression Training
## Loading required package: lattice
##
## Attaching package: 'caret'
##
## The following object is masked from 'package:purrr':
##
## lift
library(glmnet) # methods for prediction and plotting, and functions for cross-validation
## Loading required package: Matrix
##
## Attaching package: 'Matrix'
##
## The following objects are masked from 'package:tidyr':
##
## expand, pack, unpack
##
## Loaded glmnet 4.1-7
library(readxl) #Read the excel file
library(xts)
##
## ######################### Warning from 'xts' package ##########################
## # #
## # The dplyr lag() function breaks how base R's lag() function is supposed to #
## # work, which breaks lag(my_xts). Calls to lag(my_xts) that you type or #
## # source() into this session won't work correctly. #
## # #
## # Use stats::lag() to make sure you're not using dplyr::lag(), or you can add #
## # conflictRules('dplyr', exclude = 'lag') to your .Rprofile to stop #
## # dplyr from breaking base R's lag() function. #
## # #
## # Code in packages is not affected. It's protected by R's namespace mechanism #
## # Set `options(xts.warn_dplyr_breaks_lag = FALSE)` to suppress this warning. #
## # #
## ###############################################################################
##
## Attaching package: 'xts'
##
## The following objects are masked from 'package:dplyr':
##
## first, last
library(zoo)
library(tseries)
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
library(stats)
library(forecast)
library(astsa)
##
## Attaching package: 'astsa'
##
## The following object is masked from 'package:forecast':
##
## gas
##
## The following object is masked from 'package:psych':
##
## scatter.hist
library(corrplot)
library(AER)
## Loading required package: sandwich
## Loading required package: survival
##
## Attaching package: 'survival'
##
## The following object is masked from 'package:caret':
##
## cluster
library(vars)
## Loading required package: MASS
##
## Attaching package: 'MASS'
##
## The following object is masked from 'package:olsrr':
##
## cement
##
## The following object is masked from 'package:dplyr':
##
## select
##
## Loading required package: strucchange
##
## Attaching package: 'strucchange'
##
## The following object is masked from 'package:stringr':
##
## boundary
##
## Loading required package: urca
##
## Attaching package: 'vars'
##
## The following object is masked from 'package:dlookr':
##
## normality
library(dynlm)
library(vars)
library(TSstudio)
library(sarima)
## Loading required package: stats4
##
## Attaching package: 'sarima'
##
## The following object is masked from 'package:astsa':
##
## sarima
##
## The following object is masked from 'package:stats':
##
## spectrum
library(dygraphs)
setwd("/Users/abrahamcast/Desktop/")
data <-read_excel("SP_DataMexicoAtractiveness_alumn-VF_1.xlsx", na = '-')
It refers to the subcontracting that a company does to outsource parts of its production to third parties, which, even if they are in another country, are sought to be as close as possible, with similar time zones. All this is with the purpose of trying to reduce the risk by distributing their productions and suppliers to avoid keeping everything in the same place and that in the event of a crisis in the foreign country, they can react and mobilize. This process arose as a response to offshoring that sought to reduce production costs by going to other destinations no matter how far and without contemplating diversification, destinies like as Asia were selected; Resulting in situations such as those experienced during the COVID-19 pandemic.
In recent years, many countries that are world powers in the markets, such as the United States, have begun to transfer their productions from Asian countries to countries that are closer and have a better relationship. This means a large foreign investment for the country that is selected as a destination. However, there are many variables that affect the decision of companies and countries on where to establish themselves. Mexico has grown as a result of its proximity to the United States, thanks to the slow growth of recent years, lower labor costs, easy maritime, land, and air transfer of assets, the absence of presence in international conflicts, and the favorable exchange rate; Mexico has been the target mainly of several American companies that seek to transfer their production, but also of companies from European or even Asian countries.
Predictive analysis employs historical data, machine learning and statistical procedures to establish probable outcomes or as the name says, predictive patterns that benefits to know the result of the topic that it is been investigate. Different industries can benefit from analyzing historical and real-time data and it can be made by different methods to know the forecast of the event or events you are interest in.
“Regression analysis is a statistical technique for determining the relationship between a single dependent (criterion) variable and one or more independent (predictor) variables.” (NCBI, 2009). The regression analysis method is based on the assumption that the relationship between the variables is dependent or casual. Is the most commonly used technique because of the reliability to identify the variables that have impact, other way to say it is which factors influence and which ones you can ignore to know the effect in a certain topic of investigation. So the use in predictive analysis is to identify which variables are going to affect your dependent variable based on the same correlation that already happened in the past.
Once we know the context of both Nearshoring and the predictive analysis, we can make a relation between these two topics by answering the question of how regression analysis can help us to predict the occurrence of Nearshoring for Mexico. There is different ways that a regression model could be helpful:
Collect information of different key independent variables that could affect or influence the decision of foreign countries to invest in Mexico as a Nearshoring strategy.
Predict the possibility of Nearshoring in Mexico based on the historical data of the country and if the predictive model is competitive to other countries that already presented Nearshoring.
The problem situation involves that Maria an econometrics analyst from Mexico has the belief that with a database in which the socioeconomic, business environment, technological, environmental and security conditions of Mexico are included is possible to know if the country is attractive for nearshoring from a foreign company point of view.
head(data)
## # A tibble: 6 × 18
## Año IED_Flujos IED_Flujos_MXN Exportaciones Exportaciones_MXN Empleo
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1997 12146. 294151. 9088. 220091. NA
## 2 1998 8374. 210876. 9875. 248691. NA
## 3 1999 13960. 299734. 10990. 235961. NA
## 4 2000 18249. 362632. 12483. 248057. 97.8
## 5 2001 30057. 546548. 11300. 205483. 97.4
## 6 2002 24099. 468332. 11923. 231708. 97.7
## # ℹ 12 more variables: Educación <dbl>, Salario_Diario <dbl>, Innovación <dbl>,
## # Inseguridad_Robo <dbl>, Inseguridad_Homicidio <dbl>, Tipo_de_Cambio <dbl>,
## # Densidad_Carretera <dbl>, Densidad_Población <dbl>, CO2_Emisiones <dbl>,
## # PIB_Per_Cápita <dbl>, INPC <dbl>, Crisis_Financiera <dbl>
str(data)
## tibble [26 × 18] (S3: tbl_df/tbl/data.frame)
## $ Año : num [1:26] 1997 1998 1999 2000 2001 ...
## $ IED_Flujos : num [1:26] 12146 8374 13960 18249 30057 ...
## $ IED_Flujos_MXN : num [1:26] 294151 210876 299734 362632 546548 ...
## $ Exportaciones : num [1:26] 9088 9875 10990 12483 11300 ...
## $ Exportaciones_MXN : num [1:26] 220091 248691 235961 248057 205483 ...
## $ Empleo : num [1:26] NA NA NA 97.8 97.4 ...
## $ Educación : num [1:26] 7.2 7.3 7.4 7.6 7.7 7.8 7.9 8 8.1 8.3 ...
## $ Salario_Diario : num [1:26] 24.3 31.9 31.9 35.1 37.6 ...
## $ Innovación : num [1:26] 11.3 11.4 12.5 13.2 13.5 ...
## $ Inseguridad_Robo : num [1:26] 267 315 273 217 215 ...
## $ Inseguridad_Homicidio: num [1:26] 14.6 14.3 12.6 10.9 10.2 ...
## $ Tipo_de_Cambio : num [1:26] 8.06 9.94 9.52 9.6 9.17 ...
## $ Densidad_Carretera : num [1:26] 0.0521 0.053 0.055 0.0552 0.0565 0.0576 0.0596 0.0595 0.0625 0.0628 ...
## $ Densidad_Población : num [1:26] 47.4 48.8 49.5 50.6 51.3 ...
## $ CO2_Emisiones : num [1:26] 3.68 3.85 3.69 3.87 3.81 ...
## $ PIB_Per_Cápita : num [1:26] 127570 126739 129165 130875 128083 ...
## $ INPC : num [1:26] 33.3 39.5 44.3 48.3 50.4 ...
## $ Crisis_Financiera : num [1:26] 0 0 0 0 0 0 0 0 0 0 ...
names(data)
## [1] "Año" "IED_Flujos" "IED_Flujos_MXN"
## [4] "Exportaciones" "Exportaciones_MXN" "Empleo"
## [7] "Educación" "Salario_Diario" "Innovación"
## [10] "Inseguridad_Robo" "Inseguridad_Homicidio" "Tipo_de_Cambio"
## [13] "Densidad_Carretera" "Densidad_Población" "CO2_Emisiones"
## [16] "PIB_Per_Cápita" "INPC" "Crisis_Financiera"
FDI_Flows: Millions of Dollars Foreign Investment Flows Direct.
FDI_FLOWS_MXN: Millions of Pesos Foreign Investment Flows Direct.
Exports: Millions of Dollars Non-Oil Exports. The value of exports from the Maquiladora Export Industry is included.
Exports_MXN: Millions of Pesos Non-Oil Exports. The value of exports from the Maquiladora Export Industry is included.
Employment: Percentage Rate Percentage of the Employed Economically Active Population.
Education: Average Years of Years of Education.
Daily_Salary: Pesos Minimum salary in daily pesos.
Innovation: Patent rate per 100,000 inhabitants Number of patents applied for in Mexico.
Insecurity_Robbery: Robbery rate per 100,000 inhabitants Robbery with violence at home, vehicle, passers-by, carriers, banking institutions, businesses, livestock, machinery, auto parts, mainly.
Insecurity_Homicide: Homicide rate per 100,000 inhabitants Number of homicides.
Exchange_Type: Pesos per dollar FIX exchange rate.
Road_Density: Length in km2 Length of kilometers of paved road for each km2 of territorial surface.
Population_Density: Population per km2 The number of population is divided by the territorial extension of Mexico in km2.
CO2_Emissions: Metric Tons Per Capita Carbon Dioxide Emissions.
GDP_Per_Cápita: Real 2013 MXN Pesos Gross Domestic Product (GDP) divided by the population. Adjusted value for 2013 prices.
INPC: National Consumer Price Index (INPC) price index. Base 2018 = 100.
Finance Crisis: If that year there was a finance crisis.
#Change the missing values
data <- data %>%
mutate_all(~ifelse(is.na(.), median(., na.rm = TRUE), .))
sum(is.na(data))
## [1] 0
colSums(is.na(data))
## Año IED_Flujos IED_Flujos_MXN
## 0 0 0
## Exportaciones Exportaciones_MXN Empleo
## 0 0 0
## Educación Salario_Diario Innovación
## 0 0 0
## Inseguridad_Robo Inseguridad_Homicidio Tipo_de_Cambio
## 0 0 0
## Densidad_Carretera Densidad_Población CO2_Emisiones
## 0 0 0
## PIB_Per_Cápita INPC Crisis_Financiera
## 0 0 0
gg_miss_var(data)
summary(data)
## Año IED_Flujos IED_Flujos_MXN Exportaciones
## Min. :1997 Min. : 8374 Min. :210876 Min. : 9088
## 1st Qu.:2003 1st Qu.:21367 1st Qu.:368560 1st Qu.:13260
## Median :2010 Median :27698 Median :497054 Median :21188
## Mean :2010 Mean :26770 Mean :493596 Mean :23601
## 3rd Qu.:2016 3rd Qu.:32183 3rd Qu.:578606 3rd Qu.:31601
## Max. :2022 Max. :48354 Max. :754438 Max. :46478
## Exportaciones_MXN Empleo Educación Salario_Diario
## Min. :205483 Min. :95.06 Min. :7.200 Min. : 24.30
## 1st Qu.:262337 1st Qu.:96.08 1st Qu.:7.925 1st Qu.: 41.97
## Median :366294 Median :96.53 Median :8.500 Median : 54.48
## Mean :433856 Mean :96.48 Mean :8.450 Mean : 65.16
## 3rd Qu.:632356 3rd Qu.:97.01 3rd Qu.:8.975 3rd Qu.: 72.31
## Max. :785654 Max. :97.83 Max. :9.600 Max. :172.87
## Innovación Inseguridad_Robo Inseguridad_Homicidio Tipo_de_Cambio
## Min. :11.28 Min. :120.5 Min. : 8.04 Min. : 8.06
## 1st Qu.:12.60 1st Qu.:148.3 1st Qu.:10.40 1st Qu.:10.75
## Median :13.09 Median :181.8 Median :16.93 Median :13.02
## Mean :13.10 Mean :185.4 Mean :17.28 Mean :13.91
## 3rd Qu.:13.61 3rd Qu.:209.9 3rd Qu.:22.34 3rd Qu.:18.49
## Max. :15.11 Max. :314.8 Max. :29.59 Max. :20.66
## Densidad_Carretera Densidad_Población CO2_Emisiones PIB_Per_Cápita
## Min. :0.05210 Min. :47.44 Min. :3.592 Min. :126739
## 1st Qu.:0.05953 1st Qu.:52.77 1st Qu.:3.842 1st Qu.:130964
## Median :0.06990 Median :58.09 Median :3.925 Median :136846
## Mean :0.07106 Mean :57.33 Mean :3.944 Mean :138550
## 3rd Qu.:0.08273 3rd Qu.:61.39 3rd Qu.:4.088 3rd Qu.:146148
## Max. :0.09020 Max. :65.60 Max. :4.221 Max. :153236
## INPC Crisis_Financiera
## Min. : 33.28 Min. :0.00000
## 1st Qu.: 56.15 1st Qu.:0.00000
## Median : 73.35 Median :0.00000
## Mean : 75.17 Mean :0.07692
## 3rd Qu.: 91.29 3rd Qu.:0.00000
## Max. :126.48 Max. :1.00000
summary(data$IED_Flujos_MXN)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 210876 368560 497054 493596 578606 754438
summary(log(data$IED_Flujos_MXN))
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 12.26 12.82 13.12 13.06 13.27 13.53
describe(log(data))
## # A tibble: 18 × 26
## described_variables n na mean sd se_mean IQR
## <chr> <int> <int> <dbl> <dbl> <dbl> <dbl>
## 1 Año 26 0 7.61 0.00381 0.000746 0.00622
## 2 IED_Flujos 26 0 10.1 0.385 0.0755 0.410
## 3 IED_Flujos_MXN 26 0 13.1 0.317 0.0623 0.451
## 4 Exportaciones 26 0 9.95 0.501 0.0983 0.869
## 5 Exportaciones_MXN 26 0 12.9 0.447 0.0877 0.879
## 6 Empleo 26 0 4.57 0.00748 0.00147 0.00958
## 7 Educación 26 0 2.13 0.0831 0.0163 0.124
## 8 Salario_Diario 26 0 4.06 0.480 0.0942 0.544
## 9 Innovación 26 0 2.57 0.0819 0.0161 0.0771
## 10 Inseguridad_Robo 26 0 5.19 0.244 0.0478 0.348
## 11 Inseguridad_Homicidio 26 0 2.77 0.422 0.0828 0.765
## 12 Tipo_de_Cambio 26 0 2.59 0.293 0.0575 0.541
## 13 Densidad_Carretera 26 0 -2.66 0.189 0.0370 0.329
## 14 Densidad_Población 26 0 4.04 0.0959 0.0188 0.151
## 15 CO2_Emisiones 26 0 1.37 0.0460 0.00902 0.0620
## 16 PIB_Per_Cápita 26 0 11.8 0.0635 0.0125 0.110
## 17 INPC 26 0 4.26 0.347 0.0680 0.486
## 18 Crisis_Financiera 26 0 -Inf NaN NaN NaN
## # ℹ 19 more variables: skewness <dbl>, kurtosis <dbl>, p00 <dbl>, p01 <dbl>,
## # p05 <dbl>, p10 <dbl>, p20 <dbl>, p25 <dbl>, p30 <dbl>, p40 <dbl>,
## # p50 <dbl>, p60 <dbl>, p70 <dbl>, p75 <dbl>, p80 <dbl>, p90 <dbl>,
## # p95 <dbl>, p99 <dbl>, p100 <dbl>
ggplot(data = data, aes(IED_Flujos_MXN)) +
geom_histogram(fill = 'lightgreen', color='black')+
labs(x = 'Foreign Investment Flow in Pesos') +
ggtitle('Foreign Investment Flow Histogram in MXN')
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
ggplot(data, aes(x = log(IED_Flujos_MXN))) +
geom_histogram(fill = "lightgreen", color="black" ) +
labs(x = "Foreign Investment Flow in Pesos", y="Frecuency") +
ggtitle("Foreign Investment Flow Histogram in Pesos")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
ggplot(data = data, aes(x = IED_Flujos, y = Exportaciones)) +
geom_point() +
labs(x = "FDI Flows MXN (Millions of Dollars)",
y = "Exports MXN (Millions of Dollars)",
title = "Scatter Plot: FDI Flows vs. Exports")
ggplot(data = data, aes(x = factor(Crisis_Financiera), y = IED_Flujos)) +
geom_boxplot() +
labs(x = "Finance Crisis",
y = "FDI Flows MXN (Millions of Pesos)",
title = "Box Plot: FDI Flows MXN by Finance Crisis")
ggplot(data =data, aes(x = Densidad_Población, y = IED_Flujos)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE) +
labs(x = "Densidad de Población",
y = "Flujos de Inversión Directa (Millones de Pesos)",
title = "Gráfico de Regresión Lineal")
## `geom_smooth()` using formula = 'y ~ x'
data %>%
gather(key, val, -Año, -IED_Flujos, -Exportaciones) %>%
ggplot(aes(x=val)) +
geom_histogram(fill = "lightgreen", color="black" ) +
facet_wrap(~key, scales = "free")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
datacor<-data
corrplot(cor(datacor),method = "color",
type = "full", order = "hclust", addCoef.col = "black",
tl.col = "black", tl.srt = 90, diag = FALSE, number.cex = 0.5)
colnames(data)
## [1] "Año" "IED_Flujos" "IED_Flujos_MXN"
## [4] "Exportaciones" "Exportaciones_MXN" "Empleo"
## [7] "Educación" "Salario_Diario" "Innovación"
## [10] "Inseguridad_Robo" "Inseguridad_Homicidio" "Tipo_de_Cambio"
## [13] "Densidad_Carretera" "Densidad_Población" "CO2_Emisiones"
## [16] "PIB_Per_Cápita" "INPC" "Crisis_Financiera"
We learned in class the Ordinary Least Squares method. Describes the relationship between one or more quantitative independent variables and the dependent variable (simple or multiple linear regression). Least squares represents the least squared error (SSE).
Ho: More Millions of Pesos Non-Oil Exports does not have a negative effect on the Millions of Pesos in Foreign Investment Flows.
H1: More Millions of Pesos Non-Oil Exports has a negative effect on the Millions of Pesos in Foreign Investment Flows.
Ho: Finance Crisis does not have a positive effect on the Millions of Pesos in Foreign Investment Flows.
H1: Finance Crisis has a positive effect on the Millions of Pesos in Foreign Investment Flows.
Ho: Population_Density does not have an effect on the Millions of Pesos in Foreign Investment Flows.
H1: Population_Density have an effect on the Millions of Pesos in Foreign Investment Flows.
model1<-lm(log(IED_Flujos_MXN) ~Exportaciones_MXN+Salario_Diario+Tipo_de_Cambio+Densidad_Población+Densidad_Carretera+Educación+Inseguridad_Robo+Inseguridad_Homicidio+PIB_Per_Cápita,data=data)
summary(model1)
##
## Call:
## lm(formula = log(IED_Flujos_MXN) ~ Exportaciones_MXN + Salario_Diario +
## Tipo_de_Cambio + Densidad_Población + Densidad_Carretera +
## Educación + Inseguridad_Robo + Inseguridad_Homicidio + PIB_Per_Cápita,
## data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.33651 -0.09014 -0.02813 0.10138 0.39561
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 8.473e+00 3.547e+00 2.389 0.0296 *
## Exportaciones_MXN -4.338e-06 2.790e-06 -1.555 0.1395
## Salario_Diario -3.538e-03 7.376e-03 -0.480 0.6380
## Tipo_de_Cambio 1.051e-01 7.342e-02 1.432 0.1714
## Densidad_Población -1.883e-03 8.137e-02 -0.023 0.9818
## Densidad_Carretera 3.975e+01 4.834e+01 0.822 0.4230
## Educación -4.126e-01 5.635e-01 -0.732 0.4747
## Inseguridad_Robo -2.631e-03 2.109e-03 -1.247 0.2302
## Inseguridad_Homicidio 2.231e-03 2.288e-02 0.098 0.9235
## PIB_Per_Cápita 4.662e-05 3.226e-05 1.445 0.1677
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.2101 on 16 degrees of freedom
## Multiple R-squared: 0.7197, Adjusted R-squared: 0.562
## F-statistic: 4.565 on 9 and 16 DF, p-value: 0.004104
vif(model1)
## Exportaciones_MXN Salario_Diario Tipo_de_Cambio
## 167.64883 39.61086 52.58801
## Densidad_Población Densidad_Carretera Educación
## 109.81181 236.68622 86.80646
## Inseguridad_Robo Inseguridad_Homicidio PIB_Per_Cápita
## 5.72725 15.02041 46.29822
bptest(model1)
##
## studentized Breusch-Pagan test
##
## data: model1
## BP = 6.2615, df = 9, p-value = 0.7135
cat("AIC:", AIC(model1),"\n")
## AIC: 2.027688
selected_model1<-model1
cat("RMSE:",RMSE(selected_model1$fitted.values,data$IED_Flujos_MXN)) ### Root Mean Square
## RMSE: 513342.8
plot(model1)
hist(model1$residuals)
shapiro.test(model1$residuals)
##
## Shapiro-Wilk normality test
##
## data: model1$residuals
## W = 0.97442, p-value = 0.7393
model2<-lm(log(IED_Flujos_MXN) ~log(lag(IED_Flujos_MXN))+
log(Exportaciones_MXN)+Educación+I(Educación^2)+I(Exportaciones^2)+log(Tipo_de_Cambio)+log(Densidad_Población)+Crisis_Financiera+log(PIB_Per_Cápita)+log(Salario_Diario)+Inseguridad_Homicidio,data=data)
summary(model2)
##
## Call:
## lm(formula = log(IED_Flujos_MXN) ~ log(lag(IED_Flujos_MXN)) +
## log(Exportaciones_MXN) + Educación + I(Educación^2) + I(Exportaciones^2) +
## log(Tipo_de_Cambio) + log(Densidad_Población) + Crisis_Financiera +
## log(PIB_Per_Cápita) + log(Salario_Diario) + Inseguridad_Homicidio,
## data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.38260 -0.07348 0.00691 0.07571 0.25273
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -5.891e+01 5.594e+01 -1.053 0.3115
## log(lag(IED_Flujos_MXN)) -1.849e-01 2.466e-01 -0.750 0.4668
## log(Exportaciones_MXN) -3.356e+00 1.305e+00 -2.571 0.0233 *
## Educación 4.893e+00 6.697e+00 0.731 0.4779
## I(Educación^2) -2.780e-01 3.695e-01 -0.752 0.4653
## I(Exportaciones^2) 1.027e-09 8.358e-10 1.229 0.2408
## log(Tipo_de_Cambio) 3.223e+00 1.277e+00 2.523 0.0255 *
## log(Densidad_Población) 3.068e+00 1.110e+01 0.276 0.7865
## Crisis_Financiera -1.627e-01 2.184e-01 -0.745 0.4696
## log(PIB_Per_Cápita) 6.736e+00 4.563e+00 1.476 0.1637
## log(Salario_Diario) -1.223e+00 1.591e+00 -0.768 0.4560
## Inseguridad_Homicidio 1.190e-03 2.026e-02 0.059 0.9541
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1913 on 13 degrees of freedom
## (1 observation deleted due to missingness)
## Multiple R-squared: 0.792, Adjusted R-squared: 0.616
## F-statistic: 4.5 on 11 and 13 DF, p-value: 0.006194
vif(model2)
## log(lag(IED_Flujos_MXN)) log(Exportaciones_MXN) Educación
## 4.140384 216.357363 12796.415174
## I(Educación^2) I(Exportaciones^2) log(Tipo_de_Cambio)
## 11183.397028 169.220090 84.108497
## log(Densidad_Población) Crisis_Financiera log(PIB_Per_Cápita)
## 653.472774 2.399339 53.578935
## log(Salario_Diario) Inseguridad_Homicidio
## 344.646336 14.111623
bptest(model2)
##
## studentized Breusch-Pagan test
##
## data: model2
## BP = 11.176, df = 11, p-value = 0.4287
cat("AIC:", AIC(model2),"\n")
## AIC: -2.09908
selected_model2<-model2
cat("RMSE:",RMSE(selected_model2$fitted.values,data$IED_Flujos_MXN)) ### Root Mean Square
## Warning in pred - obs: longer object length is not a multiple of shorter object
## length
## RMSE: 513342.8
plot(model2)
hist(model2$residuals)
shapiro.test(model2$residuals)
##
## Shapiro-Wilk normality test
##
## data: model2$residuals
## W = 0.97045, p-value = 0.6565
model3<-lm(log(IED_Flujos_MXN) ~ log(lag(IED_Flujos_MXN))+
log(Exportaciones_MXN)+Tipo_de_Cambio+log(Densidad_Población)+Crisis_Financiera+log(PIB_Per_Cápita)+log(Salario_Diario)+Inseguridad_Homicidio,data=data)
summary(model3)
##
## Call:
## lm(formula = log(IED_Flujos_MXN) ~ log(lag(IED_Flujos_MXN)) +
## log(Exportaciones_MXN) + Tipo_de_Cambio + log(Densidad_Población) +
## Crisis_Financiera + log(PIB_Per_Cápita) + log(Salario_Diario) +
## Inseguridad_Homicidio, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.34465 -0.08638 -0.00315 0.10122 0.33533
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -37.35332 18.16190 -2.057 0.05640 .
## log(lag(IED_Flujos_MXN)) -0.20985 0.19890 -1.055 0.30708
## log(Exportaciones_MXN) -2.28851 0.65174 -3.511 0.00289 **
## Tipo_de_Cambio 0.15130 0.05195 2.912 0.01018 *
## log(Densidad_Población) 6.82734 2.61992 2.606 0.01911 *
## Crisis_Financiera -0.17269 0.18002 -0.959 0.35168
## log(PIB_Per_Cápita) 4.59791 2.05420 2.238 0.03977 *
## log(Salario_Diario) -0.31766 0.40823 -0.778 0.44784
## Inseguridad_Homicidio -0.01238 0.01047 -1.182 0.25440
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1775 on 16 degrees of freedom
## (1 observation deleted due to missingness)
## Multiple R-squared: 0.7797, Adjusted R-squared: 0.6696
## F-statistic: 7.079 on 8 and 16 DF, p-value: 0.000477
vif(model3)
## log(lag(IED_Flujos_MXN)) log(Exportaciones_MXN) Tipo_de_Cambio
## 3.129644 62.660354 33.852695
## log(Densidad_Población) Crisis_Financiera log(PIB_Per_Cápita)
## 42.322720 1.893510 12.615028
## log(Salario_Diario) Inseguridad_Homicidio
## 26.364516 4.379455
bptest(model3)
##
## studentized Breusch-Pagan test
##
## data: model3
## BP = 9.3066, df = 8, p-value = 0.3171
cat("AIC:", AIC(model3),"\n")
## AIC: -6.661234
selected_model3<-model3
cat("RMSE:",RMSE(selected_model3$fitted.values,data$IED_Flujos_MXN)) ### Root Mean Square
## RMSE: 513342.8
plot(model3)
hist(model3$residuals)
shapiro.test(model3$residuals)
##
## Shapiro-Wilk normality test
##
## data: model3$residuals
## W = 0.97629, p-value = 0.8032
effect_plot(model3,pred=Densidad_Población, data=data, interval=TRUE)
effect_plot(model3,pred=Exportaciones_MXN, data=data, interval=TRUE)
effect_plot(model3,pred=PIB_Per_Cápita, data=data, interval=TRUE)
acf(data$IED_Flujos_MXN, main = "Autocorrelation")
Box.test(model3$residuals, lag=5, type="Ljung-Box")
##
## Box-Ljung test
##
## data: model3$residuals
## X-squared = 4.068, df = 5, p-value = 0.5397
According to the results of the analysis, model 3 is the one that best fits. It is the model with the lowest AICM of -6.661234, even when the multiple R-squared is the second lowest (model 2 has a higher one),it has the best Adjusted R-squared: 0.6696 and the three models have the same RMSE of 513342.8. Nevertheless, this model present the lowest cases of multicollinearity (which can be checked with the results of the Vif test in every diagnosis test). Also the residuals have a major normality (Shapiro-Wilk Test) and even when the model 2 has a higher Breusch Pagan Test p-value (when the p-value is higher there is more chance of homoscedasticity); The model 3 has a better overrall results to be choose, the difference between multiple R-squared is only approximately 0.1 lower so it really does not exists a big difference unlike the Vif tests where the difference is abysmal and this can deteriorate the precision of the model. Then we can see with the effect plots that in fact with the effect plots the effect of the three biggest independent varibale are correct, Population density affects 6 dependent variable, Exports affect in a negative by -2 units the dependent variable and PIB Per Cápita affects 4.6 units the dependent variable. And at last with the autocorrelation graph and test we can see that p-value > 0,05 which means there is in fact autocorrelation.
Now that the data analysis has been finished. The insights can be already listed:
The top three variables that have the biggest effect in the depentent variable (FDI_FLOWS_MXN) are PIB Per Cápita, Population density and Exports.
Exports: When a country exports a lot and also does not imports as much as it exports, it that the country has a big influence in the offer of certain product which means a lost when this offer increase.Also a country that exports in big quantity might be vulnerable to global trends which goes totally against the nearshoring idea.
PIB Per Cápita: A solid and big PIB Per Cápita means that there is more money to build buildings, houses or buy machinery and that more goods and services will be produced. This is beneficial for all because there will be more employment and more opportunities to do business. So the foreign company has a lower risk arriving to Mexico.
Population density:A dense population may represent a larger potential market for the products and services of foreign companies. Companies may be more willing to invest in countries with a significant population, as this increases the consumer base for their products. Also in labor resources a higher population density can mean greater availability of skilled and unskilled labor. This can be attractive to foreign companies seeking access to a large and diverse workforce.
Arriaga, C.(2017). Inversión extranjera directa en México: comparación entre la inversión procedente de los Estados Unidos y del resto del mundo. Recovered from: https://www.scielo.org.mx/scielo.php?script=sci_arttext&pid=S0185-013X2017000200317
Dabla-Norris, E. Duval, R.(2016). La reducción de las barreras comerciales puede reactivar la productividad y el crecimiento mundial. Recovered from: https://www.imf.org/es/Blogs/Articles/2016/06/20/how-lowering-trade-barriers-can-revive-global-productivity-and-growth
León, Juan.(2010). ECONOMÍA APLICADA. Recovered from: https://economia.unmsm.edu.pe/org/arch_doc/JLeonM/publ/Interiores_Economia_Aplicada.pdf
Palmer, P.(2009). Regression Analysis for Prediction: Understanding the Process. Recovered from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2845248/#:~:text=In%20most%20cases%2C%20the%20investigators,more%20independent%20(predictor)%20variables.
Saucedo, D.(2023). Mexico and Its Attractiveness for Nearshoring. Recovered from: https://cic.itesm.mx/Paginas/Pagina-DocumentoCic.aspx?id=1860