This analysis examines the dynamics of international trade and foreign direct investment across Mexican states using data from the Instituto Nacional de Estadística y Geografía (INEGI) covering the period 2015-2023. The dataset comprises four interconnected components: state-level non-oil exports, time-series data on exports and imports, foreign direct investment (FDI) flows, and a comprehensive set of socioeconomic indicators. All monetary values are expressed in Mexican pesos with 2018 as the base year (2018 = 100).
Explore and briefly describe how has been the performance of Mexico’s exports over the period 2015 – 2025. What are the 3 main characteristics of this performance over the last approximately 10 years?
Mexico’s exports between 2015 and 2025 show a clear upward trend compared to previous decades, with monthly values generally moving to higher ranges over time.
From 2015 to 2019, exports remained strong and stable, frequently fluctuating between 550,000 and 700,000, indicating consolidation and sustained growth.
In 2020, exports experienced a sharp drop, especially in April and May, reflecting a major external shock. However, there was a rapid recovery in the second half of the year.
Between 2021 and 2025, exports remained at historically high levels, often exceeding 700,000 and reaching new peaks such as over 800,000 in 2024 and 2025.
The three main characteristics of this performance are: a long-term upward trend, strong resilience after shocks, and higher volatility at elevated export levels.
## Summary Statistics:
## Average annual growth rate: 0 %
## Highest growth year: 1992/01
## Lowest growth year: 1992/01
## Total growth 2015-2023: 1004.64 %
How can panel data analysis contribute to understand Mexico’s exports performance?
A panel data analysis means that it is possible to analyze data from several states during several years at the same time. This helps a lot because it allows us to see differences between states, observe how exports change over time, and understand which variables influence exports the most, such as the exchange rate, GDP, or foreign direct investment.
How can panel data analysis be useful in predicting Mexico’s exports performance over the next 1 – 2 years?
Data analysis can help significantly because statistical models can be built to estimate how exports change when one variable changes. This allows us to estimate possible future behavior based on historical relationships between exports and key economic variables.
How can Business Intelligence tools (e.g., predictive analytics) help to identify which Mexican states or sectors were most vulnerable to changes in U.S. tariffs and trade rules during the USMCA negotiations?
Business Intelligence tools can help by analyzing trade dependence on the United States, identifying which states depend more on specific sectors, comparing declines or instabilities during certain periods, and simulating scenarios using predictive models.
## state region year pop_density
## Aguascalientes : 8 Length:256 2016 :32 Min. : 9.791
## Baja California : 8 Class :character 2017 :32 1st Qu.: 42.534
## Baja California Sur: 8 Mode :character 2018 :32 Median : 65.441
## Campeche : 8 2019 :32 Mean : 309.866
## Chiapas : 8 2020 :32 3rd Qu.: 162.363
## Chihuahua : 8 2021 :32 Max. :6233.409
## (Other) :208 (Other):64
## gdp_per_capita_2018 lq_primary lq_secondary lq_tertiary
## Min. : 603.7 Min. :0.01493 Min. :0.3685 Min. :0.7867
## 1st Qu.:1307.2 1st Qu.:0.43199 1st Qu.:0.6470 1st Qu.:0.9339
## Median :1789.2 Median :0.82275 Median :0.9740 Median :0.9985
## Mean :1977.1 Mean :1.05431 Mean :0.9989 Mean :0.9971
## 3rd Qu.:2317.1 3rd Qu.:1.30673 3rd Qu.:1.2569 3rd Qu.:1.0679
## Max. :7731.2 Max. :4.63480 Max. :1.9911 Max. :1.1847
##
## average_daily_salary real_public_investment_pc border_economic_activity
## Min. :250.0 Min. : 4.139 Min. :-2.772
## 1st Qu.:299.1 1st Qu.: 277.511 1st Qu.:-2.137
## Median :327.7 Median : 483.944 Median :-1.982
## Mean :334.4 Mean : 599.266 Mean :-1.761
## 3rd Qu.:362.4 3rd Qu.: 835.546 3rd Qu.:-1.789
## Max. :505.7 Max. :2423.360 Max. : 2.530
##
## crime_rate exchange_rate border_distance inpc
## Min. : 1.984 Min. :12.76 Min. : 8.83 Min. : 92.04
## 1st Qu.: 10.790 1st Qu.:16.94 1st Qu.: 613.26 1st Qu.:101.83
## Median : 18.525 Median :18.16 Median : 751.64 Median :107.60
## Mean : 28.654 Mean :17.91 Mean : 704.92 Mean :110.59
## 3rd Qu.: 38.870 3rd Qu.:19.52 3rd Qu.: 875.76 3rd Qu.:119.60
## Max. :116.950 Max. :22.29 Max. :1252.66 Max. :132.37
##
## ts_exports fdi
## Min. :2.688e+05 Min. : -7413
## 1st Qu.:2.152e+07 1st Qu.: 4048
## Median :9.754e+07 Median : 10478
## Mean :2.288e+08 Mean : 18536
## 3rd Qu.:3.502e+08 3rd Qu.: 22572
## Max. :1.163e+09 Max. :171239
##
In addition to the main variable of interest (i.e., inflow of exports), please select 2 – 3 variables of interest and plot the average of each at the regional level.
The main variable of the analysis is ts_exports, since the objective is to analyze the behavior of exports at the regional level in Mexico. Additional variables selected were fdi and inpc. fdi was included because foreign direct investment can increase productive capacity and, therefore, exports, while inpc was included to consider the general economic environment, since inflation can influence costs and competitiveness.
It was done to compare which region exports the most, which one receives more investment, and how INPC behaves on average in each region.
The Northeast region (Noreste) presents the highest average exports, followed by the Northwest (Noroeste). The central and southern regions show considerably lower levels. This indicates that export activity in Mexico is mainly concentrated in the northern part of the country, which may be related to greater industrial development and proximity to the United States.
Display a histogram for each of the above selected variables.
The distribution of ts_exports is right-skewed. This means that most observations have moderate or low levels of exports, while a few show very high values. This could indicate that exports are concentrated in a few regions.
Most observations show low or moderate levels of foreign direct investment, while only a few present very high values. This indicates that investment is concentrated in certain regions and not evenly distributed.
The histogram shows that INPC does not change much and most values are within a similar range. There are no extreme values, which indicates that prices remained relatively stable between 2016 and 2023. Unlike exports and FDI, there is not much variation here.
In addition to the main variable of interest (i.e., inflow of exports), please select 2 – 3 variables of interest and display a boxplot for each variable to visualize and compare data distributions of these variables across regions.
## # A tibble: 1 × 5
## mean_exports sd_exports var_exports iqr_exports range_exports
## <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 228822084. 274331648. 7.53e16 328732074. 1162848164.
The standard deviation is high compared to the mean, which means that exports vary significantly across regions. The range is also very large, showing a big difference between the region that exports the most and the one that exports the least. Therefore, not all regions export at the same level and there is considerable inequality.
They are concentrated in the North, especially Northeast and Northwest, and there is a large difference between regions. There is significant variation.
Mexico City stands out compared to the others. Investment is concentrated in a few regions and there are isolated high values.
It is very similar across all regions. It does not change much and there is not as much difference as in exports and FDI.
Line plot of inflow exports over study time period.
data_total_nacional <- data %>%
mutate(
year = as.numeric(as.character(year)),
ts_exports = as.numeric(ts_exports)
) %>%
group_by(year) %>%
summarise(total_exportado = sum(ts_exports, na.rm = TRUE))
ggplot(data_total_nacional, aes(x = year, y = total_exportado)) +
geom_line(color = "#2c3e50", linewidth = 1.2) +
geom_point(color = "#e74c3c", size = 3) +
theme_minimal() +
labs(
title = "Total Inflow Exports over Study Time Period",
subtitle = "Aggregated data from all regions (2016-2023)",
x = "Year",
y = "Total Exports (Value)"
) +
scale_x_continuous(breaks = seq(min(data_total_nacional$year),
max(data_total_nacional$year), by = 1)) +
scale_y_continuous(labels = comma)
Line plots of exports inflows by region over study time period.
data_plot_facets <- data %>%
mutate(
year = as.numeric(as.character(year)),
ts_exports = as.numeric(ts_exports)
)
ggplot(data_plot_facets, aes(x = year, y = ts_exports, group = region)) +
geom_line(color = "steelblue", linewidth = 1) +
geom_point(color = "darkblue", size = 1.5) +
facet_wrap(~region, scales = "free_y", ncol = 2) +
theme_minimal() +
labs(
title = "Export Inflows by Region (Individual Panels)",
subtitle = "Analysis from 2016 to 2023",
x = "Year",
y = "Export Value (ts_exports)"
) +
scale_x_continuous(breaks = seq(2016, 2023, by = 3)) +
scale_y_continuous(labels = comma) +
theme(
strip.background = element_rect(fill = "gray90"),
strip.text = element_text(face = "bold"),
axis.text.x = element_text(angle = 45, hjust = 1)
)
Display the performance of both exports inflows and exchange rate over the study time period. Is there any relevant shock?
data_analisis <- data %>%
group_by(year) %>%
summarise(
total_exports = sum(as.numeric(ts_exports), na.rm = TRUE),
avg_ex_rate = mean(as.numeric(exchange_rate), na.rm = TRUE)
) %>%
mutate(year = as.numeric(as.character(year)))
rescale_factor <- max(data_analisis$total_exports) / max(data_analisis$avg_ex_rate)
#Graficar
ggplot(data_analisis, aes(x = year)) +
# Línea de Exportaciones
geom_line(aes(y = total_exports, color = "Exports (ts_exports)"), linewidth = 1.2) +
geom_point(aes(y = total_exports, color = "Exports (ts_exports)"), size = 2) +
# Línea de Tipo de Cambio
geom_line(aes(y = avg_ex_rate * rescale_factor, color = "Exchange Rate"),
linewidth = 1.2, linetype = "twodash") +
geom_point(aes(y = avg_ex_rate * rescale_factor, color = "Exchange Rate"), size = 2) +
# Configuración de los dos ejes Y
scale_y_continuous(
name = "Total Export Inflows",
labels = scales::comma,
sec.axis = sec_axis(~./rescale_factor, name = "Exchange Rate (MXN/USD)")
) +
theme_minimal() +
labs(
title = "Performance of Exports and Exchange Rate (2016-2023)",
x = "Year",
color = "Variable"
) +
scale_color_manual(values = c("Exports (ts_exports)" = "#2c3e50", "Exchange Rate" = "#e74c3c")) +
theme(legend.position = "bottom")
Bar Plot: Compare exports inflows of northern vs southern regions over the study time period.
# a.1. Panel Structure - Standardize column names
data_pd <- data_pd %>%
rename_all(tolower)
# a.2. Regional Classification - Reclassify existing regions into North vs South
data_pd <- data_pd %>%
mutate(region_group = case_when(
region %in% c("Noroeste", "Noreste") ~ "North",
region %in% c("Sur", "Centro_Sur_Oriente") ~ "South",
TRUE ~ "Center" # This includes "Occidente_Bajio" and "CdMx"
))
# Verify the classification
table(data_pd$region_group, data_pd$region)
##
## CdMx Centro_Sur_Oriente Noreste Noroeste Occidente_Bajio Sur
## Center 8 0 0 0 64 0
## North 0 0 32 48 0 0
## South 0 48 0 0 0 56
# a.3. Bar Plot -- Export Inflows: North vs South
exports_region_year <- data_pd %>%
filter(region_group %in% c("North", "South")) %>%
group_by(region_group, year) %>%
summarise(avg_exports = mean(ts_exports, na.rm = TRUE), .groups = "drop")
ggplot(exports_region_year, aes(x = as.factor(year), y = avg_exports, fill = region_group)) +
geom_bar(stat = "identity", position = "dodge", width = 0.7) +
labs(
title = "Export Performance: Northern vs Southern Mexican States",
subtitle = "Average export inflows by region (2015-2023)",
x = "Year",
y = "Average Exports (MXN Pesos, 2018 = 100)",
fill = "Region",
caption = "Source: INEGI"
) +
theme_minimal() +
theme(
plot.title = element_text(face = "bold", size = 14),
legend.position = "bottom"
) +
scale_fill_manual(values = c("North" = "#2E86AB", "South" = "#A23B72")) +
scale_y_continuous(labels = scales::comma)
Map average exports inflows at Mexico’s state level.
# Use the original 'data' dataset (not the panel version)
avg_exports_state <- data %>%
group_by(state) %>%
summarise(avg_exports = mean(ts_exports, na.rm = TRUE))
# Get Mexico map
mexico <- ne_states(country = "mexico", returnclass = "sf")
# Merge the data
mexico_exports <- mexico %>%
left_join(avg_exports_state, by = c("name" = "state"))
# Create the map
ggplot(mexico_exports) +
geom_sf(aes(fill = avg_exports), color = "white", size = 0.3) +
scale_fill_gradient(
low = "#E8F4F8",
high = "#2E86AB",
name = "Average Exports\n(MXN, 2018=100)",
labels = scales::comma,
na.value = "grey90"
) +
labs(
title = "Average Export Inflows by Mexican State (2015-2023)",
subtitle = "State-level export performance",
caption = "Source: INEGI"
) +
theme_void() +
theme(
plot.title = element_text(face = "bold", size = 16, hjust = 0.5),
plot.subtitle = element_text(size = 12, hjust = 0.5),
legend.position = "right"
)
How do northern vs. southern states differ in exposure to international markets? What are 3 - 5 characteristics that explain this difference?
The bar chart shows a persistent export gap between regions from 2016–2023, with Northern states exporting roughly four to five times more than Southern states each year. The map reinforces this pattern by showing export activity spatially concentrated in northern border states like Chihuahua, Coahuila, Nuevo León, and Baja California, which dominate national export performance.
In contrast, southern states such as Chiapas and Oaxaca exhibit much lower export averages. The regional classification table confirms that this comparison groups official statistical regions into broader North and South macro-regions, making the observed divide economically meaningful rather than arbitrary. Key Characteristic Finding:
Northern regions both east and west are physically integrated with the United States through dense border crossings.
Northern Mexico specializes in manufacturing sectors designed for export such as cars, electronics, machinery, etc.
Trade connectivity;
- The North has: Dense highway and rail networks linked to border
crossings, Industrial parks and logistics hubs, Faster customs
access.
Calculate a correlation matrix by considering a heatmap format.
1. Border Proximity Drives Economic Activity (r = -0.81) States closer to the US border show significantly stronger cross-border economic integration, suggesting geographic location is a critical determinant of international trade exposure.
2. Foreign Investment Sensitive to Currency Fluctuations (r = -0.75) FDI flows demonstrate strong inverse relationship with exchange rates, indicating foreign investors respond sharply to peso valuation changes.
3. Export Performance Shows Wage Correlation (r = 0.64) Higher export volumes associate with elevated daily salaries, though exports remain relatively independent of most other economic indicators, suggesting specialized or sector-specific drivers.
4. Sectoral Specialization Trade-offs Primary sector concentration shows strong negative correlation with GDP per capita (r = -0.46) and secondary sector employment (r = -0.46), revealing distinct state development pathways between agriculture and manufacturing.
5. Crime and Economic Outcomes (r = -0.37 with wages) Security conditions show moderate inverse relationship with wage levels, potentially reflecting either economic impacts of insecurity or underlying institutional factors affecting both variables.
In addition to the main variable of interest (i.e., inflow of exports), please select 2 – 3 variables of interest. Please create and display 2 – 3 scatterplots to explore the relationship between exports inflows and the other selected variables.
GDP per Capita - Economic development’s relationship with export activity
Border Distance - Geographic advantage/disadvantage for trade
Average Daily Salary - Labor market conditions and export competitiveness
Export Performance and Economic Development (GDP per Capita) Positive but Moderate Relationship
States with higher GDP per capita tend to have higher export volumes, suggesting economic development supports export capacity.
Notable dispersion: Several high-GDP states (6,000-8,000 MXN range) show relatively low export volumes, indicating GDP alone doesn’t guarantee export success.
Cluster of low performers: Large concentration of states with GDP below 2,500 and minimal exports suggests structural barriers in less developed states implies that export competitiveness requires more than just economic size—likely infrastructure, industry specialization, or trade logistics.
2. Geographic Proximity and Export Activity (Border Distance) Strong
Negative Relationship
Clear inverse pattern: states closer to the US border demonstrate significantly higher export volumes.
Border states dominate: States within 400km show the highest export activity (clustering at 400-1,200 million MXN range).
Geographic penalty: Export volumes decline sharply as distance increases, with states beyond 800km showing minimal activity.
Strategic advantage: Proximity to US markets provides major competitive edge, likely due to lower transportation costs and an established trade infrastructure.
3. Labor Costs and Export Competitiveness (Average Daily Salary)
Positive Correlation
Higher wages correlate with higher exports, contradicting traditional “low-wage competitiveness” narrative.
Quality over cost: Suggests Mexican exports compete on value-added production rather than purely low labor costs.
States paying 350-450 MXN daily show strongest export performance, indicating skilled labor and productivity drive export success.
Two clusters: Low-wage/low-export states (250-300 MXN) versus higher-wage/higher-export states (350-500 MXN)—pointing to a development threshold where states transition from domestic to export-oriented economies.
According to the results in the explanatory panel data analysis, please set 4 – 5 relevant hypotheses statements to be explored and tested by doing panel data analysis.
Hypothesis: States with closer proximity to the US border exhibit significantly higher export volumes, controlling for other economic factors. - Reason: The map and scatterplot show clear concentration of exports in northern border states. The negative correlation between border distance and exports suggests geographic advantage.
Hypothesis: States with higher average daily salaries demonstrate greater export performance, as higher wages reflect skilled labor, productivity advantages, and participation in value-added manufacturing sectors. - Reason: Scatterplot shows positive correlation between wages and exports, contradicting the low-wage competitiveness narrative. States paying 350-450 MXN daily show strongest export performance, suggesting quality over cost competition.
Hypothesis: Peso depreciation (higher exchange rate values) positively impacts export volumes by improving price competitiveness in international markets. - Reason: The dual-axis chart shows exports generally increasing while exchange rate (MXN/USD) rises, suggesting currency depreciation may stimulate exports.
Hypothesis: Higher crime rates negatively affect export volumes by disrupting supply chains, increasing operational costs, and deterring foreign investment. - Reason: Correlation matrix reveals negative relationship between crime rates and wages (-0.37), suggesting insecurity undermines economic activity and potentially trade operations.
Hypothesis: Northern states (Noreste, Noroeste) maintain persistently higher export performance compared to southern states (Sur, Centro_Sur_Oriente), even after controlling for economic and geographic factors, indicating structural export capacity differences. - Reason: The North vs South bar chart shows consistent 4-5x export gap across all years (2016-2023), while regional panel charts reveal distinct performance tiers that persist over time.
Based on the EDA’s results, please briefly summarize and describe the main 4 – 7 findings. Generally, an answer clear and specific enhances understanding, eliminates confusion, and ensures accurate communication by focusing on facts, and avoiding jargon or vague terminology.
Northern states (Noreste, Noroeste) consistently export 4-5 times more than southern states (Sur, Centro_Sur_Oriente) across all years (2016-2023). Northern states average 450-500 million MXN annually while southern states average only 90-120 million MXN. This gap persists throughout the entire study period without convergence.
States located within 400km of the US border dominate export activity, with volumes reaching 800-1,200 million MXN. Export performance declines sharply with distance—states beyond 800km from the border show minimal export activity. The correlation between border distance and exports is strongly negative (-0.75), confirming geographic location as a critical determinant.
Contrary to low-cost competition models, states paying higher average daily salaries (350-500 MXN) demonstrate superior export performance. The positive correlation (0.64) suggests Mexican exports compete on productivity and value-added manufacturing rather than cheap labor. This indicates participation in skilled manufacturing sectors like automotive and electronics.
From 2016 to 2023, the exchange rate increased from 22.3 to 12.8 MXN/USD (peso strengthening), while total exports fluctuated between 6.8-8.0 billion MXN. The 2021-2022 period saw both peso strengthening and peak export volumes (8.0 billion MXN), suggesting exports respond to factors beyond currency competitiveness alone.
Total exports dropped from 7.5 billion MXN (2019) to 6.9 billion MXN (2020)—an 8% decline. However, recovery was rapid, with exports reaching record levels of 8.0 billion MXN by 2022. Northern states recovered faster than southern states, widening the regional gap post-pandemic.
The spatial map reveals only 3-4 states (Chihuahua, Nuevo León, Coahuila, Baja California) account for the majority of national exports, shown by darkest blue coloring (>750 million MXN average). Meanwhile, over 20 states show minimal export activity (light blue/gray, <250 million MXN), indicating highly concentrated export capacity.
States with higher crime rates (above 40 per 100,000) tend to have lower wages and reduced export activity. The correlation between crime and wages (-0.37) suggests insecurity undermines productive economic activity, though the relationship with exports requires further panel regression analysis to establish causality.