Este documento analiza el desempeño exportador de los 32 estados de
México en el año 2022, utilizando datos del INEGI. La
variable dependiente es real_exports (exportaciones reales
en USD), complementada con seis variables explicativas seleccionadas del
dataset.
# Instalar paquetes si no están disponibles
paquetes <- c("readxl", "dplyr", "tidyr", "ggplot2",
"scales", "patchwork", "knitr", "e1071")
nuevos <- paquetes[!paquetes %in% installed.packages()[,"Package"]]
if (length(nuevos)) install.packages(nuevos)
library(readxl)
library(dplyr)
library(tidyr)
library(ggplot2)
library(scales)
library(patchwork)
library(knitr)
library(e1071)
# ── Importar hojas "exports" y "data" ─────────────────────────────────────────
file_path <- "/Users/mariirobles/Downloads/inegi_mx_state_exports.xlsx" # Archivo en el working directory
exports_sheet <- read_excel(file_path, sheet = "exports")
data_sheet <- read_excel(file_path, sheet = "data")
# ── Filtrar año 2022 y fusionar ────────────────────────────────────────────────
data_2022 <- data_sheet |>
filter(year == 2022) |>
select(state, region,
gdp_per_capita_2018,
lq_secondary,
average_daily_salary,
border_distance,
college_education,
crime_rate)
exports_2022 <- exports_sheet |>
select(state, real_exports = real_exports_2022)
df <- data_2022 |>
left_join(exports_2022, by = "state") |>
mutate(region = factor(region))
kable(head(df), caption = "Vista previa del dataset (2022, n = 32 estados)",
format.args = list(big.mark = ",", scientific = FALSE))
| state | region | gdp_per_capita_2018 | lq_secondary | average_daily_salary | border_distance | college_education | crime_rate | real_exports |
|---|---|---|---|---|---|---|---|---|
| Aguascalientes | Occidente_Bajio | 2,135.7824 | 1.2701215 | 363.8170 | 625.59 | 0.3136129 | 5.634732 | 181,116,479 |
| Baja California | Noroeste | 2,421.0493 | 1.5095911 | 411.7818 | 8.83 | 0.3518042 | 75.873393 | 817,710,013 |
| Baja California Sur | Noroeste | 2,128.7127 | 0.5157059 | 362.6095 | 800.32 | 0.3564740 | 9.679779 | 6,598,584 |
| Campeche | Sur | 5,023.4661 | 0.6871682 | 425.1225 | 978.33 | 0.2789125 | 9.883691 | 296,052,336 |
| Chiapas | Sur | 659.4383 | 0.6859139 | 317.7348 | 1,111.82 | 0.1915258 | 8.733395 | 21,460,417 |
| Chihuahua | Noroeste | 2,378.7155 | 1.8605077 | 379.4896 | 350.02 | 0.2989659 | 55.887096 | 1,163,116,947 |
La siguiente tabla resume las variables seleccionadas y su justificación teórica:
| Variable | Descripción | Justificación |
|---|---|---|
real_exports |
Exportaciones reales (USD) | Variable dependiente |
gdp_per_capita_2018 |
PIB per cápita a precios 2018 (MXN) | Capacidad productiva estatal |
lq_secondary |
Cociente de localización del sector secundario | Especialización manufacturera |
average_daily_salary |
Salario diario promedio (MXN) | Competitividad laboral |
border_distance |
Distancia al cruce fronterizo más cercano (km) | Costo logístico hacia EE.UU. |
college_education |
Proporción de población con educación universitaria | Capital humano |
crime_rate |
Tasa de criminalidad (incidentes por 100k hab.) | Clima de negocios / riesgo |
var_labels <- c(
real_exports = "Real Exports (USD)",
gdp_per_capita_2018 = "GDP per Capita 2018 (MXN)",
lq_secondary = "LQ Secondary Sector",
average_daily_salary = "Avg. Daily Salary (MXN)",
border_distance = "Border Distance (km)",
college_education = "College Education (%)",
crime_rate = "Crime Rate (per 100k)"
)
vars_of_interest <- names(var_labels)
desc_stats <- df |>
group_by(region) |>
summarise(across(
all_of(vars_of_interest),
list(
Mean = \(x) mean(x, na.rm = TRUE),
Median = \(x) median(x, na.rm = TRUE),
Min = \(x) min(x, na.rm = TRUE),
Max = \(x) max(x, na.rm = TRUE)
),
.names = "{.col}__{.fn}"
)) |>
pivot_longer(-region,
names_to = c("Variable", "Statistic"),
names_sep = "__") |>
pivot_wider(names_from = Statistic, values_from = value) |>
mutate(Variable = recode(Variable, !!!var_labels),
across(where(is.numeric), \(x) round(x, 2)))
kable(desc_stats,
caption = "Estadísticas descriptivas por región (2022)",
format.args = list(big.mark = ","))
| region | Variable | Mean | Median | Min | Max |
|---|---|---|---|---|---|
| CdMx | Real Exports (USD) | 53,147,597.65 | 53,147,597.65 | 53,147,597.65 | 5.314760e+07 |
| CdMx | GDP per Capita 2018 (MXN) | 3,920.52 | 3,920.52 | 3,920.52 | 3.920520e+03 |
| CdMx | LQ Secondary Sector | 0.56 | 0.56 | 0.56 | 5.600000e-01 |
| CdMx | Avg. Daily Salary (MXN) | 482.42 | 482.42 | 482.42 | 4.824200e+02 |
| CdMx | Border Distance (km) | 738.20 | 738.20 | 738.20 | 7.382000e+02 |
| CdMx | College Education (%) | 0.44 | 0.44 | 0.44 | 4.400000e-01 |
| CdMx | Crime Rate (per 100k) | 7.97 | 7.97 | 7.97 | 7.970000e+00 |
| Centro_Sur_Oriente | Real Exports (USD) | 148,230,033.15 | 95,440,001.09 | 33,105,906.83 | 3.191216e+08 |
| Centro_Sur_Oriente | GDP per Capita 2018 (MXN) | 1,250.40 | 1,277.59 | 1,045.22 | 1.353270e+03 |
| Centro_Sur_Oriente | LQ Secondary Sector | 0.97 | 0.96 | 0.57 | 1.700000e+00 |
| Centro_Sur_Oriente | Avg. Daily Salary (MXN) | 332.48 | 333.78 | 304.23 | 3.593300e+02 |
| Centro_Sur_Oriente | Border Distance (km) | 738.86 | 750.06 | 661.85 | 7.956000e+02 |
| Centro_Sur_Oriente | College Education (%) | 0.28 | 0.27 | 0.26 | 3.200000e-01 |
| Centro_Sur_Oriente | Crime Rate (per 100k) | 21.11 | 14.10 | 9.30 | 5.924000e+01 |
| Noreste | Real Exports (USD) | 631,749,721.53 | 653,135,533.64 | 273,330,407.64 | 9.473974e+08 |
| Noreste | GDP per Capita 2018 (MXN) | 2,504.30 | 2,470.00 | 1,867.82 | 3.209390e+03 |
| Noreste | LQ Secondary Sector | 1.34 | 1.29 | 1.17 | 1.630000e+00 |
| Noreste | Avg. Daily Salary (MXN) | 386.86 | 381.29 | 372.29 | 4.125900e+02 |
| Noreste | Border Distance (km) | 344.95 | 321.34 | 216.33 | 5.207900e+02 |
| Noreste | College Education (%) | 0.27 | 0.26 | 0.24 | 3.200000e-01 |
| Noreste | Crime Rate (per 100k) | 16.79 | 18.00 | 4.85 | 2.633000e+01 |
| Noroeste | Real Exports (USD) | 409,274,781.50 | 210,389,805.36 | 6,598,584.34 | 1.163117e+09 |
| Noroeste | GDP per Capita 2018 (MXN) | 2,139.72 | 2,253.71 | 1,594.17 | 2.633290e+03 |
| Noroeste | LQ Secondary Sector | 1.19 | 1.25 | 0.52 | 1.860000e+00 |
| Noroeste | Avg. Daily Salary (MXN) | 346.07 | 352.80 | 283.06 | 4.117800e+02 |
| Noroeste | Border Distance (km) | 476.05 | 507.98 | 8.83 | 8.003200e+02 |
| Noroeste | College Education (%) | 0.32 | 0.33 | 0.26 | 3.600000e-01 |
| Noroeste | Crime Rate (per 100k) | 37.49 | 37.46 | 6.42 | 7.587000e+01 |
| Occidente_Bajio | Real Exports (USD) | 193,328,819.11 | 146,806,516.42 | 4,463,690.27 | 4.903483e+08 |
| Occidente_Bajio | GDP per Capita 2018 (MXN) | 1,790.71 | 1,853.24 | 1,302.82 | 2.393330e+03 |
| Occidente_Bajio | LQ Secondary Sector | 0.94 | 0.95 | 0.59 | 1.420000e+00 |
| Occidente_Bajio | Avg. Daily Salary (MXN) | 343.73 | 337.06 | 299.21 | 4.114400e+02 |
| Occidente_Bajio | Border Distance (km) | 737.70 | 712.62 | 576.27 | 9.781300e+02 |
| Occidente_Bajio | College Education (%) | 0.27 | 0.29 | 0.21 | 3.200000e-01 |
| Occidente_Bajio | Crime Rate (per 100k) | 45.55 | 34.03 | 5.63 | 1.110600e+02 |
| Sur | Real Exports (USD) | 78,741,131.18 | 20,967,017.37 | 1,008,475.84 | 2.960523e+08 |
| Sur | GDP per Capita 2018 (MXN) | 1,930.79 | 1,558.00 | 659.44 | 5.023470e+03 |
| Sur | LQ Secondary Sector | 0.80 | 0.75 | 0.38 | 1.120000e+00 |
| Sur | Avg. Daily Salary (MXN) | 334.65 | 317.73 | 296.50 | 4.251200e+02 |
| Sur | Border Distance (km) | 1,035.46 | 992.44 | 952.30 | 1.252660e+03 |
| Sur | College Education (%) | 0.26 | 0.28 | 0.19 | 3.500000e-01 |
| Sur | Crime Rate (per 100k) | 17.89 | 14.19 | 1.98 | 3.855000e+01 |
disp_stats <- df |>
group_by(region) |>
summarise(across(
all_of(vars_of_interest),
list(SD = \(x) sd(x, na.rm = TRUE)),
.names = "{.col}__{.fn}"
)) |>
pivot_longer(-region,
names_to = c("Variable", "Statistic"),
names_sep = "__") |>
pivot_wider(names_from = Statistic, values_from = value) |>
mutate(Variable = recode(Variable, !!!var_labels),
SD = round(SD, 2))
kable(disp_stats,
caption = "Desviación estándar por región (2022)",
format.args = list(big.mark = ","))
| region | Variable | SD |
|---|---|---|
| CdMx | Real Exports (USD) | NA |
| CdMx | GDP per Capita 2018 (MXN) | NA |
| CdMx | LQ Secondary Sector | NA |
| CdMx | Avg. Daily Salary (MXN) | NA |
| CdMx | Border Distance (km) | NA |
| CdMx | College Education (%) | NA |
| CdMx | Crime Rate (per 100k) | NA |
| Centro_Sur_Oriente | Real Exports (USD) | 131,190,070.85 |
| Centro_Sur_Oriente | GDP per Capita 2018 (MXN) | 107.62 |
| Centro_Sur_Oriente | LQ Secondary Sector | 0.41 |
| Centro_Sur_Oriente | Avg. Daily Salary (MXN) | 20.83 |
| Centro_Sur_Oriente | Border Distance (km) | 48.11 |
| Centro_Sur_Oriente | College Education (%) | 0.02 |
| Centro_Sur_Oriente | Crime Rate (per 100k) | 18.99 |
| Noreste | Real Exports (USD) | 297,334,225.63 |
| Noreste | GDP per Capita 2018 (MXN) | 632.17 |
| Noreste | LQ Secondary Sector | 0.21 |
| Noreste | Avg. Daily Salary (MXN) | 17.85 |
| Noreste | Border Distance (km) | 132.66 |
| Noreste | College Education (%) | 0.04 |
| Noreste | Crime Rate (per 100k) | 9.94 |
| Noroeste | Real Exports (USD) | 480,581,191.61 |
| Noroeste | GDP per Capita 2018 (MXN) | 421.14 |
| Noroeste | LQ Secondary Sector | 0.50 |
| Noroeste | Avg. Daily Salary (MXN) | 49.29 |
| Noroeste | Border Distance (km) | 321.93 |
| Noroeste | College Education (%) | 0.04 |
| Noroeste | Crime Rate (per 100k) | 29.38 |
| Occidente_Bajio | Real Exports (USD) | 183,335,302.77 |
| Occidente_Bajio | GDP per Capita 2018 (MXN) | 418.29 |
| Occidente_Bajio | LQ Secondary Sector | 0.33 |
| Occidente_Bajio | Avg. Daily Salary (MXN) | 36.78 |
| Occidente_Bajio | Border Distance (km) | 136.01 |
| Occidente_Bajio | College Education (%) | 0.05 |
| Occidente_Bajio | Crime Rate (per 100k) | 39.91 |
| Sur | Real Exports (USD) | 112,467,854.03 |
| Sur | GDP per Capita 2018 (MXN) | 1,516.06 |
| Sur | LQ Secondary Sector | 0.25 |
| Sur | Avg. Daily Salary (MXN) | 45.08 |
| Sur | Border Distance (km) | 109.27 |
| Sur | College Education (%) | 0.06 |
| Sur | Crime Rate (per 100k) | 13.39 |
region_colors <- c(
"CdMx" = "#E63946",
"Centro_Sur_Oriente" = "#457B9D",
"Noreste" = "#2A9D8F",
"Noroeste" = "#E9C46A",
"Occidente_Bajio" = "#F4A261",
"Sur" = "#6A0572"
)
theme_mx <- theme_minimal(base_size = 11) +
theme(
plot.title = element_text(face = "bold", size = 11),
plot.subtitle = element_text(size = 8.5, color = "grey40"),
axis.title = element_text(size = 9),
panel.grid.minor = element_blank()
)
hist_vars <- c("real_exports", "gdp_per_capita_2018", "lq_secondary",
"average_daily_salary", "border_distance", "college_education")
# Función auxiliar de asimetría
skew_label <- function(x) {
sk <- e1071::skewness(x, na.rm = TRUE)
direction <- ifelse(abs(sk) < 0.5, "≈ Simétrica",
ifelse(sk > 0, "Sesgo derecho ▶", "◀ Sesgo izquierdo"))
paste0("Skew: ", round(sk, 2), " — ", direction)
}
hist_plots <- lapply(hist_vars, function(v) {
ggplot(df, aes(x = .data[[v]])) +
geom_histogram(bins = 12, fill = "#457B9D", color = "white", alpha = 0.88) +
geom_vline(aes(xintercept = mean(.data[[v]], na.rm = TRUE)),
color = "#E63946", linetype = "dashed", linewidth = 0.9) +
annotate("text", x = Inf, y = Inf,
label = skew_label(df[[v]]),
hjust = 1.05, vjust = 1.5, size = 3, color = "grey30") +
scale_x_continuous(labels = label_comma()) +
labs(title = var_labels[v],
subtitle = "Línea roja = media",
x = NULL, y = "Frecuencia") +
theme_mx
})
wrap_plots(hist_plots, ncol = 3) +
plot_annotation(
title = "Histogramas de Variables Seleccionadas — México (2022)",
subtitle = "Fuente: INEGI",
theme = theme(plot.title = element_text(face = "bold", size = 14))
)
Histogramas de las variables seleccionadas (2022)
skew_df <- data.frame(
Variable = var_labels[vars_of_interest],
Skewness = sapply(vars_of_interest, \(v) round(e1071::skewness(df[[v]], na.rm = TRUE), 3)),
Direccion = sapply(vars_of_interest, function(v) {
sk <- e1071::skewness(df[[v]], na.rm = TRUE)
ifelse(abs(sk) < 0.5, "≈ Simétrica",
ifelse(sk > 0, "Sesgo derecho", "Sesgo izquierdo"))
}),
Outliers = c("Sí (Nuevo León, Chihuahua)", "Sí (CdMx, Q. Roo)", "Leve",
"Leve (CdMx)", "Leve", "No", "Sí (Colima, Jalisco)"),
Transformacion = c("Log recomendado", "Log recomendado", "No requerida",
"Log opcional", "No requerida", "No requerida", "Log recomendado"),
row.names = NULL
)
kable(skew_df,
col.names = c("Variable", "Asimetría", "Dirección",
"¿Outliers?", "Transformación sugerida"),
caption = "Diagnóstico de distribución por variable (2022)")
| Variable | Asimetría | Dirección | ¿Outliers? | Transformación sugerida |
|---|---|---|---|---|
| Real Exports (USD) | 1.438 | Sesgo derecho | Sí (Nuevo León, Chihuahua) | Log recomendado |
| GDP per Capita 2018 (MXN) | 1.397 | Sesgo derecho | Sí (CdMx, Q. Roo) | Log recomendado |
| LQ Secondary Sector | 0.391 | ≈ Simétrica | Leve | No requerida |
| Avg. Daily Salary (MXN) | 0.815 | Sesgo derecho | Leve (CdMx) | Log opcional |
| Border Distance (km) | -0.521 | Sesgo izquierdo | Leve | No requerida |
| College Education (%) | 0.377 | ≈ Simétrica | No | No requerida |
| Crime Rate (per 100k) | 1.339 | Sesgo derecho | Sí (Colima, Jalisco) | Log recomendado |
box_plots <- lapply(vars_of_interest, function(v) {
ggplot(df, aes(x = region, y = .data[[v]], fill = region)) +
geom_boxplot(alpha = 0.82,
outlier.color = "black",
outlier.shape = 21,
outlier.size = 1.8) +
scale_fill_manual(values = region_colors) +
scale_y_continuous(labels = label_comma()) +
labs(title = var_labels[v], x = NULL, y = NULL) +
theme_mx +
theme(axis.text.x = element_text(angle = 35, hjust = 1, size = 7.5),
legend.position = "none")
})
wrap_plots(box_plots, ncol = 3) +
plot_annotation(
title = "Boxplots por Región — México (2022)",
subtitle = "Fuente: INEGI | Los puntos representan valores atípicos",
theme = theme(plot.title = element_text(face = "bold", size = 14))
)
Boxplots de las variables seleccionadas por región (2022)
real_exports, gdp_per_capita_2018,
crime_rate): Se recomienda aplicar transformación
logarítmica antes de incluirlas en modelos de regresión.border_distance presenta un leve sesgo
izquierdo, lo que refleja que la mayoría de los estados se ubican lejos
de la frontera norte, con pocas excepciones (estados del
Noreste/Noroeste).lq_secondary y
college_education tienen distribuciones
aproximadamente simétricas y no requieren transformación.crime_rate y gdp_per_capita_2018.Documento generado automáticamente con R Markdown. Datos: INEGI.