Introducción

La industria automotriz es uno de los motores más poderosos de la economía mexicana. Esta fortaleza no solo impulsa el crecimiento económico, sino que también consolida a México como un actor clave en las cadenas globales de valor.

Este análisis se enfoca en dos entidades representativas: Aguascalientes, un hub consolidado de manufactura automotriz, y Morelos, una región con potencial logístico y productivo. A través de visualizaciones interactivas y análisis de datos.

El Plan México busca fortalecer la relocalización estratégica de inversiones (nearshoring) y diversificar las exportaciones más allá del mercado tradicional de Estados Unidos. Para lograrlo, hemos identificado nuevas oportunidades comerciales y evaluar cómo la estructura arancelaria global puede ser utilizada como ventaja competitiva para atraer inversiones extranjeras, particularmente de actores como Nissan, con una fuerte presencia en el país.

Librerías

# Establecer directorio
setwd("~/PROFESIONAL TEC/4° Semestre C.S/Ciencia de datos para la toma de decisiones/Reto CSII")

# Librerías 

if (!require(haven)) install.packages("haven")

## Cargando paquete requerido: haven

library(haven)

if (!require(ggplot2)) install.packages("ggplot2")

## Cargando paquete requerido: ggplot2

library(ggplot2)

if (!require(scales)) install.packages("scales")

## Cargando paquete requerido: scales

library(scales)

if (!require(comtradr)) install.packages("comtradr")

## Cargando paquete requerido: comtradr

## Warning: package 'comtradr' was built under R version 4.4.3

library(comtradr)

if (!require(showtext)) install.packages("showtext")

## Cargando paquete requerido: showtext

## Warning: package 'showtext' was built under R version 4.4.3

## Cargando paquete requerido: sysfonts

## Warning: package 'sysfonts' was built under R version 4.4.3

## Cargando paquete requerido: showtextdb

## Warning: package 'showtextdb' was built under R version 4.4.3

library(showtext)

if (!require(dplyr)) install.packages("dplyr")

## Cargando paquete requerido: dplyr

## 
## Adjuntando el paquete: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

library(dplyr)

if (!require(readxl)) install.packages("readxl")

## Cargando paquete requerido: readxl

library(readxl)

if (!require(stringr)) install.packages("stringr")

## Cargando paquete requerido: stringr

library(stringr)

if (!require(tidyr)) install.packages("tidyr")

## Cargando paquete requerido: tidyr

library(tidyr)

if (!require(igraph)) install.packages("igraph")

## Cargando paquete requerido: igraph

## 
## Adjuntando el paquete: 'igraph'

## The following object is masked from 'package:tidyr':
## 
##     crossing

## The following objects are masked from 'package:dplyr':
## 
##     as_data_frame, groups, union

## The following objects are masked from 'package:stats':
## 
##     decompose, spectrum

## The following object is masked from 'package:base':
## 
##     union

library(igraph)

if (!require(tidygraph)) install.packages("tidygraph")

## Cargando paquete requerido: tidygraph

## Warning: package 'tidygraph' was built under R version 4.4.3

## 
## Adjuntando el paquete: 'tidygraph'

## The following object is masked from 'package:igraph':
## 
##     groups

## The following object is masked from 'package:stats':
## 
##     filter

library(tidygraph)

if (!require(ggraph)) install.packages("ggraph")

## Cargando paquete requerido: ggraph

## Warning: package 'ggraph' was built under R version 4.4.3

library(ggraph)

if (!require(grid)) install.packages("grid")

## Cargando paquete requerido: grid

library(grid)

if (!require(sysfonts)) install.packages("sysfonts")
library(sysfonts)

if (!require(ggimage)) install.packages("ggimage")

## Cargando paquete requerido: ggimage

## Warning: package 'ggimage' was built under R version 4.4.3

library(ggimage)

if (!require(sf)) install.packages("sf")

## Cargando paquete requerido: sf

## Linking to GEOS 3.13.0, GDAL 3.10.1, PROJ 9.5.1; sf_use_s2() is TRUE

library(sf)

if (!require(rnaturalearth)) install.packages("rnaturalearth")

## Cargando paquete requerido: rnaturalearth

## Warning: package 'rnaturalearth' was built under R version 4.4.3

library(rnaturalearth)

if (!require(viridis)) install.packages("viridis")

## Cargando paquete requerido: viridis

## Warning: package 'viridis' was built under R version 4.4.3

## Cargando paquete requerido: viridisLite

## 
## Adjuntando el paquete: 'viridis'

## The following object is masked from 'package:scales':
## 
##     viridis_pal

library(viridis)

if (!require(tidyverse)) install.packages("tidyverse")

## Cargando paquete requerido: tidyverse

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ forcats   1.0.0     ✔ readr     2.1.5
## ✔ lubridate 1.9.4     ✔ tibble    3.2.1
## ✔ purrr     1.0.4     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ lubridate::%--%()       masks igraph::%--%()
## ✖ tibble::as_data_frame() masks igraph::as_data_frame(), dplyr::as_data_frame()
## ✖ readr::col_factor()     masks scales::col_factor()
## ✖ purrr::compose()        masks igraph::compose()
## ✖ igraph::crossing()      masks tidyr::crossing()
## ✖ purrr::discard()        masks scales::discard()
## ✖ tidygraph::filter()     masks dplyr::filter(), stats::filter()
## ✖ dplyr::lag()            masks stats::lag()
## ✖ purrr::simplify()       masks igraph::simplify()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(tidyverse)

if (!require(countrycode)) install.packages("countrycode")

## Cargando paquete requerido: countrycode

## Warning: package 'countrycode' was built under R version 4.4.3

library(countrycode)

Industria automotriz en México: un motor económico

1 Millón de empleos directos.
3 Millones de Empleos indirectos.
4.5% del PIB nacional en 2024.
5° mayor productor en el mundo: 3,989,401 de vehículos producidos a nivel nacional.

set_primary_comtrade_key("4dfa11c2a4374c0cbb5bc4403252b281")

billion <- function(value) {
  value / 1e9
}

# Fuente Lato
font_add_google("Lato", "lato")
showtext_auto()

# Paleta de colores
color_guinda <- "#9F2241"
color_guinda_claro <- "#F8C3C3"
color_fondo <- "#FFFFFF"
color_texto <- "#1A1A1A"
color_verde <- "#D4C19C" 
color_gris <- "#D9D9D9"

# Años de interés
years <- c(2014, 2018, 2022, 2024)

# Iniciar una lista vacia para guardar datos 
MEX_data_list <- list()

# Loop por cada año y obtener datos
for (yr in years) {
  data <- ct_get_data(
    reporter = "MEX",
    partner = "all_countries",
    flow_direction = c("Import", "Export"),
    start_date = yr,
    end_date = yr,
    commodity_code = "8703", #"TOTAL"
    frequency = "A"
  )
  MEX_data_list[[as.character(yr)]] <- data
}

## Waiting 4s for throttling delay ■■■■■■■■ Waiting 4s for throttling delay
## ■■■■■■■■■■ Waiting 4s for throttling delay ■■■■■■■■■■■■ Waiting 4s for
## throttling delay ■■■■■■■■■■■■■■ Waiting 4s for throttling delay ■■■■■■■■■■■■■■■
## Waiting 4s for throttling delay ■■■■■■■■■■■■■■■■ Waiting 4s for throttling
## delay ■■■■■■■■■■■■■■■■■■ Waiting 4s for throttling delay ■■■■■■■■■■■■■■■■■■■
## Waiting 4s for throttling delay ■■■■■■■■■■■■■■■■■■■■■ Waiting 4s for throttling
## delay ■■■■■■■■■■■■■■■■■■■■■■■ Waiting 4s for throttling delay
## ■■■■■■■■■■■■■■■■■■■■■■■■ Waiting 4s for throttling delay
## ■■■■■■■■■■■■■■■■■■■■■■■■■■ Waiting 4s for throttling delay
## ■■■■■■■■■■■■■■■■■■■■■■■■■■■ Waiting 4s for throttling delay
## ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ Waiting 4s for throttling delay
## ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ Waiting 5s for throttling delay ■■■■■■■ Waiting
## 5s for throttling delay ■■■■■■■■■ Waiting 5s for throttling delay ■■■■■■■■■■
## Waiting 5s for throttling delay ■■■■■■■■■■■ Waiting 5s for throttling delay
## ■■■■■■■■■■■■■ Waiting 5s for throttling delay ■■■■■■■■■■■■■■ Waiting 5s for
## throttling delay ■■■■■■■■■■■■■■■ Waiting 5s for throttling delay
## ■■■■■■■■■■■■■■■■■ Waiting 5s for throttling delay ■■■■■■■■■■■■■■■■■■ Waiting 5s
## for throttling delay ■■■■■■■■■■■■■■■■■■■ Waiting 5s for throttling delay
## ■■■■■■■■■■■■■■■■■■■■ Waiting 5s for throttling delay ■■■■■■■■■■■■■■■■■■■■■■
## Waiting 5s for throttling delay ■■■■■■■■■■■■■■■■■■■■■■■ Waiting 5s for
## throttling delay ■■■■■■■■■■■■■■■■■■■■■■■■ Waiting 5s for throttling delay
## ■■■■■■■■■■■■■■■■■■■■■■■■■ Waiting 5s for throttling delay
## ■■■■■■■■■■■■■■■■■■■■■■■■■■■ Waiting 5s for throttling delay
## ■■■■■■■■■■■■■■■■■■■■■■■■■■■■ Waiting 5s for throttling delay
## ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ Waiting 5s for throttling delay
## ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ Waiting 5s for throttling delay ■■■■■■■ Waiting
## 5s for throttling delay ■■■■■■■■■ Waiting 5s for throttling delay ■■■■■■■■■■
## Waiting 5s for throttling delay ■■■■■■■■■■■ Waiting 5s for throttling delay
## ■■■■■■■■■■■■■ Waiting 5s for throttling delay ■■■■■■■■■■■■■■ Waiting 5s for
## throttling delay ■■■■■■■■■■■■■■■ Waiting 5s for throttling delay
## ■■■■■■■■■■■■■■■■ Waiting 5s for throttling delay ■■■■■■■■■■■■■■■■■ Waiting 5s
## for throttling delay ■■■■■■■■■■■■■■■■■■■ Waiting 5s for throttling delay
## ■■■■■■■■■■■■■■■■■■■■ Waiting 5s for throttling delay ■■■■■■■■■■■■■■■■■■■■■
## Waiting 5s for throttling delay ■■■■■■■■■■■■■■■■■■■■■■ Waiting 5s for
## throttling delay ■■■■■■■■■■■■■■■■■■■■■■■■ Waiting 5s for throttling delay
## ■■■■■■■■■■■■■■■■■■■■■■■■■ Waiting 5s for throttling delay
## ■■■■■■■■■■■■■■■■■■■■■■■■■■ Waiting 5s for throttling delay
## ■■■■■■■■■■■■■■■■■■■■■■■■■■■ Waiting 5s for throttling delay
## ■■■■■■■■■■■■■■■■■■■■■■■■■■■■ Waiting 5s for throttling delay
## ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■

# Combinar todos los datos en un dataframe
MEX <- bind_rows(MEX_data_list)

# Procesar y resumir datos
data1 <- MEX %>%
  filter(partner_iso != "WLD") %>%
  mutate(trade_value_usd_bn = billion(primary_value)) %>%
  group_by(flow_desc, ref_year) %>%
  summarise(trade_value_usd_bn = sum(trade_value_usd_bn, na.rm = TRUE), .groups = 'drop')

# Colores institucionales: rojo para importaciones, dorado para exportaciones
colores_flujo <- c(
  "Import" = "#9F2241",  # Rojo institucional
  "Export" = "#D4C19C"   # Dorado institucional
)

ggplot(data1, aes(x = ref_year, y = trade_value_usd_bn, color = flow_desc)) +
  geom_line(size = 1.2) +
  geom_point(size = 3) +
  scale_y_continuous(
  labels = scales::label_number(suffix = "B"),
  breaks = seq(0, 100, 20)) +
  scale_color_manual(values = colores_flujo,
                     labels = c(
                       "Import" = "Importaciones",
                       "Export" = "Exportaciones"
                     )) +
  scale_x_continuous(breaks = seq(2014, 2024, 2), expand = c(0.01, 0.01)) +
  labs(
    title = "Importaciones y Exportaciones de México (Sector automotriz)", 
    x = "Año",
    y = "Valor (miles de millones USD)",
    caption = "Data: UN Comtrade",
    color = "Flujo Comercial"
  ) +
  theme_bw(base_family = "lato") +
  theme(
    plot.title = element_text(size = 16, face = "bold", hjust = 0.5),
    axis.title = element_text(size = 12, face = "bold"),
    axis.text = element_text(size = 10),
    legend.title = element_text(size = 12),
    legend.text = element_text(size = 10),
    plot.caption = element_text(size = 10, face = "italic", hjust = 1)
  )

## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

Balanza comercial favorable, con US $110,734M de superávit comercial general en 2024.

Situación arancelaria

Imposición de 25% aranceles a todos los países (México incluido).
Se acordó una reducción de aranceles para el sector automotriz del 15%.
Sólo habrá aranceles para productos que no cumplan con normas de origen de producto establecidas por el T-MEC.
México no ha impuesto aranceles como respuesta.

Análisis macroeconómico

# Exportaciones a Canadá, China y Estados Unidos 

# Asignar colores 
colores_paises <- c(
  "CAN" = "#FF0000",  
  "CHN" = "#FFCC00",  
  "USA" = "#0033A0"   
)

# Agregar columnas para banderas (usamos emojis o URLs si quieres imágenes reales)
banderas <- c(
  "CAN" = "https://flagcdn.com/w40/ca.png",
  "CHN" = "https://flagcdn.com/w40/cn.png",
  "USA" = "https://flagcdn.com/w40/us.png"
)
# Datos
q_export <- ct_get_data(
  reporter = "MEX",
  partner = c("DEU", "USA", "JPN", "CHN", "CAN"),
  flow_direction = "Export",
  start_date = 2014,
  end_date = 2024,
  commodity_code = "8703",
  frequency = "A"
)

## Waiting 5s for throttling delay ■■■■■■■ Waiting 5s for throttling delay
## ■■■■■■■■■ Waiting 5s for throttling delay ■■■■■■■■■■ Waiting 5s for throttling
## delay ■■■■■■■■■■■ Waiting 5s for throttling delay ■■■■■■■■■■■■■ Waiting 5s for
## throttling delay ■■■■■■■■■■■■■■■ Waiting 5s for throttling delay
## ■■■■■■■■■■■■■■■■ Waiting 5s for throttling delay ■■■■■■■■■■■■■■■■■ Waiting 5s
## for throttling delay ■■■■■■■■■■■■■■■■■■■ Waiting 5s for throttling delay
## ■■■■■■■■■■■■■■■■■■■■ Waiting 5s for throttling delay ■■■■■■■■■■■■■■■■■■■■■
## Waiting 5s for throttling delay ■■■■■■■■■■■■■■■■■■■■■■ Waiting 5s for
## throttling delay ■■■■■■■■■■■■■■■■■■■■■■■■ Waiting 5s for throttling delay
## ■■■■■■■■■■■■■■■■■■■■■■■■■ Waiting 5s for throttling delay
## ■■■■■■■■■■■■■■■■■■■■■■■■■■■ Waiting 5s for throttling delay
## ■■■■■■■■■■■■■■■■■■■■■■■■■■■■ Waiting 5s for throttling delay
## ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ Waiting 5s for throttling delay
## ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■

# Gráfico de exportaciones
q_export %>%
  filter(partner_iso %in% c("USA", "CHN", "CAN")) %>%
  mutate(
    trade_value_usd_billion = primary_value / 1e9,
    flag = banderas[partner_iso]
  ) %>%
  ggplot(aes(x = ref_year, y = trade_value_usd_billion, fill = partner_iso)) +
  geom_col(position = "dodge", alpha = 0.9) +
  # Agregar banderas sobre las barras
  geom_image(
    aes(image = flag),
    position = position_dodge(width = 0.9),
    size = 0.035,
    by = "width"
  ) +
  scale_fill_manual(
    values = colores_paises,
    labels = c("CAN" = "Canadá", "CHN" = "China", "USA" = "Estados Unidos")
  ) +
  scale_y_continuous(labels = label_number(suffix = "B", accuracy = 0.1)) +
  scale_x_continuous(breaks = seq(2014, 2024, 2)) +
  labs(
    title = "Exportaciones de México a Canadá, China y Estados Unidos",
    subtitle = "Sector automotriz",
    x = "Año",
    y = "Valor (miles de millones USD)",
    fill = "Socio comercial",
    caption = "Fuente: UN Comtrade"
  ) +
  theme_bw(base_family = "lato") +
  theme(
    plot.title = element_text(size = 16, face = "bold", hjust = 0.5),
    plot.subtitle = element_text(size = 14, color = color_texto, hjust = 0.5),
    axis.title = element_text(size = 12, face = "bold"),
    axis.text = element_text(size = 10),
    legend.title = element_text(size = 12),
    legend.text = element_text(size = 10),
    plot.caption = element_text(size = 10, face = "italic", hjust = 1, color = color_texto)
  )

# Importaciones desde Canadá, China y Estados 

# Datos
q <- ct_get_data(
  reporter = "MEX",
  partner = c("DEU", "USA", "JPN", "CHN", "CAN"),
  flow_direction = "Import",
  start_date = 2014,
  end_date = 2024,
  commodity_code = "8703",
  frequency = "A"
)

q %>%
  filter(partner_iso %in% c("USA", "CHN", "CAN")) %>%
  mutate(
    trade_value_usd_billion = primary_value / 1e9,
    flag = banderas[partner_iso]
  ) %>%
  ggplot(aes(x = ref_year, y = trade_value_usd_billion, fill = partner_iso)) +
  geom_col(position = "dodge", alpha = 0.9) +
  # Añadir banderas sobre las barras
  geom_image(
    aes(image = flag),
    position = position_dodge(width = 0.9),
    size = 0.035,  # Ajusta el tamaño según lo necesites
    by = "width"
  ) +
  scale_fill_manual(
    values = colores_paises,
    labels = c("CAN" = "Canadá", "CHN" = "China", "USA" = "Estados Unidos")
  ) +
  scale_y_continuous(labels = label_number(suffix = "B", accuracy = 0.1)) +
  scale_x_continuous(breaks = seq(2014, 2024, 2)) +
  labs(
    title = "Importaciones a México desde Canadá, China y Estados Unidos",
    subtitle = "Sector automotriz",
    x = "Año",
    y = "Valor (miles de millones USD)",
    fill = "Socio comercial",
    caption = "Fuente: UN Comtrade"
  ) +
  theme_bw(base_family = "lato") +
  theme(
    plot.title = element_text(size = 16, face = "bold", hjust = 0.5),
    axis.title = element_text(size = 12, face = "bold"),
    plot.subtitle = element_text(size = 14, color = color_texto, hjust = 0.5),
    axis.text = element_text(size = 10),
    legend.title = element_text(size = 12),
    legend.text = element_text(size = 10),
    plot.caption = element_text(size = 10, face = "italic", hjust = 1, color = color_texto)
  )

# Top 5 socios comerciales 

# Procesamiento
top_n <- 5
yr <- 2024

top_partners <- MEX %>% 
  filter(partner_iso != "WLD", ref_year == yr) %>%
  group_by(partner_iso) %>%
  summarise(total = sum(primary_value), .groups = "drop") %>%
  arrange(desc(total)) %>%
  slice_head(n = top_n) %>%
  pull(partner_iso)

data_top <- MEX %>%
  filter(ref_year == 2024, partner_iso %in% c("USA", "CHN", "CAN", "DEU", "JPN")) %>%
  mutate(
    partner_iso = as.character(as_factor(partner_iso)),  # Convierte labelled a character
    trade_value_usd_bn = primary_value / 1e9,
    partner_name = dplyr::recode(partner_iso,
                                 "USA" = "Estados Unidos",
                                 "CHN" = "China",
                                 "CAN" = "Canadá",
                                 "DEU" = "Alemania",
                                 "JPN" = "Japón")
  )


# Gráfico combinado
ggplot(data_top, aes(x = reorder(partner_name, -trade_value_usd_bn), y = trade_value_usd_bn, fill = flow_desc)) +
  geom_col(position = "dodge", color = "grey20", width = 0.6) +
  geom_text(
    aes(label = number(trade_value_usd_bn, accuracy = 0.1)),
    position = position_dodge(width = 0.6),
    vjust = -0.5,
    size = 3,
    family = "lato"
  ) +
  scale_fill_manual(
    values = c("Import" = color_guinda, "Export" = color_verde),
    labels = c("Import" = "Importaciones", "Export" = "Exportaciones")
  ) +
  scale_y_continuous(
    expand = expansion(mult = c(0, 0.1)),
    labels = label_number(suffix = "B")
  ) +
  labs(
    title = "Top 5 socios comerciales de México (2024)",
    subtitle = "Importaciones y exportaciones del sector automotriz (Cód. 8703)",
    x = NULL,
    y = "Valor en miles de millones USD",
    fill = "Flujo comercial",
    caption = "Fuente: UN Comtrade"
  ) +
  theme_minimal(base_family = "lato") +
  theme(
    plot.title = element_text(size = 16, face = "bold", hjust = 0.5),
    plot.subtitle = element_text(size = 13, hjust = 0.5, color = color_texto),
    axis.text = element_text(size = 10, color = color_texto),
    axis.title = element_text(size = 12, face = "bold", color = color_texto),
    legend.position = "bottom",
    legend.title = element_text(face = "bold"),
    plot.caption = element_text(size = 10, face = "italic", hjust = 1)
  )

# Relación del sector automotriz con otras actividades económicas: Gráfcia Nodos

# Cargar base de datos
mip <- MIP_44_JustSCIAN <- read_excel("~/PROFESIONAL TEC/4° Semestre C.S/Ciencia de datos para la toma de decisiones/Reto CSII/data/MIP_44_JustSCIAN.xlsx", skip =5)

## New names:
## • `` -> `...1`
## • `` -> `...2`
## • `Total` -> `Total...3`
## • `Total` -> `Total...82`

# Renombrar y limpiar nombres
names(mip)[1] <- "from"
colnames(mip) <- str_trim(colnames(mip))

# Seleccionar columnas SCIAN y transformar a formato largo
scian_cols <- colnames(mip)[str_detect(colnames(mip), "^\\d{3}")]

mip_long <- mip %>%
  select(from, all_of(scian_cols)) %>%
  pivot_longer(cols = -from, names_to = "to", values_to = "valor") %>%
  drop_na(valor) %>%
  filter(valor > 0) %>%
  mutate(
    from_code = str_extract(from, "^\\d{3}"),
    to_code = str_extract(to, "^\\d{3}")
  )

# Crear red dirigida
network <- as_tbl_graph(mip_long, directed = TRUE)

# Nodo clave
sector_336 <- "336 - Fabricación de equipo de transporte"

# Graficar red centrada en sector 336
network %>%
  activate(nodes) %>%
  mutate(code = str_extract(name, "^\\d{3}")) %>%
  activate(edges) %>%
  filter(
    from == which(V(.)$name == sector_336) |
      to == which(V(.)$name == sector_336)
  ) %>%
  ggraph(layout = "linear") +
  geom_edge_arc(
    aes(edge_colour = valor),
    arrow = arrow(angle = 15, length = unit(0.3, "cm"), ends = "last", type = "closed"),
    edge_width = 1.1,
    show.legend = TRUE
  ) +
  scale_edge_colour_gradient(
    low = color_guinda_claro,
    high = color_guinda,
    name = "Valor económico",
    labels = label_number(scale_cut = cut_short_scale()),
    breaks = scales::pretty_breaks(n = 4)
  ) +
  geom_node_point(color = color_guinda, size = 4) +
  geom_node_label(
    aes(label = code),
    size = 3.5,
    family = "lato",
    fill = color_fondo,
    color = color_texto,
    label.size = 0.25,
    repel = TRUE,
    max.overlaps = Inf
  ) +
  theme_graph(base_family = "lato", base_size = 14) +
  labs(
    title = "Relaciones Insumo-Producto centradas en el sector 336",
    subtitle = "Fabricación de equipo de transporte\n(Millones de Pesos a Precios Básicos)",
    caption = "Fuente: INEGI, Matriz Insumo Producto 2018"
  ) +
  theme(
    plot.background = element_rect(fill = color_fondo, color = NA),
    plot.title = element_text(size = 18, face = "bold", hjust = 0.5, color = color_texto),
    plot.subtitle = element_text(size = 13, hjust = 0.5, color = color_texto),
    plot.caption = element_text(size = 10, hjust = 1, color = color_texto),
    legend.position = "bottom",
    legend.direction = "horizontal",
    legend.box.margin = margin(t = 15, r = 0, b = 0, l = 0),
    legend.title = element_text(face = "bold", color = color_texto, family = "lato", size = 11),
    legend.text = element_text(color = color_texto, family = "lato", size = 10),
    legend.key.width = unit(1, "cm")
  )

Explicación: “Este es un análisis gráfico, desde una perspectiva económica, de las relaciones insumo-producto del sector “336 - Fabricación de equipo de transporte en México”, usando como base la Matriz de Insumo-Producto (MIP) 2018. Con el uso de paquetes como tidyverse, tidygraph y ggraph, se construyó una red dirigida que permite visualizar cómo este sector se relaciona con el resto del sistema productivo del país, tanto como consumidor de insumos como proveedor de productos intermedios. La metodología empleada parte de la carga del archivo MIP_44_JustSCIAN.xlsx, una versión modificada que incluye únicamente sectores económicos reales codificados bajo el sistema SCIAN. Se omitieron las primeras cinco filas, y se aseguraron nombres de columna limpios. Posteriormente, se filtraron solo aquellas columnas cuyos encabezados inician con un código SCIAN de tres dígitos, eliminando elementos agregados como el consumo final, impuestos o exportaciones. La matriz fue transformada de formato ancho a formato largo usando pivot_longer(), lo que facilitó la representación de cada relación sectorial como una observación con sector origen, sector destino y el valor económico del flujo correspondiente. Se eliminaron todas las relaciones con valores nulos o cero. Con esta base, se construyó un gráfico, y se filtraron únicamente aquellas relaciones en las que el sector 336 participa como nodo origen (proveedor) o nodo destino (demandante). La visualización final se elaboró con ggraph(), utilizando una disposición lineal para facilitar la lectura y diferenciación de los flujos. Las flechas representan la dirección de los flujos, y su color varía de azul claro a azul oscuro, en función de la magnitud del valor. En el gráfico resultante, cada nodo representa un sector económico codificado por SCIAN. Las flechas que apuntan hacia el nodo 336 representan insumos que este sector adquiere de otros sectores. Las que salen del nodo 336 indican bienes y servicios que el sector entrega a otros sectores. El resultado muestra dos husos interconectados, en los que el sector 336 ocupa una posición central. Este comportamiento es coherente con la naturaleza del sector automotriz, que depende de una gran variedad de insumos (metales, componentes electrónicos, plásticos, energía, servicios especializados, etc.) y a su vez participa como proveedor en otras cadenas productivas como la manufactura, la logística y los servicios técnicos, etc. La presencia de muchas flechas de color azul claro indica que existen múltiples relaciones económicas de baja intensidad. Sin embargo, también se observan algunos flujos en azul oscuro, lo que revela relaciones de dependencia alta con ciertos sectores clave. Entre los sectores más conectados al 336, ya sea como proveedores o como usuarios de su producción, destacan: 484 - Transporte por carretera, 461 - Comercio al por menor, 339 - Otras industrias manufactureras, 441 - Comercio al por mayor, 332 - Fabricación de productos metálicos, 335 - Fabricación de equipo eléctrico y aparatos electrónicos, 236 - Edificación, 811 - Servicios de reparación y mantenimiento, además del propio 336 - Fabricación de equipo de transporte, que aparece vinculado consigo mismo”

# Top 10 sectores más vinculados económicamente al sector automotriz

# Dependencia por compras AL sector 336 (ellos venden al 336)
hacia_336 <- mip_long %>%
  filter(to_code == "336") %>%
  group_by(from) %>%
  summarise(valor = sum(valor, na.rm = TRUE)) %>%
  rename(sector = from, valor_hacia_336 = valor)

# Dependencia por ventas DESDE el sector 336 (ellos compran al 336)
desde_336 <- mip_long %>%
  filter(from_code == "336") %>%
  group_by(to) %>%
  summarise(valor = sum(valor, na.rm = TRUE)) %>%
  rename(sector = to, valor_desde_336 = valor)

# Unir ambas dependencias
dependencia_total <- full_join(hacia_336, desde_336, by = "sector") %>%
  mutate(across(starts_with("valor"), ~replace_na(., 0))) %>%
  mutate(valor_total = valor_hacia_336 + valor_desde_336) %>%
  arrange(desc(valor_total)) %>%
  slice_head(n = 10)

# Ver resultado
print(dependencia_total)

## # A tibble: 10 × 4
##    sector                            valor_hacia_336 valor_desde_336 valor_total
##    <chr>                                       <dbl>           <dbl>       <dbl>
##  1 336 - Fabricación de equipo de t…        1050959.        1050959.    2101918.
##  2 334 - Fabricación de equipo de c…         203891.          68605.     272496.
##  3 331 - Industrias metálicas básic…         222255.          12690.     234945.
##  4 461 - Comercio al por menor de a…         176569           43108.     219677.
##  5 431 - Comercio al por mayor de a…         178045.          39449.     217494.
##  6 326 - Industria del plástico y d…         191794.          10083.     201877.
##  7 333 - Fabricación de maquinaria …         143195.          32745.     175940.
##  8 335 - Fabricación de accesorios,…         105097.          29624.     134721.
##  9 332 - Fabricación de productos m…         113993.          20109.     134102.
## 10 484 - Autotransporte de carga              50055.          72534.     122589.

# Reorganizar datos en formato largo para graficar más fácil
dependencia_larga <- dependencia_total %>%
  select(sector, valor_hacia_336, valor_desde_336) %>%
  pivot_longer(
    cols = starts_with("valor_"),
    names_to = "tipo_relacion",
    values_to = "valor"
  ) %>%
  mutate(
    tipo_relacion = as.character(tipo_relacion),
    tipo_relacion = dplyr::recode(tipo_relacion,
                                  "valor_hacia_336" = "Vende al sector (proveedor)",
                                  "valor_desde_336" = "Compra al sector (cliente)"
    ),
    sector = str_trim(str_remove(sector, "^\\d{3}\\s*-\\s*"))  # Limpiar nombre del sector
  ) %>%
  group_by(sector) %>%
  mutate(valor_total = sum(valor)) %>%
  ungroup()

# Identificar los 2 sectores más vinculados (mayor valor total)
top_sectores <- dependencia_larga %>%
  group_by(sector) %>%
  summarise(valor_total = sum(valor)) %>%
  slice_max(order_by = valor_total, n = 1) %>%
  pull(sector)

# Agregar etiquetas condicionales solo para los dos sectores más grandes
dependencia_larga <- dependencia_larga %>%
  mutate(
    mostrar_label = ifelse(sector %in% top_sectores, comma(valor, accuracy = 1), NA)
  )

# Graficar
# Formato 
tema_SE <- function(base_size = 12) {
  theme_minimal(base_family = "lato", base_size = base_size) +
    theme(
      plot.background = element_rect(fill = color_fondo, color = NA),
      panel.grid.major = element_line(color = color_gris),
      panel.grid.minor = element_blank(),
      plot.title = element_text(size = base_size + 4, face = "bold", color = color_texto, hjust = 0.5),
      plot.subtitle = element_text(size = base_size + 1, color = color_texto, hjust = 0.5),
      plot.caption = element_text(size = base_size - 2, color = color_texto, hjust = 1, face = "italic"),
      axis.title = element_text(face = "bold", color = color_texto),
      axis.text = element_text(color = color_texto),
      legend.position = "bottom",
      legend.title = element_text(face = "bold", color = color_texto),
      legend.text = element_text(color = color_texto)
    )
}
  
# Añadir etiquetas centradas y envolver nombres de sector
dependencia_larga <- dependencia_larga %>%
  mutate(
    mostrar_label = ifelse(sector %in% top_sectores, comma(valor, accuracy = 1), NA),
    label_y = valor / 2,
    sector_wrapped = str_wrap(sector, width = 25)
  )

# Gráfico
ggplot(dependencia_larga, aes(x = reorder(sector_wrapped, valor_total), y = valor, fill = tipo_relacion)) +
  geom_col(position = position_dodge(width = 0.7), width = 0.6, color = "grey20") +
  geom_text(
    aes(y = label_y, label = mostrar_label),
    position = position_dodge(width = 0.7),
    size = 3,
    color = "black",
    family = "lato",
    na.rm = TRUE
  ) +
  scale_fill_manual(
    values = c(
      "Compra al sector (cliente)" = color_verde,
      "Vende al sector (proveedor)" = color_guinda
    )
  ) +
  scale_y_continuous(
    labels = label_number(scale = 1, suffix = "M", accuracy = 1),
    expand = expansion(mult = c(0, 0.05))
  ) +
  coord_flip() +
  labs(
    title = "Top 10 sectores más vinculados económicamente al sector automotriz",
    subtitle = "Relaciones económicas por compras y ventas al sector",
    x = NULL,
    y = "Millones de pesos a precios básicos",
    caption = "Fuente: INEGI, Matriz Insumo Producto 2018",
    fill = "Tipo de relación"
  ) +
  tema_SE(base_size = 12) +
  theme(
    plot.margin = margin(t = 10, r = 10, b = 10, l = 25),
    legend.key.width = unit(2, "cm"),
    axis.text.y = element_text(margin = margin(r = 5))
  )

## IED MX 199 - 2024 

ied <- read_csv("~/PROFESIONAL TEC/4° Semestre C.S/Ciencia de datos para la toma de decisiones/Reto CSII/data/Flujo-anual-de-IED-en-Fabricacion-de-Automoviles-y-Camiones.csv")

## Rows: 26 Columns: 5
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): Industry Group, Year_
## dbl (3): Industry Group ID, Time ID, Investment (USD)
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

# Limpiar datos
ied <- ied %>%
  mutate(`Time ID` = as.numeric(`Time ID`)) %>%
  filter(!is.na(`Investment (USD)`))

# Crear gráfico
ggplot(ied, aes(x = `Time ID`, y = `Investment (USD)`)) +
  geom_line(color = color_guinda, size = 1.5) +
  geom_point(color = color_guinda, size = 4) +
  scale_y_continuous(
    labels = label_number(scale = 1e-9, suffix = "B", accuracy = 0.1),
    expand = expansion(mult = c(0, 0.1))
  ) +
  scale_x_continuous(breaks = unique(ied$`Time ID`))+
  labs(
    title = "Evolución Anual de la Inversión Extranjera Directa (IED)",
    subtitle = "Fabricación de Automóviles y Camiones en México",
    x = "Año",
    y = "Inversión (miles de mllones de USD)",
    caption = "Fuente: Secretaría de Economía"
  ) +
  theme_minimal(base_size = 14, base_family = "Arial") +
  theme(
    plot.background = element_rect(fill = color_fondo, color = NA),
    panel.grid.major = element_line(color = color_gris),
    panel.grid.minor = element_blank(),
    plot.title = element_text(size = 20, face = "bold", hjust = 0.5, color = color_texto),
    plot.subtitle = element_text(size = 14, hjust = 0.5, margin = margin(b = 15), color = color_texto),
    plot.caption = element_text(size = 10, hjust = 1, face = "italic", color = color_texto),
    axis.text.x = element_text(angle = 45, hjust = 1, color = color_texto),
    axis.text.y = element_text(color = color_texto),
    axis.title.x = element_text(margin = margin(t = 15), face = "bold", color = color_texto),
    axis.title.y = element_text(margin = margin(r = 15), face = "bold", color = color_texto)
  )

## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Arial' not found, will use 'sans' instead

## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Arial' not found, will use 'sans' instead

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Arial' not found, will use 'sans' instead

## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family 'Arial' not found, will use 'sans' instead

Regiones estratégicas

## Inversión enero - diciembre 2024 por estado 
# Datos y equivalencias (igual que antes)
ied_2024 <- read_csv("~/PROFESIONAL TEC/4° Semestre C.S/Ciencia de datos para la toma de decisiones/Reto CSII/data/IED-segun-entidad-federativa--Periodo-enero-a-diciembre-de-2024.csv",
                locale = locale(decimal_mark = ".", grouping_mark = ","))

## Rows: 15 Columns: 6
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): State, Industry Group
## dbl (4): State ID, Year, Industry Group ID, Investment (USD)
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

ied_limpio <- ied_2024 %>%
  filter(`Industry Group` == "Fabricación de Automóviles y Camiones",
         `Investment (USD)` >= 0) %>%
  select(State, inversion = `Investment (USD)`)

equivalencias <- tibble(
  State = c("Aguascalientes", "Baja California", "Baja California Sur", "Campeche",
            "Coahuila de Zaragoza", "Colima", "Chiapas", "Chihuahua", "Ciudad de México",
            "Durango", "Guanajuato", "Guerrero", "Hidalgo", "Jalisco", "Estado de México",
            "Michoacán de Ocampo", "Morelos", "Nayarit", "Nuevo León", "Oaxaca",
            "Puebla", "Querétaro", "Quintana Roo", "San Luis Potosí",
            "Sinaloa", "Sonora", "Tabasco", "Tamaulipas", "Tlaxcala",
            "Veracruz de Ignacio de la Llave", "Yucatán", "Zacatecas"),
  name = c("Aguascalientes", "Baja California", "Baja California Sur", "Campeche",
           "Coahuila", "Colima", "Chiapas", "Chihuahua", "Ciudad de Mexico",
           "Durango", "Guanajuato", "Guerrero", "Hidalgo", "Jalisco", "Mexico",
           "Michoacan", "Morelos", "Nayarit", "Nuevo Leon", "Oaxaca",
           "Puebla", "Queretaro", "Quintana Roo", "San Luis Potosi",
           "Sinaloa", "Sonora", "Tabasco", "Tamaulipas", "Tlaxcala",
           "Veracruz", "Yucatan", "Zacatecas")
)

ied_mapa <- ied_limpio %>%
  left_join(equivalencias, by = "State")

mexico_mapa <- ne_states(country = "Mexico", returnclass = "sf")

mapa_ied <- mexico_mapa %>%
  left_join(ied_mapa, by = c("name" = "name"))

# Extraer solo Aguascalientes y Morelos para capa destacada
destacados <- mapa_ied %>%
  filter(name %in% c("Aguascalientes", "Morelos"))

# Graficar
ggplot(mapa_ied) +
  geom_sf(aes(fill = inversion), color = "white", size = 0.2) +
  scale_fill_viridis(
    option = "plasma",
    name = "Inversión Extranjera Directa (USD)",
    direction = -1,
    labels = scales::comma
  ) +
  # Capa para destacar Aguascalientes y Morelos
  geom_sf(data = destacados, fill = NA, color = "black", size = 2.5) +  # borde guinda grueso
  geom_sf(data = destacados, aes(fill = inversion), color = NA) +        # relleno normal para mantener color
  labs(
    title = "Inversión Extranjera Directa en Fabricación de Automóviles y Camiones",
    subtitle = "Por entidad federativa en México, Enero - Diciembre 2024",
    caption = "Fuente: Secretaría de Economía"
  ) +
  theme_minimal(base_size = 12) +
  theme(
    plot.title = element_text(size = 16, face = "bold", hjust = 0.5),
    plot.subtitle = element_text(size = 12, hjust = 0.5, margin = margin(b = 10)),
    plot.caption = element_text(size = 9, hjust = 1, face = "italic"),
    legend.position = "right",
    legend.title = element_text(size = 11),
    legend.text = element_text(size = 10),
    axis.text = element_blank(),
    axis.ticks = element_blank(),
    panel.grid = element_blank()
  )

## Fabricación de Automóviles y Camiones

# Cargar datos
datos <- read_csv("~/PROFESIONAL TEC/4° Semestre C.S/Ciencia de datos para la toma de decisiones/Reto CSII/data/Distribucion-de-Unidades-economicas-segun-entidades-federativas-2019.csv")

## Rows: 16 Columns: 5
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): State, Industry Group
## dbl (3): State ID, Industry Group ID, Economic Unit
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

# Seleccionar y renombrar columnas relevantes
datos_limpios <- datos %>%
  select(State, unidades = `Economic Unit`)

# Crear tabla de equivalencias entre nombres del CSV y los del shapefile
equivalencias <- tibble(
  State = c("Aguascalientes", "Baja California", "Baja California Sur", "Campeche",
            "Coahuila de Zaragoza", "Colima", "Chiapas", "Chihuahua", "Ciudad de México",
            "Durango", "Guanajuato", "Guerrero", "Hidalgo", "Jalisco", "Estado de México",
            "Michoacán de Ocampo", "Morelos", "Nayarit", "Nuevo León", "Oaxaca",
            "Puebla", "Querétaro", "Quintana Roo", "San Luis Potosí",
            "Sinaloa", "Sonora", "Tabasco", "Tamaulipas", "Tlaxcala",
            "Veracruz de Ignacio de la Llave", "Yucatán", "Zacatecas"),
  name = c("Aguascalientes", "Baja California", "Baja California Sur", "Campeche",
           "Coahuila", "Colima", "Chiapas", "Chihuahua", "Ciudad de Mexico",
           "Durango", "Guanajuato", "Guerrero", "Hidalgo", "Jalisco", "Mexico",
           "Michoacan", "Morelos", "Nayarit", "Nuevo Leon", "Oaxaca",
           "Puebla", "Queretaro", "Quintana Roo", "San Luis Potosi",
           "Sinaloa", "Sonora", "Tabasco", "Tamaulipas", "Tlaxcala",
           "Veracruz", "Yucatan", "Zacatecas")
)

# Unir equivalencias con los datos
datos_mapa <- datos_limpios %>%
  left_join(equivalencias, by = "State")

# Cargar mapa de México
mexico_mapa <- ne_states(country = "Mexico", returnclass = "sf")

# Unir datos con el shapefile
mapa_datos <- mexico_mapa %>%
  left_join(datos_mapa, by = c("name" = "name"))

# Coordenadas de fábricas Nissan
nissan_coords <- data.frame(
  lon = c(-99.17287261213131, -102.28118936691124, -102.27862730540257, -102.29207190845837),
  lat = c(18.90886934393236, 21.80447038234521, 21.7420485056283, 21.72599408991313),
  planta = c("Nissan Morelos", "Nissan A1", "Nissan A2", "Compass")
)

# Marcar los estados que queremos resaltar
mapa_datos <- mapa_datos %>%
  mutate(
    resaltar = ifelse(name %in% c("Morelos", "Aguascalientes"), "Resaltado", "Normal")
  )

ggplot(mapa_datos) +
  # Capa base del mapa
  geom_sf(aes(fill = unidades), color = "white", size = 0.2) +
  
  # Capa de resaltado con línea negra gruesa
  geom_sf(
    data = filter(mapa_datos, resaltar == "Resaltado"),
    fill = NA,
    color = "black",       
    size = 2.5,            
    show.legend = FALSE
  ) +
  
  # Fábricas Nissan como triángulos negros
  geom_jitter(data = nissan_coords, 
              aes(x = lon, y = lat, color = "Fábricas automotrices (Nissan)"), 
              shape = 17, size = 2.8, width = 0.1, height = 0.1) +
  scale_color_manual(
    name = "", 
    values = c("Fábricas automotrices (Nissan)" = "black")
  ) +
  
  # Relleno del mapa con viridis
  scale_fill_viridis(
    option = "plasma",
    name = "Unidades Económicas",
    direction = -1,
    labels = scales::comma
  ) +
  
  # Estética del gráfico
  theme_minimal(base_size = 12) +
  labs(
    title = "Distribución Estatal de Unidades Económicas en la Industria Automotriz",
    subtitle = "Fabricación de Automóviles y Camiones en México",
    caption = "Fuente: INEGI, Censos Económicos 2019"
  ) +
  theme(
    plot.title = element_text(size = 16, face = "bold", hjust = 0.5),
    plot.subtitle = element_text(size = 12, hjust = 0.5, margin = margin(b = 10)),
    plot.caption = element_text(size = 9, hjust = 1, face = "italic"),
    legend.position = "right",
    legend.title = element_text(size = 11),
    legend.text = element_text(size = 10),
    axis.text = element_blank(),
    axis.ticks = element_blank(),
    axis.title = element_blank(),
    panel.grid = element_blank()
  )

Morelos y Aguascalientes

library(readr)
library(purrr)
library(janitor)

## Warning: package 'janitor' was built under R version 4.4.3

## 
## Adjuntando el paquete: 'janitor'

## The following objects are masked from 'package:stats':
## 
##     chisq.test, fisher.test

# Define tu ruta base a los datos
ruta_base <- "~/PROFESIONAL TEC/4° Semestre C.S/Ciencia de datos para la toma de decisiones/Reto CSII/data"

# Años disponibles
anios <- 2018:2024

# Función para leer archivos por estado
leer_datos_estado <- function(estado) {
  nombres_archivos <- paste0("Productos-", anios, "-Clic-en-el-grafico-para-seleccionar.csv")
  rutas_completas <- file.path(ruta_base, estado, nombres_archivos)

  map2_dfr(rutas_completas, anios, ~{
    if (file.exists(.x)) {
      df <- read_csv(.x)
      df$anio <- .y
      df$estado <- estado
      df
    } else {
      warning(paste("Archivo no encontrado:", .x))
      NULL
    }
  })
}

# Leer datos de ambos estados
datos_morelos <- leer_datos_estado("Morelos")

## Rows: 159 Columns: 8

## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): Chapter 4 Digit, HS2 4 Digit, HS4 4 Digit
## dbl (5): Chapter 4 Digit ID, HS2 4 Digit ID, HS4 4 Digit ID, Trade Value, Share
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
## Rows: 123 Columns: 8
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): Chapter 4 Digit, HS2 4 Digit, HS4 4 Digit
## dbl (5): Chapter 4 Digit ID, HS2 4 Digit ID, HS4 4 Digit ID, Trade Value, Share
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
## Rows: 124 Columns: 8
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): Chapter 4 Digit, HS2 4 Digit, HS4 4 Digit
## dbl (5): Chapter 4 Digit ID, HS2 4 Digit ID, HS4 4 Digit ID, Trade Value, Share
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
## Rows: 144 Columns: 8
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): Chapter 4 Digit, HS2 4 Digit, HS4 4 Digit
## dbl (5): Chapter 4 Digit ID, HS2 4 Digit ID, HS4 4 Digit ID, Trade Value, Share
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
## Rows: 114 Columns: 8
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): Chapter 4 Digit, HS2 4 Digit, HS4 4 Digit
## dbl (5): Chapter 4 Digit ID, HS2 4 Digit ID, HS4 4 Digit ID, Trade Value, Share
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
## Rows: 112 Columns: 8
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): Chapter 4 Digit, HS2 4 Digit, HS4 4 Digit
## dbl (5): Chapter 4 Digit ID, HS2 4 Digit ID, HS4 4 Digit ID, Trade Value, Share
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
## Rows: 121 Columns: 8
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): Chapter 4 Digit, HS2 4 Digit, HS4 4 Digit
## dbl (5): Chapter 4 Digit ID, HS2 4 Digit ID, HS4 4 Digit ID, Trade Value, Share
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

datos_ags <- leer_datos_estado("Aguascalientes")

## Rows: 127 Columns: 8
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): Chapter 4 Digit, HS2 4 Digit, HS4 4 Digit
## dbl (5): Chapter 4 Digit ID, HS2 4 Digit ID, HS4 4 Digit ID, Trade Value, Share
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
## Rows: 121 Columns: 8
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): Chapter 4 Digit, HS2 4 Digit, HS4 4 Digit
## dbl (5): Chapter 4 Digit ID, HS2 4 Digit ID, HS4 4 Digit ID, Trade Value, Share
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
## Rows: 111 Columns: 8
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): Chapter 4 Digit, HS2 4 Digit, HS4 4 Digit
## dbl (5): Chapter 4 Digit ID, HS2 4 Digit ID, HS4 4 Digit ID, Trade Value, Share
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
## Rows: 123 Columns: 8
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): Chapter 4 Digit, HS2 4 Digit, HS4 4 Digit
## dbl (5): Chapter 4 Digit ID, HS2 4 Digit ID, HS4 4 Digit ID, Trade Value, Share
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
## Rows: 124 Columns: 8
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): Chapter 4 Digit, HS2 4 Digit, HS4 4 Digit
## dbl (5): Chapter 4 Digit ID, HS2 4 Digit ID, HS4 4 Digit ID, Trade Value, Share
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
## Rows: 112 Columns: 8
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): Chapter 4 Digit, HS2 4 Digit, HS4 4 Digit
## dbl (5): Chapter 4 Digit ID, HS2 4 Digit ID, HS4 4 Digit ID, Trade Value, Share
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
## Rows: 122 Columns: 8
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): Chapter 4 Digit, HS2 4 Digit, HS4 4 Digit
## dbl (5): Chapter 4 Digit ID, HS2 4 Digit ID, HS4 4 Digit ID, Trade Value, Share
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

# Combinar
datos_comb <- bind_rows(datos_morelos, datos_ags)


# Filtrar por código producto HS2 1787
datos_filtrados <- datos_comb %>%
  filter(`HS2 4 Digit ID` == 1787)

# Colores para estados
colores <- c("Morelos" = "#800000", "Aguascalientes" = "#004B8D")

# Gráfico combinado
ggplot(datos_filtrados, aes(x = factor(anio), y = `Trade Value`, fill = estado)) +
  geom_col(position = position_dodge(width = 0.7), width = 0.6) +
  scale_fill_manual(values = colores) +
  scale_y_continuous(labels = scales::label_number(scale = 1e-6, suffix = " M", accuracy = 1),
                     expand = expansion(mult = c(0, 0.05))) +
  labs(
    title = "Exportaciones de vehículos y autopartes",
    subtitle = "Producto HS2 1787 – Evolución anual del valor exportado",
    x = "Año",
    y = "Valor de exportación (Millones USD)",
    fill = "Estado",
    caption = "Fuente: Secretaría de Economía"
  ) +
  tema_SE(base_size = 12) +
  theme(legend.position = "top")

# Participación en las exportaciones totales 
share_morelos <- datos_morelos %>%
  filter(`HS2 4 Digit ID` == 1787, !is.na(Share)) %>%
  distinct(anio, .keep_all = TRUE) %>%
  arrange(anio)

share_ags <- datos_ags %>%
  filter(`HS2 4 Digit ID` == 1787, !is.na(Share)) %>%
  distinct(anio, .keep_all = TRUE) %>%
  arrange(anio)

# Añadir columna Estado
share_morelos <- share_morelos %>% mutate(Estado = "Morelos")
share_ags <- share_ags %>% mutate(Estado = "Aguascalientes")

# Unir datasets
share_todos <- bind_rows(share_morelos, share_ags)

# Graficar comparación
ggplot(share_todos, aes(x = anio, y = Share / 100, color = Estado)) +
  geom_line(size = 1.5) +
  geom_point(size = 3.5) +
  geom_text(
    aes(label = scales::percent(Share / 100, accuracy = 0.1)),
    vjust = -1,
    size = 3.5,
    family = "lato",
    color = "black",
    show.legend = FALSE
  ) +
  scale_x_continuous(breaks = unique(share_todos$anio)) +
  scale_y_continuous(
    labels = label_percent(),
    limits = c(
      min(share_todos$Share) * 0.0095,
      max(share_todos$Share) * 0.0105
    )
  ) +
  scale_color_manual(values = c("Morelos" = "#800000", "Aguascalientes" = "#004B8D")) +
  labs(
    title = "Participación del sector automotriz en las exportaciones totales",
    subtitle = "Comparación anual Morelos vs. Aguascalientes",
    x = "Año",
    y = "Participación (%)",
    color = "Estado",
    caption = "Fuente: Secretaría de Economía"
  ) +
  tema_SE(base_size = 12)

Nissan: Una oportunidad que no hay que dejar escapar

Los aranceles no solo han afectado a México, también a decenas de países y empresas exportadoras. Este año Nissan cerrará plantas en Tailandia India y Japón.
México no se ha visto perjudicado por la situación de Nissan, en cambio, hay un traslado de producción de pick-ups de Argentina a plantas mexicanas y el nuevo CEO de la marca es mexicano.
Ya mplean a emplean a más de 15 mil personas.
Produjeron 678,000 unidades en 2024.
Es la marca más vendida en México desde 2009.

¿Y qué es lo MEJOR? Ya cuenta con 3 plantas en México: 2 en Aguascalientes (A1 y A2), y una en Morelos (CIVAC).

# Parques industriales 
# Cargar la base de datos (ajusta la ruta si es necesario)
INEGI_DENUE_28052025 <- readr::read_csv("~/PROFESIONAL TEC/4° Semestre C.S/Ciencia de datos para la toma de decisiones/Reto CSII/data/INEGI_DENUE_NACIONAL/INEGI_DENUE_28052025.csv", 
                                        locale = locale(encoding = "ISO-8859-1"))

## Rows: 3236 Columns: 42
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (35): Clee, Nombre de la Unidad Económica, Razón social, Nombre de clase...
## dbl  (7): ID, Código de la clase de actividad SCIAN, Número exterior o kilóm...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

# Filtrar datos para Morelos, arreglando nombre de parque CIVAC
filtrar_espacios_productivos_MOR <- INEGI_DENUE_28052025 %>%
  mutate(`Corredor industrial, centro comercial o mercado público` = 
           if_else(`Corredor industrial, centro comercial o mercado público` == "CIVAC", 
                   "CIUDAD INDUSTRIAL DEL VALLE DE CUERNAVACA", 
                   `Corredor industrial, centro comercial o mercado público`)) %>% 
  filter(grepl("IND", `Corredor industrial, centro comercial o mercado público`, ignore.case = TRUE)) %>%
  filter(`Entidad federativa` == "MORELOS") %>%
  select(`Nombre de clase de la actividad`, `Entidad federativa`)

# Filtrar datos para Aguascalientes
filtrar_espacios_productivos_AGS <- INEGI_DENUE_28052025 %>%
  filter(grepl("IND", `Corredor industrial, centro comercial o mercado público`, ignore.case = TRUE)) %>%
  filter(`Entidad federativa` == "AGUASCALIENTES") %>%
  select(`Nombre de clase de la actividad`, `Entidad federativa`)

# Añadir columna con nombre del estado para ambos datasets
datos_MOR <- filtrar_espacios_productivos_MOR %>%
  mutate(Entidad = "MORELOS")

datos_AGS <- filtrar_espacios_productivos_AGS %>%
  mutate(Entidad = "AGUASCALIENTES")

# Unir ambos data frames
datos_comparacion <- bind_rows(datos_MOR, datos_AGS)

# Agrupar y contar actividades por estado
clasificacion_actividades_comparada <- datos_comparacion %>%
  group_by(Entidad, `Nombre de clase de la actividad`) %>%
  summarise(Total = n(), .groups = "drop")

ggplot(clasificacion_actividades_comparada, 
       aes(x = fct_reorder(`Nombre de clase de la actividad`, Total), 
           y = Total, fill = Entidad)) +
  geom_bar(stat = "identity", position = "dodge") +
  coord_flip() +
  labs(
    title = "Comparación de actividades económicas en parques industriales",
    subtitle = "Aguascalientes vs. Morelos",
    x = "Actividad Económica (SCIAN)",
    y = "Número de Establecimientos",
    fill = "Entidad",
    caption = "Fuente: INEGI DENUE"
  ) +
  theme_minimal(base_size = 12) +
  theme(
    text = element_text(family = "lato"),
    plot.title = element_text(size = 16, face = "bold", hjust = 0),        # Alineado a la izquierda
    plot.subtitle = element_text(size = 12, hjust = 0, margin = margin(b = 10)),  # Alineado a la izquierda
    plot.caption = element_text(size = 9, hjust = 0, face = "italic", margin = margin(t = 10)),  # Izquierda y un poco abajo
    axis.title.y = element_text(margin = margin(r = 10)),
    axis.title.x = element_text(margin = margin(t = 10)),
    legend.position = "bottom"
  ) +
  scale_fill_manual(values = c("MORELOS" = "#800000", "AGUASCALIENTES" = "#004B8D")) +
  scale_y_continuous(labels = scales::comma)

Gran parte de los parques industriales de ambos estados tienen actividades del sector automotriz (47 de 60 para Aguascalientes, y 4 de 7 para Morelos). Tienen la capacidad para recibir todavía más inversión y producción. Son estados con mucha historia dentro de la producción automotriz y cuentan con una base de operadores y técnicos altamente calificados.

Plan México

Aumentar la producción nacional en un 10%.
Aumentar en un 15% el contenido nacional de autos fabricados en México.
Duplicar los planes de educación dual.
Promoción de la capacitación técnica y la especialización de los trabajadores.
Agilización de los trámites para hacer negocios para expandir la industria.

¿A dónde y qué puede exportar Nissan desde México?

# Socios comerciales Morelos
# Años
years <- 2018:2024

# Leer archivos y añadir año
leer_exportaciones <- function(year) {
  file <- paste0("~/PROFESIONAL TEC/4° Semestre C.S/Ciencia de datos para la toma de decisiones/Reto CSII/data/Morelos/Partes-y-Accesorios-de-Vehiculos-Automotores-Destinos-comerciales-", year, ".csv")
  read_csv(file, show_col_types = FALSE) %>%
    mutate(anio = year)
}

# Leer y unir
datos_exportacion <- map_dfr(years, leer_exportaciones)

traducciones <- c(
  "Alemania" = "Germany",
  "Bélgica" = "Belgium",
  "Brasil" = "Brazil",
  "Canadá" = "Canada",
  "Chequia" = "Czech Republic",
  "Corea del Sur" = "South Korea",
  "Eslovaquia" = "Slovakia",
  "España" = "Spain",
  "Estados Unidos" = "United States",
  "Francia" = "France",
  "Italia" = "Italy",
  "Japón" = "Japan",
  "Noruega" = "Norway",
  "Países Bajos" = "Netherlands",
  "Panamá" = "Panama",
  "Polonia" = "Poland",
  "Reino Unido" = "United Kingdom",
  "Sudáfrica" = "South Africa",
  "Suecia" = "Sweden",
  "Tailandia" = "Thailand",
  "Turquía" = "Turkey"
)

# Preparar y traducir datos
datos_exportacion <- datos_exportacion %>%
  rename(pais = Country, valor = `Trade Value`) %>%
  mutate(
    pais_ingles = if_else(pais %in% names(traducciones), traducciones[pais], pais),
    iso_a3 = countrycode(pais_ingles, origin = "country.name", destination = "iso3c")
  )

# Calcular promedio anual de exportaciones por país
datos_exportacion_promedio <- datos_exportacion %>%
  group_by(iso_a3) %>%
  summarise(exportaciones_promedio = mean(valor, na.rm = TRUE), .groups = "drop")

# Obtener mapa mundial
world <- ne_countries(scale = "medium", returnclass = "sf")

# Unir datos con mapa
mapa_exportaciones <- world %>%
  left_join(datos_exportacion_promedio, by = "iso_a3")

# Opcional: resaltar algunos países (por ejemplo, países con exportaciones > cierto umbral)
destacados <- mapa_exportaciones %>%
  filter(exportaciones_promedio > quantile(exportaciones_promedio, 0.75, na.rm = TRUE))

# Gráfico con estilo mejorado
ggplot(mapa_exportaciones) +
  geom_sf(aes(fill = exportaciones_promedio), color = "white", size = 0.15) +  # bordes blancos finos
  geom_sf(data = destacados, fill = NA, color = "#800000", size = 1.2) +        # borde guinda para destacados
  scale_fill_viridis(
    option = "plasma",
    trans = "log10",
    na.value = "grey90",
    name = "Exportaciones promedio (USD)",
    labels = scales::label_comma()
  ) +
  labs(
    title = "Socios comerciales de Morelos (2018–2024)",
    subtitle = "Valor promedio anual de las exportaciones del sector automotriz",
    caption = "Fuente: Secretaría de Economía"
  ) +
  theme_minimal(base_size = 12) +
  theme(
    text = element_text(family = "lato"),
    plot.title = element_text(size = 16, face = "bold", hjust = 0.5),
    plot.subtitle = element_text(size = 12, hjust = 0.5, margin = margin(b = 10)),
    plot.caption = element_text(size = 9, hjust = 1, face = "italic"),
    legend.position = "right",
    legend.title = element_text(size = 11),
    legend.text = element_text(size = 10),
    axis.text = element_blank(),
    axis.ticks = element_blank(),
    panel.grid = element_blank()
  )

# Socios Aguascalientes 
# Años
years <- 2018:2024

# Función para leer datos por año para Aguascalientes
leer_exportaciones <- function(year) {
  file <- paste0("~/PROFESIONAL TEC/4° Semestre C.S/Ciencia de datos para la toma de decisiones/Reto CSII/data/Aguascalientes/Partes-y-Accesorios-de-Vehiculos-Automotores-Destinos-comerciales-", year, ".csv")
  read_csv(file, show_col_types = FALSE) %>%
    mutate(anio = year)
}

# Leer y unir todos los años
datos_exportacion <- map_dfr(years, leer_exportaciones)

# Traducciones de países para unir con códigos ISO3
traducciones <- c(
  "Alemania" = "Germany",
  "Bélgica" = "Belgium",
  "Brasil" = "Brazil",
  "Canadá" = "Canada",
  "Chequia" = "Czech Republic",
  "Corea del Sur" = "South Korea",
  "Eslovaquia" = "Slovakia",
  "España" = "Spain",
  "Estados Unidos" = "United States",
  "Francia" = "France",
  "Italia" = "Italy",
  "Japón" = "Japan",
  "Noruega" = "Norway",
  "Países Bajos" = "Netherlands",
  "Panamá" = "Panama",
  "Polonia" = "Poland",
  "Reino Unido" = "United Kingdom",
  "Sudáfrica" = "South Africa",
  "Suecia" = "Sweden",
  "Tailandia" = "Thailand",
  "Turquía" = "Turkey"
)

# Preparar y traducir datos
datos_exportacion <- datos_exportacion %>%
  rename(pais = Country, valor = `Trade Value`) %>%
  mutate(
    pais_ingles = if_else(pais %in% names(traducciones), traducciones[pais], pais),
    iso_a3 = countrycode(pais_ingles, origin = "country.name", destination = "iso3c")
  )

## Warning: There was 1 warning in `mutate()`.
## ℹ In argument: `iso_a3 = countrycode(pais_ingles, origin = "country.name",
##   destination = "iso3c")`.
## Caused by warning:
## ! Some values were not matched unambiguously: Emiratos Árabes Unidos, Filipinas, Pakistán, Perú, República Dominicana, Rusia

# Calcular promedio anual de exportaciones por país
datos_exportacion_promedio <- datos_exportacion %>%
  group_by(iso_a3) %>%
  summarise(exportaciones_promedio = mean(valor, na.rm = TRUE), .groups = "drop")

# Obtener mapa mundial
world <- ne_countries(scale = "medium", returnclass = "sf")

# Unir datos con mapa
mapa_exportaciones <- world %>%
  left_join(datos_exportacion_promedio, by = "iso_a3")

# Resaltar países con exportaciones por encima del cuartil 75
destacados <- mapa_exportaciones %>%
  filter(exportaciones_promedio > quantile(exportaciones_promedio, 0.75, na.rm = TRUE))

# Graficar el mapa con estilo mejorado
ggplot(mapa_exportaciones) +
  geom_sf(aes(fill = exportaciones_promedio), color = "white", size = 0.15) +   # bordes blancos finos
  geom_sf(data = destacados, fill = NA, color = "#800000", size = 1.2) +         # borde guinda para destacados
  scale_fill_viridis(
    option = "plasma",
    trans = "log10",
    na.value = "grey90",
    name = "Exportaciones promedio (USD)",
    labels = scales::label_comma()
  ) +
  labs(
    title = "Socios comerciales de Aguascalientes (2018–2024)",
    subtitle = "Valor promedio anual de las exportaciones del sector automotriz",
    caption = "Fuente:  Secretaría de Economía"
  ) +
  theme_minimal(base_size = 12) +
  theme(
    text = element_text(family = "lato"),
    plot.title = element_text(size = 16, face = "bold", hjust = 0.5),
    plot.subtitle = element_text(size = 12, hjust = 0.5, margin = margin(b = 10)),
    plot.caption = element_text(size = 9, hjust = 1, face = "italic"),
    legend.position = "right",
    legend.title = element_text(size = 11),
    legend.text = element_text(size = 10),
    axis.text = element_blank(),
    axis.ticks = element_blank(),
    panel.grid = element_blank()
  )

# Clusters Exportación 
# Cargar la base ----------------------------------------------------------
exportaciones = read_excel("~/PROFESIONAL TEC/4° Semestre C.S/Ciencia de datos para la toma de decisiones/Reto CSII/data/RAIAVL_10.xlsx", skip = 7, col_names = FALSE)

## New names:
## • `` -> `...1`
## • `` -> `...2`
## • `` -> `...3`
## • `` -> `...4`

names(exportaciones) = c("anio", "mes", "detalle", "exportacion")

# Limpiar exportaciones
exportaciones = exportaciones %>%
  filter(exportacion != "-" & !is.na(exportacion)) %>%
  mutate(exportacion = as.numeric(exportacion))

# Separar la columna de detalle -------------------------------------------
exportaciones = exportaciones %>%
  separate(detalle, into = c("marca", "modelo", "tipo", "segmento", "pais_destino"), sep = " - ", remove = FALSE)

# -------------------------------------------------------------------------
# Agrupamiento por país destino
exportaciones_pais = exportaciones %>%
  group_by(pais_destino) %>%
  summarise(exportaciones = sum(exportacion, na.rm = TRUE)) %>%
  drop_na()

# Matriz de clustering por país destino
exportaciones_pais_mat = exportaciones_pais %>%
  column_to_rownames(var = "pais_destino")

distancias_pais = dist(exportaciones_pais_mat, method = "euclidean")

# Clustering por país - método Single
cluster_pais_single = hclust(distancias_pais, method = "single")
plot(cluster_pais_single, main = "País Destino - Exportaciones, Nissan, 2018-2025 - Método Single", ylab = "Distancia")
rect.hclust(cluster_pais_single, k = 3, border = 2:4)

# Clustering por país - método Ward.D2
cluster_pais_ward = hclust(distancias_pais, method = "ward.D2")
plot(cluster_pais_ward, main = "País Destino - Exportaciones,Nissan, 2018-2025 - Método Ward.D2", ylab = "Distancia")
rect.hclust(cluster_pais_ward, k = 3, border = 2:4)

# -------------------------------------------------------------------------
# Agrupamiento por modelo exportado
exportaciones_modelo = exportaciones %>%
  group_by(modelo) %>%
  summarise(exportaciones = sum(exportacion, na.rm = TRUE)) %>%
  filter(exportaciones > 100) %>%
  drop_na()

# Matriz de clustering por modelo
exportaciones_modelo_mat = exportaciones_modelo %>%
  column_to_rownames(var = "modelo")

distancias_modelo = dist(exportaciones_modelo_mat, method = "euclidean")

# Clustering por modelo - método Single
cluster_modelo_single = hclust(distancias_modelo, method = "single")
plot(cluster_modelo_single, main = "Modelo(s) - Exportaciones, Nissan, 2018-2025 - Método Single", xlab = "Modelos", ylab = "Distancia")
rect.hclust(cluster_modelo_single, k = 3, border = 2:4)

# Clustering por modelo - método Ward.D2
cluster_modelo_ward = hclust(distancias_modelo, method = "ward.D2")
plot(cluster_modelo_ward, main = "Modelo(s) - Exportaciones, Nissan, 2018-2025 - Método Ward.D2", xlab = "Modelos", ylab = "Distancia")
rect.hclust(cluster_modelo_ward, k = 3, border = 2:4)

color_nissan <- "#006341"  # ejemplo de un verde más institucional

exportaciones_top10 <- exportaciones_pais %>%
  filter(!is.na(exportaciones), exportaciones > 0) %>%  # quitar ceros y NAs
  arrange(desc(exportaciones)) %>%                       # ordenar de mayor a menor
  slice_head(n = 10)  

# Aplica el formato unificado
ggplot(exportaciones_top10, aes(x = reorder(pais_destino, exportaciones), y = exportaciones)) +
  geom_col(fill = color_nissan) +
  coord_flip() +
  labs(
    title = "Exportaciones por país destino (2018–2025)",
    subtitle = "Vehículos Nissan",
    x = "País destino",
    y = "Unidades exportadas",
    caption = "Fuente: Base de datos Nissan México"
  ) +
  theme_minimal(base_size = 12) +
  theme(
    text = element_text(family = "lato"),
    plot.title = element_text(size = 16, face = "bold", hjust = 0.5),
    plot.subtitle = element_text(size = 12, hjust = 0.5, margin = margin(b = 10)),
    plot.caption = element_text(size = 9, hjust = 1, face = "italic"),
    axis.title.y = element_text(margin = margin(r = 10)),
    axis.title.x = element_text(margin = margin(t = 10)),
    legend.position = "none"
  ) +
  scale_y_continuous(labels = scales::comma)

# Color institucional o más sobrio (puedes cambiar "tomato" si quieres un color corporativo)
color_modelo <- "#D62728"  # rojo oscuro estilo Nissan

ggplot(exportaciones_modelo, aes(x = reorder(modelo, exportaciones), y = exportaciones)) +
  geom_col(fill = color_modelo) +
  coord_flip() +
  labs(
    title = "Modelos más exportados por Nissan (2018–2025)",
    x = "Modelo",
    y = "Unidades exportadas",
    caption = "Fuente: INEGI"
  ) +
  theme_minimal(base_size = 12) +
  theme(
    text = element_text(family = "lato"),
    plot.title = element_text(size = 16, face = "bold", hjust = 0.5),
    plot.subtitle = element_text(size = 12, hjust = 0.5, margin = margin(b = 10)),
    plot.caption = element_text(size = 9, hjust = 1, face = "italic"),
    axis.title.y = element_text(margin = margin(r = 10)),
    axis.title.x = element_text(margin = margin(t = 10)),
    legend.position = "none"
  ) +
  scale_y_continuous(labels = scales::comma)

Ventas

# Cargar base -----------------------------------------------------------------
datos = read_excel("~/PROFESIONAL TEC/4° Semestre C.S/Ciencia de datos para la toma de decisiones/Reto CSII/data/RAIAVL_8_9.xlsx", skip = 7, col_names = FALSE)

## New names:
## • `` -> `...1`
## • `` -> `...2`
## • `` -> `...3`
## • `` -> `...4`

names(datos) = c("anio", "mes", "detalle", "ventas")

# Limpiar ventas y eliminar registros inválidos
datos = datos %>%
  filter(ventas != "-" & !is.na(ventas)) %>%
  mutate(ventas = as.numeric(ventas))

# Separar la columna de detalle ----------------------------------------------
datos = datos %>%
  separate(detalle, into = c("marca", "modelo", "tipo", "segmento", "origen", "pais_origen"), sep = " - ", remove = FALSE)

# Agrupar por país de origen --------------------------------------------------
ventas_pais = datos %>%
  group_by(pais_origen) %>%
  summarise(ventas = sum(ventas, na.rm = TRUE)) %>%
  drop_na()

# Matriz de clustering por país -----------------------------------------------
ventas_pais_mat = ventas_pais %>%
  column_to_rownames(var = "pais_origen")

distancias_pais = dist(ventas_pais_mat, method = "euclidean")

# Clustering por país - método single
cluster_pais_single = hclust(distancias_pais, method = "single")
plot(cluster_pais_single, main = "País - Venta al público y producción (México), Nissan, 2018-2025 - Single", ylab = "Distancia")
rect.hclust(cluster_pais_single, k = 3, border = 2:4)

# Clustering por país - método ward.D2
cluster_pais_ward = hclust(distancias_pais, method = "ward.D2")
plot(cluster_pais_ward, main = "País - Venta al público y producción (México), Nissan, 2018-2025 - Ward.D2", ylab = "Distancia")
rect.hclust(cluster_pais_ward, k = 3, border = 2:4)

# Agrupar por modelo ----------------------------------------------------------
ventas_modelo = datos %>%
  group_by(modelo) %>%
  summarise(ventas = sum(ventas, na.rm = TRUE)) %>%
  filter(ventas > 100) %>%
  drop_na()

# Matriz de clustering por modelo ---------------------------------------------
ventas_modelo_mat = ventas_modelo %>%
  column_to_rownames(var = "modelo")

distancias_modelo = dist(ventas_modelo_mat, method = "euclidean")

# Clustering por modelo - método single
cluster_modelo_single = hclust(distancias_modelo, method = "single")
plot(cluster_modelo_single, main = "Modelo - Venta al público y producción (México), Nissan, 2018-2025 - Single", xlab = "Modelos", sub = "", ylab = "Distancia")
rect.hclust(cluster_modelo_single, k = 3, border = 2:4)

# Clustering por modelo - método ward.D2
cluster_modelo_ward = hclust(distancias_modelo, method = "ward.D2")
plot(cluster_modelo_ward, main = "Modelo - Venta al público y producción (México), Nissan, 2018-2025 - Ward.D2", xlab = "Modelos", sub = "", ylab = "Distancia")
rect.hclust(cluster_modelo_ward, k = 3, border = 2:4)

# Filtrar top 20 países con ventas > 0
ventas_pais_top20 <- ventas_pais %>%
  filter(!is.na(ventas), ventas > 0) %>%
  arrange(desc(ventas)) %>%
  slice_head(n = 20)

# Gráfico con formato unificado
ggplot(ventas_pais_top20, aes(x = reorder(pais_origen, ventas), y = ventas)) +
  geom_col(fill = "#006341") +
  coord_flip() +
  labs(
    title = "Ventas totales por país de origen (2018-2025)",
    x = "País de origen",
    y = "Ventas",
    caption = "Fuente: INEGI"
  ) +
  theme_minimal(base_size = 12) +
  theme(
    text = element_text(family = "lato"),
    plot.title = element_text(size = 16, face = "bold", hjust = 0.5),
    plot.subtitle = element_text(size = 12, hjust = 0.5, margin = margin(b = 10)),
    plot.caption = element_text(size = 9, hjust = 1, face = "italic"),
    axis.title.y = element_text(margin = margin(r = 10)),
    axis.title.x = element_text(margin = margin(t = 10)),
    legend.position = "none"
  ) +
  scale_y_continuous(labels = scales::comma)

# Filtrar top 20 modelos con ventas > 0
ventas_top20 <- ventas_modelo %>%
  filter(!is.na(ventas), ventas > 0) %>%
  arrange(desc(ventas)) %>%
  slice_head(n = 20)

# Gráfico con formato unificado
ggplot(ventas_top20, aes(x = reorder(modelo, ventas), y = ventas)) +
  geom_col(fill = "#D62728") +
  coord_flip() +
  labs(
    title = "Ventas por modelo (2018-2025)",
    x = "Modelo",
    y = "Unidades vendidas",
    caption = "Fuente: INEGI"
  ) +
  theme_minimal(base_size = 12) +
  theme(
    text = element_text(family = "lato"),
    plot.title = element_text(size = 16, face = "bold", hjust = 0.5),
    plot.subtitle = element_text(size = 12, hjust = 0.5, margin = margin(b = 10)),
    plot.caption = element_text(size = 9, hjust = 1, face = "italic"),
    axis.title.y = element_text(margin = margin(r = 10)),
    axis.title.x = element_text(margin = margin(t = 10)),
    legend.position = "none"
  ) +
  scale_y_continuous(labels = scales::comma)

Interpretación de las gráficas generadas: Gráficas por País (Ventas y Exportaciones) Single – País Origen (Ventas): El dendograma muestra cómo países con ventas bajas como India o España se agrupan rápidamente, mientras que México, con más de 1.5 millones de unidades vendidas, permanece aislado hasta el final. Este comportamiento refleja el hecho de que México representa un outlier en términos de volumen, y el método single permite identificarlo con claridad, aunque el resto de los grupos parecen poco estructurados. Ward.D2 – País Origen (Ventas): En esta gráfica se observa una división más balanceada. México sigue estando separado, pero el agrupamiento de Japón, EE.UU. e India con España es más coherente. Single – País Destino (Exportaciones): Al igual que en ventas, los países con exportaciones bajas se agrupan de inmediato, y los que tienen mayores volúmenes se mantienen separados. Ward.D2 – País Destino (Exportaciones): La estructura aquí es más clara y útil. Países con altos niveles de exportación como EE.UU., Brasil o Canadá se agrupan entre sí, mientras que otros destinos menores forman clústeres independientes. Este método resalta mejor las diferencias estructurales en el mercado de exportación de Nissan. Gráficas por Modelo (Ventas y Exportaciones) Single – Modelo: Los modelos con ventas o exportaciones muy bajas tienden a agruparse rápidamente. Modelos dominantes como Versa, March o NP300 se mantienen aislados, lo cual muestra su particularidad en el mercado. Ward.D2 – Modelo: Esta es la gráfica más clara para definir ligas de modelos. Se observa claramente que hay una agrupación de ventas altas (Versa, March, NP300), una de modelos con desempeño medio (como Altima, Sentra, Xtrail), y otra de modelos de exportación menor. Este tipo de clasificación puede ser útil para definir estrategias diferenciadas por segmento.

Hallazgos y recomendaciones puntuales para la industria automotriz ✅

Aprovechar la guerra arancelaria para relocalizar la producción de Nissan a México.
Explotar las ventajas comparativas y el potencial de Morelos y Aguascalientes.
Canadá deber ser una prioridad como socio comercial por su alto consumo de vehículos Nissan, la relación económica existente y el T-MEC.
Sudamérica representa un nuevo mercado para diversificar nuestras exportaciones, especialmente Colombia, Chile y Brasil.
Se estan importando demasiados vehículos de China que se podrían suplir por producción nacional.

Beneficios para México 🇲🇽

Derrama económica hacia otros sectores fuertemente relacionados con el sector automotriz.
Desarrollo no solo económico, si no también social en Morelos y Aguascalientes.
Creación de empleos calificados.
Diversificación de exportaciones.
Reducción de importaciones.

Aranceles mundiales para el sector automotriz: una oportunidad para Morelos y Aguascalientes

Diego Ceciliano, Santiago Colín, Aldo García, Sergio Paulino, Santiago González

2025-06-06