Gasto en menstruación en base a ENGHo 2017/2018 (INDEC)

El siguiente código fue elaborado para una tesis de grado sobre menstruación y economía feminista. Los datos pertenecen a la Encuesta Nacional de Gasto de los Hogares realizada entre noviembre de 2017 y noviembre de 2018 por el INDEC. Los datos se publicaron en 2020.

El objetivo es conocer la estructura de gastos de les argentines, motivo por el cual los montos y cantidades relevadas se transformaron para representar valores anuales. Por lo mismo, los datos en términos absolutos pierden vigencia con el paso del tiempo. Se busca realizar un informe que compare las estructuras de gasto en productos de gestión menstrual entre distintos hogares, y con respecto a otros gastos que los mismos realizan.

##Datasets y librerías

library(tidyverse)
library(readr)
library(kableExtra) 
library(magick)
library(scales)
library(rmarkdown)

gastos <- read_delim("engho2018_gastos.txt", "|")
personas <- read_delim("engho2018_personas.txt", "|")
hogares <- read_delim("engho2018_hogares.txt", "|")
articulos <- read_delim("engho2018_articulos.txt", "|")

Se intenta obtener información acerca de los gastos vinculados a hogares y también a personas, por lo que se busca unir las bases a través de sus claves primarias: id y miembro. Se deben obtener 901.804 observaciones (el total de observaciones de la base de gastos) con toda la información de hogares y personas. Se obtienen 225 variables (columnas)

##Unión entre las bases

base <- left_join(gastos, personas, by = c ("id", "miembro")) %>% 
  left_join(., hogares, by = "id") %>% 
  group_by(miembro, id) %>% 
  mutate(., 
         sexo = case_when(miembro == 0 ~ "hogar", 
                          cp13 == 1 ~ "Varón",
                          cp13 == 2 ~ "Mujer"))

La variable “sexo” presenta tres valores posibles: mujer, varón y un tercer valor que se corresponde con gastos realizados por algún miembro para uso común de todo el hogar. Bajo este valor dentro de la categoría sexo se encuentran todos los gastos en productos de gestión menstrual que se relevaron. Es por esto que de ahora en más se filtra el tipo de gasto computado como gasto de hogar. A su vez, en el caso de los registros de gastos realizados para satisfacer las necesidades comunes del hogar, la variable MIEMBRO tendrá valor 0. Este valor se renombra como hogar para poder ser identificado en el resto del trabajo.

A continuación se realiza un recorte en la base madre filtrando solo los productos de gestión menstrual (PGM). Estos sólo se encuentran entre aquellos registros cuyo sexo es “hogar”, es decir, comom gastos de uso común del hogar.

##Base Productos de Gestión Menstrual computados como gasto de hogar

base_pgm <- base %>%
  filter(sexo == "hogar") %>% 
  mutate(.,
        articulo = ifelse(articulo == "A1213108", "pgm", articulo)) %>% 
  filter(articulo == "pgm")

base_pgm_hogares <- base_pgm %>% 
  group_by(id) %>% 
  summarise(monto = sum(monto)) %>% 
  left_join(., hogares, by = "id") %>% 
   mutate(monto_pc = monto/cantmiem)

##Hogares que tienen gastos computados en PGM

h_pgm <- base_pgm_hogares %>% 
  summarise(sum(pondera))

E <- (858573/12642525)*100

M <- (1355/21547)*100

Se realizan distintas tablas con el gasto promedio de pgm computado para el hogar según variables de interés.

##Region

pgm_region <- base_pgm_hogares %>%
  mutate(.,
        region = case_when(region == 1 ~ "Metropolitana", #Renombro las regiones
                  region == 2 ~ "Pampeana",
                  region == 3 ~ "Noroeste",
                  region == 4 ~ "Noreste",
                  region == 5 ~ "Cuyo",
                  region == 6 ~ "Patagonia"))%>%
  group_by(region) %>% #Agrupo por región
  summarise(monto = weighted.mean(monto, pondera)) #Saco el promedio ponderado del monto por region

##Region per capita

region_pc <- base_pgm_hogares %>%
  mutate(.,
        region = case_when(region == 1 ~ "Metropolitana", #Renombro las regiones
                  region == 2 ~ "Pampeana",
                  region == 3 ~ "Noroeste",
                  region == 4 ~ "Noreste",
                  region == 5 ~ "Cuyo",
                  region == 6 ~ "Patagonia"))%>%
  group_by(region) %>% #Agrupo por región
  summarise(monto_pc = weighted.mean(monto_pc, pondera)) #Saco el promedio del monto por region

##Colores

colores <- c("#F94144", "#F3722C", "#F8961E", "#90BE6D", "#43AA8B", "#577590") #para regiones - discreta
paleta <- c("#641220", "#6E1423", "#85182A", "#A11D33", "#A71E34", "#B21E35", "#BD1F36", "#C71F37", "#DA1E37", "#E01E37") #para escalas continuas

##Tabla 1

indice_region <- pgm_region %>% 
  mutate(.,
         "Región" = region,
         Monto = round(monto, 1),
         indice = round(monto/first(monto) * 100),
         "Índice" = indice) %>% 
  arrange(desc(indice))

tabla_indice <- indice_region [, c (3, 4, 6)]

tabla1 <- tabla_indice %>%
 knitr::kable(align = "lcc", caption = "Fuente: Elaboración propia en base a ENGHo 2017/18 - INDEC", digits = 2) %>% 
 kable_styling(c ("responsive", "striped", "hovered"), full_width = F, position = "center",
                fixed_thead = T, font_size = 18) %>%
  column_spec(1, width = "5cm") %>% 
  column_spec(2, width = "5cm") %>% 
  column_spec(3, bold = TRUE, width = "5cm") %>% 
  row_spec(0, bold = T, color = "white", background = paleta[7]) 

print(tabla1)

## <table class="table table-responsive table-striped" style="font-size: 18px; width: auto !important; margin-left: auto; margin-right: auto;">
## <caption style="font-size: initial !important;">Fuente: Elaboración propia en base a ENGHo 2017/18 - INDEC</caption>
##  <thead>
##   <tr>
##    <th style="text-align:left;position: sticky; top:0; background-color: #FFFFFF;font-weight: bold;color: white !important;background-color: #BD1F36 !important;"> Región </th>
##    <th style="text-align:center;position: sticky; top:0; background-color: #FFFFFF;font-weight: bold;color: white !important;background-color: #BD1F36 !important;"> Monto </th>
##    <th style="text-align:center;position: sticky; top:0; background-color: #FFFFFF;font-weight: bold;color: white !important;background-color: #BD1F36 !important;"> Índice </th>
##   </tr>
##  </thead>
## <tbody>
##   <tr>
##    <td style="text-align:left;width: 5cm; "> Metropolitana </td>
##    <td style="text-align:center;width: 5cm; "> 330.0 </td>
##    <td style="text-align:center;width: 5cm; font-weight: bold;"> 122 </td>
##   </tr>
##   <tr>
##    <td style="text-align:left;width: 5cm; "> Patagonia </td>
##    <td style="text-align:center;width: 5cm; "> 322.4 </td>
##    <td style="text-align:center;width: 5cm; font-weight: bold;"> 119 </td>
##   </tr>
##   <tr>
##    <td style="text-align:left;width: 5cm; "> Cuyo </td>
##    <td style="text-align:center;width: 5cm; "> 270.4 </td>
##    <td style="text-align:center;width: 5cm; font-weight: bold;"> 100 </td>
##   </tr>
##   <tr>
##    <td style="text-align:left;width: 5cm; "> Pampeana </td>
##    <td style="text-align:center;width: 5cm; "> 264.7 </td>
##    <td style="text-align:center;width: 5cm; font-weight: bold;"> 98 </td>
##   </tr>
##   <tr>
##    <td style="text-align:left;width: 5cm; "> Noroeste </td>
##    <td style="text-align:center;width: 5cm; "> 241.4 </td>
##    <td style="text-align:center;width: 5cm; font-weight: bold;"> 89 </td>
##   </tr>
##   <tr>
##    <td style="text-align:left;width: 5cm; "> Noreste </td>
##    <td style="text-align:center;width: 5cm; "> 227.2 </td>
##    <td style="text-align:center;width: 5cm; font-weight: bold;"> 84 </td>
##   </tr>
## </tbody>
## </table>

Gráficos que se desprenden del ejercicio anterior:

##Grafico por regiones

ggplot(data = pgm_region, aes(x = reorder(region, monto), y = monto, fill = region)) + 
  scale_fill_manual(values = colores)+
  geom_bar(stat="identity", position="dodge")+
  labs( x = "", y = "Monto en pesos", fill = "Región",
  title ="",
  caption = "Elaboración propia en base a INDEC. Encuesta Nacional de Gastos de los Hogares 2017-2018.")+
  guides(fill = "none")+
  theme_minimal()+
  theme(text = element_text(size = 14))

##Grafico por regiones per capita

ggplot(data = region_pc, aes(x = reorder(region, monto_pc), y = monto_pc, fill = region)) + 
  scale_fill_manual(values = colores)+
  geom_bar(stat = "identity", position = "dodge")+
  labs( x = "", y = "Monto en pesos", fill = "Región",
  title ="",
  caption = "Elaboración propia en base a INDEC. Encuesta Nacional de Gastos de los Hogares 2017-2018.")+
  guides(fill = "none")+
  theme_minimal()+
  theme(text = element_text(size = 14))

##Comparación con las CBT regionales

cbt18 <- read_delim(file = "cbt_2018.csv", ",")

canastas <- left_join(pgm_region, cbt18, by = "region") %>% 
  mutate(.,
        "Región" = region,
        Monto = round(monto, 1),
        "Canasta Promedio" = round(canprom18, 1),
        proporcion = (paste(round(monto/canprom18, digits = 4)*100, "%")),
        "Proporción PGM" = proporcion) %>% 
  arrange(desc(proporcion))


#Tabla 2
tabla_canastas <- canastas [, c(16, 17, 18, 20)]

tabla2 <- tabla_canastas %>%
 knitr::kable(align = "lccc", caption = "Fuente: Elaboración propia en base a ENGHo 2017/18 - INDEC", digits = 2) %>% 
 kable_styling(c ("striped", "hover", "responsive"), full_width = F, position = "center",
                fixed_thead = T, font_size = 18) %>%
  column_spec(1, width = "4cm") %>% 
  column_spec(2, width = "2cm") %>% 
  column_spec(3, width = "6cm") %>% 
  column_spec(4, bold = TRUE, width = "6cm") %>% 
  row_spec(0, bold = T, color = "white", background = paleta[7]) 

print(tabla2)

## <table class="table table-striped table-hover table-responsive" style="font-size: 18px; width: auto !important; margin-left: auto; margin-right: auto;">
## <caption style="font-size: initial !important;">Fuente: Elaboración propia en base a ENGHo 2017/18 - INDEC</caption>
##  <thead>
##   <tr>
##    <th style="text-align:left;position: sticky; top:0; background-color: #FFFFFF;font-weight: bold;color: white !important;background-color: #BD1F36 !important;"> Región </th>
##    <th style="text-align:center;position: sticky; top:0; background-color: #FFFFFF;font-weight: bold;color: white !important;background-color: #BD1F36 !important;"> Monto </th>
##    <th style="text-align:center;position: sticky; top:0; background-color: #FFFFFF;font-weight: bold;color: white !important;background-color: #BD1F36 !important;"> Canasta Promedio </th>
##    <th style="text-align:center;position: sticky; top:0; background-color: #FFFFFF;font-weight: bold;color: white !important;background-color: #BD1F36 !important;"> Proporción PGM </th>
##   </tr>
##  </thead>
## <tbody>
##   <tr>
##    <td style="text-align:left;width: 4cm; "> Metropolitana </td>
##    <td style="text-align:center;width: 2cm; "> 330.0 </td>
##    <td style="text-align:center;width: 6cm; "> 5883.1 </td>
##    <td style="text-align:center;width: 6cm; font-weight: bold;"> 5.61 % </td>
##   </tr>
##   <tr>
##    <td style="text-align:left;width: 4cm; "> Noroeste </td>
##    <td style="text-align:center;width: 2cm; "> 241.4 </td>
##    <td style="text-align:center;width: 6cm; "> 4739.0 </td>
##    <td style="text-align:center;width: 6cm; font-weight: bold;"> 5.09 % </td>
##   </tr>
##   <tr>
##    <td style="text-align:left;width: 4cm; "> Cuyo </td>
##    <td style="text-align:center;width: 2cm; "> 270.4 </td>
##    <td style="text-align:center;width: 6cm; "> 5584.7 </td>
##    <td style="text-align:center;width: 6cm; font-weight: bold;"> 4.84 % </td>
##   </tr>
##   <tr>
##    <td style="text-align:left;width: 4cm; "> Patagonia </td>
##    <td style="text-align:center;width: 2cm; "> 322.4 </td>
##    <td style="text-align:center;width: 6cm; "> 6886.9 </td>
##    <td style="text-align:center;width: 6cm; font-weight: bold;"> 4.68 % </td>
##   </tr>
##   <tr>
##    <td style="text-align:left;width: 4cm; "> Noreste </td>
##    <td style="text-align:center;width: 2cm; "> 227.2 </td>
##    <td style="text-align:center;width: 6cm; "> 4908.2 </td>
##    <td style="text-align:center;width: 6cm; font-weight: bold;"> 4.63 % </td>
##   </tr>
##   <tr>
##    <td style="text-align:left;width: 4cm; "> Pampeana </td>
##    <td style="text-align:center;width: 2cm; "> 264.7 </td>
##    <td style="text-align:center;width: 6cm; "> 5835.2 </td>
##    <td style="text-align:center;width: 6cm; font-weight: bold;"> 4.54 % </td>
##   </tr>
## </tbody>
## </table>

##Actualización por IPC hogares enero 2018-octubre 2020

ipc_enero18 <- read_delim(file = "ipc_enero18.csv", ",")

ipc_h_0118 <- left_join(pgm_region, ipc_enero18, by = "region") %>% 
  mutate(vp_ng = (monto*ipc_ng),
         vp_bsv = (monto*ipc_bsv),
         vp_cp = (monto*ipc_cp),
         anual = (vp_cp*12),
         prom_nacional = (mean(anual)))

ipc_ene18h <- ipc_h_0118 [, c(1, 2, 6, 9, 10)]

##Actualizacion por IPC per capita

ipc_ene18pc <- left_join(region_pc, ipc_enero18, by = "region") %>% 
  mutate(vp_ng = (monto_pc*ipc_ng),
        vp_bsv = (monto_pc*ipc_bsv),
        vp_cp = (monto_pc*ipc_cp),
        anual_pc = (vp_cp*12),
        prom_nacional = (mean(anual_pc)))

ipc_pcene18 <- ipc_ene18pc [, c(1, 2, 8, 9, 10)]

ipc_mayo18 <- read_delim(file = "ipc_mayo18.csv", ",")  

ipc_mayo18pc <- left_join(region_pc, ipc_mayo18, by = "region") %>% 
  mutate(vp_ng = (monto_pc*ipc_ng1),
        vp_bsv = (monto_pc*ipc_bsv1),
        vp_cp = (monto_pc*ipc_cp1),
        anual_pc = (vp_cp*12),
        prom_nacional = (mean(anual_pc)))

tabla_ipc <- ipc_mayo18pc %>% 
  mutate(.,
         "Región" = region,
         "Monto per cápita" = round(monto_pc, 2),
         "Valor presente mensual" = round(vp_cp, 2),
          vp_anual = round(anual_pc, 2),
         "Valor presente anual" = vp_anual,
         "Promedio nacional anual" = round(prom_nacional, 2)) %>% 
  arrange(desc(vp_anual))

tabla_ipc <- tabla_ipc [, c(11:13, 15, 16)]

ipc_nov18 <- read_delim(file = "ipc_nov18.csv", ",")  

ipc_nov18pc <- left_join(region_pc, ipc_nov18, by = "region") %>% 
  mutate(vp_ng = (monto_pc*ipc_ng2),
        vp_bsv = (monto_pc*ipc_bsv2),
        vp_cp = (monto_pc*ipc_cp2),
        anual_pc = (vp_cp*12),
        prom_nacional = (mean(anual_pc)))

ipc_pcnov18 <- ipc_nov18pc [, c(1, 2, 8, 9, 10)]

##Tabla 3

tabla3 <- tabla_ipc %>%
 knitr::kable(align = "lcccc", caption = "Fuente: Elaboración propia en base a ENGHo 2017/18 - INDEC") %>% 
 kable_styling(c ("striped", "hover", "responsive"), full_width = F, position = "center",
                fixed_thead = T, font_size = 20) %>%
  column_spec(1, width = "4cm") %>% 
  column_spec(2, width = "4cm") %>% 
  column_spec(3, width = "4cm") %>% 
  column_spec(4, bold = TRUE, width = "4cm") %>%
  column_spec(5, width = "4cm") %>% 
  row_spec(0, bold = T, color = "white", background = paleta[7]) 

print(tabla3)

## <table class="table table-striped table-hover table-responsive" style="font-size: 20px; width: auto !important; margin-left: auto; margin-right: auto;">
## <caption style="font-size: initial !important;">Fuente: Elaboración propia en base a ENGHo 2017/18 - INDEC</caption>
##  <thead>
##   <tr>
##    <th style="text-align:left;position: sticky; top:0; background-color: #FFFFFF;font-weight: bold;color: white !important;background-color: #BD1F36 !important;"> Región </th>
##    <th style="text-align:center;position: sticky; top:0; background-color: #FFFFFF;font-weight: bold;color: white !important;background-color: #BD1F36 !important;"> Monto per cápita </th>
##    <th style="text-align:center;position: sticky; top:0; background-color: #FFFFFF;font-weight: bold;color: white !important;background-color: #BD1F36 !important;"> Valor presente mensual </th>
##    <th style="text-align:center;position: sticky; top:0; background-color: #FFFFFF;font-weight: bold;color: white !important;background-color: #BD1F36 !important;"> Valor presente anual </th>
##    <th style="text-align:center;position: sticky; top:0; background-color: #FFFFFF;font-weight: bold;color: white !important;background-color: #BD1F36 !important;"> Promedio nacional anual </th>
##   </tr>
##  </thead>
## <tbody>
##   <tr>
##    <td style="text-align:left;width: 4cm; "> Metropolitana </td>
##    <td style="text-align:center;width: 4cm; "> 127.90 </td>
##    <td style="text-align:center;width: 4cm; "> 447.64 </td>
##    <td style="text-align:center;width: 4cm; font-weight: bold;"> 5371.63 </td>
##    <td style="text-align:center;width: 4cm; "> 4097.97 </td>
##   </tr>
##   <tr>
##    <td style="text-align:left;width: 4cm; "> Patagonia </td>
##    <td style="text-align:center;width: 4cm; "> 104.82 </td>
##    <td style="text-align:center;width: 4cm; "> 377.35 </td>
##    <td style="text-align:center;width: 4cm; font-weight: bold;"> 4528.20 </td>
##    <td style="text-align:center;width: 4cm; "> 4097.97 </td>
##   </tr>
##   <tr>
##    <td style="text-align:left;width: 4cm; "> Pampeana </td>
##    <td style="text-align:center;width: 4cm; "> 102.08 </td>
##    <td style="text-align:center;width: 4cm; "> 357.29 </td>
##    <td style="text-align:center;width: 4cm; font-weight: bold;"> 4287.51 </td>
##    <td style="text-align:center;width: 4cm; "> 4097.97 </td>
##   </tr>
##   <tr>
##    <td style="text-align:left;width: 4cm; "> Cuyo </td>
##    <td style="text-align:center;width: 4cm; "> 91.07 </td>
##    <td style="text-align:center;width: 4cm; "> 327.84 </td>
##    <td style="text-align:center;width: 4cm; font-weight: bold;"> 3934.06 </td>
##    <td style="text-align:center;width: 4cm; "> 4097.97 </td>
##   </tr>
##   <tr>
##    <td style="text-align:left;width: 4cm; "> Noroeste </td>
##    <td style="text-align:center;width: 4cm; "> 75.16 </td>
##    <td style="text-align:center;width: 4cm; "> 270.57 </td>
##    <td style="text-align:center;width: 4cm; font-weight: bold;"> 3246.84 </td>
##    <td style="text-align:center;width: 4cm; "> 4097.97 </td>
##   </tr>
##   <tr>
##    <td style="text-align:left;width: 4cm; "> Noreste </td>
##    <td style="text-align:center;width: 4cm; "> 72.51 </td>
##    <td style="text-align:center;width: 4cm; "> 268.30 </td>
##    <td style="text-align:center;width: 4cm; font-weight: bold;"> 3219.58 </td>
##    <td style="text-align:center;width: 4cm; "> 4097.97 </td>
##   </tr>
## </tbody>
## </table>

##Quintiles de gasto por hogar

quintiles <- base_pgm_hogares %>% 
  group_by(quigapht) %>% 
  summarise(monto = weighted.mean(monto, pondera))

#Grafico
ggplot(data = quintiles, aes(x = reorder(quigapht, monto), y = monto, fill = quigapht)) + 
  scale_fill_gradient(low = "#641220", high = "#E01E37")+
  geom_bar(stat="identity", position="stack")+
  labs( x = "", y = "Monto en pesos", fill = "Quintil de Gasto",
  title ="",
  caption = "Elaboración propia en base a INDEC. Encuesta Nacional de Gastos de los Hogares 2017-2018.")+
  guides(fill = "none")+
  theme_minimal()+
  theme(text = element_text(size = 14))

##Quintiles per cápita

quintiles_pc <- base_pgm_hogares %>% 
  group_by(quigapht) %>% 
  summarise(monto_pc = weighted.mean(monto_pc, pondera))

#Grafico
ggplot(data = quintiles_pc, aes(x = reorder(quigapht, monto_pc), y = monto_pc, fill = quigapht)) +
  scale_fill_gradient(low = "#641220", high = "#E01E37")+
  geom_bar(stat="identity", position="stack")+
  labs( x = "", y = "Monto en pesos", fill = "Quintil de Gasto",
  title ="",
  caption = "Elaboración propia en base a INDEC. Encuesta Nacional de Gastos de los Hogares 2017-2018.")+
  guides(fill = "none")+
  theme_minimal()+
  theme(text = element_text(size = 14))

##Clima Educativo

clima <- base_pgm_hogares %>% 
  mutate(.,
        clima_educativo = case_when(clima_educativo == 1 ~ "Muy bajo",
                                    clima_educativo == 2 ~ "Bajo",
                                    clima_educativo == 3 ~ "Medio",
                                    clima_educativo == 4 ~ "Alto",
                                    clima_educativo == 5 ~ "Muy alto")) %>%
  group_by(clima_educativo)%>% 
  summarise(monto = weighted.mean(monto, pondera))

#Grafico
ggplot(data = clima, aes(x = reorder(clima_educativo, monto), y = monto, fill = clima_educativo)) + 
  geom_bar(stat = "identity", position = "dodge")+
  scale_fill_manual(values = colores)+
  labs( x = "", y = "Monto en pesos", fill = "Clima educativo del Hogar",
  title ="",
  caption = "Elaboración propia en base a INDEC. Encuesta Nacional de Gastos de los Hogares 2017-2018.")+
  guides(fill = "none")+
  theme_minimal()+
  theme(text = element_text(size = 14))