ESdata is an R package that allows access to statistical
information from Spain structured as ordered data (tidy data).
If you have previously installed the devtools package, you
can install the package from Github by doing:
devtools::install_github("jmsallan/ESdata")
Once the package is installed, we can access the data by doing:
library(ESdata)
The package allows access to a set of data structured in data
frames, with the only dependency on having a version of R equal or
higher than 3.5.0. A good alternative to explore this data
is to use the tidyverse packages:
library(tidyverse)
In its current version, the package has information related to:
In this document we will show the information available on the Consumer Price Index (IPC).
The Consumer Price Index or Índice de Precios al Consumo (IPC) is an estimator of the prices of consumer goods and services in an economy. It is obtained from the prices of a family basket representative of the consumption habits of families at a given moment. The prices of the goods in the family basket are evaluated periodically to obtain an approximated evaluation of prices. Therefore, the maintenance of an IPC index requires:
The IPC can be presented as an index, as the variation of prices with respect to the previous period, or variation with respect to the IPC values of a year ago. With this last value, the seasonality of some of the prices with which the IPC is calculated is corrected.
The Instituto Nacional de Estadística (INE) provides monthly
updates of IPC data, and updates the family basket every year. The IPC
data incorporated into ESdata have been obtained from the
tables at https://www.ine.es/dynt3/inebase/es/index.htm?padre=3470&capsel=3466.
The standardized methodological report can be accessed from https://www.ine.es/dynt3/metadatos/es/RespuestaDatos.html?oe=30138.
The purchase price of housing is not part of the IPC, since housing
is considered an investment, and not a consumer good. The INE maintains
a Housing Price Index or Índice de Precios de la
Vivienda (IPV), in which the prices of new and second-hand housing
are weighted. The IPV values that have been incorporated into
ESdata have been obtained from https://www.ine.es/dyngs/INEbase/es/operacion.htm?c=Estadistica_C&cid=1254736152838&menu=ultiDatos&idp=1254735976607.
The methodological report for the IPV can be accessed from https://www.ine.es/dynt3/metadatos/es/RespuestaDatos.html?oe=30457.
The information on the IPC from the INE is incorporated into
ESdata through tables on the composition and weighting of
the family basket, and IPC tables.
The elements that make up the family basket are found in the
ipc_clas_grupos table, which shows how the goods that make
up the family basket are added in homogeneous groups. The classification
levels, from largest to smallest, are:
grupos, coded from G01 to
G11.subgrupos, coded from G011 to
G127.clases, coded from G0111 to
G1270.subclases, coded from G01111 to
G01111.grupos_especiales, coded from
GE01 to GE29.rubbricas, coded from R01 to
R57.In ipc_pond you can find the weighting of each of the
groups for the calculation of the IPC in each of the autonomous
communities and the whole of Spain for each year.
I have structured the information on IPC provided by the INE in the following tables:
ipc_ccaa contains IPC information at the group level
for Spain and the Autonomous Communities since 2002.ipc_clasif contains information on the IPC for Spain
for all classification levels since 2002.ipc_hist contains the historical series of the Spanish
IPC since 1961.ipc_hist_grupos contains IPC information for Spain at
the group level since 1993.Table variables are defined as follows:
periodo: the last day of the month of the data
corresponding to the row, in date format Date.region: the ISO 3166-2 code of Spain and its autonomous
communities (see ccaa_iso for the official name)nivel of data aggregation of each of the
rows.grupo: the code of the aggregation of products
for which the price evolution is displayed. See
ipc_clas_grupos for the name of each group.dato, which can be the index
indice, monthly variation mensual, yearly
variation anual and cumulative yearly variation
acumulado.valor, in index or in percentage.The following graph shows the value of year-on-year inflation for the
IPC historical series. The data from ipc_hist has been
filtered to obtain the year-on-year inflation value.
ipc_hist %>% filter(dato == "anual") %>%
mutate(valor = valor/100) %>%
ggplot(aes(periodo, valor)) +
geom_line(size=1.3) +
labs(title = "Evolución de la inflación de España (serie histórica)", x="tiempo", y="inflación interanual") +
scale_y_continuous(labels = scales::percent) +
theme_bw()
To display the data in percentage, I have divided the original data
by 100 with mutate and used
scale_y_continuous(labels = scales::percent).
In this graph we use ipc_hist_groups to compare the
evolution of leisure and culture prices with the general evolution of
prices:
ipc_hist_grupos %>% filter(dato == "anual", grupo %in% c("general", "G09")) %>%
mutate(valor=valor/100) %>%
ggplot(aes(periodo, valor, col=grupo)) +
geom_line(size=1.2) +
scale_color_manual(name="índice", labels=c("ocio y cultura", "general"), values = c("#FF4500", "#606060")) +
labs(title = "Evolución de la inflación del ocio y cultura en España", x="tiempo", y="inflación interanual") +
scale_y_continuous(labels = scales::percent) +
theme(legend.position = c(0.1, 0.15),
panel.background = element_rect(fill="#F5F5DC"),
legend.background = element_rect(fill="#F5F5DC"),
legend.key = element_rect(fill="#F5F5DC"))
In this graph I have inserted the legend inside the graph so that it occupies the available space, and I have changed the color of the background of the graph and of the legend.
The following graph shows the weighting of the IPC groups as a pie
chart. ggplot does not provide these types of plots by
default, but we can use coord_polar to generate a pie
chart. To distinguish the twelve IPC groups, I have used a diverging
Brewer scale.
ipc_pond %>% filter(periodo==2020 & region=="ES") %>%
ggplot(aes(x="", y=valor, fill=grupo)) +
scale_fill_brewer(palette = "Paired") +
geom_bar(stat = "Identity") +
coord_polar("y", start = 0) +
theme_void()
The following table shows the twelve IPC codes:
| codigo | nombre |
|---|---|
| G01 | Alimentos y bebidas no alcohólicas |
| G02 | Bebidas alcohólicas y tabaco |
| G03 | Vestido y calzado |
| G04 | Vivienda, agua, electricidad, gas y otros combustibles |
| G05 | Muebles, artículos del hogar y artículos para el mantenimiento corriente del hogar |
| G06 | Sanidad |
| G07 | Transporte |
| G08 | Comunicaciones |
| G09 | Ocio y cultura |
| G10 | Enseñanza |
| G11 | Restaurantes y hoteles |
| G12 | Otros bienes y servicios |
From the graph we see that the groups with the highest weighting are food (G01), transport (G07) and restaurants and hotels (G11).
In this chart I show the evolution of the IPC index for Catalonia and the whole of Spain, using a line chart with colors defined specifically for the chart..
ipc_ccaa %>% filter(region %in% c("ES", "ES-CT") & dato=="indice" & grupo=="general") %>%
ggplot(aes(periodo, valor, col=region)) +
geom_line(size=1.2) +
scale_color_manual(name="indice", labels=c("España", "Cataluña"), values = c("#606060", "#A0522D")) +
labs(title="IPC de Cataluña y España", x="", y="índice") +
theme(legend.position = c(0.8, 0.15),
panel.background = element_rect(fill="#FAEBD7"),
legend.background = element_rect(fill="#FAEBD7"),
legend.key = element_rect(fill="#FAEBD7"))
If you have any proposals or have detected any errors, you can make a pull request at: https://github.com/jmsallan/ESdata