Al final de este tema, tendremos ideas claras acerca de los siguientes temas:
El perfilamiento de datos es el proceso de revisión exhaustiva de las fuentes de información que utilizamos en un proyecto de análisis de datos, al comprender su estructura, contenido e interrelaciones, identificando su potencial o problemáticas.
R como herramienta para limpiar los datos, reduce (o evita) costos de licenciamiento y es un potente lenguaje para realizar análisis estadístico debido a su gran ecosistema de paquetes especializados; además, es un lenguaje rápido de aprender y bien documentado.
Se puede automatizar fácilmente, gracias a la creación de scripts que automatizan procesos; por ejemplo, leer datos o hacer operaciones con los datos y hacerlo siempre de forma automática.
Puede leer prácticamente cualquier tipo de datos estructurados (filas y columnas): Excel (csv, xlsx, txt), SAS, Stata, SPSS, SQL, etc; y datos no estructurados también: texto, audios, imágenes y videos. Para más información puede consultar aquí.
Hasta cierto punto, es compatible con grandes conjuntos de datos. Para más información puede consultar aquí
Tiene capacidades avanzadas de gráficos, por lo que nos permite realizar gráficos (Top 50 ggplot2 Visualizations ) y dashboards (shiny ) de forma que podamos presentar los resultados de forma vistosa.
Mejora su funcionalidad constantemente, ya que tiene detrás una comunidad bastante grande que crea nuevas funciones, corrige bugs y, sobre todo, documenta muy bien todo lo que va haciendo, de forma que la utilización de todas las funciones y métodos sea fácil a nivel de usuario. Para más información puede consultar aquí.
1- En la primera ventana tenemos un script (.R), dónde podemos escribir nuestro código, ejecutarlo y guardarlo.
2- Debajo tenemos la consola, la cual muestra los resultados del código que vayamos ejecutando, y dónde podemos hacer pruebas.
3- En la tercera pantalla, tenemos nuestro directorio de trabajo con las tablas, funciones y las variables que vayamos creando.
4- Y en la esquina inferior derecha encontramos los gráficos que creamos, los paquetes que tengamos instalados, los ficheros y una ventana de ayuda.
Hoy por hoy existen varias técnicas estadísticas que proporcionan información sobre las características cualitativas de los datos, siendo unos de sus objetivos principales el descubrir y validar los metadatos del conjunto entre manos.
Algunos usos comunes del perfilamiento de datos incluyen:
En cualquiera de ellos, el perfilamiento permite descubrir problemas en los orígenes de los datos y sus posibles causas (por ejemplo, error humano, corrupción de datos, etc.) y qué se necesita hacer para corregirlo al integrarlos.
En un proceso de perfilamiento de datos se toman las siguientes consideraciones:
Situaciones comunes que vamos a encontrar (hay muchas más):
Confundir Data Profiling (perfilamiento de los datos) con Data Quality (calidad de los datos) es un error muy frecuente, cada uno se ocupa de etapas diferentes cuando hablamos de gestión de datos. Más adelante profundizaremos en el concepto de calidad de datos para entenderlo mejor.
El perfilamiento es una etapa de diagnóstico y análisis dónde se determinan los requerimientos de calidad, para hablar de calidad de datos debemos evaluar el estado de nuestros datos con respecto a un parámetro deseado y cuantificarlo.
En general la etapa de Data Quality es posterior a la de Data Profiling.
A continuación se adjunta un diagrama del ciclo de vida de la Calidad de Datos:
Tidyverse es una librería que resume la mayor parte de las tareas que tiene que realizar un data-scientist. Se trata de una aportación de uno de los mayores gurúes de R: Hadley Wicham.
Consta de librerías para la minería de datos como podemos ver a continuación:
ggplot2.– Es la librería más famosa. Se trata de una gramática de gráficos para explorar datos y comunicar las conclusiones. Permite seleccionar y filtrar los datos, las diferentes geometrías, escalas, coordenadas, divisiones, zooms, etc.
dplyr.– Es la segunda librería más famosa. Creada para transformar los datos, vendría a ser el equivalente a un lenguaje SQL, e incluye sus mismas funcionalidades.
readr.– Es una librería de lectura de diferentes fuentes de datos. Su ventaja sobre las otras librerías de lectura de R es que permite integrarse perfectamente con las otras dos librerías anteriores, mediante la concatenación de órdenes: %>%. (pipes)
purrr.– Es una librería que permite explotar una de las grandes funcionalidades de R : la vectorización. Para explicarlo, a continuación un ejemplo:
Instalamos únicamente tidyverse!
library(tidyverse)
library(purrr) # cargamos la librería
mtcars %>% str()
## 'data.frame': 32 obs. of 11 variables:
## $ mpg : num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
## $ cyl : num 6 6 4 6 8 6 8 4 4 6 ...
## $ disp: num 160 160 108 258 360 ...
## $ hp : num 110 110 93 110 175 105 245 62 95 123 ...
## $ drat: num 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
## $ wt : num 2.62 2.88 2.32 3.21 3.44 ...
## $ qsec: num 16.5 17 18.6 19.4 17 ...
## $ vs : num 0 0 1 1 0 1 0 1 1 1 ...
## $ am : num 1 1 1 0 0 0 0 0 0 0 ...
## $ gear: num 4 4 4 3 3 3 3 4 4 4 ...
## $ carb: num 4 4 1 1 2 1 4 2 2 4 ...
mtcars %>% #elegimos el conjunto de datos a trabajar
split(.$cyl) %>% # dividimos los datos según los distintos valores del campo cyl
map(~ lm(mpg ~ wt, data = .)) %>% # realizamos una regresión lineal para cada subconjunto
map(summary) %>% # sacamos el coeficiente de determinación para cada subconjunto, o sea, cuando cyl vale 4, 6, 8
map_dbl("r.squared")
## 4 6 8
## 0.5086326 0.4645102 0.4229655
Existen tres tipos principales de análisis:
Además, es clave el conocimiento de nuestro negocio, como se comportan los datos según la industria, empresa y sistemas en los que nos movemos, pues hay un análisis adicional (en muchos casos no formal) relacionado con la validación de reglas de datos, en donde se verifica la conformidad de las instancias y los conjuntos de datos con reglas predefinidas. Por ejemplo:
Ejemplos:
# con este símbolo podemos incorporar comentarios en nuestro código
comunidades <- read.csv('https://storage.googleapis.com/datasets-academy/Profiling/Data/communities.csv' , sep=',', na.strings ="?") # lectura de un archivo csv
str(comunidades) # Muestra de forma compacta la estructura de un objeto R
## 'data.frame': 1994 obs. of 128 variables:
## $ state : int 8 53 24 34 42 6 44 6 21 29 ...
## $ county : int NA NA NA 5 95 NA 7 NA NA NA ...
## $ community : int NA NA NA 81440 6096 NA 41500 NA NA NA ...
## $ communityname : Factor w/ 1828 levels "Aberdeencity",..: 796 1626 2 1788 142 1520 840 1462 669 288 ...
## $ fold : int 1 1 1 1 1 1 1 1 1 1 ...
## $ population : num 0.19 0 0 0.04 0.01 0.02 0.01 0.01 0.03 0.01 ...
## $ householdsize : num 0.33 0.16 0.42 0.77 0.55 0.28 0.39 0.74 0.34 0.4 ...
## $ racepctblack : num 0.02 0.12 0.49 1 0.02 0.06 0 0.03 0.2 0.06 ...
## $ racePctWhite : num 0.9 0.74 0.56 0.08 0.95 0.54 0.98 0.46 0.84 0.87 ...
## $ racePctAsian : num 0.12 0.45 0.17 0.12 0.09 1 0.06 0.2 0.02 0.3 ...
## $ racePctHisp : num 0.17 0.07 0.04 0.1 0.05 0.25 0.02 1 0 0.03 ...
## $ agePct12t21 : num 0.34 0.26 0.39 0.51 0.38 0.31 0.3 0.52 0.38 0.9 ...
## $ agePct12t29 : num 0.47 0.59 0.47 0.5 0.38 0.48 0.37 0.55 0.45 0.82 ...
## $ agePct16t24 : num 0.29 0.35 0.28 0.34 0.23 0.27 0.23 0.36 0.28 0.8 ...
## $ agePct65up : num 0.32 0.27 0.32 0.21 0.36 0.37 0.6 0.35 0.48 0.39 ...
## $ numbUrban : num 0.2 0.02 0 0.06 0.02 0.04 0.02 0 0.04 0.02 ...
## $ pctUrban : num 1 1 0 1 0.9 1 0.81 0 1 1 ...
## $ medIncome : num 0.37 0.31 0.3 0.58 0.5 0.52 0.42 0.16 0.17 0.54 ...
## $ pctWWage : num 0.72 0.72 0.58 0.89 0.72 0.68 0.5 0.44 0.47 0.59 ...
## $ pctWFarmSelf : num 0.34 0.11 0.19 0.21 0.16 0.2 0.23 1 0.36 0.22 ...
## $ pctWInvInc : num 0.6 0.45 0.39 0.43 0.68 0.61 0.68 0.23 0.34 0.86 ...
## $ pctWSocSec : num 0.29 0.25 0.38 0.36 0.44 0.28 0.61 0.53 0.55 0.42 ...
## $ pctWPubAsst : num 0.15 0.29 0.4 0.2 0.11 0.15 0.21 0.97 0.48 0.02 ...
## $ pctWRetire : num 0.43 0.39 0.84 0.82 0.71 0.25 0.54 0.41 0.43 0.31 ...
## $ medFamInc : num 0.39 0.29 0.28 0.51 0.46 0.62 0.43 0.15 0.21 0.85 ...
## $ perCapInc : num 0.4 0.37 0.27 0.36 0.43 0.72 0.47 0.1 0.23 0.89 ...
## $ whitePerCap : num 0.39 0.38 0.29 0.4 0.41 0.76 0.44 0.12 0.23 0.94 ...
## $ blackPerCap : num 0.32 0.33 0.27 0.39 0.28 0.77 0.4 0.08 0.19 0.11 ...
## $ indianPerCap : num 0.27 0.16 0.07 0.16 0 0.28 0.24 0.17 0.1 0.09 ...
## $ AsianPerCap : num 0.27 0.3 0.29 0.25 0.74 0.52 0.86 0.27 0.26 0.33 ...
## $ OtherPerCap : num 0.36 0.22 0.28 0.36 0.51 0.48 0.24 0.18 0.29 0.17 ...
## $ HispPerCap : num 0.41 0.35 0.39 0.44 0.48 0.6 0.36 0.21 0.22 0.8 ...
## $ NumUnderPov : num 0.08 0.01 0.01 0.01 0 0.01 0.01 0.03 0.04 0 ...
## $ PctPopUnderPov : num 0.19 0.24 0.27 0.1 0.06 0.12 0.11 0.64 0.45 0.11 ...
## $ PctLess9thGrade : num 0.1 0.14 0.27 0.09 0.25 0.13 0.29 0.96 0.52 0.04 ...
## $ PctNotHSGrad : num 0.18 0.24 0.43 0.25 0.3 0.12 0.41 0.82 0.59 0.03 ...
## $ PctBSorMore : num 0.48 0.3 0.19 0.31 0.33 0.8 0.36 0.12 0.17 1 ...
## $ PctUnemployed : num 0.27 0.27 0.36 0.33 0.12 0.1 0.28 1 0.55 0.11 ...
## $ PctEmploy : num 0.68 0.73 0.58 0.71 0.65 0.65 0.54 0.26 0.43 0.44 ...
## $ PctEmplManu : num 0.23 0.57 0.32 0.36 0.67 0.19 0.44 0.43 0.59 0.2 ...
## $ PctEmplProfServ : num 0.41 0.15 0.29 0.45 0.38 0.77 0.53 0.34 0.36 1 ...
## $ PctOccupManu : num 0.25 0.42 0.49 0.37 0.42 0.06 0.33 0.71 0.64 0.02 ...
## $ PctOccupMgmtProf : num 0.52 0.36 0.32 0.39 0.46 0.91 0.49 0.18 0.29 0.96 ...
## $ MalePctDivorce : num 0.68 1 0.63 0.34 0.22 0.49 0.25 0.38 0.62 0.3 ...
## $ MalePctNevMarr : num 0.4 0.63 0.41 0.45 0.27 0.57 0.34 0.47 0.26 0.85 ...
## $ FemalePctDiv : num 0.75 0.91 0.71 0.49 0.2 0.61 0.28 0.59 0.66 0.39 ...
## $ TotalPctDiv : num 0.75 1 0.7 0.44 0.21 0.58 0.28 0.52 0.67 0.36 ...
## $ PersPerFam : num 0.35 0.29 0.45 0.75 0.51 0.44 0.42 0.78 0.37 0.31 ...
## $ PctFam2Par : num 0.55 0.43 0.42 0.65 0.91 0.62 0.77 0.45 0.51 0.65 ...
## $ PctKids2Par : num 0.59 0.47 0.44 0.54 0.91 0.69 0.81 0.43 0.55 0.73 ...
## $ PctYoungKids2Par : num 0.61 0.6 0.43 0.83 0.89 0.87 0.79 0.34 0.58 0.78 ...
## $ PctTeen2Par : num 0.56 0.39 0.43 0.65 0.85 0.53 0.74 0.34 0.47 0.67 ...
## $ PctWorkMomYoungKids : num 0.74 0.46 0.71 0.85 0.4 0.3 0.57 0.29 0.65 0.72 ...
## $ PctWorkMom : num 0.76 0.53 0.67 0.86 0.6 0.43 0.62 0.27 0.64 0.71 ...
## $ NumIlleg : num 0.04 0 0.01 0.03 0 0 0 0.02 0.02 0 ...
## $ PctIlleg : num 0.14 0.24 0.46 0.33 0.06 0.11 0.13 0.5 0.29 0.07 ...
## $ NumImmig : num 0.03 0.01 0 0.02 0 0.04 0.01 0.02 0 0.01 ...
## $ PctImmigRecent : num 0.24 0.52 0.07 0.11 0.03 0.3 0 0.5 0.12 0.41 ...
## $ PctImmigRec5 : num 0.27 0.62 0.06 0.2 0.07 0.35 0.02 0.59 0.09 0.44 ...
## $ PctImmigRec8 : num 0.37 0.64 0.15 0.3 0.2 0.43 0.02 0.65 0.07 0.52 ...
## $ PctImmigRec10 : num 0.39 0.63 0.19 0.31 0.27 0.47 0.1 0.59 0.13 0.48 ...
## $ PctRecentImmig : num 0.07 0.25 0.02 0.05 0.01 0.5 0 0.69 0 0.22 ...
## $ PctRecImmig5 : num 0.07 0.27 0.02 0.08 0.02 0.5 0.01 0.72 0 0.21 ...
## $ PctRecImmig8 : num 0.08 0.25 0.04 0.11 0.04 0.56 0.01 0.71 0 0.22 ...
## $ PctRecImmig10 : num 0.08 0.23 0.05 0.11 0.05 0.57 0.03 0.6 0 0.19 ...
## $ PctSpeakEnglOnly : num 0.89 0.84 0.88 0.81 0.88 0.45 0.73 0.12 0.99 0.85 ...
## $ PctNotSpeakEnglWell : num 0.06 0.1 0.04 0.08 0.05 0.28 0.05 0.93 0.01 0.03 ...
## $ PctLargHouseFam : num 0.14 0.16 0.2 0.56 0.16 0.25 0.12 0.74 0.12 0.09 ...
## $ PctLargHouseOccup : num 0.13 0.1 0.2 0.62 0.19 0.19 0.13 0.75 0.12 0.06 ...
## $ PersPerOccupHous : num 0.33 0.17 0.46 0.85 0.59 0.29 0.42 0.8 0.35 0.15 ...
## $ PersPerOwnOccHous : num 0.39 0.29 0.52 0.77 0.6 0.53 0.54 0.68 0.38 0.34 ...
## $ PersPerRentOccHous : num 0.28 0.17 0.43 1 0.37 0.18 0.24 0.92 0.33 0.05 ...
## $ PctPersOwnOccup : num 0.55 0.26 0.42 0.94 0.89 0.39 0.65 0.39 0.5 0.48 ...
## $ PctPersDenseHous : num 0.09 0.2 0.15 0.12 0.02 0.26 0.03 0.89 0.1 0.03 ...
## $ PctHousLess3BR : num 0.51 0.82 0.51 0.01 0.19 0.73 0.46 0.66 0.64 0.58 ...
## $ MedNumBR : num 0.5 0 0.5 0.5 0.5 0 0.5 0 0 0 ...
## $ HousVacant : num 0.21 0.02 0.01 0.01 0.01 0.02 0.01 0.01 0.04 0.02 ...
## $ PctHousOccup : num 0.71 0.79 0.86 0.97 0.89 0.84 0.89 0.91 0.72 0.72 ...
## $ PctHousOwnOcc : num 0.52 0.24 0.41 0.96 0.87 0.3 0.57 0.46 0.49 0.38 ...
## $ PctVacantBoarded : num 0.05 0.02 0.29 0.6 0.04 0.16 0.09 0.22 0.05 0.07 ...
## $ PctVacMore6Mos : num 0.26 0.25 0.3 0.47 0.55 0.28 0.49 0.37 0.49 0.47 ...
## $ MedYrHousBuilt : num 0.65 0.65 0.52 0.52 0.73 0.25 0.38 0.6 0.5 0.04 ...
## $ PctHousNoPhone : num 0.14 0.16 0.47 0.11 0.05 0.02 0.05 0.28 0.57 0.01 ...
## $ PctWOFullPlumb : num 0.06 0 0.45 0.11 0.14 0.05 0.05 0.23 0.22 0 ...
## $ OwnOccLowQuart : num 0.22 0.21 0.18 0.24 0.31 0.94 0.37 0.15 0.07 0.63 ...
## $ OwnOccMedVal : num 0.19 0.2 0.17 0.21 0.31 1 0.38 0.13 0.07 0.71 ...
## $ OwnOccHiQuart : num 0.18 0.21 0.16 0.19 0.3 1 0.39 0.13 0.08 0.79 ...
## $ RentLowQ : num 0.36 0.42 0.27 0.75 0.4 0.67 0.26 0.21 0.14 0.44 ...
## $ RentMedian : num 0.35 0.38 0.29 0.7 0.36 0.63 0.35 0.24 0.17 0.42 ...
## $ RentHighQ : num 0.38 0.4 0.27 0.77 0.38 0.68 0.42 0.25 0.16 0.47 ...
## $ MedRent : num 0.34 0.37 0.31 0.89 0.38 0.62 0.35 0.24 0.15 0.41 ...
## $ MedRentPctHousInc : num 0.38 0.29 0.48 0.63 0.22 0.47 0.46 0.64 0.38 0.23 ...
## $ MedOwnCostPctInc : num 0.46 0.32 0.39 0.51 0.51 0.59 0.44 0.59 0.13 0.27 ...
## $ MedOwnCostPctIncNoMtg: num 0.25 0.18 0.28 0.47 0.21 0.11 0.31 0.28 0.36 0.28 ...
## $ NumInShelters : num 0.04 0 0 0 0 0 0 0 0.01 0 ...
## $ NumStreet : num 0 0 0 0 0 0 0 0 0 0 ...
## $ PctForeignBorn : num 0.12 0.21 0.14 0.19 0.11 0.7 0.15 0.59 0.01 0.22 ...
## $ PctBornSameState : num 0.42 0.5 0.49 0.3 0.72 0.42 0.81 0.58 0.78 0.42 ...
## $ PctSameHouse85 : num 0.5 0.34 0.54 0.73 0.64 0.49 0.77 0.52 0.48 0.34 ...
## [list output truncated]
comunidades %>% # base de datos
select(state) %>% #seleccionamos una o más columnas
nrow() # número de filas en una columna
## [1] 1994
comunidades %>%
select(state) %>%
summary() # función genérica que se utiliza para producir resúmenes de resultados
## state
## Min. : 1.00
## 1st Qu.:12.00
## Median :34.00
## Mean :28.68
## 3rd Qu.:42.00
## Max. :56.00
comunidades %>%
select(state) %>%
unique() %>% # devuelve un vector con los elementos/filas duplicados eliminados
nrow()
## [1] 46
comunidades %>%
select(state) %>%
duplicated() %>% # determina qué elementos están duplicados (TRUE/FALSE)
sum() # suma total de observaciones que son duplicados
## [1] 1948
comunidades %>%
select(state) %>%
group_by(state) %>% # agrupa por una o más variables
summarize(n=n()) %>% # realiza una operacion por cada grupo en menos filas
arrange(n) # ordena las filas según los valores de una o más columnas
## # A tibble: 46 x 2
## state n
## <int> <int>
## 1 10 1
## 2 11 1
## 3 20 1
## 4 2 3
## 5 50 4
## 6 32 5
## 7 16 7
## 8 27 7
## 9 56 7
## 10 38 8
## # ... with 36 more rows
library(ggplot2)
comunidades %>%
select(state) %>%
group_by(state) %>%
summarize(NumeroObservaciones=n()) %>%
ggplot(aes(NumeroObservaciones)) + # crea un nuevo gráfico y lo mapea en función de número de observaciones
geom_histogram(bins = 30) # Histogramas y polígonos de frecuencia, bins es el número de cortes
Este análisis busca posibles errores en los registros individuales de la fuente de datos, e identifica qué filas específicas en una tabla contienen problemas y qué problemas sistémicos ocurren entre diferentes columnas.
En ocasiones se suele complementar aplicando reglas de negocio que verifican la exactitud e integridad de los datos.
Ejemplos:
También conocido como Dependency profile.
También llamado análisis cruzado o “Join profile”, consiste en identificar cómo se interrelacionan las partes de los datos, principalmente intenta determinar las similitudes o diferencias en la sintaxis, y los tipos de datos entre diferentes tablas.
Ejemplos:
Comprender las relaciones es crucial para reutilizar los datos.
Nota: Para realizar una análisis adecuado de relaciones, se recomienda hacer un trabajo previo de unificar en una sola (consolidar la base de datos), aquellas fuentes de datos relacionadas entre sí, o a su vez tener mucho cuidado al importar los “data sets” de manera que se conserven las relaciones importantes.
El curso busca identificar problemas de datos perdidos y preparar las estructuras analíticas para la modelización de Machine Learning, en ese sentido nos concentraremos en el Análisis de Estructura o Column profile y sus tareas principales.
Las tareas de perfilamiento por columnas se pueden clasificar de la siguiente manera:
Cardinalidad: contabilizar la cantidad de elementos (finitos o infinitos) de una columna.
Patrones: búsqueda de valores que no cumplan los patrones de un valor “correcto”, es muy útil las expresiones regulares podemos para encontrar ocurrencias de valores correctos o incorrectos. Este criterio suele ser una combinación de uno o más tareas anteriores, por ejemplo: longitud + valores nulos.
Distribución: detectar si los datos siguen alguna distribución o comportamiento conocido. Unicidad: contabilizar la cantidad de entradas únicas de una determinada columna, ese criterio es útil por ejemplo para identificar valores duplicados.
Longitud: determinar que los campos tengan el formato correcto según la longitud y tipo de dato, por ejemplo: números de teléfono, código postal, etc..
Rango: calcular los valores mínimos y máximos en el campo, no solo a nivel númerico, sino también a nivel de texto.
Frecuencia: contabilizar la ocurrencia de valores particulares.
Valores nulos: contabilizar la cantidad de valores nulos o vacíos, para identificar si hay registros incompletos o con datos corruptos.
Dependencias: determinar si existen relaciones o estructuras integradas dentro de un conjunto de datos, por ejemplo: una dependencia funcional entre la columna X y la columna Y nos puede indicar que los valores de ambas columnas son iguales o están relacionados por algún criterio de negocio. El análisis de dependencia funcional se puede usar para la identificación de datos redundantes, valores mapeados y ayuda a sugerir oportunidades para la normalización de datos.
Al final de este tema, debes ser capaz de:
Actualmente, vivimos en un mundo heterogéneo, conviviendo con diferentes tecnologías y operando múltiples plataformas, sensores y dispositivos que generan cada segundo un volumen impresionante de datos; esto por sí solo es un escenario difícil de manejar, más aún cuando las compañías no tienen una estrategia clara de gestión de datos, lo que ocasiona frecuentemente datos duplicados, inconsistentes, ambiguos e incompletos que no resultan útiles para la operación y toma de decisiones en las empresas.
Se dice que los datos son el petróleo del siglo 21, debido a su potencial de negocio e impacto en la sociedad, entonces la buena información es el activo más valioso y la mala información puede dañar seriamente su negocio y su credibilidad.
La calidad de datos es una percepción o evaluación de la idoneidad de los datos para servir con su propósito en un contexto dado.
“Even though quality cannot be defined, you know what it is” Robert Pirsig
Entonces, la calidad de datos es una percepción o evaluación de la idoneidad de los datos para servir con su propósito en un contexto dado. Al tratarse de una percepción intuitivamente se piensa en ciertos aspectos de los datos, por lo general se tiende a pensar en que sean exactos, estén completos, actuales, etc..
Es por esto que la calidad de datos es denominada un concepto “multifacético”, ya que depende de las dimensiones que la definen.
“Garbage in, garbage out” Wilf Hey
“You can have data without information, but you cannot have information without data” Daniel Keys Moran.
La proporción de datos almacenados frente al potencial de “100% completo”.
Ningún dato se grabará más de una vez en función de cómo se identifique ese dato, estado de único.
El grado en que los datos representan la realidad desde el punto requerido en la línea de tiempo.
Los datos son válidos si se ajustan a la sintaxis (formato, tipo, rango) de su definición.
El grado en que los datos describen correctamente el objeto o evento del “mundo real”.
La ausencia de diferencia, cuando se comparan dos o más representaciones de una cosa con una definición.
Al final de este tema, el estudiante será capaz de:
Los datos perdidos (conocidos en inglés como missing values) se producen cuando no se almacena ningún valor de datos para la variable en una observación. Estos suceden frecuentemente y pueden tener un efecto significativo en las conclusiones que uno puede obtener de los datos.
En el mundo real, muchos conjuntos de datos pueden contener valores perdidos por varias razones. Estos se suelen codificar como NaN, NULL, espacios en blanco o cualquier otro marcador de posición. Si se realiza cualquier tipo de análisis sobre un conjunto de datos que tiene muchos valores perdidos, tendrá una alta probabilidad de llegar a conclusiones distorsionadas.
Estos pueden ocurrir debido a la falta de respuesta de una unidad de análisis, cuando no se proporciona información para una o más variables. Algunas variables de análisis suelen tener más probabilidades de generar una falta de respuesta que otras, generlamente información privada. Un claro ejemplo de ellon son los ingresos.
En los estudios longitudinales suele suceder la pérdida de datos por desgaste (attrition). Esta sucede cuando se repite una medición después de un cierto período de tiempo y los participantes se retiran antes de que finalice dicho estudio, faltando así una o más mediciones.
Por otro lado, menudo faltan datos en investigaciones de economía, sociología y ciencias políticas porque los gobiernos o entidades privadas eligen no informar o fallan al informar estadísticas críticas, o porque la información simplemente no está disponible.
Además, los valores perdidos pueden ser causados también por el investigador, cuando se comenten errores metodológicos o la recopilación y/o ingreso de datos se realiza de manera incorrecta.
Por estos motivos, la pérdida de datos suele tener diferentes impactos en la validez de las conclusiones de una investigación, según el tipo de dato perdido.
Finalmente, una forma de manejar este problema es deshacerse de las observaciones que tienen alguna variable con dato faltante; sin embargo, al hacerlo se corre el riesgo de perder información valiosa. Una mejor estrategia sería imputar los valores faltantes; es decir, inferir los valores faltantes a partir de la parte existente.
Los datos perdidos estructuralmente son aquellos que están perdidos por una razón lógica. En muchos casos por propio giro del negocio o del comportamiento normal de los datos que estamos manejando. Es decir, son datos que faltan porque simplemente no deberían existir.
En el ejemplo mostrado a continuación, la primera y la tercera observación tienen valores faltantes para la variable “EdadHijoMenor”. Esto se debe a que estas personas no tienen hijos.
ID | Hijos | EdadHijoMenor |
---|---|---|
1 | NO | NA |
2 | SI | 18 |
3 | NO | NA |
4 | SI | 13 |
5 | SI | 8 |
Cuando no podemos sacar conclusiones sobre el posible valor de los datos perdidos, y estos faltan por motivos que simplemente no entendemos se conoce como datos perdidos no aleatorios (NMAR) o también como datos perdidos no almacenables.
Los datos perdidos estructurales suelen ser un caso especial de datos que faltan no al azar. Sin embargo, estos necesitan de una distinción importante. Los datos perdidos estructuralmente son fáciles de analizar, mientras que otras formas de pérdida de datos no aleatorios son altamente problemáticos.
Cuando los datos están perdidos de forma no aleatoria, significa que no podemos utilizar ninguno de los métodos estándar para tratar los datos faltantes (por ejemplo, imputación o algoritmos diseñados específicamente para valores perdidos), ya que cualquier cálculo estándar dará la respuesta incorrecta.
En estos casos, los datos perdidos no son dignos de mención, pues normalmente no podemos reproducir su comportamiento, ni identificar la fuente de datos, o a su vez corresponden a datos recopilados en diferentes períodos de tiempo no comparables entre sí, y como resultado, juntar estos datos en un mismo análisis podría generar conclusiones incorrectas o engañosas.
Finalmente, tratarlos como datos perdidos al azar también sería inapropiado.
Este método es el más fácil y rápido. Al no hacer nada dejaremos que el algoritmo que entrenemos maneje los datos perdidos.
Algunos algoritmos pueden tener en cuenta los valores perdidos y aprender los mejores valores de imputación (o incluso tratarlos como una categoría aislada) para los datos faltantes en base a la función de pérdida calculada durante el entrenamiento (tales como los modelos basados en árboles). Otros algoritmos tienen la opción de simplemente ignorar estos casos (tales como los modelos de regresión). Dependiendo de la librería utilizada, algunos algoritmos simplemente no se ejecutarán y nos solicitarán revisar los valores perdidos.
Este método solo puede ser utilizado con datos numéricos.
Para imputar con la media o la mediana debemos seguir los pasos descritos a continuación:
Esta imputación dependerá de la simetría de la distribución de los datos existentes. Si la distribución de los datos es simétrica, la media y la mediana son iguales, entonces se puede usar cualquiera de los dos valores. Si la distribución es sesgada hacia la derecha, la media será mayor a la mediana, y es preferible utilizar esta última para no alterar la distribución de los datos. Finalmente, si la distribución es sesgada hacia la izquierda, la media será menor a la mediana, y por tanto es la medida más conveniente.
Sus ventajas son:
Sus desventajas son:
Un método estadístico similar al anterior, útil para imputar datos perdidos es usar los valores más frecuentes (o moda). Esta técnica puede trabajar con variables categóricas y consiste en:
Sus ventajas son:
Sus desventajas son:
Este es un método muy sencillo. Tal como sugiere su nombre, reemplaza los valores faltantes con cero o cualquier valor constante que especifiquemos.
Este métotodo se utiliza cuando el valor perdido en realidad corresponde a un cero por ejemplo.
En la actualidad existen varios métodos de imputación basados en aprendizaje supervisado y no supervisado. Los modelos no supervisados son el fundamento para muchos otros métodos, principalmente clustering.
Estos métodos incluyen:
En el caso de los K-Vecinos más cercanos por ejemplo, el principio fundamental es encontrar un número predefinido de ejemplos de entrenamiento más cercanos a un punto, y predecir la etiqueta que corresponda a partir de estos.
Algunas de sus ventajas son:
Algunas de sus desventajas son:
A continuación revisaremos un ejemplo de datos perdidos en R, usando herramientas compatibles con el tidyverse.
# Instalación de librerías en desarrollo
# remotes::install_github("njtierney/naniar")
# Carga de librerías
library(dplyr)
library(visdat)
library(naniar)
library(simputation)
# Carga de datos
mis_dataset <- read.csv(url("https://raw.githubusercontent.com/dataoptimal/posts/master/data%20cleaning%20with%20R%20and%20the%20tidyverse/telecom.csv"))
head(mis_dataset)
## customerID MonthlyCharges TotalCharges PaymentMethod Churn
## 1 7590-VHVEG 29.85 109.9 Electronic check yes
## 2 5575-GNVDE 56.95 na Mailed check yes
## 3 3668-QPYBK NA 108.15 -- yes
## 4 7795-CFOCW 42.30 1840.75 Bank transfer no
## 5 9237-HQITU 70.70 <NA> Electronic check no
## 6 9305-CDSKC NaN 820.5 -- yes
# Revisión de la estructura de los datos
mis_dataset %>% glimpse()
## Rows: 10
## Columns: 5
## $ customerID <fct> 7590-VHVEG, 5575-GNVDE, 3668-QPYBK, 7795-CFOCW, 9237-HQ~
## $ MonthlyCharges <dbl> 29.85, 56.95, NA, 42.30, 70.70, NaN, 89.10, NA, 104.80,~
## $ TotalCharges <fct> 109.9, na, 108.15, 1840.75, NA, 820.5, 1949.4, N/A, 304~
## $ PaymentMethod <fct> Electronic check, Mailed check, --, Bank transfer, Elec~
## $ Churn <fct> yes, yes, yes, no, no, yes, no, yes, no, no
# Corrección en la carga
mis_dataset <- read.csv(url("https://raw.githubusercontent.com/dataoptimal/posts/master/data%20cleaning%20with%20R%20and%20the%20tidyverse/telecom.csv"), na.strings = c("NaN","NA","N/A","na","--"))
head(mis_dataset)
## customerID MonthlyCharges TotalCharges PaymentMethod Churn
## 1 7590-VHVEG 29.85 109.90 Electronic check yes
## 2 5575-GNVDE 56.95 NA Mailed check yes
## 3 3668-QPYBK NA 108.15 <NA> yes
## 4 7795-CFOCW 42.30 1840.75 Bank transfer no
## 5 9237-HQITU 70.70 NA Electronic check no
## 6 9305-CDSKC NaN 820.50 <NA> yes
# Revisión de formato
mis_dataset %>% glimpse()
## Rows: 10
## Columns: 5
## $ customerID <fct> 7590-VHVEG, 5575-GNVDE, 3668-QPYBK, 7795-CFOCW, 9237-HQ~
## $ MonthlyCharges <dbl> 29.85, 56.95, NA, 42.30, 70.70, NaN, 89.10, NA, 104.80,~
## $ TotalCharges <dbl> 109.90, NA, 108.15, 1840.75, NA, 820.50, 1949.40, NA, 3~
## $ PaymentMethod <fct> Electronic check, Mailed check, NA, Bank transfer, Elec~
## $ Churn <fct> yes, yes, yes, no, no, yes, no, yes, no, no
# Inspección gráfica de datos perdidos
mis_dataset %>% vis_miss()
# Casos perdidos por variable
mis_dataset %>% gg_miss_var()
# Imputación simple
mis_dataset <- mis_dataset %>%
mutate(TotalChargesImpMean = replace(TotalCharges,is.na(TotalCharges),mean(TotalCharges,na.rm=T)),
TotalChargesImpMedian = replace(TotalCharges,is.na(TotalCharges),median(TotalCharges,na.rm=T)))
mis_dataset %>%
select(TotalCharges,
TotalChargesImpMean,
TotalChargesImpMedian) %>%
head()
## TotalCharges TotalChargesImpMean TotalChargesImpMedian
## 1 109.90 109.900 109.90
## 2 NA 1175.671 820.50
## 3 108.15 108.150 108.15
## 4 1840.75 1840.750 1840.75
## 5 NA 1175.671 820.50
## 6 820.50 820.500 820.50
# Imputación multivariada
mis_dataset <- mis_dataset %>%
impute_lm(TotalCharges ~ MonthlyCharges + Churn)
mis_dataset %>%
select(TotalCharges, TotalChargesImpMean, TotalChargesImpMedian) %>%
head()
## TotalCharges TotalChargesImpMean TotalChargesImpMedian
## 1 109.9000 109.900 109.90
## 2 828.0137 1175.671 820.50
## 3 108.1500 108.150 108.15
## 4 1840.7500 1840.750 1840.75
## 5 1748.1025 1175.671 820.50
## 6 820.5000 820.500 820.50
Al final de este capítulo, el estudiante será capaz de:
La ingeniería de variables consta de extensa teoría y por ello solo presentaremos sus conceptos básicos y algunas herramientas prácticas del tidyverse.
Los datos en el mundo real pueden estar muy desordenados y caóticos, sin importar si es una base de datos relacional, un archivo de texto o cualquier otra fuente de datos.
A pesar de que los datos generalmente se construyen como tablas donde cada fila (unidad de análisis) tiene sus propios valores correspondientes a una columna dada (característica), los datos pueden ser difíciles de entender y procesar. Para facilitar la lectura de los datos para los modelos de aprendizaje automático y aumentar su rendimiento, podemos realizar ingeniería de Variables.
En general, se puede pensar en la limpieza de datos como un proceso de sustracción y en la ingeniería de variables como un proceso de adición. Esta es a menudo una de las tareas más valiosas que un científico de datos puede realizar para mejorar el rendimiento de sus modelos.
En conclusión, podríamos definir a la ingeniería de variables como el proceso de creación y tratamiento de nuevas variables de entrada a partir de las ya existentes en una fuente de datos.
Sus enfoques son los siguientes:
Cuando trabajamos realizando análisis de datos, muchas de las operaciones utilizadas para crear variables se repiten en los conjuntos de datos, y una vez que nos damos cuenta de que estas operaciones no dependen de los datos subyacentes es mejor abstraer este proceso en un marco que puede crear variables para cualquier base.
Esta es la principal idea detrás de la ingeniería de variables automatizada, donde podemos aplicar los mismos bloques de código básicos, llamados primitivos de variables, a diferentes conjuntos de datos para construir variables predictoras.
Por ejemplo, una operación concurrente en un modelo de datos es el cálculo de valores máximos. A este proceso entonces lo podemos abstraer como una variable primitiva max que puede ser aplicada a diferentes propósitos y conjuntos de datos.
Este especifica el último punto en el tiempo en que se pueden usar los datos de una fila para el cálculo de una entidad; cualquier dato posterior a este punto en el tiempo se filtrará antes de calcular las variables.
Por ejemplo, consideremos un conjunto de datos de transacciones de clientes (identificadas con fecha y hora) donde queremos predecir si los clientes 1, 2 y 3 van a gastar $500 entre las 04:00h y 23:59h del 2020-06-01. Cuando definimos las variables para este problema, necesitamos asegurarnos que no hay datos antes de las 04:00h que sean usados en los cálculos.
A menudo se puede diseñar funciones informativas aprovechando experiencia propia (o del equipo de trabajo) sobre un dominio específico, tomándolo como premisa. Esto se hace con el fin de aislar información específica en el modelamiento.
Por ejemplo, si estuviesemos trabajando con un conjunto de datos del mercado inmobiliario (propiedades inmobiliarias y precios), y uno de los especialistas de negocio recordó que la crisis de la vivienda en Europa ocurrió en el mismo periodo de tiempo, este conocimiento puede disparar algunas alertas al modelo. Seguramente el especialista nos advierte que los precios de inmuebles estarán afectados durante la crisis, por lo que podemos crear una variable indicadora para las transacciones realizadas durante ese período.
Las variables indicadoras son variables binarias y nos indican si una observación cumple con una determinada condición o no, siendo muy útiles para aislar características clave de los datos.
Pese a que el conocimiento del negocio es muy amplio y abierto, y aporta mucho; dependerá de la experiencia en el negocio con que cuentan los integrantes del equipo. De igual manera, en algún momento se agotarán las ideas y se se podrá recurrir a otras herramientas técnicas de la ingeniería de variables.
Esta técnica consiste en evaluar la posibilidad de combinar dos o más variables que tengan sentido, identificando su interacción en el negocio. Los términos en los que interactúan estas variables pueden ser operaciones de tipo suma, resta y multiplicación.
Los términos de interacción, son las condiciones que nos permiten modelar las relaciones entre variables cuando los efectos de una variable para alcanzar un objetivo es influenciada por otra.
Un consejo general es preguntarse por cada conjunto de variables¿Podría combinar esta información de alguna manera que sea aún más útil?
Así, algunas consideraciones para agrupar clases dispersas son:
No hay una regla formal de cuántas observaciones se necesita por cada clase.
Esto depende del tamaño del conjunto de datos.
Se debe tomar en cuenta las particularidades del negocio.
Como sugerencia general (no es una regla), se puede combinar clases hasta que cada una tenga al menos 50 observaciones (aprox.)
Después de combinar clases dispersas se tendrá menos clases únicas, pero cada una tendrá más observaciones.
Categorías dispersas:
Categorías reagrupadas:
La mayoría de los algoritmos de aprendizaje automático no pueden manejar directamente las variables categóricas, porque la máquina no puede leer directamente valores de texto. Por lo tanto, necesitamos crear variables ficticias o dummy* para las variables categóricas de nuestro conjunto de datos.
Las variables dummy son un conjunto de variables binarias (0 o 1) que representan una sola clase de una variables categórica. La información que representa es exactamente la misma, pero esta representación numérica le permite ser leída por el ordenador.
Normalmente este tipo de variables se descarta y no se las utiliza en el modelo analítico.
Prácticamente ningún algoritmo de aprendizaje automático puede trabajar con datos categóricos directamente. Estos requieren que todas las variables de entrada y las variables de salida sean numéricas.
Por ello, debemos transformar las variables categóricas a una forma numérica. Si la variable categórica es una variable de salida, es posible que también se necesite convertir las predicciones del modelo nuevamente en una forma categórica para presentarlas o usarlas en alguna aplicación.
Así, existen 2 tipos principales de conversiones numéricas:
Integer Encoding
Consiste en asignar a cada valor único de categoría un valor entero, por ejemplo:
Categoría | Valor |
---|---|
Azul | 1 |
Rojo | 2 |
Verde | 3 |
One-Hot Encoding
Se utiliza cuando no existe una relación ordinal. El usar una codificación numérica y permitir que el modelo asuma un orden natural entre categorías puede resultar en un desempeño pobre o resultados inesperados.
En este caso, se puede aplicar una codificación instantánea o one-hot a la representación de enteros. Aquí se elimina la variable codificada de entero y se agrega una nueva variable binaria para cada valor entero único.
Color: Azul, Rojo, Verde
Azul | Rojo | Verde |
---|---|---|
1 | 0 | 0 |
0 | 1 | 0 |
0 | 0 | 1 |
Al final de este tema, el estudiante será capaz de:
Acontinuación se presenta una lista de software y documentos relacionados con el análisis de datos exploratorio automatizado, que incluye:
Información sobre las librerías para auto análisis exploratorio [aquí]https://github.com/mstaniak/autoEDA-resources)
autoEDA tiene como objetivo automatizar el análisis de datos exploratorios de manera univariante o bivariada. Tiene la capacidad de generar gráficos creados con la biblioteca ggplot2
y temas inspirados enRColorBrewer
.
La capacidad principal consiste en limpiar y preprocesar de manera impecable sus datos para que los gráficos se muestren de manera adecuada.
#install.packages('devtools')
library(devtools)
#devtools::install_github("XanderHorn/autoEDA")
library(autoEDA)
overview_1 <- autoEDA(x = iris)
## autoEDA | Setting color theme
## autoEDA | Removing constant features
## autoEDA | 0 constant features removed
## autoEDA | 0 zero spread features removed
## autoEDA | Removing features containing majority missing values
## autoEDA | 0 majority missing features removed
## autoEDA | Cleaning data
## autoEDA | Correcting sparse categorical feature levels
## autoEDA | Performing univariate analysis
## autoEDA | Visualizing data
overview_1
## Feature Observations FeatureClass FeatureType PercentageMissing
## 1 Sepal.Length 150 numeric Continuous 0
## 2 Sepal.Width 150 numeric Continuous 0
## 3 Petal.Length 150 numeric Continuous 0
## 4 Petal.Width 150 numeric Continuous 0
## 5 Species 150 character Categorical 0
## PercentageUnique ConstantFeature ZeroSpreadFeature LowerOutliers
## 1 23.33 No No 0
## 2 15.33 No No 1
## 3 28.67 No No 0
## 4 14.67 No No 0
## 5 2.00 No No 0
## UpperOutliers ImputationValue MinValue FirstQuartile Median Mean Mode
## 1 0 5.8 4.3 5.1 5.80 5.84 5
## 2 3 3 2.0 2.8 3.00 3.06 3
## 3 0 4.35 1.0 1.6 4.35 3.76 1.4
## 4 0 1.3 0.1 0.3 1.30 1.20 0.2
## 5 0 SETOSA 0.0 0.0 0.00 0.00 SETOSA
## ThirdQuartile MaxValue LowerOutlierValue UpperOutlierValue
## 1 6.4 7.9 3.15 8.35
## 2 3.3 4.4 2.05 4.05
## 3 5.1 6.9 -3.65 10.35
## 4 1.8 2.5 -1.95 4.05
## 5 0.0 0.0 0.00 0.00
overview_2 <- autoEDA(x = iris,
y = "Sepal.Length")
## autoEDA | Setting color theme
## autoEDA | Removing constant features
## autoEDA | 0 constant features removed
## autoEDA | Removing zero spread features
## autoEDA | 0 zero spread features removed
## autoEDA | Removing features containing majority missing values
## autoEDA | 0 majority missing features removed
## autoEDA | Cleaning data
## autoEDA | Correcting sparse categorical feature levels
## autoEDA | Sorting features
## autoEDA | Regression outcome detected
## autoEDA | Calculating feature predictive power
## autoEDA | Visualizing data
overview_2
## Feature Observations FeatureClass FeatureType PercentageMissing
## 1 Petal.Length 150 numeric Continuous 0
## 2 Petal.Width 150 numeric Continuous 0
## 3 Sepal.Length 150 numeric Continuous 0
## 4 Sepal.Width 150 numeric Continuous 0
## 5 Species 150 character Categorical 0
## PercentageUnique ConstantFeature ZeroSpreadFeature LowerOutliers
## 1 28.67 No No 0
## 2 14.67 No No 0
## 3 23.33 No No 0
## 4 15.33 No No 1
## 5 2.00 No No 0
## UpperOutliers ImputationValue MinValue FirstQuartile Median Mean Mode
## 1 0 4.35 1.0 1.6 4.35 3.76 1.4
## 2 0 1.3 0.1 0.3 1.30 1.20 0.2
## 3 0 5.8 4.3 5.1 5.80 5.84 5
## 4 3 3 2.0 2.8 3.00 3.06 3
## 5 0 SETOSA 0.0 0.0 0.00 0.00 SETOSA
## ThirdQuartile MaxValue LowerOutlierValue UpperOutlierValue
## 1 5.1 6.9 -3.65 10.35
## 2 1.8 2.5 -1.95 4.05
## 3 6.4 7.9 3.15 8.35
## 4 3.3 4.4 2.05 4.05
## 5 0.0 0.0 0.00 0.00
## PredictivePowerPercentage PredictivePower
## 1 87 High
## 2 82 High
## 3 0 Low
## 4 12 Low
## 5 78 High
overview_3 <- autoEDA(x = iris,
y = "Species")
## autoEDA | Setting color theme
## autoEDA | Removing constant features
## autoEDA | 0 constant features removed
## autoEDA | Removing zero spread features
## autoEDA | 0 zero spread features removed
## autoEDA | Removing features containing majority missing values
## autoEDA | 0 majority missing features removed
## autoEDA | Cleaning data
## autoEDA | Correcting sparse categorical feature levels
## autoEDA | Sorting features
## autoEDA | Multi-class classification outcome detected
## autoEDA | Calculating feature predictive power
## autoEDA | Visualizing data
overview_3
## Feature Observations FeatureClass FeatureType PercentageMissing
## 1 Petal.Length 150 numeric Continuous 0
## 2 Petal.Width 150 numeric Continuous 0
## 3 Sepal.Length 150 numeric Continuous 0
## 4 Sepal.Width 150 numeric Continuous 0
## 5 Species 150 character Categorical 0
## PercentageUnique ConstantFeature ZeroSpreadFeature LowerOutliers
## 1 28.67 No No 0
## 2 14.67 No No 0
## 3 23.33 No No 0
## 4 15.33 No No 1
## 5 2.00 No No 0
## UpperOutliers ImputationValue MinValue FirstQuartile Median Mean Mode
## 1 0 4.35 1.0 1.6 4.35 3.76 1.4
## 2 0 1.3 0.1 0.3 1.30 1.20 0.2
## 3 0 5.8 4.3 5.1 5.80 5.84 5
## 4 3 3 2.0 2.8 3.00 3.06 3
## 5 0 SETOSA 0.0 0.0 0.00 0.00 SETOSA
## ThirdQuartile MaxValue LowerOutlierValue UpperOutlierValue
## 1 5.1 6.9 -3.65 10.35
## 2 1.8 2.5 -1.95 4.05
## 3 6.4 7.9 3.15 8.35
## 4 3.3 4.4 2.05 4.05
## 5 0.0 0.0 0.00 0.00
## PredictivePowerPercentage PredictivePower
## 1 86 High
## 2 88 High
## 3 46 Medium
## 4 24 Low
## 5 0 Low
#install.packages("arsenal")
library(arsenal)
#install.packages("knitr")
require(knitr)
data(mockstudy) #cargamos los datos
dim(mockstudy) #número de filas y columnas
## [1] 1499 14
str(mockstudy) #vistazo rápido de los datos
## 'data.frame': 1499 obs. of 14 variables:
## $ case : int 110754 99706 105271 105001 112263 86205 99508 90158 88989 90515 ...
## $ age : int 67 74 50 71 69 56 50 57 51 63 ...
## ..- attr(*, "label")= chr "Age in Years"
## $ arm : chr "F: FOLFOX" "A: IFL" "A: IFL" "G: IROX" ...
## ..- attr(*, "label")= chr "Treatment Arm"
## $ sex : Factor w/ 2 levels "Male","Female": 1 2 2 2 2 1 1 1 2 1 ...
## $ race : chr "Caucasian" "Caucasian" "Caucasian" "Caucasian" ...
## ..- attr(*, "label")= chr "Race"
## $ fu.time : int 922 270 175 128 233 120 369 421 387 363 ...
## $ fu.stat : int 2 2 2 2 2 2 2 2 2 2 ...
## $ ps : int 0 1 1 1 0 0 0 0 1 1 ...
## $ hgb : num 11.5 10.7 11.1 12.6 13 10.2 13.3 12.1 13.8 12.1 ...
## $ bmi : num 25.1 19.5 NA 29.4 26.4 ...
## ..- attr(*, "label")= chr "Body Mass Index (kg/m^2)"
## $ alk.phos : int 160 290 700 771 350 569 162 152 231 492 ...
## $ ast : int 35 52 100 68 35 27 16 12 25 18 ...
## $ mdquality.s: int NA 1 1 1 NA 1 1 1 1 1 ...
## $ age.ord : Ord.factor w/ 8 levels "10-19"<"20-29"<..: 6 7 4 7 6 5 4 5 5 6 ...
tab1 <- tableby(arm ~ sex + age, data=mockstudy)
tab1
## tableby Object
##
## Function Call:
## tableby(formula = arm ~ sex + age, data = mockstudy)
##
## Variable(s):
## arm ~ sex, age
summary(tab1)
##
##
## | | A: IFL (N=428) | F: FOLFOX (N=691) | G: IROX (N=380) | Total (N=1499) | p value|
## |:---------------------------|:---------------:|:-----------------:|:---------------:|:---------------:|-------:|
## |**sex** | | | | | 0.190|
## | Male | 277 (64.7%) | 411 (59.5%) | 228 (60.0%) | 916 (61.1%) | |
## | Female | 151 (35.3%) | 280 (40.5%) | 152 (40.0%) | 583 (38.9%) | |
## |**Age in Years** | | | | | 0.614|
## | Mean (SD) | 59.673 (11.365) | 60.301 (11.632) | 59.763 (11.499) | 59.985 (11.519) | |
## | Range | 27.000 - 88.000 | 19.000 - 88.000 | 26.000 - 85.000 | 19.000 - 88.000 | |
tab2 <- as.data.frame(tab1)
tab2
## group.term group.label strata.term variable term label
## 1 arm Treatment Arm sex sex sex
## 2 arm Treatment Arm sex countpct Male
## 3 arm Treatment Arm sex countpct Female
## 4 arm Treatment Arm age age Age in Years
## 5 arm Treatment Arm age meansd Mean (SD)
## 6 arm Treatment Arm age range Range
## variable.type A: IFL F: FOLFOX G: IROX
## 1 categorical
## 2 categorical 277.00000, 64.71963 411.00000, 59.47902 228, 60
## 3 categorical 151.00000, 35.28037 280.00000, 40.52098 152, 40
## 4 numeric
## 5 numeric 59.67290, 11.36454 60.30101, 11.63225 59.76316, 11.49930
## 6 numeric 27, 88 19, 88 26, 85
## Total test p.value
## 1 Pearson's Chi-squared test 0.1904388
## 2 916.0000, 61.1074 Pearson's Chi-squared test 0.1904388
## 3 583.0000, 38.8926 Pearson's Chi-squared test 0.1904388
## 4 Linear Model ANOVA 0.6143859
## 5 59.98532, 11.51877 Linear Model ANOVA 0.6143859
## 6 19, 88 Linear Model ANOVA 0.6143859
summary(tab1)
A: IFL (N=428) | F: FOLFOX (N=691) | G: IROX (N=380) | Total (N=1499) | p value | |
---|---|---|---|---|---|
sex | 0.190 | ||||
Male | 277 (64.7%) | 411 (59.5%) | 228 (60.0%) | 916 (61.1%) | |
Female | 151 (35.3%) | 280 (40.5%) | 152 (40.0%) | 583 (38.9%) | |
Age in Years | 0.614 | ||||
Mean (SD) | 59.673 (11.365) | 60.301 (11.632) | 59.763 (11.499) | 59.985 (11.519) | |
Range | 27.000 - 88.000 | 19.000 - 88.000 | 26.000 - 85.000 | 19.000 - 88.000 |
dat <- data.frame(
tp = paste0("Time Point ", c(1, 2, 1, 2, 1, 2, 1, 2, 1, 2)),
id = c(1, 1, 2, 2, 3, 3, 4, 4, 5, 6),
Cat = c("A", "A", "A", "B", "B", "B", "B", "A", NA, "B"),
Fac = factor(c("A", "B", "C", "A", "B", "C", "A", "B", "C", "A")),
Num = c(1, 2, 3, 4, 4, 3, 3, 4, 0, NA),
Ord = ordered(c("I", "II", "II", "III", "III", "III", "I", "III", "II", "I")),
Lgl = c(TRUE, TRUE, FALSE, TRUE, FALSE, TRUE, TRUE, FALSE, FALSE, FALSE),
Dat = as.Date("2018-05-01") + c(1, 1, 2, 2, 3, 4, 5, 6, 3, 4),
stringsAsFactors = FALSE
)
p <- paired(tp ~ Cat + Fac + Num + Ord + Lgl + Dat, data = dat, id = id, signed.rank.exact = FALSE)
summary(p)
{style="max-height: 300px;",results="asis"} ## ## ## | | Time Point 1 (N=4) | Time Point 2 (N=4) | Difference (N=4) | p value| ## |:---------------------------|:-----------------------:|:-----------------------:|:----------------:|-------:| ## |**Cat** | | | | 1.000| ## | A | 2 (50.0%) | 2 (50.0%) | 1 (50.0%) | | ## | B | 2 (50.0%) | 2 (50.0%) | 1 (50.0%) | | ## |**Fac** | | | | 0.261| ## | A | 2 (50.0%) | 1 (25.0%) | 2 (100.0%) | | ## | B | 1 (25.0%) | 2 (50.0%) | 1 (100.0%) | | ## | C | 1 (25.0%) | 1 (25.0%) | 1 (100.0%) | | ## |**Num** | | | | 0.391| ## | Mean (SD) | 2.750 (1.258) | 3.250 (0.957) | 0.500 (1.000) | | ## | Range | 1.000 - 4.000 | 2.000 - 4.000 | -1.000 - 1.000 | | ## |**Ord** | | | | 0.174| ## | I | 2 (50.0%) | 0 (0.0%) | 2 (100.0%) | | ## | II | 1 (25.0%) | 1 (25.0%) | 1 (100.0%) | | ## | III | 1 (25.0%) | 3 (75.0%) | 0 (0.0%) | | ## |**Lgl** | | | | 1.000| ## | FALSE | 2 (50.0%) | 1 (25.0%) | 2 (100.0%) | | ## | TRUE | 2 (50.0%) | 3 (75.0%) | 1 (50.0%) | | ## |**Dat** | | | | 0.182| ## | Median | 2018-05-03 | 2018-05-04 | 0.500 | | ## | Range | 2018-05-02 - 2018-05-06 | 2018-05-02 - 2018-05-07 | 0.000 - 1.000 | |
summary(p)
Time Point 1 (N=4) | Time Point 2 (N=4) | Difference (N=4) | p value | |
---|---|---|---|---|
Cat | 1.000 | |||
A | 2 (50.0%) | 2 (50.0%) | 1 (50.0%) | |
B | 2 (50.0%) | 2 (50.0%) | 1 (50.0%) | |
Fac | 0.261 | |||
A | 2 (50.0%) | 1 (25.0%) | 2 (100.0%) | |
B | 1 (25.0%) | 2 (50.0%) | 1 (100.0%) | |
C | 1 (25.0%) | 1 (25.0%) | 1 (100.0%) | |
Num | 0.391 | |||
Mean (SD) | 2.750 (1.258) | 3.250 (0.957) | 0.500 (1.000) | |
Range | 1.000 - 4.000 | 2.000 - 4.000 | -1.000 - 1.000 | |
Ord | 0.174 | |||
I | 2 (50.0%) | 0 (0.0%) | 2 (100.0%) | |
II | 1 (25.0%) | 1 (25.0%) | 1 (100.0%) | |
III | 1 (25.0%) | 3 (75.0%) | 0 (0.0%) | |
Lgl | 1.000 | |||
FALSE | 2 (50.0%) | 1 (25.0%) | 2 (100.0%) | |
TRUE | 2 (50.0%) | 3 (75.0%) | 1 (50.0%) | |
Dat | 0.182 | |||
Median | 2018-05-03 | 2018-05-04 | 0.500 | |
Range | 2018-05-02 - 2018-05-06 | 2018-05-02 - 2018-05-07 | 0.000 - 1.000 |
tab3 <- modelsum(bmi ~ sex + age, data=mockstudy)
summary(tab3, text=TRUE)
##
##
## | |estimate |std.error |p.value |adj.r.squared |Nmiss |
## |:------------|:--------|:---------|:-------|:-------------|:-----|
## |(Intercept) |27.491 |0.181 |< 0.001 |0.004 |33 |
## |sex Female |-0.731 |0.290 |0.012 | | |
## |(Intercept) |26.424 |0.752 |< 0.001 |0.000 |33 |
## |Age in Years |0.013 |0.012 |0.290 | | |
summary(tab3)
estimate | std.error | p.value | adj.r.squared | Nmiss | |
---|---|---|---|---|---|
(Intercept) | 27.491 | 0.181 | < 0.001 | 0.004 | 33 |
sex Female | -0.731 | 0.290 | 0.012 | ||
(Intercept) | 26.424 | 0.752 | < 0.001 | 0.000 | 33 |
Age in Years | 0.013 | 0.012 | 0.290 |
df1 <- data.frame(id = paste0("person", 1:3),
a = c("a", "b", "c"),
b = c(1, 3, 4),
c = c("f", "e", "d"),
row.names = paste0("rn", 1:3),
stringsAsFactors = FALSE)
df2 <- data.frame(id = paste0("person", 3:1),
a = c("c", "b", "a"),
b = c(1, 3, 4),
d = paste0("rn", 1:3),
row.names = paste0("rn", c(1,3,2)),
stringsAsFactors = FALSE)
comparedf(df1, df2)
## Compare Object
##
## Function Call:
## comparedf(x = df1, y = df2)
##
## Shared: 3 non-by variables and 3 observations.
## Not shared: 2 variables and 0 observations.
##
## Differences found in 2/3 variables compared.
## 0 variables compared have non-identical attributes.
summary(comparedf(df1, df2))
##
##
## Table: Summary of data.frames
##
## version arg ncol nrow
## -------- ---- ----- -----
## x df1 4 3
## y df2 4 3
##
##
##
## Table: Summary of overall comparison
##
## statistic value
## ------------------------------------------------------------ ------
## Number of by-variables 0
## Number of non-by variables in common 3
## Number of variables compared 3
## Number of variables in x but not y 1
## Number of variables in y but not x 1
## Number of variables compared with some values unequal 2
## Number of variables compared with all values equal 1
## Number of observations in common 3
## Number of observations in x but not y 0
## Number of observations in y but not x 0
## Number of observations with some compared variables unequal 2
## Number of observations with all compared variables equal 1
## Number of values unequal 4
##
##
##
## Table: Variables not shared
##
## version variable position class
## -------- --------- --------- ----------
## x c 4 character
## y d 4 character
##
##
##
## Table: Other variables not compared
##
## | |
## |:-------------------------------|
## |No other variables not compared |
##
##
##
## Table: Observations not shared
##
## | |
## |:--------------------------|
## |No observations not shared |
##
##
##
## Table: Differences detected by variable
##
## var.x var.y n NAs
## ------ ------ --- ----
## id id 2 0
## a a 2 0
## b b 0 0
##
##
##
## Table: Differences detected
##
## var.x var.y ..row.names.. values.x values.y row.x row.y
## ------ ------ -------------- --------- --------- ------ ------
## id id 1 person1 person3 1 1
## id id 3 person3 person1 3 3
## a a 1 a c 1 1
## a a 3 c a 3 3
##
##
##
## Table: Non-identical attributes
##
## | |
## |:---------------------------|
## |No non-identical attributes |
EDA, es la fase inicial e importante del análisis de datos / modelado predictivo. Durante este proceso, los analistas / modeladores echarán un primer vistazo a los datos y, por lo tanto, generarán hipótesis relevantes y decidirán los próximos pasos.
Sin embargo, el proceso de EDA puede resultar complicado a veces. Este paquete R tiene como objetivo automatizar la mayor parte del manejo y visualización de datos, de modo que los usuarios puedan concentrarse en estudiar los datos y extraer conocimientos.
Cuando ejecutemos el comando create_report(base_de_datos)
, la función create_report genera un reporte html en el directorio de trabajo con el nombre: report.html
Importante! Fijar el direcorio de trabajo ubicándonos en Files
en la carpeta en la que deseamos tener el reporte y luego usando la función setwd()
.
#install.packages("DataExplorer")
library(DataExplorer)
data(airquality) # cargamos la base de datos
#create_report(airquality)
El reporte se ve así:
El paquete janitor
(conserje) es un paquete de R que tiene funciones simples para examinar y limpiar datos sucios. Fue construido teniendo en cuenta a los usuarios principiantes e intermedios de R y está optimizado para que sea fácil de usar.
Las principales funciones de janitor
:
Formatear los nombres de las columnas de la base de datos.
Aislar registros duplicados
Proporcionar tabulaciones rápidas (es decir, tablas de frecuencia y tablas de referencias cruzadas).
Otras funciones de janitor
dan un formato agradable a los resultados de estas tabulaciones. Juntas, estas funciones de tabulación e informe se aproximan a las características populares de SPSS y Microsoft Excel.
Data: MyMSA
#install.packages("janitor")
#install.packages("readxl")
library(janitor)
library(readxl)
mymsa = read_excel("data/mymsa.xlsx")
x = janitor::clean_names(mymsa)
data.frame(mymsa = colnames(mymsa), x = colnames(x))
## mymsa x
## 1 RFID rfid
## 2 Plant plant
## 3 KillDate kill_date
## 4 BodyNo body_no
## 5 LeftSideScanTime left_side_scan_time
## 6 RightSideScanTime right_side_scan_time
## 7 HangMethod hang_method
## 8 Hgp hgp
## 9 Sex sex
## 10 LeftHscw left_hscw
## 11 RightHscw right_hscw
## 12 TotalHscw total_hscw
## 13 P8Fat p8fat
## 14 Lot lot
## 15 Est % BI est_percent_bi
## 16 HumpCold hump_cold
## 17 Ema ema
## 18 OssificationCold ossification_cold
## 19 AusMarbling aus_marbling
## 20 MsaMarbling msa_marbling
## 21 MeatColour meat_colour
## 22 FatColour fat_colour
## 23 RibfatCold ribfat_cold
## 24 Ph ph
## 25 LoinTemp loin_temp
## 26 FeedType feed_type
## 27 NoDaysOnFeed no_days_on_feed
## 28 MSAIndex msa_index
## 29 spare spare
x %>% tabyl(meat_colour) # devuelve una tabla de frecuencias
## meat_colour n percent
## 1B 87 0.02175
## 1C 657 0.16425
## 2 1730 0.43250
## 3 1478 0.36950
## 4 30 0.00750
## 5 14 0.00350
## 6 4 0.00100
x %>%
tabyl(meat_colour) %>%
adorn_pct_formatting(digits = 0, affix_sign = TRUE) # para incluir porcentaje
## meat_colour n percent
## 1B 87 2%
## 1C 657 16%
## 2 1730 43%
## 3 1478 37%
## 4 30 1%
## 5 14 0%
## 6 4 0%
x %>% tabyl(spare)
## spare n percent valid_percent
## NA 4000 1 NA
x = remove_empty(x, which = c("rows","cols")) # elimina las columnas que están completamente vacías y las filas enteras que están completamente vacías.
x = read_excel("data/mymsa.xlsx") %>%
clean_names() %>% remove_empty() # podemos incluirlo desde la lectura de la base
x %>% tabyl(meat_colour, plant) #tabulacion cruzada
## meat_colour 1 2
## 1B 0 87
## 1C 87 570
## 2 1443 287
## 3 1477 1
## 4 27 3
## 5 9 5
## 6 1 3
# fila de totales
x %>%
tabyl(meat_colour, plant) %>%
adorn_totals(where = "row")
## meat_colour 1 2
## 1B 0 87
## 1C 87 570
## 2 1443 287
## 3 1477 1
## 4 27 3
## 5 9 5
## 6 1 3
## Total 3044 956
# columna de totales
x %>%
tabyl(meat_colour, plant) %>%
adorn_totals(where = "col")
## meat_colour 1 2 Total
## 1B 0 87 87
## 1C 87 570 657
## 2 1443 287 1730
## 3 1477 1 1478
## 4 27 3 30
## 5 9 5 14
## 6 1 3 4
# filas y columnas de totales
x %>%
tabyl(meat_colour, plant) %>%
adorn_totals(where = c("row","col"))
## meat_colour 1 2 Total
## 1B 0 87 87
## 1C 87 570 657
## 2 1443 287 1730
## 3 1477 1 1478
## 4 27 3 30
## 5 9 5 14
## 6 1 3 4
## Total 3044 956 4000
x %>%
tabyl(meat_colour, plant) %>%
adorn_totals(where = c("row","col")) %>%
adorn_percentages(denominator = "col") %>%
adorn_pct_formatting(digits = 0) # con porcentajes
## meat_colour 1 2 Total
## 1B 0% 9% 2%
## 1C 3% 60% 16%
## 2 47% 30% 43%
## 3 49% 0% 37%
## 4 1% 0% 1%
## 5 0% 1% 0%
## 6 0% 0% 0%
## Total 100% 100% 100%
# conteos
x %>%
tabyl(meat_colour, plant) %>%
adorn_totals(where = c("row","col")) %>%
adorn_percentages(denominator = "col") %>%
adorn_pct_formatting(digits = 0) %>%
adorn_ns(position = "front")
## meat_colour 1 2 Total
## 1B 0 (0%) 87 (9%) 87 (2%)
## 1C 87 (3%) 570 (60%) 657 (16%)
## 2 1443 (47%) 287 (30%) 1730 (43%)
## 3 1477 (49%) 1 (0%) 1478 (37%)
## 4 27 (1%) 3 (0%) 30 (1%)
## 5 9 (0%) 5 (1%) 14 (0%)
## 6 1 (0%) 3 (0%) 4 (0%)
## Total 3044 (100%) 956 (100%) 4000 (100%)
# examinamos si hay duplicados
x %>% get_dupes(rfid)
## # A tibble: 0 x 29
## # ... with 29 variables: rfid <chr>, dupe_count <int>, plant <dbl>,
## # kill_date <dttm>, body_no <dbl>, left_side_scan_time <dbl>,
## # right_side_scan_time <dbl>, hang_method <chr>, hgp <chr>, sex <chr>,
## # left_hscw <dbl>, right_hscw <dbl>, total_hscw <dbl>, p8fat <dbl>,
## # lot <dbl>, est_percent_bi <chr>, hump_cold <dbl>, ema <dbl>,
## # ossification_cold <dbl>, aus_marbling <dbl>, msa_marbling <dbl>,
## # meat_colour <chr>, fat_colour <dbl>, ribfat_cold <dbl>, ph <dbl>,
## # loin_temp <dbl>, feed_type <chr>, no_days_on_feed <dbl>, msa_index <dbl>
# vamos a crear duplicados artificiales
x1 = x %>% slice(1:3)
x2 = bind_rows(x1,x)
x2 %>% get_dupes(rfid)
## # A tibble: 6 x 29
## rfid dupe_count plant kill_date body_no left_side_scan_ti~
## <chr> <int> <dbl> <dttm> <dbl> <dbl>
## 1 201 553126081~ 2 1 2016-08-15 00:00:00 193 423
## 2 201 553126081~ 2 1 2016-08-15 00:00:00 193 423
## 3 253 120151214~ 2 1 2016-08-15 00:00:00 257 542
## 4 253 120151214~ 2 1 2016-08-15 00:00:00 257 542
## 5 818 415178538~ 2 1 2016-08-02 00:00:00 99 445
## 6 818 415178538~ 2 1 2016-08-02 00:00:00 99 445
## # ... with 23 more variables: right_side_scan_time <dbl>, hang_method <chr>,
## # hgp <chr>, sex <chr>, left_hscw <dbl>, right_hscw <dbl>, total_hscw <dbl>,
## # p8fat <dbl>, lot <dbl>, est_percent_bi <chr>, hump_cold <dbl>, ema <dbl>,
## # ossification_cold <dbl>, aus_marbling <dbl>, msa_marbling <dbl>,
## # meat_colour <chr>, fat_colour <dbl>, ribfat_cold <dbl>, ph <dbl>,
## # loin_temp <dbl>, feed_type <chr>, no_days_on_feed <dbl>, msa_index <dbl>
#¿Alguna vez leyo datos de Excel y vio un valor como 42223 donde debería estar una fecha? Esta función convierte esos números de serie a la clase Fecha.
excel_numeric_to_date(41103)
## [1] "2012-07-13"
librería rio
: Importación y exportación de datos optimizada, la importación basada en web es compatible de forma nativa (incluso desde SSL / HTTPS), los archivos comprimidos se pueden leer directamente sin descompresión explícita y se utilizan paquetes de importación rápida cuando sea apropiado.
librería rio
: Importación y exportación de datos optimizada, la importación basada en web es compatible de forma nativa (incluso desde SSL / HTTPS), los archivos comprimidos se pueden leer directamente sin descompresión explícita y se utilizan paquetes de importación rápida cuando sea apropiado.
#install.packages("rio")
library(rio)
library(janitor)
library(tidyverse)
# Data del Banco Central del Ecuador ====
urlData <- "https://contenido.bce.fin.ec/documentos/Estadisticas/SectorReal/CuentasProvinciales/Can2019.xlsx"
data <- import(urlData, sheet = "VAB CANTONAL", skip = 6, col_names = TRUE) # importamos la base
data = data %>%
pivot_longer(-c(PROVINCIA,"CÓDIGO PROVINCIA","CANTÓN","CÓDIGO CANTÓN"),
names_to="Sector",values_to="VAB") %>%
mutate(Sector = str_to_sentence(str_squish(str_replace_all(Sector, "\r|\n", "")))) %>%
mutate(PROVINCIA = str_to_sentence(PROVINCIA))
data <- clean_names(dat = data,case = "upper_camel")
data = data %>% filter(!is.na(CodigoCanton))
data %>% glimpse()
## Rows: 3,315
## Columns: 6
## $ Provincia <chr> "Azuay", "Azuay", "Azuay", "Azuay", "Azuay", "Azuay", ~
## $ CodigoProvincia <chr> "01", "01", "01", "01", "01", "01", "01", "01", "01", ~
## $ Canton <chr> "Cuenca", "Cuenca", "Cuenca", "Cuenca", "Cuenca", "Cue~
## $ CodigoCanton <chr> "0101", "0101", "0101", "0101", "0101", "0101", "0101"~
## $ Sector <chr> "Agricultura, ganadería, silvicultura y pesca", "Explo~
## $ Vab <dbl> 92901.6333, 69016.8075, 902215.8441, 77681.5980, 83595~
El Banco Mundial pone a disposición una gran cantidad de datos de los Indicadores de desarrollo mundial a través de su API web. El paquete WDI para R facilita la búsqueda y descarga de series de datos desde WDI.
#install.packages('WDI')
library(WDI)
library(ggplot2)
WDIsearch('gdp')
## indicator
## [1,] "5.51.01.10.gdp"
## [2,] "6.0.GDP_current"
## [3,] "6.0.GDP_growth"
## [4,] "6.0.GDP_usd"
## [5,] "6.0.GDPpc_constant"
## [6,] "BG.GSR.NFSV.GD.ZS"
## [7,] "BG.KAC.FNEI.GD.PP.ZS"
## [8,] "BG.KAC.FNEI.GD.ZS"
## [9,] "BG.KLT.DINV.GD.PP.ZS"
## [10,] "BG.KLT.DINV.GD.ZS"
## [11,] "BI.WAG.TOTL.GD.ZS"
## [12,] "BM.GSR.MRCH.ZS"
## [13,] "BM.KLT.DINV.GD.ZS"
## [14,] "BM.KLT.DINV.WD.GD.ZS"
## [15,] "BN.CAB.XOKA.GD.ZS"
## [16,] "BN.CAB.XOKA.GDP.ZS"
## [17,] "BN.CAB.XOTR.ZS"
## [18,] "BN.CUR.GDPM.ZS"
## [19,] "BN.GSR.FCTY.CD.ZS"
## [20,] "BN.KLT.DINV.CD.ZS"
## [21,] "BN.KLT.DINV.DRS.GDP.ZS"
## [22,] "BN.KLT.PRVT.GD.ZS"
## [23,] "BN.TRF.CURR.CD.ZS"
## [24,] "BX.GSR.MRCH.ZS"
## [25,] "BX.KLT.DINV.DT.GD.ZS"
## [26,] "BX.KLT.DINV.WD.GD.ZS"
## [27,] "BX.TRF.MGR.DT.GD.ZS"
## [28,] "BX.TRF.PWKR.DT.GD.ZS"
## [29,] "BX.TRF.PWKR.GD.ZS"
## [30,] "CM.FIN.INTL.GD.ZS"
## [31,] "CM.MKT.LCAP.GD.ZS"
## [32,] "CM.MKT.TRAD.GD.ZS"
## [33,] "DP.DOD.DECD.CR.BC.Z1"
## [34,] "DP.DOD.DECD.CR.CG.Z1"
## [35,] "DP.DOD.DECD.CR.FC.Z1"
## [36,] "DP.DOD.DECD.CR.GG.Z1"
## [37,] "DP.DOD.DECD.CR.NF.Z1"
## [38,] "DP.DOD.DECF.CR.BC.Z1"
## [39,] "DP.DOD.DECF.CR.CG.Z1"
## [40,] "DP.DOD.DECF.CR.FC.Z1"
## [41,] "DP.DOD.DECF.CR.GG.Z1"
## [42,] "DP.DOD.DECF.CR.NF.Z1"
## [43,] "DP.DOD.DECN.CR.BC.Z1"
## [44,] "DP.DOD.DECN.CR.CG.Z1"
## [45,] "DP.DOD.DECN.CR.FC.Z1"
## [46,] "DP.DOD.DECN.CR.GG.Z1"
## [47,] "DP.DOD.DECN.CR.NF.Z1"
## [48,] "DP.DOD.DECT.CR.BC.Z1"
## [49,] "DP.DOD.DECT.CR.CG.Z1"
## [50,] "DP.DOD.DECT.CR.FC.Z1"
## [51,] "DP.DOD.DECT.CR.GG.Z1"
## [52,] "DP.DOD.DECT.CR.NF.Z1"
## [53,] "DP.DOD.DECX.CR.BC.Z1"
## [54,] "DP.DOD.DECX.CR.CG.Z1"
## [55,] "DP.DOD.DECX.CR.FC.Z1"
## [56,] "DP.DOD.DECX.CR.GG.Z1"
## [57,] "DP.DOD.DECX.CR.NF.Z1"
## [58,] "DP.DOD.DLCD.CR.BC.Z1"
## [59,] "DP.DOD.DLCD.CR.CG.Z1"
## [60,] "DP.DOD.DLCD.CR.FC.Z1"
## [61,] "DP.DOD.DLCD.CR.GG.Z1"
## [62,] "DP.DOD.DLCD.CR.L1.BC.Z1"
## [63,] "DP.DOD.DLCD.CR.L1.CG.Z1"
## [64,] "DP.DOD.DLCD.CR.L1.FC.Z1"
## [65,] "DP.DOD.DLCD.CR.L1.GG.Z1"
## [66,] "DP.DOD.DLCD.CR.L1.NF.Z1"
## [67,] "DP.DOD.DLCD.CR.M1.BC.Z1"
## [68,] "DP.DOD.DLCD.CR.M1.CG.Z1"
## [69,] "DP.DOD.DLCD.CR.M1.FC.Z1"
## [70,] "DP.DOD.DLCD.CR.M1.GG.Z1"
## [71,] "DP.DOD.DLCD.CR.M1.NF.Z1"
## [72,] "DP.DOD.DLCD.CR.NF.Z1"
## [73,] "DP.DOD.DLD1.CR.CG.Z1"
## [74,] "DP.DOD.DLD1.CR.GG.Z1"
## [75,] "DP.DOD.DLD2.CR.CG.Z1"
## [76,] "DP.DOD.DLD2.CR.GG.Z1"
## [77,] "DP.DOD.DLD2A.CR.CG.Z1"
## [78,] "DP.DOD.DLD2A.CR.GG.Z1"
## [79,] "DP.DOD.DLD3.CR.CG.Z1"
## [80,] "DP.DOD.DLD3.CR.GG.Z1"
## [81,] "DP.DOD.DLD4.CR.CG.Z1"
## [82,] "DP.DOD.DLD4.CR.GG.Z1"
## [83,] "DP.DOD.DLDS.CR.BC.Z1"
## [84,] "DP.DOD.DLDS.CR.CG.Z1"
## [85,] "DP.DOD.DLDS.CR.FC.Z1"
## [86,] "DP.DOD.DLDS.CR.GG.Z1"
## [87,] "DP.DOD.DLDS.CR.L1.BC.Z1"
## [88,] "DP.DOD.DLDS.CR.L1.CG.Z1"
## [89,] "DP.DOD.DLDS.CR.L1.FC.Z1"
## [90,] "DP.DOD.DLDS.CR.L1.GG.Z1"
## [91,] "DP.DOD.DLDS.CR.L1.NF.Z1"
## [92,] "DP.DOD.DLDS.CR.M1.BC.Z1"
## [93,] "DP.DOD.DLDS.CR.M1.CG.Z1"
## [94,] "DP.DOD.DLDS.CR.M1.FC.Z1"
## [95,] "DP.DOD.DLDS.CR.M1.GG.Z1"
## [96,] "DP.DOD.DLDS.CR.M1.NF.Z1"
## [97,] "DP.DOD.DLDS.CR.MV.BC.Z1"
## [98,] "DP.DOD.DLDS.CR.MV.CG.Z1"
## [99,] "DP.DOD.DLDS.CR.MV.FC.Z1"
## [100,] "DP.DOD.DLDS.CR.MV.GG.Z1"
## [101,] "DP.DOD.DLDS.CR.MV.NF.Z1"
## [102,] "DP.DOD.DLDS.CR.NF.Z1"
## [103,] "DP.DOD.DLIN.CR.BC.Z1"
## [104,] "DP.DOD.DLIN.CR.CG.Z1"
## [105,] "DP.DOD.DLIN.CR.FC.Z1"
## [106,] "DP.DOD.DLIN.CR.GG.Z1"
## [107,] "DP.DOD.DLIN.CR.L1.BC.Z1"
## [108,] "DP.DOD.DLIN.CR.L1.CG.Z1"
## [109,] "DP.DOD.DLIN.CR.L1.FC.Z1"
## [110,] "DP.DOD.DLIN.CR.L1.GG.Z1"
## [111,] "DP.DOD.DLIN.CR.L1.NF.Z1"
## [112,] "DP.DOD.DLIN.CR.M1.BC.Z1"
## [113,] "DP.DOD.DLIN.CR.M1.CG.Z1"
## [114,] "DP.DOD.DLIN.CR.M1.FC.Z1"
## [115,] "DP.DOD.DLIN.CR.M1.GG.Z1"
## [116,] "DP.DOD.DLIN.CR.M1.NF.Z1"
## [117,] "DP.DOD.DLIN.CR.NF.Z1"
## [118,] "DP.DOD.DLLO.CR.BC.Z1"
## [119,] "DP.DOD.DLLO.CR.CG.Z1"
## [120,] "DP.DOD.DLLO.CR.FC.Z1"
## [121,] "DP.DOD.DLLO.CR.GG.Z1"
## [122,] "DP.DOD.DLLO.CR.L1.BC.Z1"
## [123,] "DP.DOD.DLLO.CR.L1.CG.Z1"
## [124,] "DP.DOD.DLLO.CR.L1.FC.Z1"
## [125,] "DP.DOD.DLLO.CR.L1.GG.Z1"
## [126,] "DP.DOD.DLLO.CR.L1.NF.Z1"
## [127,] "DP.DOD.DLLO.CR.M1.BC.Z1"
## [128,] "DP.DOD.DLLO.CR.M1.CG.Z1"
## [129,] "DP.DOD.DLLO.CR.M1.FC.Z1"
## [130,] "DP.DOD.DLLO.CR.M1.GG.Z1"
## [131,] "DP.DOD.DLLO.CR.M1.NF.Z1"
## [132,] "DP.DOD.DLLO.CR.NF.Z1"
## [133,] "DP.DOD.DLOA.CR.BC.Z1"
## [134,] "DP.DOD.DLOA.CR.CG.Z1"
## [135,] "DP.DOD.DLOA.CR.FC.Z1"
## [136,] "DP.DOD.DLOA.CR.GG.Z1"
## [137,] "DP.DOD.DLOA.CR.L1.BC.Z1"
## [138,] "DP.DOD.DLOA.CR.L1.CG.Z1"
## [139,] "DP.DOD.DLOA.CR.L1.FC.Z1"
## [140,] "DP.DOD.DLOA.CR.L1.GG.Z1"
## [141,] "DP.DOD.DLOA.CR.L1.NF.Z1"
## [142,] "DP.DOD.DLOA.CR.M1.BC.Z1"
## [143,] "DP.DOD.DLOA.CR.M1.CG.Z1"
## [144,] "DP.DOD.DLOA.CR.M1.FC.Z1"
## [145,] "DP.DOD.DLOA.CR.M1.GG.Z1"
## [146,] "DP.DOD.DLOA.CR.M1.NF.Z1"
## [147,] "DP.DOD.DLOA.CR.NF.Z1"
## [148,] "DP.DOD.DLSD.CR.BC.Z1"
## [149,] "DP.DOD.DLSD.CR.CG.Z1"
## [150,] "DP.DOD.DLSD.CR.FC.Z1"
## [151,] "DP.DOD.DLSD.CR.GG.Z1"
## [152,] "DP.DOD.DLSD.CR.M1.BC.Z1"
## [153,] "DP.DOD.DLSD.CR.M1.CG.Z1"
## [154,] "DP.DOD.DLSD.CR.M1.FC.Z1"
## [155,] "DP.DOD.DLSD.CR.M1.GG.Z1"
## [156,] "DP.DOD.DLSD.CR.M1.NF.Z1"
## [157,] "DP.DOD.DLSD.CR.NF.Z1"
## [158,] "DP.DOD.DLTC.CR.BC.Z1"
## [159,] "DP.DOD.DLTC.CR.CG.Z1"
## [160,] "DP.DOD.DLTC.CR.FC.Z1"
## [161,] "DP.DOD.DLTC.CR.GG.Z1"
## [162,] "DP.DOD.DLTC.CR.L1.BC.Z1"
## [163,] "DP.DOD.DLTC.CR.L1.CG.Z1"
## [164,] "DP.DOD.DLTC.CR.L1.FC.Z1"
## [165,] "DP.DOD.DLTC.CR.L1.GG.Z1"
## [166,] "DP.DOD.DLTC.CR.L1.NF.Z1"
## [167,] "DP.DOD.DLTC.CR.M1.BC.Z1"
## [168,] "DP.DOD.DLTC.CR.M1.CG.Z1"
## [169,] "DP.DOD.DLTC.CR.M1.FC.Z1"
## [170,] "DP.DOD.DLTC.CR.M1.GG.Z1"
## [171,] "DP.DOD.DLTC.CR.M1.NF.Z1"
## [172,] "DP.DOD.DLTC.CR.NF.Z1"
## [173,] "DP.DOD.DSCD.CR.BC.Z1"
## [174,] "DP.DOD.DSCD.CR.CG.Z1"
## [175,] "DP.DOD.DSCD.CR.FC.Z1"
## [176,] "DP.DOD.DSCD.CR.GG.Z1"
## [177,] "DP.DOD.DSCD.CR.NF.Z1"
## [178,] "DP.DOD.DSDS.CR.BC.Z1"
## [179,] "DP.DOD.DSDS.CR.CG.Z1"
## [180,] "DP.DOD.DSDS.CR.FC.Z1"
## [181,] "DP.DOD.DSDS.CR.GG.Z1"
## [182,] "DP.DOD.DSDS.CR.NF.Z1"
## [183,] "DP.DOD.DSIN.CR.BC.Z1"
## [184,] "DP.DOD.DSIN.CR.CG.Z1"
## [185,] "DP.DOD.DSIN.CR.FC.Z1"
## [186,] "DP.DOD.DSIN.CR.GG.Z1"
## [187,] "DP.DOD.DSIN.CR.NF.Z1"
## [188,] "DP.DOD.DSLO.CR.BC.Z1"
## [189,] "DP.DOD.DSLO.CR.CG.Z1"
## [190,] "DP.DOD.DSLO.CR.FC.Z1"
## [191,] "DP.DOD.DSLO.CR.GG.Z1"
## [192,] "DP.DOD.DSLO.CR.NF.Z1"
## [193,] "DP.DOD.DSOA.CR.BC.Z1"
## [194,] "DP.DOD.DSOA.CR.CG.Z1"
## [195,] "DP.DOD.DSOA.CR.FC.Z1"
## [196,] "DP.DOD.DSOA.CR.GG.Z1"
## [197,] "DP.DOD.DSOA.CR.NF.Z1"
## [198,] "DP.DOD.DSTC.CR.BC.Z1"
## [199,] "DP.DOD.DSTC.CR.CG.Z1"
## [200,] "DP.DOD.DSTC.CR.FC.Z1"
## [201,] "DP.DOD.DSTC.CR.GG.Z1"
## [202,] "DP.DOD.DSTC.CR.NF.Z1"
## [203,] "DT.DOD.ALLC.ZSG"
## [204,] "DT.DOD.ALLN.ZSG"
## [205,] "DT.DOD.DECT.CD.ZSG"
## [206,] "DT.ODA.ALLD.GD.ZS"
## [207,] "DT.ODA.DACD.ZSG"
## [208,] "DT.ODA.MULT.ZSG"
## [209,] "DT.ODA.NDAC.ZSG"
## [210,] "DT.ODA.ODAT.GD.ZS"
## [211,] "DT.TDS.DECT.GD.ZS"
## [212,] "EG.EGY.PRIM.PP.KD"
## [213,] "EG.GDP.PUSE.KO.87"
## [214,] "EG.GDP.PUSE.KO.KD"
## [215,] "EG.GDP.PUSE.KO.PP"
## [216,] "EG.GDP.PUSE.KO.PP.KD"
## [217,] "EG.USE.COMM.GD.PP.KD"
## [218,] "EN.ATM.CO2E.GDP"
## [219,] "EN.ATM.CO2E.KD.87.GD"
## [220,] "EN.ATM.CO2E.KD.GD"
## [221,] "EN.ATM.CO2E.PP.GD"
## [222,] "EN.ATM.CO2E.PP.GD.KD"
## [223,] "ER.GDP.FWTL.M3.KD"
## [224,] "EU.EGY.USES.GDP"
## [225,] "FB.DPT.INSU.PC.ZS"
## [226,] "FD.AST.PRVT.GD.ZS"
## [227,] "FI.RES.TOTL.CD.ZS"
## [228,] "FM.AST.GOVT.CN.ZS"
## [229,] "FM.LBL.BMNY.GD.ZS"
## [230,] "FM.LBL.MQMY.GD.ZS"
## [231,] "FM.LBL.MQMY.GDP.ZS"
## [232,] "FM.LBL.MQMY.XD"
## [233,] "FM.LBL.QMNY.GDP.ZS"
## [234,] "FM.LBL.SEIG.GDP.ZS"
## [235,] "FS.AST.CGOV.GD.ZS"
## [236,] "FS.AST.DOMO.GD.ZS"
## [237,] "FS.AST.DOMS.GD.ZS"
## [238,] "FS.AST.DTOT.ZS"
## [239,] "FS.AST.PRVT.GD.ZS"
## [240,] "FS.AST.PRVT.GDP.ZS"
## [241,] "FS.LBL.LIQU.GD.ZS"
## [242,] "FS.LBL.LIQU.GDP.ZS"
## [243,] "FS.LBL.QLIQ.GD.ZS"
## [244,] "GB.BAL.OVRL.GD.ZS"
## [245,] "GB.BAL.OVRL.GDP.ZS"
## [246,] "GB.DOD.TOTL.GD.ZS"
## [247,] "GB.DOD.TOTL.GDP.ZS"
## [248,] "GB.FIN.ABRD.GD.ZS"
## [249,] "GB.FIN.ABRD.GDP.ZS"
## [250,] "GB.FIN.DOMS.GD.ZS"
## [251,] "GB.FIN.DOMS.GDP.ZS"
## [252,] "GB.REV.CTOT.GD.ZS"
## [253,] "GB.REV.TOTL.GDP.ZS"
## [254,] "GB.REV.XAGT.CN.ZS"
## [255,] "GB.RVC.TOTL.GD.ZS"
## [256,] "GB.SOE.DECT.ZS"
## [257,] "GB.SOE.ECON.GD.ZS"
## [258,] "GB.SOE.ECON.GDP.ZS"
## [259,] "GB.SOE.NFLW.GD.ZS"
## [260,] "GB.SOE.NFLW.GDP.ZS"
## [261,] "GB.SOE.OVRL.GD.ZS"
## [262,] "GB.TAX.TOTL.GD.ZS"
## [263,] "GB.TAX.TOTL.GDP.ZS"
## [264,] "GB.XPD.DEFN.GDP.ZS"
## [265,] "GB.XPD.RSDV.GD.ZS"
## [266,] "GB.XPD.TOTL.GD.ZS"
## [267,] "GB.XPD.TOTL.GDP.ZS"
## [268,] "GC.AST.TOTL.GD.ZS"
## [269,] "GC.BAL.CASH.GD.ZS"
## [270,] "GC.DOD.TOTL.GD.ZS"
## [271,] "GC.FIN.DOMS.GD.ZS"
## [272,] "GC.FIN.FRGN.GD.ZS"
## [273,] "GC.LBL.TOTL.GD.ZS"
## [274,] "GC.NFN.TOTL.GD.ZS"
## [275,] "GC.NLD.TOTL.GD.ZS"
## [276,] "GC.REV.XGRT.GD.ZS"
## [277,] "GC.TAX.TOTL.GD.ZS"
## [278,] "GC.XPN.TOTL.GD.ZS"
## [279,] "GD.ZS"
## [280,] "GFDD.DI.01"
## [281,] "GFDD.DI.02"
## [282,] "GFDD.DI.03"
## [283,] "GFDD.DI.05"
## [284,] "GFDD.DI.06"
## [285,] "GFDD.DI.07"
## [286,] "GFDD.DI.08"
## [287,] "GFDD.DI.09"
## [288,] "GFDD.DI.10"
## [289,] "GFDD.DI.11"
## [290,] "GFDD.DI.12"
## [291,] "GFDD.DI.13"
## [292,] "GFDD.DI.14"
## [293,] "GFDD.DM.01"
## [294,] "GFDD.DM.02"
## [295,] "GFDD.DM.03"
## [296,] "GFDD.DM.04"
## [297,] "GFDD.DM.05"
## [298,] "GFDD.DM.06"
## [299,] "GFDD.DM.07"
## [300,] "GFDD.DM.08"
## [301,] "GFDD.DM.09"
## [302,] "GFDD.DM.10"
## [303,] "GFDD.DM.11"
## [304,] "GFDD.DM.12"
## [305,] "GFDD.DM.13"
## [306,] "GFDD.EI.08"
## [307,] "GFDD.OI.02"
## [308,] "GFDD.OI.08"
## [309,] "GFDD.OI.09"
## [310,] "GFDD.OI.13"
## [311,] "GFDD.OI.14"
## [312,] "GFDD.OI.17"
## [313,] "GFDD.OI.18"
## [314,] "IE.ICT.TOTL.GD.ZS"
## [315,] "IS.RRS.GOOD.KM.PP.ZS"
## [316,] "IS.RRS.PASG.K2.PP.ZS"
## [317,] "IT.TEL.REVN.GD.ZS"
## [318,] "MS.MIL.XPND.GD.ZS"
## [319,] "NA.GDP.ACC.FB.SNA08.CR"
## [320,] "NA.GDP.ACC.FB.SNA08.KR"
## [321,] "NA.GDP.AGR.CR"
## [322,] "NA.GDP.AGR.KR"
## [323,] "NA.GDP.AGR.SNA08.CR"
## [324,] "NA.GDP.AGR.SNA08.KR"
## [325,] "NA.GDP.BUSS.SNA08.CR"
## [326,] "NA.GDP.BUSS.SNA08.KR"
## [327,] "NA.GDP.CNST.CR"
## [328,] "NA.GDP.CNST.KR"
## [329,] "NA.GDP.CNST.SNA08.CR"
## [330,] "NA.GDP.CNST.SNA08.KR"
## [331,] "NA.GDP.EDUS.SNA08.CR"
## [332,] "NA.GDP.EDUS.SNA08.KR"
## [333,] "NA.GDP.ELEC.GAS.SNA08.CR"
## [334,] "NA.GDP.ELEC.GAS.SNA08.KR"
## [335,] "NA.GDP.EXC.OG.CR"
## [336,] "NA.GDP.EXC.OG.KR"
## [337,] "NA.GDP.FINS.CR"
## [338,] "NA.GDP.FINS.KR"
## [339,] "NA.GDP.FINS.SNA08.CR"
## [340,] "NA.GDP.FINS.SNA08.KR"
## [341,] "NA.GDP.HLTH.SOCW.SNA08.CR"
## [342,] "NA.GDP.HLTH.SOCW.SNA08.KR"
## [343,] "NA.GDP.INC.OG.CR"
## [344,] "NA.GDP.INC.OG.KR"
## [345,] "NA.GDP.INC.OG.SNA08.CR"
## [346,] "NA.GDP.INC.OG.SNA08.KR"
## [347,] "NA.GDP.INF.COMM.SNA08.CR"
## [348,] "NA.GDP.INF.COMM.SNA08.KR"
## [349,] "NA.GDP.MINQ.CR"
## [350,] "NA.GDP.MINQ.KR"
## [351,] "NA.GDP.MINQ.SNA08.CR"
## [352,] "NA.GDP.MINQ.SNA08.KR"
## [353,] "NA.GDP.MNF.CR"
## [354,] "NA.GDP.MNF.KR"
## [355,] "NA.GDP.MNF.SNA08.CR"
## [356,] "NA.GDP.MNF.SNA08.KR"
## [357,] "NA.GDP.PADM.DEF.SNA08.CR"
## [358,] "NA.GDP.PADM.DEF.SNA08.KR"
## [359,] "NA.GDP.REST.SNA08.CR"
## [360,] "NA.GDP.REST.SNA08.KR"
## [361,] "NA.GDP.SRV.OTHR.CR"
## [362,] "NA.GDP.SRV.OTHR.KR"
## [363,] "NA.GDP.SRV.OTHR.SNA08.CR"
## [364,] "NA.GDP.SRV.OTHR.SNA08.KR"
## [365,] "NA.GDP.TRAN.COMM.CR"
## [366,] "NA.GDP.TRAN.COMM.KR"
## [367,] "NA.GDP.TRAN.STOR.SNA08.CR"
## [368,] "NA.GDP.TRAN.STOR.SNA08.KR"
## [369,] "NA.GDP.TRD.HTL.CR"
## [370,] "NA.GDP.TRD.HTL.KR"
## [371,] "NA.GDP.TRD.SNA08.CR"
## [372,] "NA.GDP.TRD.SNA08.KR"
## [373,] "NA.GDP.UTL.CR"
## [374,] "NA.GDP.UTL.KR"
## [375,] "NA.GDP.WTR.WST.SNA08.CR"
## [376,] "NA.GDP.WTR.WST.SNA08.KR"
## [377,] "NE.CON.GOVT.ZS"
## [378,] "NE.CON.PETC.ZS"
## [379,] "NE.CON.PRVT.ZS"
## [380,] "NE.CON.TETC.ZS"
## [381,] "NE.CON.TOTL.ZG"
## [382,] "NE.CON.TOTL.ZS"
## [383,] "NE.DAB.TOTL.ZS"
## [384,] "NE.EXP.GNFS.ZS"
## [385,] "NE.GDI.CON.GOVT.CR"
## [386,] "NE.GDI.CON.GOVT.SNA08.CR"
## [387,] "NE.GDI.CON.NPI.CR"
## [388,] "NE.GDI.CON.NPI.SNA08.CR"
## [389,] "NE.GDI.CON.PRVT.CR"
## [390,] "NE.GDI.CON.PRVT.SNA08.CR"
## [391,] "NE.GDI.EXPT.CR"
## [392,] "NE.GDI.EXPT.SNA08.CR"
## [393,] "NE.GDI.FPRV.ZS"
## [394,] "NE.GDI.FPUB.ZS"
## [395,] "NE.GDI.FTOT.CR"
## [396,] "NE.GDI.FTOT.SNA08.CR"
## [397,] "NE.GDI.FTOT.ZS"
## [398,] "NE.GDI.IMPT.CR"
## [399,] "NE.GDI.IMPT.SNA08.CR"
## [400,] "NE.GDI.INEX.SNA08.CR"
## [401,] "NE.GDI.STKB.CR"
## [402,] "NE.GDI.STKB.SNA08.CR"
## [403,] "NE.GDI.TOTL.CR"
## [404,] "NE.GDI.TOTL.SNA08.CR"
## [405,] "NE.GDI.TOTL.ZG"
## [406,] "NE.GDI.TOTL.ZS"
## [407,] "NE.IMP.GNFS.ZS"
## [408,] "NE.MRCH.GDP.ZS"
## [409,] "NE.RSB.GNFS.ZG"
## [410,] "NE.RSB.GNFS.ZS"
## [411,] "NE.TRD.GNFS.ZS"
## [412,] "NP.AGR.TOTL.ZG"
## [413,] "NP.IND.TOTL.ZG"
## [414,] "NP.SRV.TOTL.ZG"
## [415,] "NV.AGR.PCAP.KD.ZG"
## [416,] "NV.AGR.TOTL.ZG"
## [417,] "NV.AGR.TOTL.ZS"
## [418,] "NV.IND.MANF.ZS"
## [419,] "NV.IND.TOTL.ZG"
## [420,] "NV.IND.TOTL.ZS"
## [421,] "NV.SRV.DISC.CD"
## [422,] "NV.SRV.DISC.CN"
## [423,] "NV.SRV.DISC.KN"
## [424,] "NV.SRV.TETC.ZG"
## [425,] "NV.SRV.TETC.ZS"
## [426,] "NV.SRV.TOTL.ZS"
## [427,] "NY.AGR.SUBS.GD.ZS"
## [428,] "NY.GDP.COAL.RT.ZS"
## [429,] "NY.GDP.DEFL.87.ZG"
## [430,] "NY.GDP.DEFL.KD.ZG"
## [431,] "NY.GDP.DEFL.KD.ZG.AD"
## [432,] "NY.GDP.DEFL.ZS"
## [433,] "NY.GDP.DEFL.ZS.87"
## [434,] "NY.GDP.DEFL.ZS.AD"
## [435,] "NY.GDP.DISC.CD"
## [436,] "NY.GDP.DISC.CN"
## [437,] "NY.GDP.DISC.KN"
## [438,] "NY.GDP.FCST.KD.87"
## [439,] "NY.GDP.FCST.KN.87"
## [440,] "NY.GDP.FRST.RT.ZS"
## [441,] "NY.GDP.MINR.RT.ZS"
## [442,] "NY.GDP.MKTP.CD"
## [443,] "NY.GDP.MKTP.CD.XD"
## [444,] "NY.GDP.MKTP.CN"
## [445,] "NY.GDP.MKTP.CN.AD"
## [446,] "NY.GDP.MKTP.CN.XD"
## [447,] "NY.GDP.MKTP.IN"
## [448,] "NY.GDP.MKTP.KD"
## [449,] "NY.GDP.MKTP.KD.87"
## [450,] "NY.GDP.MKTP.KD.ZG"
## [451,] "NY.GDP.MKTP.KN"
## [452,] "NY.GDP.MKTP.KN.87"
## [453,] "NY.GDP.MKTP.KN.87.ZG"
## [454,] "NY.GDP.MKTP.PP.CD"
## [455,] "NY.GDP.MKTP.PP.KD"
## [456,] "NY.GDP.MKTP.PP.KD.87"
## [457,] "NY.GDP.MKTP.XD"
## [458,] "NY.GDP.MKTP.XU.E"
## [459,] "NY.GDP.NGAS.RT.ZS"
## [460,] "NY.GDP.PCAP.CD"
## [461,] "NY.GDP.PCAP.CN"
## [462,] "NY.GDP.PCAP.KD"
## [463,] "NY.GDP.PCAP.KD.ZG"
## [464,] "NY.GDP.PCAP.KN"
## [465,] "NY.GDP.PCAP.PP.CD"
## [466,] "NY.GDP.PCAP.PP.KD"
## [467,] "NY.GDP.PCAP.PP.KD.87"
## [468,] "NY.GDP.PCAP.PP.KD.ZG"
## [469,] "NY.GDP.PETR.RT.ZS"
## [470,] "NY.GDP.TOTL.RT.ZS"
## [471,] "NY.GDS.TOTL.ZS"
## [472,] "NY.GEN.AEDU.GD.ZS"
## [473,] "NY.GEN.DCO2.GD.ZS"
## [474,] "NY.GEN.DFOR.GD.ZS"
## [475,] "NY.GEN.DKAP.GD.ZS"
## [476,] "NY.GEN.DMIN.GD.ZS"
## [477,] "NY.GEN.DNGY.GD.ZS"
## [478,] "NY.GEN.NDOM.GD.ZS"
## [479,] "NY.GEN.SVNG.GD.ZS"
## [480,] "NY.GNS.ICTR.ZS"
## [481,] "NYGDPMKTPKDZ"
## [482,] "NYGDPMKTPSACD"
## [483,] "NYGDPMKTPSACN"
## [484,] "NYGDPMKTPSAKD"
## [485,] "NYGDPMKTPSAKN"
## [486,] "PA.NUS.PPP"
## [487,] "PA.NUS.PPP.05"
## [488,] "PA.NUS.PPPC.RF"
## [489,] "S02"
## [490,] "SE.XPD.EDUC.ZS"
## [491,] "SE.XPD.PRIM.GDP.ZS"
## [492,] "SE.XPD.PRIM.PC.ZS"
## [493,] "SE.XPD.SECO.GDP.ZS"
## [494,] "SE.XPD.SECO.PC.ZS"
## [495,] "SE.XPD.TERT.GDP.ZS"
## [496,] "SE.XPD.TERT.PC.ZS"
## [497,] "SE.XPD.TOTL.GD.ZS"
## [498,] "SF.TRN.RAIL.KM.ZS"
## [499,] "SH.XPD.CHEX.GD.ZS"
## [500,] "SH.XPD.GHED.GD.ZS"
## [501,] "SH.XPD.HLTH.ZS"
## [502,] "SH.XPD.KHEX.GD.ZS"
## [503,] "SH.XPD.PRIV.ZS"
## [504,] "SH.XPD.PUBL.ZS"
## [505,] "SH.XPD.TOTL.ZS"
## [506,] "SL.GDP.PCAP.EM.KD"
## [507,] "SL.GDP.PCAP.EM.KD.ZG"
## [508,] "SL.GDP.PCAP.EM.XD"
## [509,] "TG.VAL.TOTL.GD.PP.ZS"
## [510,] "TG.VAL.TOTL.GD.ZS"
## [511,] "TG.VAL.TOTL.GG.ZS"
## [512,] "UIS.XGDP.0.FSGOV"
## [513,] "UIS.XGDP.02.FSGOV.FFNTR"
## [514,] "UIS.XGDP.1.FSGOV"
## [515,] "UIS.XGDP.1.FSGOV.FFNTR"
## [516,] "UIS.XGDP.1.FSHH.FFNTR"
## [517,] "UIS.XGDP.2.FSGOV"
## [518,] "UIS.XGDP.2.FSGOV.FFNTR"
## [519,] "UIS.XGDP.23.FSGOV"
## [520,] "UIS.XGDP.23.FSHH.FFNTR"
## [521,] "UIS.XGDP.2T3.FSGOV.FFNTR"
## [522,] "UIS.XGDP.2T4.V.FSGOV"
## [523,] "UIS.XGDP.3.FSGOV"
## [524,] "UIS.XGDP.3.FSGOV.FFNTR"
## [525,] "UIS.XGDP.4.FSGOV"
## [526,] "UIS.XGDP.56.FSGOV"
## [527,] "UIS.XGDP.5T8.FSGOV.FFNTR"
## [528,] "UIS.XGDP.5T8.FSHH.FFNTR"
## [529,] "UIS.XGDP.FSGOV.FFNTR"
## [530,] "UIS.XGDP.FSHH.FFNTR"
## [531,] "UIS.XUNIT.GDPCAP.02.FSGOV"
## [532,] "UIS.XUNIT.GDPCAP.1.FSGOV"
## [533,] "UIS.XUNIT.GDPCAP.1.FSHH"
## [534,] "UIS.XUNIT.GDPCAP.2.FSGOV"
## [535,] "UIS.XUNIT.GDPCAP.23.FSGOV"
## [536,] "UIS.XUNIT.GDPCAP.23.FSHH"
## [537,] "UIS.XUNIT.GDPCAP.3.FSGOV"
## [538,] "UIS.XUNIT.GDPCAP.5T8.FSGOV"
## [539,] "UIS.XUNIT.GDPCAP.5T8.FSHH"
## name
## [1,] "Per capita GDP growth"
## [2,] "GDP (current $)"
## [3,] "GDP growth (annual %)"
## [4,] "GDP (constant 2005 $)"
## [5,] "GDP per capita, PPP (constant 2011 international $) "
## [6,] "Trade in services (% of GDP)"
## [7,] "Gross private capital flows (% of GDP, PPP)"
## [8,] "Gross private capital flows (% of GDP)"
## [9,] "Gross foreign direct investment (% of GDP, PPP)"
## [10,] "Gross foreign direct investment (% of GDP)"
## [11,] "Wage bill as a percentage of GDP"
## [12,] "Merchandise imports (BOP): percentage of GDP (%)"
## [13,] "Foreign direct investment, net outflows (% of GDP)"
## [14,] "Foreign direct investment, net outflows (% of GDP)"
## [15,] "Current account balance (% of GDP)"
## [16,] "Current account balance (% of GDP)"
## [17,] "Curr. acc. bal. before official transf. (% of GDP)"
## [18,] "Current account balance excluding net official capital grants (% of GDP)"
## [19,] "Net income (% of GDP)"
## [20,] "Foreign direct investment (% of GDP)"
## [21,] "Foreign direct investment, net inflows (% of GDP)"
## [22,] "Private capital flows, total (% of GDP)"
## [23,] "Net current transfers (% of GDP)"
## [24,] "Merchandise exports (BOP): percentage of GDP (%)"
## [25,] "Foreign direct investment, net inflows (% of GDP)"
## [26,] "Foreign direct investment, net inflows (% of GDP)"
## [27,] "Migrant remittance inflows (% of GDP)"
## [28,] "Personal remittances, received (% of GDP)"
## [29,] "Workers' remittances, receipts (% of GDP)"
## [30,] "Financing via international capital markets (gross inflows, % of GDP)"
## [31,] "Market capitalization of listed domestic companies (% of GDP)"
## [32,] "Stocks traded, total value (% of GDP)"
## [33,] "Gross PSD, Budgetary Central Gov., All maturities, All instruments, Domestic creditors, Nominal Value, % of GDP"
## [34,] "Gross PSD, Central Gov., All maturities, All instruments, Domestic creditors, Nominal Value, % of GDP"
## [35,] "Gross PSD, Financial Public Corp., All maturities, All instruments, Domestic creditors, Nominal Value, % of GDP"
## [36,] "Gross PSD, General Gov., All maturities, All instruments, Domestic creditors, Nominal Value, % of GDP"
## [37,] "Gross PSD, Nonfinancial Public Corp., All maturities, All instruments, Domestic creditors, Nominal Value, % of GDP"
## [38,] "Gross PSD, Budgetary Central Gov., All maturities, All instruments, Foreign currency, Nominal Value, % of GDP"
## [39,] "Gross PSD, Central Gov., All maturities, All instruments, Foreign currency, Nominal Value, % of GDP"
## [40,] "Gross PSD, Financial Public Corp., All maturities, All instruments, Foreign currency, Nominal Value, % of GDP"
## [41,] "Gross PSD, General Gov., All maturities, All instruments, Foreign currency, Nominal Value, % of GDP"
## [42,] "Gross PSD, Nonfinancial Public Corp., All maturities, All instruments, Foreign currency, Nominal Value, % of GDP"
## [43,] "Gross PSD, Budgetary Central Gov., All maturities, All instruments, Domestic currency, Nominal Value, % of GDP"
## [44,] "Gross PSD, Central Gov., All maturities, All instruments, Domestic currency, Nominal Value, % of GDP"
## [45,] "Gross PSD, Financial Public Corp., All maturities, All instruments, Domestic currency, Nominal Value, % of GDP"
## [46,] "Gross PSD, General Gov., All maturities, All instruments, Domestic currency, Nominal Value, % of GDP"
## [47,] "Gross PSD, Nonfinancial Public Corp., All maturities, All instruments, Domestic currency, Nominal Value, % of GDP"
## [48,] "Gross PSD, Budgetary Central Gov., All maturities, All instruments, Nominal Value, % of GDP"
## [49,] "Gross PSD, Central Gov., All maturities, All instruments, Nominal Value, % of GDP"
## [50,] "Gross PSD, Financial Public Corp., All maturities, All instruments, Nominal Value, % of GDP"
## [51,] "Gross PSD, General Gov., All maturities, All instruments, Nominal Value, % of GDP"
## [52,] "Gross PSD, Nonfinancial Public Corp., All maturities, All instruments, Nominal Value, % of GDP"
## [53,] "Gross PSD, Budgetary Central Gov., All maturities, All instruments, External creditors, Nominal Value, % of GDP"
## [54,] "Gross PSD, Central Gov., All maturities, All instruments, External creditors, Nominal Value, % of GDP"
## [55,] "Gross PSD, Financial Public Corp., All maturities, All instruments, External creditors, Nominal Value, % of GDP"
## [56,] "Gross PSD, General Gov., All maturities, All instruments, External creditors, Nominal Value, % of GDP"
## [57,] "Gross PSD, Nonfinancial Public Corp., All maturities, All instruments, External creditors, Nominal Value, % of GDP"
## [58,] "Gross PSD, Budgetary Central Gov., All maturities, Currency and deposits, Nominal Value, % of GDP"
## [59,] "Gross PSD, Central Gov., All maturities, Currency and deposits, Nominal Value, % of GDP"
## [60,] "Gross PSD, Financial Public Corp., All maturities, Currency and deposits, Nominal Value, % of GDP"
## [61,] "Gross PSD, General Gov., All maturities, Currency and deposits, Nominal Value, % of GDP"
## [62,] "Gross PSD, Budgetary Central Gov., Long-term, With payment due in one year or less, Currency and deposits, Nominal Value, % of GDP"
## [63,] "Gross PSD, Central Gov., Long-term, With payment due in one year or less, Currency and deposits, Nominal Value, % of GDP"
## [64,] "Gross PSD, Financial Public Corp., Long-term, With payment due in one year or less, Currency and deposits, Nominal Value, % of GDP"
## [65,] "Gross PSD, General Gov., Long-term, With payment due in one year or less, Currency and deposits, Nominal Value, % of GDP"
## [66,] "Gross PSD, Nonfinancial Public Corp., Long-term, With payment due in one year or less, Currency and deposits, Nominal Value, % of GDP"
## [67,] "Gross PSD, Budgetary Central Gov., Long-term, With payment due in more than one year, Currency and deposits, Nominal Value, % of GDP"
## [68,] "Gross PSD, Central Gov., Long-term, With payment due in more than one year, Currency and deposits, Nominal Value, % of GDP"
## [69,] "Gross PSD, Financial Public Corp., Long-term, With payment due in more than one year, Currency and deposits, Nominal Value, % of GDP"
## [70,] "Gross PSD, General Gov., Long-term, With payment due in more than one year, Currency and deposits, Nominal Value, % of GDP"
## [71,] "Gross PSD, Nonfinancial Public Corp., Long-term, With payment due in more than one year, Currency and deposits, Nominal Value, % of GDP"
## [72,] "Gross PSD, Nonfinancial Public Corp., All maturities, Currency and deposits, Nominal Value, % of GDP"
## [73,] "Gross PSD, Central Gov.-D1, All maturities, Debt securities + loans, Nominal Value, % of GDP"
## [74,] "Gross PSD, General Gov.-D1, All maturities, Debt securities + loans, Nominal Value, % of GDP"
## [75,] "Gross PSD, Central Gov.-D2, All maturities, D1+ SDRs + currency and deposits, Nominal Value, % of GDP"
## [76,] "Gross PSD, General Gov.-D2, All maturities, D1+ SDRs + currency and deposits, Nominal Value, % of GDP"
## [77,] "Gross PSD, Central Gov.-D2A, All maturities, D1+ currency and deposits, Maastricht debt, % of GDP"
## [78,] "Gross PSD, General Gov.-D2A, All maturities, D1+ currency and deposits, Maastricht debt, % of GDP"
## [79,] "Gross PSD, Central Gov.-D3, All maturities, D2+other accounts payable, Nominal Value, % of GDP"
## [80,] "Gross PSD, General Gov.-D3, All maturities, D2+other accounts payable, Nominal Value, % of GDP"
## [81,] "Gross PSD, Central Gov.-D4, All maturities, D3+insurance, pensions, and standardized guarantees, Nominal Value, % of GDP"
## [82,] "Gross PSD, General Gov.-D4, All maturities, D3+insurance, pensions, and standardized guarantees, Nominal Value, % of GDP"
## [83,] "Gross PSD, Budgetary Central Gov., All maturities, Debt securities, Nominal Value, % of GDP"
## [84,] "Gross PSD, Central Gov., All maturities, Debt securities, Nominal Value, % of GDP"
## [85,] "Gross PSD, Financial Public Corp., All maturities, Debt securities, Nominal Value, % of GDP"
## [86,] "Gross PSD, General Gov., All maturities, Debt securities, Nominal Value, % of GDP"
## [87,] "Gross PSD, Budgetary Central Gov., Long-term, With payment due in one year or less, Debt securities, Nominal Value, % of GDP"
## [88,] "Gross PSD, Central Gov., Long-term, With payment due in one year or less, Debt securities, Nominal Value, % of GDP"
## [89,] "Gross PSD, Financial Public Corp., Long-term, With payment due in one year or less, Debt securities, Nominal Value, % of GDP"
## [90,] "Gross PSD, General Gov., Long-term, With payment due in one year or less, Debt securities, Nominal Value, % of GDP"
## [91,] "Gross PSD, Nonfinancial Public Corp., Long-term, With payment due in one year or less, Debt securities, Nominal Value, % of GDP"
## [92,] "Gross PSD, Budgetary Central Gov., Long-term, With payment due in more than one year, Debt securities, Nominal Value, % of GDP"
## [93,] "Gross PSD, Central Gov., Long-term, With payment due in more than one year, Debt securities, Nominal Value, % of GDP"
## [94,] "Gross PSD, Financial Public Corp., Long-term, With payment due in more than one year, Debt securities, Nominal Value, % of GDP"
## [95,] "Gross PSD, General Gov., Long-term, With payment due in more than one year, Debt securities, Nominal Value, % of GDP"
## [96,] "Gross PSD, Nonfinancial Public Corp., Long-term, With payment due in more than one year, Debt securities, Nominal Value, % of GDP"
## [97,] "Gross PSD, Budgetary Central Gov., All maturities, Debt Securities, Market value, % of GDP"
## [98,] "Gross PSD, Central Gov., All maturities, Debt Securities, Market value, % of GDP"
## [99,] "Gross PSD, Financial Public Corp., All maturities, Debt Securities, Market value, % of GDP"
## [100,] "Gross PSD, General Gov., All maturities, Debt Securities, Market value, % of GDP"
## [101,] "Gross PSD, Nonfinancial Public Corp., All maturities, Debt Securities, Market value, % of GDP"
## [102,] "Gross PSD, Nonfinancial Public Corp., All maturities, Debt securities, Nominal Value, % of GDP"
## [103,] "Gross PSD, Budgetary Central Gov., All maturities, Insurance, pensions, and standardized guarantee schemes, Nominal Value, % of GDP"
## [104,] "Gross PSD, Central Gov., All maturities, Insurance, pensions, and standardized guarantee schemes, Nominal Value, % of GDP"
## [105,] "Gross PSD, Financial Public Corp., All maturities, Insurance, pensions, and standardized guarantee schemes, Nominal Value, % of GDP"
## [106,] "Gross PSD, General Gov., All maturities, Insurance, pensions, and standardized guarantee schemes, Nominal Value, % of GDP"
## [107,] "Gross PSD, Budgetary Central Gov., Long-term, With payment due in one year or less, Insurance, pensions, and standardized guarantee schemes, Nominal Value, % of GDP"
## [108,] "Gross PSD, Central Gov., Long-term, With payment due in one year or less, Insurance, pensions, and standardized guarantee schemes, Nominal Value, % of GDP"
## [109,] "Gross PSD, Financial Public Corp., Long-term, With payment due in one year or less, Insurance, pensions, and standardized guarantee schemes, Nominal Value, % of GDP"
## [110,] "Gross PSD, General Gov., Long-term, With payment due in one year or less, Insurance, pensions, and standardized guarantee schemes, Nominal Value, % of GDP"
## [111,] "Gross PSD, Nonfinancial Public Corp., Long-term, With payment due in one year or less, Insurance, pensions, and standardized guarantee schemes, Nominal Value, % of GDP"
## [112,] "Gross PSD, Budgetary Central Gov., Long-term, With payment due in more than one year, Insurance, pensions, and standardized guarantee schemes, Nominal Value, % of GDP"
## [113,] "Gross PSD, Central Gov., Long-term, With payment due in more than one year, Insurance, pensions, and standardized guarantee schemes, Nominal Value, % of GDP"
## [114,] "Gross PSD, Financial Public Corp., Long-term, With payment due in more than one year, Insurance, pensions, and standardized guarantee schemes, Nominal Value, % of GDP"
## [115,] "Gross PSD, General Gov., Long-term, With payment due in more than one year, Insurance, pensions, and standardized guarantee schemes, Nominal Value, % of GDP"
## [116,] "Gross PSD, Nonfinancial Public Corp., Long-term, With payment due in more than one year, Insurance, pensions, and standardized guarantee schemes, Nominal Value, % of GDP"
## [117,] "Gross PSD, Nonfinancial Public Corp., All maturities, Insurance, pensions, and standardized guarantee schemes, Nominal Value, % of GDP"
## [118,] "Gross PSD, Budgetary Central Gov., All maturities, Loans, Nominal Value, % of GDP"
## [119,] "Gross PSD, Central Gov., All maturities, Loans, Nominal Value, % of GDP"
## [120,] "Gross PSD, Financial Public Corp., All maturities, Loans, Nominal Value, % of GDP"
## [121,] "Gross PSD, General Gov., All maturities, Loans, Nominal Value, % of GDP"
## [122,] "Gross PSD, Budgetary Central Gov., Long-term, With payment due in one year or less, Loans, Nominal Value, % of GDP"
## [123,] "Gross PSD, Central Gov., Long-term, With payment due in one year or less, Loans, Nominal Value, % of GDP"
## [124,] "Gross PSD, Financial Public Corp., Long-term, With payment due in one year or less, Loans, Nominal Value, % of GDP"
## [125,] "Gross PSD, General Gov., Long-term, With payment due in one year or less, Loans, Nominal Value, % of GDP"
## [126,] "Gross PSD, Nonfinancial Public Corp., Long-term, With payment due in one year or less, Loans, Nominal Value, % of GDP"
## [127,] "Gross PSD, Budgetary Central Gov., Long-term, With payment due in more than one year, Loans, Nominal Value, % of GDP"
## [128,] "Gross PSD, Central Gov., Long-term, With payment due in more than one year, Loans, Nominal Value, % of GDP"
## [129,] "Gross PSD, Financial Public Corp., Long-term, With payment due in more than one year, Loans, Nominal Value, % of GDP"
## [130,] "Gross PSD, General Gov., Long-term, With payment due in more than one year, Loans, Nominal Value, % of GDP"
## [131,] "Gross PSD, Nonfinancial Public Corp., Long-term, With payment due in more than one year, Loans, Nominal Value, % of GDP"
## [132,] "Gross PSD, Nonfinancial Public Corp., All maturities, Loans, Nominal Value, % of GDP"
## [133,] "Gross PSD, Budgetary Central Gov., All maturities, Other accounts payable, Nominal Value, % of GDP"
## [134,] "Gross PSD, Central Gov., All maturities, Other accounts payable, Nominal Value, % of GDP"
## [135,] "Gross PSD, Financial Public Corp., All maturities, Other accounts payable, Nominal Value, % of GDP"
## [136,] "Gross PSD, General Gov., All maturities, Other accounts payable, Nominal Value, % of GDP"
## [137,] "Gross PSD, Budgetary Central Gov., Long-term, With payment due in one year or less, Other accounts payable, Nominal Value, % of GDP"
## [138,] "Gross PSD, Central Gov., Long-term, With payment due in one year or less, Other accounts payable, Nominal Value, % of GDP"
## [139,] "Gross PSD, Financial Public Corp., Long-term, With payment due in one year or less, Other accounts payable, Nominal Value, % of GDP"
## [140,] "Gross PSD, General Gov., Long-term, With payment due in one year or less, Other accounts payable, Nominal Value, % of GDP"
## [141,] "Gross PSD, Nonfinancial Public Corp., Long-term, With payment due in one year or less, Other accounts payable, Nominal Value, % of GDP"
## [142,] "Gross PSD, Budgetary Central Gov., Long-term, With payment due in more than one year, Other accounts payable, Nominal Value, % of GDP"
## [143,] "Gross PSD, Central Gov., Long-term, With payment due in more than one year, Other accounts payable, Nominal Value, % of GDP"
## [144,] "Gross PSD, Financial Public Corp., Long-term, With payment due in more than one year, Other accounts payable, Nominal Value, % of GDP"
## [145,] "Gross PSD, General Gov., Long-term, With payment due in more than one year, Other accounts payable, Nominal Value, % of GDP"
## [146,] "Gross PSD, Nonfinancial Public Corp., Long-term, With payment due in more than one year, Other accounts payable, Nominal Value, % of GDP"
## [147,] "Gross PSD, Nonfinancial Public Corp., All maturities, Other accounts payable, Nominal Value, % of GDP"
## [148,] "Gross PSD, Budgetary Central Gov., All maturities, Special Drawing Rights, Nominal Value, % of GDP"
## [149,] "Gross PSD, Central Gov., All maturities, Special Drawing Rights, Nominal Value, % of GDP"
## [150,] "Gross PSD, Financial Public Corp., All maturities, Special Drawing Rights, Nominal Value, % of GDP"
## [151,] "Gross PSD, General Gov., All maturities, Special Drawing Rights, Nominal Value, % of GDP"
## [152,] "Gross PSD, Budgetary Central Gov., Long-term, With payment due in more than one year, Special Drawing Rights, Nominal Value, % of GDP"
## [153,] "Gross PSD, Central Gov., Long-term, With payment due in more than one year, Special Drawing Rights, Nominal Value, % of GDP"
## [154,] "Gross PSD, Financial Public Corp., Long-term, With payment due in more than one year, Special Drawing Rights, Nominal Value, % of GDP"
## [155,] "Gross PSD, General Gov., Long-term, With payment due in more than one year, Special Drawing Rights, Nominal Value, % of GDP"
## [156,] "Gross PSD, Nonfinancial Public Corp., Long-term, With payment due in more than one year, Special Drawing Rights, Nominal Value, % of GDP"
## [157,] "Gross PSD, Nonfinancial Public Corp., All maturities, Special Drawing Rights, Nominal Value, % of GDP"
## [158,] "Gross PSD, Budgetary Central Gov., Long-term, All instruments, Nominal Value, % of GDP"
## [159,] "Gross PSD, Central Gov., Long-term, All instruments, Nominal Value, % of GDP"
## [160,] "Gross PSD, Financial Public Corp., Long-term, All instruments, Nominal Value, % of GDP"
## [161,] "Gross PSD, General Gov., Long-term, All instruments, Nominal Value, % of GDP"
## [162,] "Gross PSD, Budgetary Central Gov., Long-term, With payment due in one year or less, All instruments, Nominal Value, % of GDP"
## [163,] "Gross PSD, Central Gov., Long-term, With payment due in one year or less, All instruments, Nominal Value, % of GDP"
## [164,] "Gross PSD, Financial Public Corp., Long-term, With payment due in one year or less, All instruments, Nominal Value, % of GDP"
## [165,] "Gross PSD, General Gov., Long-term, With payment due in one year or less, All instruments, Nominal Value, % of GDP"
## [166,] "Gross PSD, Nonfinancial Public Corp., Long-term, With payment due in one year or less, All instruments, Nominal Value, % of GDP"
## [167,] "Gross PSD, Budgetary Central Gov., Long-term, With payment due in more than one year, All instruments, Nominal Value, % of GDP"
## [168,] "Gross PSD, Central Gov., Long-term, With payment due in more than one year, All instruments, Nominal Value, % of GDP"
## [169,] "Gross PSD, Financial Public Corp., Long-term, With payment due in more than one year, All instruments, Nominal Value, % of GDP"
## [170,] "Gross PSD, General Gov., Long-term, With payment due in more than one year, All instruments, Nominal Value, % of GDP"
## [171,] "Gross PSD, Nonfinancial Public Corp., Long-term, With payment due in more than one year, All instruments, Nominal Value, % of GDP"
## [172,] "Gross PSD, Nonfinancial Public Corp., Long-term, All instruments, Nominal Value, % of GDP"
## [173,] "Gross PSD, Budgetary Central Gov., Short-term, Currency and deposits, Nominal Value, % of GDP"
## [174,] "Gross PSD, Central Gov., Short-term, Currency and deposits, Nominal Value, % of GDP"
## [175,] "Gross PSD, Financial Public Corp., Short-term, Currency and deposits, Nominal Value, % of GDP"
## [176,] "Gross PSD, General Gov., Short-term, Currency and deposits, Nominal Value, % of GDP"
## [177,] "Gross PSD, Nonfinancial Public Corp., Short-term, Currency and deposits, Nominal Value, % of GDP"
## [178,] "Gross PSD, Budgetary Central Gov., Short-term, Debt securities, Nominal Value, % of GDP"
## [179,] "Gross PSD, Central Gov., Short-term, Debt securities, Nominal Value, % of GDP"
## [180,] "Gross PSD, Financial Public Corp., Short-term, Debt securities, Nominal Value, % of GDP"
## [181,] "Gross PSD, General Gov., Short-term, Debt securities, Nominal Value, % of GDP"
## [182,] "Gross PSD, Nonfinancial Public Corp., Short-term, Debt securities, Nominal Value, % of GDP"
## [183,] "Gross PSD, Budgetary Central Gov., Short-term, Insurance, pensions, and standardized guarantee schemes, Nominal Value, % of GDP"
## [184,] "Gross PSD, Central Gov., Short-term, Insurance, pensions, and standardized guarantee schemes, Nominal Value, % of GDP"
## [185,] "Gross PSD, Financial Public Corp., Short-term, Insurance, pensions, and standardized guarantee schemes, Nominal Value, % of GDP"
## [186,] "Gross PSD, General Gov., Short-term, Insurance, pensions, and standardized guarantee schemes, Nominal Value, % of GDP"
## [187,] "Gross PSD, Nonfinancial Public Corp., Short-term, Insurance, pensions, and standardized guarantee schemes, Nominal Value, % of GDP"
## [188,] "Gross PSD, Budgetary Central Gov., Short-term, Loans, Nominal Value, % of GDP"
## [189,] "Gross PSD, Central Gov., Short-term, Loans, Nominal Value, % of GDP"
## [190,] "Gross PSD, Financial Public Corp., Short-term, Loans, Nominal Value, % of GDP"
## [191,] "Gross PSD, General Gov., Short-term, Loans, Nominal Value, % of GDP"
## [192,] "Gross PSD, Nonfinancial Public Corp., Short-term, Loans, Nominal Value, % of GDP"
## [193,] "Gross PSD, Budgetary Central Gov., Short-term, Other accounts payable, Nominal Value, % of GDP"
## [194,] "Gross PSD, Central Gov., Short-term, Other accounts payable, Nominal Value, % of GDP"
## [195,] "Gross PSD, Financial Public Corp., Short-term, Other accounts payable, Nominal Value, % of GDP"
## [196,] "Gross PSD, General Gov., Short-term, Other accounts payable, Nominal Value, % of GDP"
## [197,] "Gross PSD, Nonfinancial Public Corp., Short-term, Other accounts payable, Nominal Value, % of GDP"
## [198,] "Gross PSD, Budgetary Central Gov., Short-term, All instruments, Nominal Value, % of GDP"
## [199,] "Gross PSD, Central Gov., Short-term, All instruments, Nominal Value, % of GDP"
## [200,] "Gross PSD, Financial Public Corp., Short-term, All instruments, Nominal Value, % of GDP"
## [201,] "Gross PSD, General Gov., Short-term, All instruments, Nominal Value, % of GDP"
## [202,] "Gross PSD, Nonfinancial Public Corp., Short-term, All instruments, Nominal Value, % of GDP"
## [203,] "Debt on Concessional terms to GDP (% of GDP)"
## [204,] "Debt on Non-concessional terms to GDP (% of GDP)"
## [205,] "Debt outstanding and disbursed, Total to GDP (% of GDP)"
## [206,] "Net ODA received (% of GDP)"
## [207,] "Net ODA received from DAC donors (% of recipient's GDP)"
## [208,] "Net ODA received from multilateral donors (% of GDP)"
## [209,] "Net ODA received from non-DAC bilateral donors (% of GDP)"
## [210,] "Net ODA received (% of GDP)"
## [211,] "Total debt service (% of GDP)"
## [212,] "Energy intensity level of primary energy (MJ/$2011 PPP GDP)"
## [213,] "GDP per unit of energy use (1987 US$ per kg of oil equivalent)"
## [214,] "GDP per unit of energy use (2000 US$ per kg of oil equivalent)"
## [215,] "GDP per unit of energy use (PPP $ per kg of oil equivalent)"
## [216,] "GDP per unit of energy use (constant 2017 PPP $ per kg of oil equivalent)"
## [217,] "Energy use (kg of oil equivalent) per $1,000 GDP (constant 2017 PPP)"
## [218,] "CO2 emissions, industrial (kg per 1987 US$ of GDP)"
## [219,] "CO2 emissions, industrial (kg per 1987 US$ of GDP)"
## [220,] "CO2 emissions (kg per 2010 US$ of GDP)"
## [221,] "CO2 emissions (kg per PPP $ of GDP)"
## [222,] "CO2 emissions (kg per 2017 PPP $ of GDP)"
## [223,] "Water productivity, total (constant 2010 US$ GDP per cubic meter of total freshwater withdrawal)"
## [224,] "GDP per unit of energy use (1987 US$ per kg of oil equivalent)"
## [225,] "Deposit insurance coverage (% of GDP per capita)"
## [226,] "Domestic credit to private sector by banks (% of GDP)"
## [227,] "Total reserves includes gold (% of GDP)"
## [228,] "Claims on governments and other public entities (% of GDP)"
## [229,] "Broad money (% of GDP)"
## [230,] "Money and quasi money (M2) as % of GDP"
## [231,] "Money and quasi money (M2) as % of GDP"
## [232,] "Income velocity of money (GDP/M2)"
## [233,] "Quasi-liquid liabilities (% of GDP)"
## [234,] "Seignorage (% of GDP)"
## [235,] "Claims on central government, etc. (% GDP)"
## [236,] "Claims on other sectors of the domestic economy (% of GDP)"
## [237,] "Domestic credit provided by financial sector (% of GDP)"
## [238,] "Domestic credit provided by banking sector (% of GDP)"
## [239,] "Domestic credit to private sector (% of GDP)"
## [240,] "Credit to private sector (% of GDP)"
## [241,] "Liquid liabilities (M3) as % of GDP"
## [242,] "Liquid liabilities (M3) as % of GDP"
## [243,] "Quasi-liquid liabilities (% of GDP)"
## [244,] "Overall budget balance, including grants (% of GDP)"
## [245,] "Overall budget deficit, including grants (% of GDP)"
## [246,] "Central government debt, total (% of GDP)"
## [247,] "Central government debt, total (% of GDP)"
## [248,] "Financing from abroad (% of GDP)"
## [249,] "Financing from abroad (% of GDP)"
## [250,] "Domestic financing, total (% of GDP)"
## [251,] "Domestic finanacing (% of GDP)"
## [252,] "Current revenue, excluding grants (% of GDP)"
## [253,] "Current revenue (% of GDP)"
## [254,] "Central government revenues, excluding all grants (% of GDP)"
## [255,] "Current revenue, excluding grants (% of GDP)"
## [256,] "SOE external debt (% of GDP)"
## [257,] "State-owned enterprises, economic activity (% of GDP)"
## [258,] "SOE economic activity (% of GDP)"
## [259,] "State-owned enterprises, net financial flows from government (% of GDP)"
## [260,] "SOE net financial flows from government (% of GDP)"
## [261,] "State-owned enterprises, overall balance before transfers (% of GDP)"
## [262,] "Tax revenue (% of GDP)"
## [263,] "Tax revenue (% of GDP)"
## [264,] "Defense expenditure (% of GDP)"
## [265,] "Research and development expenditure (% of GDP)"
## [266,] "Expenditure, total (% of GDP)"
## [267,] "Total expenditure (% of GDP)"
## [268,] "Net acquisition of financial assets (% of GDP)"
## [269,] "Cash surplus/deficit (% of GDP)"
## [270,] "Central government debt, total (% of GDP)"
## [271,] "Net incurrence of liabilities, domestic (% of GDP)"
## [272,] "Net incurrence of liabilities, foreign (% of GDP)"
## [273,] "Net incurrence of liabilities, total (% of GDP)"
## [274,] "Net investment in nonfinancial assets (% of GDP)"
## [275,] "Net lending (+) / net borrowing (-) (% of GDP)"
## [276,] "Revenue, excluding grants (% of GDP)"
## [277,] "Tax revenue (% of GDP)"
## [278,] "Expense (% of GDP)"
## [279,] "Expenditure shares of GDP (percentage share, GDP=100, XR term)"
## [280,] "Private credit by deposit money banks to GDP (%)"
## [281,] "Deposit money banks'' assets to GDP (%)"
## [282,] "Nonbank financial institutions’ assets to GDP (%)"
## [283,] "Liquid liabilities to GDP (%)"
## [284,] "Central bank assets to GDP (%)"
## [285,] "Mutual fund assets to GDP (%)"
## [286,] "Financial system deposits to GDP (%)"
## [287,] "Life insurance premium volume to GDP (%)"
## [288,] "Non-life insurance premium volume to GDP (%)"
## [289,] "Insurance company assets to GDP (%)"
## [290,] "Private credit by deposit money banks and other financial institutions to GDP (%)"
## [291,] "Pension fund assets to GDP (%)"
## [292,] "Domestic credit to private sector (% of GDP)"
## [293,] "Stock market capitalization to GDP (%)"
## [294,] "Stock market total value traded to GDP (%)"
## [295,] "Outstanding domestic private debt securities to GDP (%)"
## [296,] "Outstanding domestic public debt securities to GDP (%)"
## [297,] "Outstanding international private debt securities to GDP (%)"
## [298,] "Outstanding international public debt securities to GDP (%)"
## [299,] "International debt issues to GDP (%)"
## [300,] "Gross portfolio equity liabilities to GDP (%)"
## [301,] "Gross portfolio equity assets to GDP (%)"
## [302,] "Gross portfolio debt liabilities to GDP (%)"
## [303,] "Gross portfolio debt assets to GDP (%)"
## [304,] "Syndicated loan issuance volume to GDP (%)"
## [305,] "Corporate bond issuance volume to GDP (%)"
## [306,] "Credit to government and state-owned enterprises to GDP (%)"
## [307,] "Bank deposits to GDP (%)"
## [308,] "Loans from nonresident banks (net) to GDP (%)"
## [309,] "Loans from nonresident banks (amounts outstanding) to GDP (%)"
## [310,] "Remittance inflows to GDP (%)"
## [311,] "Consolidated foreign claims of BIS reporting banks to GDP (%)"
## [312,] "Global leasing volume to GDP (%)"
## [313,] "Total factoring volume to GDP (%)"
## [314,] "Information and communication technology expenditure (% of GDP)"
## [315,] "Railways, goods transported (ton-km per PPP $ million of GDP)"
## [316,] "Railways, passenger-km (per PPP $ million of GDP)"
## [317,] "Telecommunications revenue (% GDP)"
## [318,] "Military expenditure (% of GDP)"
## [319,] "GDP on Accommodation & Food Beverages Activity Sector (in IDR Million), SNA 2008, Current Price"
## [320,] "GDP on Accommodation & Food Beverages Activity Sector (in IDR Million), SNA 2008, Constant Price"
## [321,] "GDP on Agriculture Sector (in IDR Million), Current Price"
## [322,] "GDP on Agriculture Sector (in IDR Million), Constant Price"
## [323,] "GDP on Agriculture, Forestry & Fisheries Sector (in IDR Million), SNA 2008, Current Price"
## [324,] "GDP on Agriculture, Forestry & Fisheries Sector (in IDR Million), SNA 2008, Constant Price"
## [325,] "GDP on Business Services Sector (in IDR Million), SNA 2008, Current Price"
## [326,] "GDP on Business Services Sector (in IDR Million), SNA 2008, Constant Price"
## [327,] "GDP on Construction Sector (in IDR Million), Current Price"
## [328,] "GDP on Construction Sector (in IDR Million), Constant Price"
## [329,] "GDP on Construction Sector (in IDR Million), SNA 2008, Current Price"
## [330,] "GDP on Construction Sector (in IDR Million), SNA 2008, Constant Price"
## [331,] "GDP on Education Services Sector (in IDR Million), SNA 2008, Current Price"
## [332,] "GDP on Education Services Sector (in IDR Million), SNA 2008, Constant Price"
## [333,] "GDP on Electricity & Gas Supply Sector (in IDR Million), SNA 2008, Current Price"
## [334,] "GDP on Electricity & Gas Supply Sector (in IDR Million), SNA 2008, Constant Price"
## [335,] "Total GDP excluding Oil and Gas (in IDR Million), Current Price"
## [336,] "Total GDP excluding Oil and Gas (in IDR Million), Constant Price"
## [337,] "GDP on Financial Service Sector (in IDR Million), Current Price"
## [338,] "GDP on Financial Service Sector (in IDR Million), Constant Price"
## [339,] "GDP on Financial & Insurance Activity Sector (in IDR Million), SNA 2008, Current Price"
## [340,] "GDP on Financial & Insurance Activity Sector (in IDR Million), SNA 2008, Constant Price"
## [341,] "GDP on Human Health & Social Work Activity Sector (in IDR Million), SNA 2008, Current Price"
## [342,] "GDP on Human Health & Social Work Activity Sector (in IDR Million), SNA 2008, Constant Price"
## [343,] "Total GDP including Oil and Gas (in IDR Million), Current Price"
## [344,] "Total GDP including Oil and Gas (in IDR Million), Constant Price"
## [345,] "Total GDP including Oil and Gas (in IDR Million), SNA 2008, Current Price"
## [346,] "Total GDP including Oil and Gas (in IDR Million), SNA 2008, Constant Price"
## [347,] "GDP on Information & Communication Sector (in IDR Million), SNA 2008, Current Price"
## [348,] "GDP on Information & Communication Sector (in IDR Million), SNA 2008, Constant Price"
## [349,] "GDP on Mining and Quarrying Sector (in IDR Million), Current Price"
## [350,] "GDP on Mining and Quarrying Sector (in IDR Million), Constant Price"
## [351,] "GDP on Mining & Quarrying Sector (in IDR Million), SNA 2008, Current Price"
## [352,] "GDP on Mining & Quarrying Sector (in IDR Million), SNA 2008, Constant Price"
## [353,] "GDP on Manufacturing Sector (in IDR Million), Current Price"
## [354,] "GDP on Manufacturing Sector (in IDR Million), Constant Price"
## [355,] "GDP on Manufacturing Industry Sector (in IDR Million), SNA 2008, Current Price"
## [356,] "GDP on Manufacturing Industry Sector (in IDR Million), SNA 2008, Constant Price"
## [357,] "GDP on Public Administration, Defense & Compulsory Social Security Sector (in IDR Million), SNA 2008, Current Price"
## [358,] "GDP on Public Administration, Defense & Compulsory Social Security Sector (in IDR Million), SNA 2008, Constant Price"
## [359,] "GDP on Real Estate Sector (in IDR Million), SNA 2008, Current Price"
## [360,] "GDP on Real Estate Sector (in IDR Million), SNA 2008, Constant Price"
## [361,] "GDP on Other Service Sector (in IDR Million), Current Price"
## [362,] "GDP on Other Service Sector (in IDR Million), Constant Price"
## [363,] "GDP on Other Services Sector (in IDR Million), SNA 2008, Current Price"
## [364,] "GDP on Other Services Sector (in IDR Million), SNA 2008, Constant Price"
## [365,] "GDP on Transportation and Telecommunication Sector (in IDR Million), Current Price"
## [366,] "GDP on Transportation and Telecommunication Sector (in IDR Million), Constant Price"
## [367,] "GDP on Transportation & Storage Sector (in IDR Million), SNA 2008, Current Price"
## [368,] "GDP on Transportation & Storage Sector (in IDR Million), SNA 2008, Constant Price"
## [369,] "GDP on Trade, Hotel and Restaurant Sector (in IDR Million), Current Price"
## [370,] "GDP on Trade, Hotel and Restaurant Sector (in IDR Million), Constant Price"
## [371,] "GDP on Wholesales & Retail Trade, Repair of Motor Vehicles & Motorcycles Sector (in IDR Million), SNA 2008, Current Price"
## [372,] "GDP on Wholesales & Retail Trade, Repair of Motor Vehicles & Motorcycles Sector (in IDR Million), SNA 2008, Constant Price"
## [373,] "GDP on Utilities Sector (in IDR Million), Current Price"
## [374,] "GDP on Utilities Sector (in IDR Million), Constant Price"
## [375,] "GDP on Water Supply, Sewerage, Waste & Recycling Management Sector (in IDR Million), SNA 2008, Current Price"
## [376,] "GDP on Water Supply, Sewerage, Waste & Recycling Management Sector (in IDR Million), SNA 2008, Constant Price"
## [377,] "General government final consumption expenditure (% of GDP)"
## [378,] "Household final consumption expenditure, etc. (% of GDP)"
## [379,] "Households and NPISHs final consumption expenditure (% of GDP)"
## [380,] "Final consumption expenditure, etc. (% of GDP)"
## [381,] "Total consumption: contribution to growth of GDP (%)"
## [382,] "Final consumption expenditure (% of GDP)"
## [383,] "Gross national expenditure (% of GDP)"
## [384,] "Exports of goods and services (% of GDP)"
## [385,] "GDP expenditure on general government consumption (in IDR Million)"
## [386,] "GDP expenditure on general government consumption (in IDR Million), SNA 2008, Current Price"
## [387,] "GDP expenditure on non profit private institution consumption (in IDR Million)"
## [388,] "GDP expenditure on non profit private institution consumption (in IDR Million), SNA 2008, Current Price"
## [389,] "GDP expenditure on private consumption (in IDR Million)"
## [390,] "GDP expenditure on private consumption (in IDR Million), SNA 2008, Current Price"
## [391,] "GDP expenditure on exports (in IDR Million)"
## [392,] "GDP expenditure on exports (in IDR Million), SNA 2008, Current Price"
## [393,] "Gross fixed capital formation, private sector (% of GDP)"
## [394,] "Gross public investment (% of GDP)"
## [395,] "GDP expenditure on gross fixed capital formation (in IDR Million)"
## [396,] "GDP expenditure on gross fixed capital formation (in IDR Million), SNA 2008, Current Price"
## [397,] "Gross fixed capital formation (% of GDP)"
## [398,] "GDP expenditure on imports (in IDR Million)"
## [399,] "GDP expenditure on imports (in IDR Million), SNA 2008, Current Price"
## [400,] "GDP expenditure on inter-region net exports (in IDR Million), SNA 2008, Current Price"
## [401,] "GDP expenditure on changes in stock (in IDR Million)"
## [402,] "GDP expenditure on changes in stock (in IDR Million), SNA 2008, Current Price"
## [403,] "Total GDP based on expenditure (in IDR Million)"
## [404,] "Total GDP based on expenditure (in IDR Million), SNA 2008, Current Price"
## [405,] "Gross domestic investment: contr. to growth of GDP(%)"
## [406,] "Gross capital formation (% of GDP)"
## [407,] "Imports of goods and services (% of GDP)"
## [408,] "Merchandise trade to GDP ratio (%)"
## [409,] "Resource balance: contribution to growth of GDP (%)"
## [410,] "External balance on goods and services (% of GDP)"
## [411,] "Trade (% of GDP)"
## [412,] "Agriculture: contribution to growth of GDP (%)"
## [413,] "Industry: contribution to growth of GDP (%)"
## [414,] "Services: contribution to growth of GDP (%)"
## [415,] "Real agricultural GDP per capita growth rate (%)"
## [416,] "Real agricultural GDP growth rates (%)"
## [417,] "Agriculture, forestry, and fishing, value added (% of GDP)"
## [418,] "Manufacturing, value added (% of GDP)"
## [419,] "Industry: contribution to growth of GDP (%)"
## [420,] "Industry (including construction), value added (% of GDP)"
## [421,] "Discrepancy in GDP, value added (current US$)"
## [422,] "Discrepancy in GDP, value added (current LCU)"
## [423,] "Discrepancy in GDP, value added (constant LCU)"
## [424,] "Services: contribution to growth of GDP (%)"
## [425,] "Services, etc., value added (% of GDP)"
## [426,] "Services, value added (% of GDP)"
## [427,] "Agricultural support estimate (% of GDP)"
## [428,] "Coal rents (% of GDP)"
## [429,] "Inflation, GDP deflator (annual %)"
## [430,] "Inflation, GDP deflator (annual %)"
## [431,] "Inflation, GDP deflator: linked series (annual %)"
## [432,] "GDP deflator (base year varies by country)"
## [433,] "GDP deflator (1987 = 100)"
## [434,] "GDP deflator: linked series (base year varies by country)"
## [435,] "Discrepancy in expenditure estimate of GDP (current US$)"
## [436,] "Discrepancy in expenditure estimate of GDP (current LCU)"
## [437,] "Discrepancy in expenditure estimate of GDP (constant LCU)"
## [438,] "GDP at factor cost (constant 1987 US$)"
## [439,] "GDP at factor cost (constant 1987 LCU)"
## [440,] "Forest rents (% of GDP)"
## [441,] "Mineral rents (% of GDP)"
## [442,] "GDP (current US$)"
## [443,] "GDP deflator, index (2000=100; US$ series)"
## [444,] "GDP (current LCU)"
## [445,] "GDP: linked series (current LCU)"
## [446,] "GDP deflator, period average (LCU index 2000=100)"
## [447,] "GDP Deflator"
## [448,] "GDP (constant 2010 US$)"
## [449,] "GDP at market prices (constant 1987 US$)"
## [450,] "GDP growth (annual %)"
## [451,] "GDP (constant LCU)"
## [452,] "GDP at market prices (constant 1987 LCU)"
## [453,] "GDP growth (annual %)"
## [454,] "GDP, PPP (current international $)"
## [455,] "GDP, PPP (constant 2017 international $)"
## [456,] "GDP, PPP (constant 1987 international $)"
## [457,] "GDP deflator (1987=100,Index)"
## [458,] "GDP deflator, end period (base year varies by country)"
## [459,] "Natural gas rents (% of GDP)"
## [460,] "GDP per capita (current US$)"
## [461,] "GDP per capita (current LCU)"
## [462,] "GDP per capita (constant 2010 US$)"
## [463,] "GDP per capita growth (annual %)"
## [464,] "GDP per capita (constant LCU)"
## [465,] "GDP per capita, PPP (current international $)"
## [466,] "GDP per capita, PPP (constant 2017 international $)"
## [467,] "GDP per capita, PPP (constant 1987 international $)"
## [468,] "GDP per capita, PPP annual growth (%)"
## [469,] "Oil rents (% of GDP)"
## [470,] "Total natural resources rents (% of GDP)"
## [471,] "Gross domestic savings (% of GDP)"
## [472,] "Genuine savings: education expenditure (% of GDP)"
## [473,] "Genuine savings: carbon dioxide damage (% of GDP)"
## [474,] "Genuine savings: net forest depletion (% of GDP)"
## [475,] "Genuine savings: consumption of fixed capital (% of GDP)"
## [476,] "Genuine savings: mineral depletion (% of GDP)"
## [477,] "Genuine savings: energy depletion (% of GDP)"
## [478,] "Genuine savings: net domestic savings (% of GDP)"
## [479,] "Genuine domestic savings (% of GDP)"
## [480,] "Gross savings (% of GDP)"
## [481,] "Annual percentage growth rate of GDP at market prices based on constant 2010 US Dollars."
## [482,] "GDP,current US$,millions,seas. adj.,"
## [483,] "GDP,current LCU,millions,seas. adj.,"
## [484,] "GDP,constant 2010 US$,millions,seas. adj.,"
## [485,] "GDP,constant 2010 LCU,millions,seas. adj.,"
## [486,] "PPP conversion factor, GDP (LCU per international $)"
## [487,] "2005 PPP conversion factor, GDP (LCU per international $)"
## [488,] "Price level ratio of PPP conversion factor (GDP) to market exchange rate"
## [489,] "EXPENDITURE SHARES (GDP=100)"
## [490,] "Public Expenditure on Education (% GDP)"
## [491,] "Public spending on education, primary (% of GDP)"
## [492,] "Government expenditure per student, primary (% of GDP per capita)"
## [493,] "Public spending on education, secondary (% of GDP)"
## [494,] "Government expenditure per student, secondary (% of GDP per capita)"
## [495,] "Public spending on education, tertiary (% of GDP)"
## [496,] "Government expenditure per student, tertiary (% of GDP per capita)"
## [497,] "Government expenditure on education, total (% of GDP)"
## [498,] "Rail traffic (km per million US$ GDP)"
## [499,] "Current health expenditure (% of GDP)"
## [500,] "Domestic general government health expenditure (% of GDP)"
## [501,] "Public Expenditure on Health (% GDP)"
## [502,] "Capital health expenditure (% of GDP)"
## [503,] "Health expenditure, private (% of GDP)"
## [504,] "Health expenditure, public (% of GDP)"
## [505,] "Health expenditure, total (% of GDP)"
## [506,] "GDP per person employed (constant 2017 PPP $)"
## [507,] "GDP per person employed (annual % growth)"
## [508,] "GDP per person employed, index (1980 = 100)"
## [509,] "Trade (% of GDP, PPP)"
## [510,] "Merchandise trade (% of GDP)"
## [511,] "Trade in goods (% of goods GDP)"
## [512,] "Government expenditure on pre-primary education as % of GDP (%)"
## [513,] "Initial government funding of pre-primary education as a percentage of GDP (%)"
## [514,] "Government expenditure on primary education as % of GDP (%)"
## [515,] "Initial government funding of primary education as a percentage of GDP (%)"
## [516,] "Initial household funding of primary education as a percentage of GDP"
## [517,] "Government expenditure on lower secondary education as a percentage of GDP (%)"
## [518,] "Initial government funding of lower secondary education as a percentage of GDP (%)"
## [519,] "Government expenditure on secondary education as % of GDP (%)"
## [520,] "Initial household funding of secondary education as a percentage of GDP"
## [521,] "Initial government funding of secondary education as a percentage of GDP (%)"
## [522,] "Government expenditure on secondary and post-secondary non-tertiary vocational education as % of GDP (%)"
## [523,] "Government expenditure on upper secondary education as a percentage of GDP (%)"
## [524,] "Initial government funding of upper secondary education as a percentage of GDP (%)"
## [525,] "Government expenditure on post-secondary non-tertiary education as % of GDP (%)"
## [526,] "Government expenditure on tertiary education as % of GDP (%)"
## [527,] "Initial government funding of tertiary education as a percentage of GDP (%)"
## [528,] "Initial household funding of tertiary education as a percentage of GDP"
## [529,] "Initial government funding of education as a percentage of GDP (%)"
## [530,] "Initial household funding of education as a percentage of GDP"
## [531,] "Initial government funding per pre-primary student as a percentage of GDP per capita"
## [532,] "Initial government funding per primary student as a percentage of GDP per capita"
## [533,] "Initial household funding per primary student as a percentage of GDP per capita"
## [534,] "Initial government funding per lower secondary student as a percentage of GDP per capita"
## [535,] "Initial government funding per secondary student as a percentage of GDP per capita"
## [536,] "Initial household funding per secondary student as a percentage of GDP per capita"
## [537,] "Initial government funding per upper secondary student as a percentage of GDP per capita"
## [538,] "Initial government funding per tertiary student as a percentage of GDP per capita"
## [539,] "Initial household funding per tertiary student as a percentage of GDP per capita"
# PIB: Producto Interno Bruto, percápita quiere decir en dólares constantes.
dat = WDI(indicator='NY.GDP.PCAP.KD', country=c('MX','CA','US'), start=1960, end=2012)
head(dat)
## iso2c country NY.GDP.PCAP.KD year
## 1 CA Canada 48785.94 2012
## 2 CA Canada 48464.50 2011
## 3 CA Canada 47448.01 2010
## 4 CA Canada 46540.64 2009
## 5 CA Canada 48495.20 2008
## 6 CA Canada 48534.17 2007
tail(dat)
## iso2c country NY.GDP.PCAP.KD year
## 154 US United States 20831.39 1965
## 155 US United States 19824.67 1964
## 156 US United States 18999.97 1963
## 157 US United States 18463.01 1962
## 158 US United States 17671.22 1961
## 159 US United States 17562.67 1960
ggplot(dat, aes(year, NY.GDP.PCAP.KD, color=country)) + geom_line() +
xlab('Year') + ylab('PIB per capita')
dat2 = WDI(indicator='NY.GDP.PCAP.KD', country=c("EC"), start=1960, end=2012) # para Ecuador es "EC"
head(dat2)
## iso2c country NY.GDP.PCAP.KD year
## 1 EC Ecuador 5122.180 2012
## 2 EC Ecuador 4921.848 2011
## 3 EC Ecuador 4633.590 2010
## 4 EC Ecuador 4547.509 2009
## 5 EC Ecuador 4596.145 2008
## 6 EC Ecuador 4393.724 2007
ggplot(dat2, aes(year, NY.GDP.PCAP.KD, color=country)) + geom_line() +
xlab('Year') + ylab('PIB per capita')