install.packages(“readxl”, “tidyr”, “dplyr”, “knitr”, “kableExtra”, “plotly”, “readr”, “ggplot2”, “rworldmap”)
Este dataset se utiliza para monitorear el progreso hacia el Objetivo de Desarrollo Sostenible (ODS) 7, que busca garantizar el acceso a una energía asequible, confiable, sostenible y moderna para todos. Específicamente, el dataset proporciona datos sobre el porcentaje de la población con acceso a la electricidad, un indicador clave para medir el progreso en este objetivo global.
El dataset abarca múltiples años, lo que permite observar tendencias en el acceso a la electricidad a lo largo del tiempo. Además, cubre datos a nivel global, con desgloses por países, regiones y continentes, lo que permite comparaciones geográficas.
Estos datos son cruciales para gobiernos y organizaciones internacionales al formular políticas de desarrollo y planificación energética. Permiten identificar regiones que están rezagadas en términos de electrificación y donde se necesitan más inversiones y esfuerzos. Por otro lado, ayudan a los investigadores para que pueden usar estos datos para estudiar las correlaciones entre el acceso a la electricidad y otros indicadores de desarrollo, como la educación, la salud y el crecimiento económico.
Los datos provienen de “Tracking SDG 7: The Energy Progress Report” (2023), un informe colaborativo de la Agencia Internacional de Energía (IEA), la Agencia Internacional de Energías Renovables (IRENA), la División de Estadística de las Naciones Unidas (UNSD), el Banco Mundial, y la Organización Mundial de la Salud (WHO).
NOTA: La última actualización de los datos fue el 28 de Junio del 2024. Aunque los datos provienen de fuentes confiables, la precisión puede variar dependiendo de la región y la capacidad de recopilación de datos de los diferentes países, en este caso, los datos sobre electrificación se obtienen de la industria, encuestas nacionales y fuentes internacionales. El dataset proporciona un panorama general, pero puede no capturar disparidades internas dentro de los países, como diferencias entre áreas urbanas y rurales.
# Leer los datos
archivo <- file.choose()
datas <- read_excel(archivo, skip = 3)
#Se transforma el data set de formato ancho a formato largo para tener una mayor organización, de esta forma en vez de tener un gran número de columnas con los años se crea una sola en la que estén todos.
data <- pivot_longer(datas,
cols = starts_with("19") | starts_with("20"),
names_to = "Year",
values_to = "Percentage")
kable(data[1:20,], caption= "Tabla 1: Acceso a la electricidad (% de la población) - Mundial") %>%
kable_styling(full_width = T) %>%
scroll_box(width = "900px", height = "450px")
| Country Name | Country Code | Indicator Name | Indicator Code | Year | Percentage |
|---|---|---|---|---|---|
| Aruba | ABW | Access to electricity (% of population) | EG.ELC.ACCS.ZS | 1960 | NA |
| Aruba | ABW | Access to electricity (% of population) | EG.ELC.ACCS.ZS | 1961 | NA |
| Aruba | ABW | Access to electricity (% of population) | EG.ELC.ACCS.ZS | 1962 | NA |
| Aruba | ABW | Access to electricity (% of population) | EG.ELC.ACCS.ZS | 1963 | NA |
| Aruba | ABW | Access to electricity (% of population) | EG.ELC.ACCS.ZS | 1964 | NA |
| Aruba | ABW | Access to electricity (% of population) | EG.ELC.ACCS.ZS | 1965 | NA |
| Aruba | ABW | Access to electricity (% of population) | EG.ELC.ACCS.ZS | 1966 | NA |
| Aruba | ABW | Access to electricity (% of population) | EG.ELC.ACCS.ZS | 1967 | NA |
| Aruba | ABW | Access to electricity (% of population) | EG.ELC.ACCS.ZS | 1968 | NA |
| Aruba | ABW | Access to electricity (% of population) | EG.ELC.ACCS.ZS | 1969 | NA |
| Aruba | ABW | Access to electricity (% of population) | EG.ELC.ACCS.ZS | 1970 | NA |
| Aruba | ABW | Access to electricity (% of population) | EG.ELC.ACCS.ZS | 1971 | NA |
| Aruba | ABW | Access to electricity (% of population) | EG.ELC.ACCS.ZS | 1972 | NA |
| Aruba | ABW | Access to electricity (% of population) | EG.ELC.ACCS.ZS | 1973 | NA |
| Aruba | ABW | Access to electricity (% of population) | EG.ELC.ACCS.ZS | 1974 | NA |
| Aruba | ABW | Access to electricity (% of population) | EG.ELC.ACCS.ZS | 1975 | NA |
| Aruba | ABW | Access to electricity (% of population) | EG.ELC.ACCS.ZS | 1976 | NA |
| Aruba | ABW | Access to electricity (% of population) | EG.ELC.ACCS.ZS | 1977 | NA |
| Aruba | ABW | Access to electricity (% of population) | EG.ELC.ACCS.ZS | 1978 | NA |
| Aruba | ABW | Access to electricity (% of population) | EG.ELC.ACCS.ZS | 1979 | NA |
#Verifico si hay algun problema en la base de datos, sin embargo, todo está funcionando bien
print(problems(data))
## # A tibble: 0 × 4
## # ℹ 4 variables: row <int>, col <int>, expected <chr>, actual <chr>
Esto indica que no se encontraron problemas en los datos, ya que el tibble tiene 0 filas y 5 columnas, es decir el tibble está vacío.
Las variables son:
# Mostrar nombres de las variables
variables <- names(data)
variables
## [1] "Country Name" "Country Code" "Indicator Name" "Indicator Code"
## [5] "Year" "Percentage"
Significado de cada variable:
-Country Name : El nombre del país o la región para la que se registran los datos, útil para realizar análisis por país o comparar diferentes países/regiones.
-Country Code : Un código estandarizado que representa el país o región que facilita la identificación y combinación de datos a nivel de país o región en análisis.
-Indicator Name : El nombre del indicador o métrica que se está midiendo, el cual describe qué aspecto específico del dato se está evaluando. En este caso es: Access to electricity (% of population).
-Indicator Code: Un código único que identifica el indicador, el cual proporciona un identificador estandarizado para el indicador. En este caso es: EG.ELC.ACCS.ZS y hace referencia al porcentaje de población con acceso a electricidad.
-Year : El año en el que se registraron los datos, este permite analizar la evolución de los datos a lo largo del tiempo.
-Percentage : El valor del indicador para el país o región en el año especificado. Es la cifra real que representa la medición del indicador del porcentaje de la población con acceso a electricidad para el país/región y el año.
Verificamos los datos viendo el encabezado y la cola de los datos:
head(data)
## # A tibble: 6 × 6
## `Country Name` `Country Code` `Indicator Name` `Indicator Code` Year
## <chr> <chr> <chr> <chr> <chr>
## 1 Aruba ABW Access to electricity (%… EG.ELC.ACCS.ZS 1960
## 2 Aruba ABW Access to electricity (%… EG.ELC.ACCS.ZS 1961
## 3 Aruba ABW Access to electricity (%… EG.ELC.ACCS.ZS 1962
## 4 Aruba ABW Access to electricity (%… EG.ELC.ACCS.ZS 1963
## 5 Aruba ABW Access to electricity (%… EG.ELC.ACCS.ZS 1964
## 6 Aruba ABW Access to electricity (%… EG.ELC.ACCS.ZS 1965
## # ℹ 1 more variable: Percentage <dbl>
tail(data)
## # A tibble: 6 × 6
## `Country Name` `Country Code` `Indicator Name` `Indicator Code` Year
## <chr> <chr> <chr> <chr> <chr>
## 1 Zimbabwe ZWE Access to electricity (%… EG.ELC.ACCS.ZS 2018
## 2 Zimbabwe ZWE Access to electricity (%… EG.ELC.ACCS.ZS 2019
## 3 Zimbabwe ZWE Access to electricity (%… EG.ELC.ACCS.ZS 2020
## 4 Zimbabwe ZWE Access to electricity (%… EG.ELC.ACCS.ZS 2021
## 5 Zimbabwe ZWE Access to electricity (%… EG.ELC.ACCS.ZS 2022
## 6 Zimbabwe ZWE Access to electricity (%… EG.ELC.ACCS.ZS 2023
## # ℹ 1 more variable: Percentage <dbl>
Dimensión (número de filas y columnas):
dim(datas)
## [1] 266 68
Antes de realizar el cambio de ancho a largo teníamos lo siguiente:
Filas: 266
Columnas: 68
dim(data)
## [1] 17024 6
Ahora se tiene lo siguiente:
Filas: 17024
Columnas: 6
Analizamos ahora la estructura de la base de datos:
#Realizo el cambio porque esta variable es numérica.
data$Year <- as.numeric(data$Year)
str(data)
## tibble [17,024 × 6] (S3: tbl_df/tbl/data.frame)
## $ Country Name : chr [1:17024] "Aruba" "Aruba" "Aruba" "Aruba" ...
## $ Country Code : chr [1:17024] "ABW" "ABW" "ABW" "ABW" ...
## $ Indicator Name: chr [1:17024] "Access to electricity (% of population)" "Access to electricity (% of population)" "Access to electricity (% of population)" "Access to electricity (% of population)" ...
## $ Indicator Code: chr [1:17024] "EG.ELC.ACCS.ZS" "EG.ELC.ACCS.ZS" "EG.ELC.ACCS.ZS" "EG.ELC.ACCS.ZS" ...
## $ Year : num [1:17024] 1960 1961 1962 1963 1964 ...
## $ Percentage : num [1:17024] NA NA NA NA NA NA NA NA NA NA ...
Lo que nos mostro el tipo de cada variable, en general:
-Country Name : character (nominal), ya que clasifica a los datos en diferentes países sin un orden implícito.
-Country Code : character (nominal), representa a cada país con un identificador único sin una relación de orden entre los códigos.
-Indicator Name : character (nominal), describe el tipo de indicador medido, no tiene un orden intrínseco.
-Indicator Code : character (nominal), no hay un orden, solo proporciona un identificador único para el indicador.
-Year : numeric (discreto), el año por naturaleza es una variable continua; sin embargo, en este análisis se agrupan datos por años, por lo que se trata como variable discreta porque estamos trabajando con valores específicos de años, sin valores intermedios posibles.
-Percentage : numeric (continuo), los valores representan porcentajes que pueden tomar cualquier valor dentro de un rango.
Porcentaje de la población con acceso a la electricidad y año:
data$Year <- as.numeric(data$Year)
# Calcular estadísticas descriptivas por año
summary_stats_by_year <- data %>%
filter(Year >= 1990 & Year <= 2022) %>%
group_by(Year) %>%
summarise(
Media = mean(Percentage, na.rm = TRUE),
Mediana = median(Percentage, na.rm = TRUE),
DE = sd(Percentage, na.rm = TRUE),
CV = (sd(Percentage, na.rm = TRUE) / mean(Percentage, na.rm = TRUE)) * 100,
P10 = quantile(Percentage, 0.10, na.rm = TRUE),
P25 = quantile(Percentage, 0.25, na.rm = TRUE),
P50 = quantile(Percentage, 0.50, na.rm = TRUE),
P75 = quantile(Percentage, 0.75, na.rm = TRUE),
P90 = quantile(Percentage, 0.90, na.rm = TRUE)
)
# Mostrar resultados
kable(summary_stats_by_year,
format = "html",
caption = "Estadísticas Descriptivas del Porcentaje de acceso a la electricidad por Año") %>%
kable_styling(full_width = TRUE,
bootstrap_options = c("striped", "hover", "condensed", "responsive"))%>%
scroll_box(width = "100%", height = "450px")
| Year | Media | Mediana | DE | CV | P10 | P25 | P50 | P75 | P90 |
|---|---|---|---|---|---|---|---|---|---|
| 1990 | 96.60344 | 100.00000 | 13.17630 | 13.63958 | 93.66691 | 99.98384 | 100.00000 | 100.00000 | 100 |
| 1991 | 92.22483 | 100.00000 | 20.62335 | 22.36204 | 72.29256 | 99.70235 | 100.00000 | 100.00000 | 100 |
| 1992 | 87.97584 | 100.00000 | 25.75285 | 29.27264 | 46.14000 | 92.98772 | 100.00000 | 100.00000 | 100 |
| 1993 | 83.84999 | 100.00000 | 28.28227 | 33.72960 | 31.14730 | 79.73337 | 100.00000 | 100.00000 | 100 |
| 1994 | 82.80036 | 100.00000 | 29.15067 | 35.20597 | 28.65856 | 76.88352 | 100.00000 | 100.00000 | 100 |
| 1995 | 81.96506 | 100.00000 | 29.45562 | 35.93680 | 29.08626 | 74.03815 | 100.00000 | 100.00000 | 100 |
| 1996 | 79.53655 | 99.73293 | 30.74388 | 38.65378 | 27.29021 | 67.16998 | 99.73293 | 100.00000 | 100 |
| 1997 | 78.53863 | 99.27018 | 31.83974 | 40.54023 | 22.60707 | 67.33043 | 99.27018 | 100.00000 | 100 |
| 1998 | 77.66918 | 98.46965 | 31.83716 | 40.99072 | 22.06097 | 63.86436 | 98.46965 | 100.00000 | 100 |
| 1999 | 77.16029 | 97.00000 | 31.98964 | 41.45868 | 20.82163 | 57.95112 | 97.00000 | 100.00000 | 100 |
| 2000 | 74.56921 | 93.40000 | 32.53102 | 43.62528 | 17.64000 | 54.17194 | 93.40000 | 100.00000 | 100 |
| 2001 | 75.10784 | 94.00000 | 32.25984 | 42.95137 | 19.89532 | 53.80000 | 94.00000 | 99.99699 | 100 |
| 2002 | 75.36080 | 94.40000 | 32.01645 | 42.48423 | 19.36763 | 51.47500 | 94.40000 | 100.00000 | 100 |
| 2003 | 76.03507 | 94.20000 | 31.35657 | 41.23962 | 20.83110 | 53.35000 | 94.20000 | 99.92310 | 100 |
| 2004 | 76.47813 | 94.38374 | 30.96661 | 40.49080 | 23.63856 | 54.40000 | 94.38374 | 99.90000 | 100 |
| 2005 | 76.95829 | 94.80000 | 30.66280 | 39.84340 | 23.46864 | 58.47500 | 94.80000 | 99.82979 | 100 |
| 2006 | 77.89855 | 96.25000 | 30.09695 | 38.63609 | 26.40000 | 59.02500 | 96.25000 | 99.90000 | 100 |
| 2007 | 77.71576 | 95.35000 | 30.10853 | 38.74186 | 25.97172 | 57.40000 | 95.35000 | 99.99743 | 100 |
| 2008 | 78.38178 | 96.35000 | 29.81651 | 38.04010 | 25.99276 | 60.50000 | 96.35000 | 100.00000 | 100 |
| 2009 | 78.52833 | 96.50000 | 29.75137 | 37.88616 | 26.26873 | 60.60000 | 96.50000 | 100.00000 | 100 |
| 2010 | 79.39060 | 96.80000 | 29.25521 | 36.84971 | 29.60000 | 64.19702 | 96.80000 | 100.00000 | 100 |
| 2011 | 80.18188 | 96.90000 | 28.42810 | 35.45452 | 34.45601 | 66.20000 | 96.90000 | 100.00000 | 100 |
| 2012 | 80.86841 | 97.20000 | 27.82299 | 34.40526 | 36.45625 | 68.05561 | 97.20000 | 100.00000 | 100 |
| 2013 | 81.47073 | 98.00000 | 27.53789 | 33.80096 | 34.90000 | 69.47448 | 98.00000 | 100.00000 | 100 |
| 2014 | 82.17171 | 98.10000 | 26.99668 | 32.85398 | 34.43980 | 71.50000 | 98.10000 | 100.00000 | 100 |
| 2015 | 82.74035 | 98.60000 | 26.47847 | 32.00188 | 37.90131 | 73.40000 | 98.60000 | 100.00000 | 100 |
| 2016 | 83.84689 | 99.00526 | 25.49844 | 30.41072 | 41.32000 | 76.15000 | 99.00526 | 100.00000 | 100 |
| 2017 | 84.59178 | 99.27957 | 24.93902 | 29.48161 | 42.98000 | 79.35000 | 99.27957 | 100.00000 | 100 |
| 2018 | 85.21799 | 99.60000 | 24.08158 | 28.25880 | 45.32000 | 80.85000 | 99.60000 | 100.00000 | 100 |
| 2019 | 85.77764 | 99.60000 | 23.82623 | 27.77674 | 46.62000 | 83.75000 | 99.60000 | 100.00000 | 100 |
| 2020 | 86.31129 | 99.70000 | 23.42005 | 27.13439 | 47.67711 | 85.80000 | 99.70000 | 100.00000 | 100 |
| 2021 | 86.87050 | 99.97632 | 22.87507 | 26.33238 | 49.06000 | 86.07485 | 99.97632 | 100.00000 | 100 |
| 2022 | 87.16016 | 100.00000 | 22.49044 | 25.80358 | 50.20000 | 85.56137 | 100.00000 | 100.00000 | 100 |
Análisis:
-La media es el promedio de los datos, pero es sensible a valores extremos o atípicos, en este análisis se calculó por año.
-La mediana representa el valor que divide un conjunto de datos en dos mitades iguales y no es influenciada por valores extremos o atípicos. Además, podemos calcularla por medio de los percentiles, ya que corresponde al 50%. Por eso notamos que el valor del P50 y la mediana es el mismo en todos los casos.
-Los percentiles dividen un conjunto de datos en 100 partes iguales, proporcionando una forma de entender la distribución de los datos en relación con su posición en el conjunto. Por ejemplo, el P25 se refiere al primer cuartil por debajo del cual cae el 25% de los datos. Por ejemplo, en el año 2001 el P25 es 53.8, entonces podríamos concluir que, en el 2001 el del 25% de la población mundial, el 54% contaba con electricidad.
-La desviación estándar (DE) mide la dispersión o variabilidad de los datos en relación con la media. Es decir, indica cuánto se desvían los valores del promedio. Siguiendo con el ejemplo anterior, en el 2001 la desviación estándar fue de 32.25984, es decir, puede haber existencia de valores atípicos que generen esta gran dispersión de los datos alrededor de la media.
-El coeficiente de variación (CV) es una medida de la dispersión relativa de los datos, expresada como un porcentaje de la media. Este se utiliza para comparar la variabilidad entre datasets con diferentes unidades o magnitudes; sin embargo, esta base de datos no contiene más variables numéricas. De igual forma podemos observarlo para comparar la variabilidad de los porcentajes en los distintos años, ya que normaliza la variabilidad en relación con la media. Entonces, si el CV de los porcentajes en 2001 fue 42.95137 y en el 2005 fue de 39.84340, había mayor variabilidad entre los datos antes.
Podemos ver este comportamiento gráficamente usando la media:
plot<-ggplot(summary_stats_by_year, aes(x = Year, y = Media)) +
geom_line(color = "blue") +
geom_point(color = "purple") +
labs(title = "Media de Porcentaje de Acceso a Electricidad por Año",
x = "Año",
y = "Media Porcentaje de Acceso a Electricidad") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
interactive <- ggplotly(plot, tooltip = c("x", "y"))
interactive
Como vimos anteriormente, las variables categóricas de esta base de datos son: “Country Name”, “Country Code”, “Indicator Name”, “Indicator Code”. Las cuales no se pueden analizar por medio de un conteo, ya que no representan una frecuencia; sin embargo, podemos verificar la información previamente expuesta.
El único indicador es el del porcentaje de la población con acceso a la electricidad:
unique(data$`Indicator Name`)
## [1] "Access to electricity (% of population)"
unique(data$`Indicator Code`)
## [1] "EG.ELC.ACCS.ZS"
Además, se pueden ver algunos países y códigos, ya que son demasiados.
summary(as.factor(data$`Country Name`))
## Afghanistan
## 64
## Africa Eastern and Southern
## 64
## Africa Western and Central
## 64
## Albania
## 64
## Algeria
## 64
## American Samoa
## 64
## Andorra
## 64
## Angola
## 64
## Antigua and Barbuda
## 64
## Arab World
## 64
## Argentina
## 64
## Armenia
## 64
## Aruba
## 64
## Australia
## 64
## Austria
## 64
## Azerbaijan
## 64
## Bahamas, The
## 64
## Bahrain
## 64
## Bangladesh
## 64
## Barbados
## 64
## Belarus
## 64
## Belgium
## 64
## Belize
## 64
## Benin
## 64
## Bermuda
## 64
## Bhutan
## 64
## Bolivia
## 64
## Bosnia and Herzegovina
## 64
## Botswana
## 64
## Brazil
## 64
## British Virgin Islands
## 64
## Brunei Darussalam
## 64
## Bulgaria
## 64
## Burkina Faso
## 64
## Burundi
## 64
## Cabo Verde
## 64
## Cambodia
## 64
## Cameroon
## 64
## Canada
## 64
## Caribbean small states
## 64
## Cayman Islands
## 64
## Central African Republic
## 64
## Central Europe and the Baltics
## 64
## Chad
## 64
## Channel Islands
## 64
## Chile
## 64
## China
## 64
## Colombia
## 64
## Comoros
## 64
## Congo, Dem. Rep.
## 64
## Congo, Rep.
## 64
## Costa Rica
## 64
## Cote d'Ivoire
## 64
## Croatia
## 64
## Cuba
## 64
## Curacao
## 64
## Cyprus
## 64
## Czechia
## 64
## Denmark
## 64
## Djibouti
## 64
## Dominica
## 64
## Dominican Republic
## 64
## Early-demographic dividend
## 64
## East Asia & Pacific
## 64
## East Asia & Pacific (excluding high income)
## 64
## East Asia & Pacific (IDA & IBRD countries)
## 64
## Ecuador
## 64
## Egypt, Arab Rep.
## 64
## El Salvador
## 64
## Equatorial Guinea
## 64
## Eritrea
## 64
## Estonia
## 64
## Eswatini
## 64
## Ethiopia
## 64
## Euro area
## 64
## Europe & Central Asia
## 64
## Europe & Central Asia (excluding high income)
## 64
## Europe & Central Asia (IDA & IBRD countries)
## 64
## European Union
## 64
## Faroe Islands
## 64
## Fiji
## 64
## Finland
## 64
## Fragile and conflict affected situations
## 64
## France
## 64
## French Polynesia
## 64
## Gabon
## 64
## Gambia, The
## 64
## Georgia
## 64
## Germany
## 64
## Ghana
## 64
## Gibraltar
## 64
## Greece
## 64
## Greenland
## 64
## Grenada
## 64
## Guam
## 64
## Guatemala
## 64
## Guinea
## 64
## Guinea-Bissau
## 64
## Guyana
## 64
## (Other)
## 10688
summary(as.factor(data$`Country Code`))
## ABW AFE AFG AFW AGO ALB AND ARB ARE ARG
## 64 64 64 64 64 64 64 64 64 64
## ARM ASM ATG AUS AUT AZE BDI BEL BEN BFA
## 64 64 64 64 64 64 64 64 64 64
## BGD BGR BHR BHS BIH BLR BLZ BMU BOL BRA
## 64 64 64 64 64 64 64 64 64 64
## BRB BRN BTN BWA CAF CAN CEB CHE CHI CHL
## 64 64 64 64 64 64 64 64 64 64
## CHN CIV CMR COD COG COL COM CPV CRI CSS
## 64 64 64 64 64 64 64 64 64 64
## CUB CUW CYM CYP CZE DEU DJI DMA DNK DOM
## 64 64 64 64 64 64 64 64 64 64
## DZA EAP EAR EAS ECA ECS ECU EGY EMU ERI
## 64 64 64 64 64 64 64 64 64 64
## ESP EST ETH EUU FCS FIN FJI FRA FRO FSM
## 64 64 64 64 64 64 64 64 64 64
## GAB GBR GEO GHA GIB GIN GMB GNB GNQ GRC
## 64 64 64 64 64 64 64 64 64 64
## GRD GRL GTM GUM GUY HIC HKG HND HPC (Other)
## 64 64 64 64 64 64 64 64 64 10688
Observamos que la frecuencia de cada país y código da como resultado 64, lo cual es consecuencia del cambio hecho previamente de ancho a largo, puesto que son 64 años a ser analizados por cada país.
library(dplyr)
latin_america <- c("Argentina", "Brazil", "Chile", "Mexico", "Colombia", "Venezuela, RB") # Puedes añadir más países
data_latam <- filter(data, `Country Name` %in% latin_america, Year >= 1988)
ggplot(data_latam, aes(x = Year, y = Percentage, color = `Country Name` )) +
geom_line() +
labs(title = "Evolución del acceso en América Latina y el Caribe", x = "Año", y = "Porcentaje de acceso") +
theme_minimal()
## Warning: Removed 22 rows containing missing values or values outside the scale range
## (`geom_line()`).
Lo que nos muestra que a medida que avanza el tiempo hay mayor porcentaje de acceso para estos países.
african_countries <- c("Nigeria", "Ethiopia", "Kenya", "South Africa", "Egypt, Arab Rep.")
data_africa <- filter(data, `Country Name` %in% african_countries & Year >= 1988)
ggplot(data_africa, aes(x = Year, y = Percentage, color = `Country Name` )) +
geom_line() +
labs(title = "Evolución del acceso en África", x = "Año", y = "Porcentaje de acceso") +
theme_minimal()
## Warning: Removed 36 rows containing missing values or values outside the scale range
## (`geom_line()`).
Podríamos concluir que Egipto es el que ha tenido un mayor acceso a la electricidad y relativamente constante durante los años.
data_inferior <- filter(data, Year == 2022 & Percentage < 50)
ggplot(data_inferior, aes(x = reorder(`Country Name`, Percentage), y = Percentage)) +
geom_bar(stat = "identity", fill = "steelblue") +
coord_flip() + # Esto hace que las barras sean horizontales
labs(title = "Porcentaje de acceso inferior a 50% en 2022",
x = "País",
y = "Porcentaje de acceso") +
theme_minimal()
Notamos que son varios países en África los que tienen bajo acceso a la electricidad, por lo que puede ser un fallo en las políticas del gobierno y se podrían analizar más a profundidad sus causas.
library(rworldmap)
## Cargando paquete requerido: sp
## ### Welcome to rworldmap ###
## For a short introduction type : vignette('rworldmap')
library(dplyr)
data_1990 <- filter(data, Year == 1990 & Percentage == 100)
world_map <- joinCountryData2Map(data_1990, joinCode = "ISO3", nameJoinColumn = "Country Code")
## 69 codes from your data successfully matched countries in the map
## 9 codes from your data failed to match with a country code in the map
## 174 codes from the map weren't represented in your data
mapCountryData(world_map, nameColumnToPlot = "Percentage", mapTitle = "Países con acceso al 100% en 1990", catMethod = "pretty",colourPalette = c("lightblue", "blue"), addLegend = FALSE)
## You asked for 7 categories, 2 were used due to pretty() classification
Notamos que en su mayoría, son más los países desarrollados los que desde hace mucho tiempo tienen un mayor acceso a la electricidad.
data_col <- filter(data, `Country Code` == 'COL' & Year > 1990)
ggplot(data_col, aes(x = Year, y = Percentage, color = `Country Name` )) +
geom_line() +
labs(title = "Evolución del acceso en Colombia", x = "Año", y = "Porcentaje de acceso") +
theme_minimal()
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_line()`).
Notamos que Colombia viene evolucionando y aumentado su porcentaje de acceso a la electricidad con los años.
En esta ocasión, la función table() será reemplazada por
kable()
kable(data[1:20,], caption= "Tabla 1: Acceso a la electricidad (% de la población) - Mundial") %>%
kable_styling(full_width = T) %>%
scroll_box(width = "900px", height = "450px")
| Country Name | Country Code | Indicator Name | Indicator Code | Year | Percentage |
|---|---|---|---|---|---|
| Aruba | ABW | Access to electricity (% of population) | EG.ELC.ACCS.ZS | 1960 | NA |
| Aruba | ABW | Access to electricity (% of population) | EG.ELC.ACCS.ZS | 1961 | NA |
| Aruba | ABW | Access to electricity (% of population) | EG.ELC.ACCS.ZS | 1962 | NA |
| Aruba | ABW | Access to electricity (% of population) | EG.ELC.ACCS.ZS | 1963 | NA |
| Aruba | ABW | Access to electricity (% of population) | EG.ELC.ACCS.ZS | 1964 | NA |
| Aruba | ABW | Access to electricity (% of population) | EG.ELC.ACCS.ZS | 1965 | NA |
| Aruba | ABW | Access to electricity (% of population) | EG.ELC.ACCS.ZS | 1966 | NA |
| Aruba | ABW | Access to electricity (% of population) | EG.ELC.ACCS.ZS | 1967 | NA |
| Aruba | ABW | Access to electricity (% of population) | EG.ELC.ACCS.ZS | 1968 | NA |
| Aruba | ABW | Access to electricity (% of population) | EG.ELC.ACCS.ZS | 1969 | NA |
| Aruba | ABW | Access to electricity (% of population) | EG.ELC.ACCS.ZS | 1970 | NA |
| Aruba | ABW | Access to electricity (% of population) | EG.ELC.ACCS.ZS | 1971 | NA |
| Aruba | ABW | Access to electricity (% of population) | EG.ELC.ACCS.ZS | 1972 | NA |
| Aruba | ABW | Access to electricity (% of population) | EG.ELC.ACCS.ZS | 1973 | NA |
| Aruba | ABW | Access to electricity (% of population) | EG.ELC.ACCS.ZS | 1974 | NA |
| Aruba | ABW | Access to electricity (% of population) | EG.ELC.ACCS.ZS | 1975 | NA |
| Aruba | ABW | Access to electricity (% of population) | EG.ELC.ACCS.ZS | 1976 | NA |
| Aruba | ABW | Access to electricity (% of population) | EG.ELC.ACCS.ZS | 1977 | NA |
| Aruba | ABW | Access to electricity (% of population) | EG.ELC.ACCS.ZS | 1978 | NA |
| Aruba | ABW | Access to electricity (% of population) | EG.ELC.ACCS.ZS | 1979 | NA |
data_no_na <- data %>%
filter(!is.na(Percentage))
year_non_na_counts <- data_no_na %>%
group_by(Year) %>%
summarise(Total_NonNA_Count = n())
year_non_na_counts_df <- as.data.frame(year_non_na_counts)
kable(year_non_na_counts_df, caption = "Conteo de valores no NA por año") %>%
kable_styling(full_width = TRUE) %>%
scroll_box(width = "900px", height = "450px")
| Year | Total_NonNA_Count |
|---|---|
| 1990 | 107 |
| 1991 | 118 |
| 1992 | 135 |
| 1993 | 150 |
| 1994 | 156 |
| 1995 | 163 |
| 1996 | 178 |
| 1997 | 185 |
| 1998 | 194 |
| 1999 | 203 |
| 2000 | 259 |
| 2001 | 259 |
| 2002 | 260 |
| 2003 | 260 |
| 2004 | 260 |
| 2005 | 260 |
| 2006 | 260 |
| 2007 | 262 |
| 2008 | 262 |
| 2009 | 263 |
| 2010 | 263 |
| 2011 | 263 |
| 2012 | 263 |
| 2013 | 263 |
| 2014 | 263 |
| 2015 | 263 |
| 2016 | 263 |
| 2017 | 263 |
| 2018 | 263 |
| 2019 | 263 |
| 2020 | 263 |
| 2021 | 263 |
| 2022 | 263 |
data_no_na <- data %>%
filter(!is.na(Percentage))
country_non_na_counts <- data_no_na %>%
group_by(`Country Name`) %>%
summarise(Total_NonNA_Count = n())
country_non_na_counts_df <- as.data.frame(country_non_na_counts)
kable(country_non_na_counts_df, caption = "Conteo total de valores no NA por país") %>%
kable_styling(full_width = TRUE) %>%
scroll_box(width = "900px", height = "450px")
| Country Name | Total_NonNA_Count |
|---|---|
| Afghanistan | 23 |
| Africa Eastern and Southern | 23 |
| Africa Western and Central | 30 |
| Albania | 33 |
| Algeria | 23 |
| Andorra | 33 |
| Angola | 23 |
| Antigua and Barbuda | 33 |
| Arab World | 27 |
| Argentina | 33 |
| Armenia | 23 |
| Aruba | 33 |
| Australia | 33 |
| Austria | 33 |
| Azerbaijan | 24 |
| Bahamas, The | 33 |
| Bahrain | 33 |
| Bangladesh | 32 |
| Barbados | 33 |
| Belarus | 33 |
| Belgium | 33 |
| Belize | 32 |
| Benin | 27 |
| Bermuda | 33 |
| Bhutan | 23 |
| Bolivia | 31 |
| Bosnia and Herzegovina | 33 |
| Botswana | 32 |
| Brazil | 33 |
| British Virgin Islands | 33 |
| Brunei Darussalam | 33 |
| Bulgaria | 33 |
| Burkina Faso | 30 |
| Burundi | 25 |
| Cabo Verde | 23 |
| Cambodia | 25 |
| Cameroon | 32 |
| Canada | 33 |
| Caribbean small states | 30 |
| Cayman Islands | 33 |
| Central African Republic | 28 |
| Central Europe and the Baltics | 33 |
| Chad | 26 |
| Channel Islands | 33 |
| Chile | 33 |
| China | 23 |
| Colombia | 33 |
| Comoros | 27 |
| Congo, Dem. Rep. | 23 |
| Congo, Rep. | 23 |
| Costa Rica | 23 |
| Cote d’Ivoire | 29 |
| Croatia | 33 |
| Cuba | 23 |
| Curacao | 33 |
| Cyprus | 33 |
| Czechia | 33 |
| Denmark | 33 |
| Djibouti | 27 |
| Dominica | 25 |
| Dominican Republic | 32 |
| Early-demographic dividend | 30 |
| East Asia & Pacific | 23 |
| East Asia & Pacific (IDA & IBRD countries) | 23 |
| East Asia & Pacific (excluding high income) | 23 |
| Ecuador | 28 |
| Egypt, Arab Rep. | 31 |
| El Salvador | 32 |
| Equatorial Guinea | 23 |
| Eritrea | 28 |
| Estonia | 33 |
| Eswatini | 23 |
| Ethiopia | 23 |
| Euro area | 33 |
| Europe & Central Asia | 33 |
| Europe & Central Asia (IDA & IBRD countries) | 33 |
| Europe & Central Asia (excluding high income) | 23 |
| European Union | 33 |
| Faroe Islands | 33 |
| Fiji | 27 |
| Finland | 33 |
| Fragile and conflict affected situations | 23 |
| France | 33 |
| French Polynesia | 33 |
| Gabon | 23 |
| Gambia, The | 30 |
| Georgia | 23 |
| Germany | 33 |
| Ghana | 30 |
| Gibraltar | 33 |
| Greece | 33 |
| Greenland | 33 |
| Grenada | 25 |
| Guam | 33 |
| Guatemala | 28 |
| Guinea | 24 |
| Guinea-Bissau | 21 |
| Guyana | 30 |
| Haiti | 28 |
| Heavily indebted poor countries (HIPC) | 24 |
| High income | 33 |
| Honduras | 32 |
| Hong Kong SAR, China | 33 |
| Hungary | 33 |
| IBRD only | 23 |
| IDA & IBRD total | 23 |
| IDA blend | 25 |
| IDA only | 24 |
| IDA total | 25 |
| Iceland | 33 |
| India | 30 |
| Indonesia | 32 |
| Iran, Islamic Rep. | 23 |
| Iraq | 23 |
| Ireland | 33 |
| Isle of Man | 33 |
| Israel | 33 |
| Italy | 33 |
| Jamaica | 33 |
| Japan | 33 |
| Jordan | 33 |
| Kazakhstan | 28 |
| Kenya | 30 |
| Kiribati | 23 |
| Korea, Dem. People’s Rep. | 14 |
| Korea, Rep. | 33 |
| Kosovo | 10 |
| Kuwait | 33 |
| Kyrgyz Republic | 26 |
| Lao PDR | 30 |
| Late-demographic dividend | 23 |
| Latin America & Caribbean | 31 |
| Latin America & Caribbean (excluding high income) | 31 |
| Latin America & the Caribbean (IDA & IBRD countries) | 31 |
| Latvia | 33 |
| Least developed countries: UN classification | 23 |
| Lebanon | 23 |
| Lesotho | 23 |
| Liberia | 16 |
| Libya | 23 |
| Liechtenstein | 33 |
| Lithuania | 33 |
| Low & middle income | 23 |
| Low income | 23 |
| Lower middle income | 30 |
| Luxembourg | 33 |
| Macao SAR, China | 33 |
| Madagascar | 31 |
| Malawi | 31 |
| Malaysia | 23 |
| Maldives | 23 |
| Mali | 27 |
| Malta | 33 |
| Marshall Islands | 24 |
| Mauritania | 23 |
| Mauritius | 33 |
| Mexico | 31 |
| Micronesia, Fed. Sts. | 23 |
| Middle East & North Africa | 23 |
| Middle East & North Africa (IDA & IBRD countries) | 23 |
| Middle East & North Africa (excluding high income) | 23 |
| Middle income | 23 |
| Moldova | 33 |
| Monaco | 33 |
| Mongolia | 23 |
| Montenegro | 33 |
| Morocco | 31 |
| Mozambique | 26 |
| Myanmar | 23 |
| Namibia | 31 |
| Nauru | 33 |
| Nepal | 27 |
| Netherlands | 33 |
| New Caledonia | 33 |
| New Zealand | 33 |
| Nicaragua | 30 |
| Niger | 31 |
| Nigeria | 33 |
| North America | 33 |
| North Macedonia | 33 |
| Northern Mariana Islands | 33 |
| Norway | 33 |
| OECD members | 33 |
| Oman | 33 |
| Other small states | 27 |
| Pacific island small states | 24 |
| Pakistan | 25 |
| Palau | 33 |
| Panama | 33 |
| Papua New Guinea | 27 |
| Paraguay | 28 |
| Peru | 31 |
| Philippines | 30 |
| Poland | 33 |
| Portugal | 33 |
| Post-demographic dividend | 33 |
| Pre-demographic dividend | 26 |
| Puerto Rico | 33 |
| Qatar | 33 |
| Romania | 33 |
| Russian Federation | 33 |
| Rwanda | 27 |
| Samoa | 32 |
| San Marino | 33 |
| Sao Tome and Principe | 23 |
| Saudi Arabia | 33 |
| Senegal | 30 |
| Serbia | 33 |
| Seychelles | 29 |
| Sierra Leone | 23 |
| Singapore | 33 |
| Sint Maarten (Dutch part) | 33 |
| Slovak Republic | 33 |
| Slovenia | 33 |
| Small states | 27 |
| Solomon Islands | 24 |
| Somalia | 23 |
| South Africa | 27 |
| South Asia | 30 |
| South Asia (IDA & IBRD) | 30 |
| South Sudan | 16 |
| Spain | 33 |
| Sri Lanka | 23 |
| St. Kitts and Nevis | 33 |
| St. Lucia | 23 |
| St. Martin (French part) | 33 |
| St. Vincent and the Grenadines | 32 |
| Sub-Saharan Africa | 27 |
| Sub-Saharan Africa (IDA & IBRD countries) | 27 |
| Sub-Saharan Africa (excluding high income) | 27 |
| Sudan | 33 |
| Suriname | 24 |
| Sweden | 33 |
| Switzerland | 33 |
| Syrian Arab Republic | 23 |
| Tajikistan | 24 |
| Tanzania | 31 |
| Thailand | 23 |
| Timor-Leste | 23 |
| Togo | 25 |
| Tonga | 29 |
| Trinidad and Tobago | 23 |
| Tunisia | 29 |
| Turkiye | 23 |
| Turkmenistan | 23 |
| Turks and Caicos Islands | 33 |
| Tuvalu | 23 |
| Uganda | 30 |
| Ukraine | 33 |
| United Arab Emirates | 33 |
| United Kingdom | 33 |
| United States | 33 |
| Upper middle income | 23 |
| Uruguay | 31 |
| Uzbekistan | 27 |
| Vanuatu | 29 |
| Venezuela, RB | 31 |
| Viet Nam | 26 |
| Virgin Islands (U.S.) | 33 |
| West Bank and Gaza | 26 |
| World | 25 |
| Yemen, Rep. | 31 |
| Zambia | 33 |
| Zimbabwe | 31 |
country_stats <- data %>%
group_by(`Country Name`) %>%
summarise(
Min_Value = min(Percentage, na.rm = TRUE),
Max_Value = max(Percentage, na.rm = TRUE),
Mean_Value = mean(Percentage, na.rm = TRUE),
Median_Value = median(Percentage, na.rm = TRUE)
)
## Warning: There were 4 warnings in `summarise()`.
## The first warning was:
## ℹ In argument: `Min_Value = min(Percentage, na.rm = TRUE)`.
## ℹ In group 6: `Country Name = "American Samoa"`.
## Caused by warning in `min()`:
## ! ningún argumento finito para min; retornando Inf
## ℹ Run `dplyr::last_dplyr_warnings()` to see the 3 remaining warnings.
country_stats_df <- as.data.frame(country_stats)
kable(country_stats_df, caption = "Estadísticas descriptivas por país") %>%
kable_styling(full_width = TRUE) %>%
scroll_box(width = "900px", height = "450px")
| Country Name | Min_Value | Max_Value | Mean_Value | Median_Value |
|---|---|---|---|---|
| Afghanistan | 4.4000000 | 97.70000 | 57.091304 | 48.30000 |
| Africa Eastern and Southern | 19.9638818 | 48.71199 | 31.756021 | 28.91159 |
| Africa Western and Central | 31.5751303 | 55.43758 | 41.742255 | 41.24249 |
| Albania | 99.4000000 | 100.00000 | 99.809091 | 100.00000 |
| Algeria | 98.6000000 | 100.00000 | 99.095652 | 99.00000 |
| American Samoa | Inf | -Inf | NaN | NA |
| Andorra | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| Angola | 20.0000000 | 48.50000 | 36.286957 | 37.30000 |
| Antigua and Barbuda | 92.2000000 | 100.00000 | 98.202290 | 98.50000 |
| Arab World | 76.9643258 | 91.02640 | 85.446424 | 85.80413 |
| Argentina | 92.1548004 | 100.00000 | 96.929798 | 97.00000 |
| Armenia | 98.0000000 | 100.00000 | 99.521739 | 99.70000 |
| Aruba | 91.7000000 | 100.00000 | 99.472445 | 100.00000 |
| Australia | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| Austria | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| Azerbaijan | 97.0000000 | 100.00000 | 99.575000 | 99.95000 |
| Bahamas, The | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| Bahrain | 99.9413986 | 100.00000 | 99.996754 | 100.00000 |
| Bangladesh | 9.9054193 | 99.40000 | 50.852567 | 48.50000 |
| Barbados | 99.9000000 | 100.00000 | 99.993224 | 100.00000 |
| Belarus | 89.0000000 | 100.00000 | 96.503030 | 98.00000 |
| Belgium | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| Belize | 67.0000000 | 98.60000 | 84.969722 | 85.65000 |
| Benin | 14.5000000 | 56.50000 | 30.682590 | 29.60000 |
| Bermuda | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| Bhutan | 31.2000000 | 100.00000 | 76.378261 | 81.80000 |
| Bolivia | 60.0279007 | 99.90000 | 80.085928 | 80.20000 |
| Bosnia and Herzegovina | 98.5000000 | 100.00000 | 99.693939 | 100.00000 |
| Botswana | 9.3982267 | 75.90000 | 41.699681 | 40.80000 |
| Brazil | 87.4751160 | 100.00000 | 96.405326 | 97.60000 |
| British Virgin Islands | 94.6065598 | 100.00000 | 97.761719 | 97.80000 |
| Brunei Darussalam | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| Bulgaria | 88.1000000 | 100.00000 | 95.972727 | 97.30000 |
| Burkina Faso | 6.1000000 | 19.50000 | 12.513256 | 12.25000 |
| Burundi | 2.0709493 | 10.30000 | 5.950838 | 5.40000 |
| Cabo Verde | 58.6000000 | 97.10000 | 78.517391 | 82.30000 |
| Cambodia | 8.8200369 | 92.30000 | 43.387602 | 31.10000 |
| Cameroon | 29.0000000 | 71.00000 | 49.191549 | 48.60000 |
| Canada | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| Caribbean small states | 82.1285600 | 97.59329 | 90.221499 | 89.99393 |
| Cayman Islands | 99.8469238 | 100.00000 | 99.988969 | 100.00000 |
| Central African Republic | 3.0000000 | 15.70000 | 9.349798 | 8.90000 |
| Central Europe and the Baltics | 97.9262635 | 100.00000 | 99.337108 | 99.64098 |
| Chad | 2.3000000 | 11.70000 | 6.651023 | 6.35000 |
| Channel Islands | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| Chile | 92.2574270 | 100.00000 | 97.986104 | 98.20000 |
| China | 96.7000000 | 100.00000 | 99.043478 | 99.90000 |
| Colombia | 89.9000000 | 100.00000 | 95.888771 | 96.10000 |
| Comoros | 28.9000000 | 89.90000 | 60.739953 | 60.10000 |
| Congo, Dem. Rep. | 6.0000000 | 21.50000 | 13.782609 | 13.70000 |
| Congo, Rep. | 29.4000000 | 50.60000 | 40.095652 | 40.80000 |
| Costa Rica | 96.9000000 | 100.00000 | 99.213043 | 99.40000 |
| Cote d’Ivoire | 36.5000000 | 71.10000 | 56.690086 | 57.30000 |
| Croatia | 99.8000000 | 100.00000 | 99.981818 | 100.00000 |
| Cuba | 95.5000000 | 100.00000 | 98.073913 | 98.00000 |
| Curacao | 99.7000000 | 100.00000 | 99.936364 | 100.00000 |
| Cyprus | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| Czechia | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| Denmark | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| Djibouti | 49.7000000 | 65.40000 | 58.498590 | 57.60000 |
| Dominica | 75.0000000 | 100.00000 | 90.992877 | 90.90000 |
| Dominican Republic | 78.2000000 | 100.00000 | 92.671822 | 93.95000 |
| Early-demographic dividend | 55.7968255 | 95.28516 | 76.448489 | 76.56665 |
| East Asia & Pacific | 92.0806563 | 98.21324 | 95.665923 | 96.03563 |
| East Asia & Pacific (IDA & IBRD countries) | 91.1583142 | 98.56329 | 95.605101 | 96.41113 |
| East Asia & Pacific (excluding high income) | 91.1581941 | 98.02619 | 95.180562 | 95.59909 |
| Ecuador | 89.8301670 | 100.00000 | 96.300337 | 96.85000 |
| Egypt, Arab Rep. | 93.4000000 | 100.00000 | 98.329994 | 99.00000 |
| El Salvador | 69.6078800 | 100.00000 | 88.303516 | 90.10000 |
| Equatorial Guinea | 64.8000000 | 67.00000 | 65.752174 | 65.90000 |
| Eritrea | 22.9000000 | 55.40000 | 38.600703 | 38.35000 |
| Estonia | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| Eswatini | 20.4000000 | 82.90000 | 51.934783 | 51.20000 |
| Ethiopia | 10.2000000 | 55.00000 | 28.995652 | 25.50000 |
| Euro area | 99.9873264 | 100.00000 | 99.997249 | 100.00000 |
| Europe & Central Asia | 99.1137059 | 100.00000 | 99.704259 | 99.87138 |
| Europe & Central Asia (IDA & IBRD countries) | 98.2465834 | 100.00000 | 99.408165 | 99.71983 |
| Europe & Central Asia (excluding high income) | 98.9925837 | 99.99146 | 99.561447 | 99.61635 |
| European Union | 99.4761939 | 100.00000 | 99.839130 | 99.91700 |
| Faroe Islands | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| Fiji | 66.9000000 | 97.10000 | 86.039404 | 88.90000 |
| Finland | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| Fragile and conflict affected situations | 40.5982968 | 57.86153 | 48.554678 | 47.92781 |
| France | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| French Polynesia | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| Gabon | 73.6000000 | 93.50000 | 84.921739 | 86.40000 |
| Gambia, The | 17.7000000 | 65.40000 | 41.136884 | 40.65000 |
| Georgia | 97.9000000 | 100.00000 | 99.878261 | 100.00000 |
| Germany | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| Ghana | 30.3694057 | 86.30000 | 58.133095 | 56.70000 |
| Gibraltar | 99.7000000 | 100.00000 | 99.921212 | 100.00000 |
| Greece | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| Greenland | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| Grenada | 85.0400000 | 94.20000 | 89.526779 | 89.40000 |
| Guam | 99.8443069 | 100.00000 | 99.985737 | 100.00000 |
| Guatemala | 60.8000000 | 99.10000 | 82.678862 | 83.40000 |
| Guinea | 15.2000000 | 47.70000 | 29.320833 | 27.40000 |
| Guinea-Bissau | 1.3000000 | 37.40000 | 16.771429 | 14.90000 |
| Guyana | 70.7139816 | 93.00000 | 80.795530 | 78.05000 |
| Haiti | 30.0130177 | 49.30000 | 37.274968 | 36.35000 |
| Heavily indebted poor countries (HIPC) | 17.8675254 | 48.65539 | 31.392645 | 28.94794 |
| High income | 99.4422836 | 99.97989 | 99.776153 | 99.86585 |
| Honduras | 54.7819370 | 94.40000 | 75.524396 | 72.40000 |
| Hong Kong SAR, China | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| Hungary | 99.9000000 | 100.00000 | 99.993939 | 100.00000 |
| IBRD only | 84.5263170 | 99.11695 | 92.133669 | 92.51408 |
| IDA & IBRD total | 74.0831791 | 90.10598 | 82.055856 | 82.01881 |
| IDA blend | 54.5467065 | 77.22022 | 65.656884 | 64.19403 |
| IDA only | 25.5857160 | 60.62342 | 41.633015 | 39.29038 |
| IDA total | 36.4025530 | 66.11973 | 49.447553 | 47.23159 |
| Iceland | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| India | 49.8113098 | 99.60000 | 73.623353 | 73.20000 |
| Indonesia | 48.9000000 | 100.00000 | 86.663955 | 90.85000 |
| Iran, Islamic Rep. | 97.9000000 | 100.00000 | 99.247826 | 99.50000 |
| Iraq | 96.8000000 | 100.00000 | 98.573913 | 98.40000 |
| Ireland | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| Isle of Man | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| Israel | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| Italy | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| Jamaica | 70.3345860 | 100.00000 | 88.803576 | 88.70000 |
| Japan | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| Jordan | 96.8000000 | 100.00000 | 99.182538 | 99.40000 |
| Kazakhstan | 97.0000000 | 100.00000 | 99.760869 | 100.00000 |
| Kenya | 3.4734368 | 76.50000 | 32.047151 | 25.45000 |
| Kiribati | 55.6000000 | 94.40000 | 75.465217 | 74.90000 |
| Korea, Dem. People’s Rep. | 26.0000000 | 54.70000 | 40.900000 | 41.05000 |
| Korea, Rep. | 99.8828430 | 100.00000 | 99.990797 | 100.00000 |
| Kosovo | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| Kuwait | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| Kyrgyz Republic | 98.3000000 | 100.00000 | 99.496900 | 99.55970 |
| Lao PDR | 25.0000000 | 100.00000 | 63.435890 | 61.60000 |
| Late-demographic dividend | 95.2456142 | 99.48252 | 98.119241 | 98.63846 |
| Latin America & Caribbean | 88.5995058 | 98.57913 | 94.182952 | 94.47682 |
| Latin America & Caribbean (excluding high income) | 87.7184813 | 98.47625 | 93.763802 | 94.11299 |
| Latin America & the Caribbean (IDA & IBRD countries) | 88.4902080 | 98.54513 | 94.095765 | 94.37738 |
| Latvia | 99.6000000 | 100.00000 | 99.921212 | 100.00000 |
| Least developed countries: UN classification | 20.1822161 | 56.82640 | 37.130198 | 34.42001 |
| Lebanon | 97.8000000 | 100.00000 | 99.552174 | 99.70000 |
| Lesotho | 1.3000000 | 50.40000 | 23.578261 | 20.60000 |
| Liberia | 1.3000000 | 31.80000 | 14.768750 | 12.50000 |
| Libya | 67.0000000 | 99.80000 | 81.143478 | 80.30000 |
| Liechtenstein | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| Lithuania | 99.4000000 | 100.00000 | 99.809091 | 100.00000 |
| Low & middle income | 72.8544678 | 89.57009 | 81.174157 | 81.04695 |
| Low income | 15.8054673 | 44.94833 | 29.182925 | 26.74294 |
| Lower middle income | 47.1700205 | 90.62382 | 69.086854 | 68.94455 |
| Luxembourg | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| Macao SAR, China | 99.8492813 | 100.00000 | 99.987361 | 100.00000 |
| Madagascar | 9.2000000 | 36.50000 | 18.519544 | 16.00000 |
| Malawi | 1.5535558 | 18.00000 | 7.552962 | 7.40000 |
| Malaysia | 98.6000000 | 100.00000 | 99.382609 | 99.50000 |
| Maldives | 83.8000000 | 100.00000 | 95.813044 | 99.20000 |
| Mali | 4.2885408 | 53.40000 | 26.214657 | 24.00000 |
| Malta | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| Marshall Islands | 68.5242359 | 100.00000 | 84.746843 | 89.65000 |
| Mauritania | 18.2000000 | 49.00000 | 34.178261 | 35.30000 |
| Mauritius | 99.0000000 | 100.00000 | 99.314512 | 99.22089 |
| Mexico | 93.1459810 | 100.00000 | 98.054563 | 98.90000 |
| Micronesia, Fed. Sts. | 46.0000000 | 85.30000 | 65.917391 | 66.30000 |
| Middle East & North Africa | 92.4180532 | 97.62500 | 95.421978 | 95.91908 |
| Middle East & North Africa (IDA & IBRD countries) | 91.3401715 | 97.19838 | 94.686917 | 95.21742 |
| Middle East & North Africa (excluding high income) | 91.4262186 | 97.22911 | 94.741805 | 95.26890 |
| Middle income | 77.2998233 | 94.88556 | 86.275832 | 86.57554 |
| Moldova | 98.6000000 | 100.00000 | 99.593939 | 99.90000 |
| Monaco | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| Mongolia | 67.3000000 | 100.00000 | 84.091304 | 81.20000 |
| Montenegro | 97.7000000 | 100.00000 | 99.539394 | 99.80000 |
| Morocco | 49.2000000 | 100.00000 | 82.320367 | 84.10000 |
| Mozambique | 3.6631756 | 33.20000 | 17.617012 | 16.95000 |
| Myanmar | 41.9000000 | 73.70000 | 55.408696 | 52.00000 |
| Namibia | 26.4000000 | 56.20000 | 41.626589 | 42.30000 |
| Nauru | 98.9990845 | 100.00000 | 99.325576 | 99.20000 |
| Nepal | 17.9000000 | 93.90000 | 59.392324 | 59.30000 |
| Netherlands | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| New Caledonia | 99.8443069 | 100.00000 | 99.985737 | 100.00000 |
| New Zealand | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| Nicaragua | 69.0471030 | 86.50000 | 77.773054 | 77.20000 |
| Niger | 4.0121398 | 19.50000 | 11.553075 | 11.20000 |
| Nigeria | 27.3000000 | 60.50000 | 47.672948 | 48.00000 |
| North America | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| North Macedonia | 99.0000000 | 100.00000 | 99.666667 | 99.90000 |
| Northern Mariana Islands | 99.7000000 | 100.00000 | 99.959674 | 100.00000 |
| Norway | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| Not classified | Inf | -Inf | NaN | NA |
| OECD members | 99.1309867 | 100.00000 | 99.673866 | 99.69257 |
| Oman | 99.9000000 | 100.00000 | 99.993939 | 100.00000 |
| Other small states | 72.4961542 | 94.60278 | 84.842423 | 85.67029 |
| Pacific island small states | 55.4438772 | 86.42255 | 70.860692 | 72.11623 |
| Pakistan | 70.2653503 | 95.00000 | 84.269014 | 87.10000 |
| Palau | 96.6253510 | 100.00000 | 98.451128 | 98.30000 |
| Panama | 70.1900000 | 95.30000 | 84.845012 | 85.50000 |
| Papua New Guinea | 2.1584826 | 23.60000 | 15.002979 | 15.30000 |
| Paraguay | 77.4691140 | 100.00000 | 94.765092 | 96.80000 |
| Peru | 64.7489929 | 96.20000 | 82.121535 | 82.00000 |
| Philippines | 65.4000000 | 97.50000 | 82.587473 | 83.80000 |
| Poland | 99.8000000 | 100.00000 | 99.969697 | 100.00000 |
| Portugal | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| Post-demographic dividend | 99.7152424 | 100.00000 | 99.915077 | 99.95712 |
| Pre-demographic dividend | 22.9176844 | 51.12034 | 35.506965 | 32.47673 |
| Puerto Rico | 99.9834290 | 100.00000 | 99.999247 | 100.00000 |
| Qatar | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| Romania | 94.4000000 | 100.00000 | 98.233333 | 99.10000 |
| Russian Federation | 95.5000000 | 100.00000 | 99.200000 | 100.00000 |
| Rwanda | 1.0278360 | 50.60000 | 17.808057 | 11.40000 |
| Samoa | 78.8000000 | 100.00000 | 92.101682 | 94.45000 |
| San Marino | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| Sao Tome and Principe | 51.5000000 | 78.50000 | 63.382609 | 61.40000 |
| Saudi Arabia | 99.9000000 | 100.00000 | 99.993939 | 100.00000 |
| Senegal | 26.0000000 | 70.40000 | 48.866904 | 49.80000 |
| Serbia | 99.6000000 | 100.00000 | 99.863636 | 99.90000 |
| Seychelles | 90.0000000 | 100.00000 | 96.679008 | 96.80000 |
| Sierra Leone | 7.7000000 | 29.40000 | 16.778261 | 14.60000 |
| Singapore | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| Sint Maarten (Dutch part) | 99.7000000 | 100.00000 | 99.936364 | 100.00000 |
| Slovak Republic | 99.9000000 | 100.00000 | 99.993939 | 100.00000 |
| Slovenia | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| Small states | 72.7326038 | 93.91070 | 83.929486 | 84.75684 |
| Solomon Islands | 4.7000000 | 76.30000 | 38.050000 | 37.25000 |
| Somalia | 2.1000000 | 52.30000 | 34.634783 | 49.20000 |
| South Africa | 57.6000000 | 90.00000 | 80.067227 | 82.60000 |
| South Asia | 46.3227012 | 98.78642 | 71.823672 | 71.26464 |
| South Asia (IDA & IBRD) | 46.3227012 | 98.78642 | 71.823672 | 71.26464 |
| South Sudan | 0.8000000 | 8.40000 | 4.475000 | 4.25000 |
| Spain | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| Sri Lanka | 63.6000000 | 100.00000 | 87.304348 | 87.10000 |
| St. Kitts and Nevis | 91.3068161 | 100.00000 | 96.703563 | 96.70000 |
| St. Lucia | 86.0000000 | 100.00000 | 94.752174 | 95.30000 |
| St. Martin (French part) | 99.7000000 | 100.00000 | 99.936364 | 100.00000 |
| St. Vincent and the Grenadines | 66.8000000 | 100.00000 | 87.218849 | 88.15000 |
| Sub-Saharan Africa | 25.6591470 | 51.43474 | 35.675339 | 32.53974 |
| Sub-Saharan Africa (IDA & IBRD countries) | 25.6591470 | 51.43474 | 35.675339 | 32.53974 |
| Sub-Saharan Africa (excluding high income) | 25.6507452 | 51.42993 | 35.668416 | 32.53319 |
| Sudan | 20.0000000 | 63.20000 | 37.219592 | 30.77676 |
| Suriname | 91.0000000 | 99.62406 | 95.584336 | 95.05000 |
| Sweden | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| Switzerland | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| Syrian Arab Republic | 86.0000000 | 99.50000 | 90.882609 | 90.70000 |
| Tajikistan | 97.0000000 | 100.00000 | 98.958333 | 99.00000 |
| Tanzania | 2.3301878 | 45.80000 | 17.504912 | 13.80000 |
| Thailand | 82.1000000 | 100.00000 | 96.656522 | 99.30000 |
| Timor-Leste | 17.8000000 | 100.00000 | 55.865217 | 52.70000 |
| Togo | 15.3000000 | 57.20000 | 35.690920 | 33.70000 |
| Tonga | 80.0000000 | 100.00000 | 91.298752 | 92.00000 |
| Trinidad and Tobago | 91.3000000 | 100.00000 | 98.656522 | 100.00000 |
| Tunisia | 86.8000000 | 100.00000 | 97.482759 | 99.40000 |
| Turkiye | 99.7000000 | 100.00000 | 99.895652 | 100.00000 |
| Turkmenistan | 99.5000000 | 100.00000 | 99.808696 | 99.90000 |
| Turks and Caicos Islands | 88.7000000 | 100.00000 | 96.266964 | 96.00000 |
| Tuvalu | 94.6000000 | 100.00000 | 97.121739 | 97.00000 |
| Uganda | 0.5338985 | 47.10000 | 17.072143 | 11.85000 |
| Ukraine | 99.1000000 | 100.00000 | 99.851515 | 100.00000 |
| United Arab Emirates | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| United Kingdom | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| United States | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| Upper middle income | 94.3274649 | 99.48512 | 97.665075 | 98.38205 |
| Uruguay | 95.9000000 | 100.00000 | 98.445199 | 98.70000 |
| Uzbekistan | 99.3797455 | 100.00000 | 99.686785 | 99.60000 |
| Vanuatu | 12.9416437 | 70.00000 | 38.153676 | 33.30000 |
| Venezuela, RB | 95.7000000 | 100.00000 | 99.063458 | 99.10000 |
| Viet Nam | 78.4000000 | 100.00000 | 94.764247 | 96.75000 |
| Virgin Islands (U.S.) | 100.0000000 | 100.00000 | 100.000000 | 100.00000 |
| West Bank and Gaza | 97.9000000 | 100.00000 | 99.556248 | 99.80000 |
| World | 73.3510572 | 91.42033 | 83.906972 | 83.56808 |
| Yemen, Rep. | 40.7747154 | 79.20000 | 57.514358 | 55.80000 |
| Zambia | 12.7527027 | 47.80000 | 25.470565 | 22.00000 |
| Zimbabwe | 28.1000000 | 52.70000 | 37.973419 | 36.90000 |
na_count <- colSums(is.na(data))
na_count
## Country Name Country Code Indicator Name Indicator Code Year
## 0 0 0 0 0
## Percentage
## 9411
Hay un total de 9411 valores NA en la variable Percentage
library(Amelia)
## Cargando paquete requerido: Rcpp
## ##
## ## Amelia II: Multiple Imputation
## ## (Version 1.8.2, built: 2024-04-10)
## ## Copyright (C) 2005-2024 James Honaker, Gary King and Matthew Blackwell
## ## Refer to http://gking.harvard.edu/amelia/ for more information
## ##
library(Rcpp)
PERCENTAGES_NA<-is.na(data$PERCENTAGES)
## Warning: Unknown or uninitialised column: `PERCENTAGES`.
NAS_ALL <- !complete.cases(data)
head(data[NAS_ALL, ], 20)
## # A tibble: 20 × 6
## `Country Name` `Country Code` `Indicator Name` `Indicator Code` Year
## <chr> <chr> <chr> <chr> <dbl>
## 1 Aruba ABW Access to electricity (… EG.ELC.ACCS.ZS 1960
## 2 Aruba ABW Access to electricity (… EG.ELC.ACCS.ZS 1961
## 3 Aruba ABW Access to electricity (… EG.ELC.ACCS.ZS 1962
## 4 Aruba ABW Access to electricity (… EG.ELC.ACCS.ZS 1963
## 5 Aruba ABW Access to electricity (… EG.ELC.ACCS.ZS 1964
## 6 Aruba ABW Access to electricity (… EG.ELC.ACCS.ZS 1965
## 7 Aruba ABW Access to electricity (… EG.ELC.ACCS.ZS 1966
## 8 Aruba ABW Access to electricity (… EG.ELC.ACCS.ZS 1967
## 9 Aruba ABW Access to electricity (… EG.ELC.ACCS.ZS 1968
## 10 Aruba ABW Access to electricity (… EG.ELC.ACCS.ZS 1969
## 11 Aruba ABW Access to electricity (… EG.ELC.ACCS.ZS 1970
## 12 Aruba ABW Access to electricity (… EG.ELC.ACCS.ZS 1971
## 13 Aruba ABW Access to electricity (… EG.ELC.ACCS.ZS 1972
## 14 Aruba ABW Access to electricity (… EG.ELC.ACCS.ZS 1973
## 15 Aruba ABW Access to electricity (… EG.ELC.ACCS.ZS 1974
## 16 Aruba ABW Access to electricity (… EG.ELC.ACCS.ZS 1975
## 17 Aruba ABW Access to electricity (… EG.ELC.ACCS.ZS 1976
## 18 Aruba ABW Access to electricity (… EG.ELC.ACCS.ZS 1977
## 19 Aruba ABW Access to electricity (… EG.ELC.ACCS.ZS 1978
## 20 Aruba ABW Access to electricity (… EG.ELC.ACCS.ZS 1979
## # ℹ 1 more variable: Percentage <dbl>
suppressWarnings(require(Amelia))
suppressWarnings(missmap(data))
summary(data)
## Country Name Country Code Indicator Name Indicator Code
## Length:17024 Length:17024 Length:17024 Length:17024
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
##
## Year Percentage
## Min. :1960 Min. : 0.534
## 1st Qu.:1976 1st Qu.: 69.608
## Median :1992 Median : 98.400
## Mean :1992 Mean : 81.042
## 3rd Qu.:2007 3rd Qu.:100.000
## Max. :2023 Max. :100.000
## NA's :9411
par(mfrow=c(1,3))
boxplot(data$Percentage, main="PORCENTAJES", ylab="Percentage", col="lightblue")
En la distribución no hay valores atípicos. Ahora, si filtramos en 2022, encontraremos una distribución con valores atípicos.
data_2022 <- data[data$Year == 2022, ]
boxplot(data_2022$Percentage, main="PORCENTAJES en 2022", ylab="Porcentaje", col="lightblue")
Ahora, aplicaremos el test de Kolmogorov-Smirnov para realizar la prueba de normalidad de esta distribución.
data$Percentage <- as.numeric(data$Percentage)
data_2022 <- data %>%
filter(Year == 2022 & !is.na(Percentage) & is.finite(Percentage))
mean_2022 <- mean(data_2022$Percentage)
sd_2022 <- sd(data_2022$Percentage)
ks_test_result <- ks.test(data_2022$Percentage, "pnorm", mean = mean_2022, sd = sd_2022)
## Warning in ks.test.default(data_2022$Percentage, "pnorm", mean = mean_2022, :
## ties should not be present for the one-sample Kolmogorov-Smirnov test
ks_test_result
##
## Asymptotic one-sample Kolmogorov-Smirnov test
##
## data: data_2022$Percentage
## D = 0.31786, p-value < 2.2e-16
## alternative hypothesis: two-sided
El valor de D (0.31786) indica que hay una diferencia notable entre la distribución observada de los datos y la distribución normal teórica. Un valor de D relativamente alto sugiere que los datos no se ajustan bien a la distribución normal.
El p-value menor que 2.2e-16 es extremadamente bajo. Dado que el p-value es mucho menor que el umbral comúnmente utilizado de 0.05, se puede sugerir que los datos siguen una distribución normal.
hist(data_2022$Percentage,
probability = TRUE,
col = "lightblue",
main = "Distribucion del acceso a electricidad en 2022",
xlab = "Porcentaje de acceso a electricidad")
curve(dnorm(x, mean = mean_2022, sd = sd_2022),
col = "red", lwd = 2, add = TRUE)
qqnorm(data_2022$Percentage,
main = "Q-Q Plot - Porcentaje de acceso a electricidad en 2022")
qqline(data_2022$Percentage, col = "red")
De la misma manera, la curva roja superpuesta representa una distribución normal. Como vemos, en el histograma y en el diagrama QQ, los datos no se ajustan a las lineas rojas, por tanto, podemos concluir que la distribución no es normal.
Por tanto, luego de realizar el test de Kolmogorov-Smirnov y los métodos gráficos podemos concluir que los datos no se ajustan a una distribución normal.
require(tidyr)
library(Amelia)
library(Rcpp)
data_2022 <- filter(data, Year == 2022)
summary(data_2022$Percentage)
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 8.40 85.56 100.00 87.16 100.00 100.00 3
data_2022$Percentage <- as.numeric(as.character(data_2022$Percentage))
hist(data_2022$Percentage,
probability = TRUE,
col = "lightblue",
main = "Distribucion del acceso a electricidad en 2022",
xlab = "Porcentaje de acceso a electricidad")
median_value <- median(data_2022$Percentage, na.rm = TRUE)
head(filter(data_2022, Percentage < 85), 20)
## # A tibble: 20 × 6
## `Country Name` `Country Code` `Indicator Name` `Indicator Code` Year
## <chr> <chr> <chr> <chr> <dbl>
## 1 Africa Eastern and So… AFE Access to elect… EG.ELC.ACCS.ZS 2022
## 2 Africa Western and Ce… AFW Access to elect… EG.ELC.ACCS.ZS 2022
## 3 Angola AGO Access to elect… EG.ELC.ACCS.ZS 2022
## 4 Burundi BDI Access to elect… EG.ELC.ACCS.ZS 2022
## 5 Benin BEN Access to elect… EG.ELC.ACCS.ZS 2022
## 6 Burkina Faso BFA Access to elect… EG.ELC.ACCS.ZS 2022
## 7 Botswana BWA Access to elect… EG.ELC.ACCS.ZS 2022
## 8 Central African Repub… CAF Access to elect… EG.ELC.ACCS.ZS 2022
## 9 Cote d'Ivoire CIV Access to elect… EG.ELC.ACCS.ZS 2022
## 10 Cameroon CMR Access to elect… EG.ELC.ACCS.ZS 2022
## 11 Congo, Dem. Rep. COD Access to elect… EG.ELC.ACCS.ZS 2022
## 12 Congo, Rep. COG Access to elect… EG.ELC.ACCS.ZS 2022
## 13 Djibouti DJI Access to elect… EG.ELC.ACCS.ZS 2022
## 14 Eritrea ERI Access to elect… EG.ELC.ACCS.ZS 2022
## 15 Ethiopia ETH Access to elect… EG.ELC.ACCS.ZS 2022
## 16 Fragile and conflict … FCS Access to elect… EG.ELC.ACCS.ZS 2022
## 17 Guinea GIN Access to elect… EG.ELC.ACCS.ZS 2022
## 18 Gambia, The GMB Access to elect… EG.ELC.ACCS.ZS 2022
## 19 Guinea-Bissau GNB Access to elect… EG.ELC.ACCS.ZS 2022
## 20 Equatorial Guinea GNQ Access to elect… EG.ELC.ACCS.ZS 2022
## # ℹ 1 more variable: Percentage <dbl>
data_2022$Percentage[data_2022$Percentage < 85] <- NA
boxplot(data_2022$Percentage,
main="Porcentajes en 2022",
ylab="Porcentaje",
col="lightblue"
)
na.omit(data_2022$Percentage < 85)
## [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [49] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [61] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [73] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [85] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [97] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [109] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [121] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [133] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [145] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [157] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [169] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [181] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [193] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## attr(,"na.action")
## [1] 2 4 5 12 17 19 20 34 35 42 43 44 45 57 70 73 75 86 87
## [20] 88 89 99 101 105 106 108 111 122 132 133 136 137 142 152 159 161 166 167
## [39] 169 172 174 175 190 192 194 204 207 208 210 211 214 216 217 218 220 225 230
## [58] 233 242 247 248 259 262 263 265 266
## attr(,"class")
## [1] "omit"
hist(data_2022$Percentage,
probability = TRUE,
col = "lightblue",
main = "Distribucion del acceso a electricidad en 2022",
xlab = "Porcentaje de acceso a electricidad")
Al dibujar el boxplot luego de hacer el tratamiento tenemos que aún se siguen visualizando valores atípicos. Por eso, se dibujó un histograma para analizar la distribución de todos los datos, luego se elimnaron los porcentajes inferiores a 85, y luego, se volvió a dibujar el histograma. Al hacer esto, encontramos que la distribución no se vió afectada. Ahora procederemos a realizar de nuevo el tratamiento para valores atípicos.
data_2022$Percentage[data_2022$Percentage < 85] <- NA
na.omit(data_2022$Percentage < 85)
## [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [49] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [61] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [73] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [85] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [97] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [109] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [121] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [133] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [145] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [157] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [169] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [181] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [193] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## attr(,"na.action")
## [1] 2 4 5 12 17 19 20 34 35 42 43 44 45 57 70 73 75 86 87
## [20] 88 89 99 101 105 106 108 111 122 132 133 136 137 142 152 159 161 166 167
## [39] 169 172 174 175 190 192 194 204 207 208 210 211 214 216 217 218 220 225 230
## [58] 233 242 247 248 259 262 263 265 266
## attr(,"class")
## [1] "omit"
summary(data_2022$Percentage)
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 85.10 99.46 100.00 98.56 100.00 100.00 66
median_value <- median(data_2022$Percentage, na.rm = TRUE)
data_2022$Percentage[data_2022$Percentage < 100] <- NA
data_2022$Percentage[is.na(data_2022$Percentage)] <- median_value
boxplot(data_2022$Percentage,
main="Porcentajes en 2022",
ylab="Porcentaje",
col="lightblue"
)
suppressWarnings(missmap(data_2022))
summary(data_2022$Percentage)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 100 100 100 100 100 100