Lectura Previa:
OBJETIVO
El objetivo de esta tarea es familiarizar al estudiante con el manejo de gráficos dinámicos, específicamente usaremos visualizaciones creadas con el paquete rCharts. Como se discutió en la presentación rCharts permite la creación de gráficos dinámicos basado en diferentes librerías de Javascript, pero para el presente trabajo nos centraremos solo en NVD3. NVD3 es una librería basada en Javascript D3 que permite la creación de distintos tipos de visualizaciones.
knitr::opts_chunk$set(
echo = TRUE,
warning = FALSE,
message = FALSE,
comment = NA
)
Enunciado
Considere el archivo sobre los efectos del virus COVID-19 en el mundo . La data puede ser provista por el instructor o bajada de algunos de los repositorios a nivel mundial de acuerdo a enlace provisto al inicio de este documento. El dataset contiene varias columnas y sus filas corresponden a los valores de cada país día a día desde el inicio de la pandemia. Nos interesa únicamente los totales, por lo que el estudiante deberá realizar los filtros a la última fecha y agregaciones que crea conveniente.
La data puede ser descargada de: https://covid19.who.int/WHO-COVID-19-global-data.csv
Lea la data directamente del Internet al ambiente R, note que esta contiene información día a día, lo cual no nos interesa , queremos obtener la data a la fecha última , por lo tanto deberá realizar un filtro a esa fecha.
# Extraer la data de internet a la ulitma fecha 14-07-204
covid_ultimo_corte <- read.csv("https://covid19.who.int/WHO-COVID-19-global-data.csv",
header=TRUE, sep=",")
#View(covid_ultimo_corte)
names(covid_ultimo_corte)
[1] "Date_reported" "Country_code" "Country"
[4] "WHO_region" "New_cases" "Cumulative_cases"
[7] "New_deaths" "Cumulative_deaths"
dim(covid_ultimo_corte)
[1] 57120 8
require(dplyr)
ultima_fecha <- max(covid_ultimo_corte$Date_reported, na.rm = TRUE)
ultima_fecha <- covid_ultimo_corte %>%
filter(Date_reported == ultima_fecha)
# Seleccionar las columnas de interes
# WHO_region,
# Country_code
# Cumulative_cases
# Cumulative_deaths
# Realice una selección de dichas columnas
covid <- ultima_fecha[, c("WHO_region", "Country_code", "Cumulative_cases", "Cumulative_deaths" )]
#View(covid)
dim(covid)
[1] 240 4
colSums(covid == 0)
WHO_region Country_code Cumulative_cases Cumulative_deaths
0 NA 2 13
# Realice un filtro de aquellos países que no tiene casos o muertos registrados
covid <- covid[covid$Cumulative_cases != 0, ]
covid <- covid[covid$Cumulative_deaths != 0, ]
colSums(covid == 0)
WHO_region Country_code Cumulative_cases Cumulative_deaths
0 NA 0 0
#View(covid)
dim(covid)
[1] 227 4
# Realice un filtro de los países que pertenecen a la región “Other”
# unique(covid$WHO_region)
covid <- covid[covid$WHO_region != "OTHER", ]
View(covid)
unique(covid$WHO_region)
[1] "EMRO" "EURO" "AFRO" "WPRO" "AMRO" "SEARO" ""
dim(covid)
[1] 226 4
# Asegúrese de que la data está completa , es decir , no existen valores en blanco
# Valores vacios en Who_Region
sum(covid$WHO_region == "")
[1] 14
covid<- covid[covid$WHO_region != "", ]
# Comprobar valores 0
colSums(covid == 0)
WHO_region Country_code Cumulative_cases Cumulative_deaths
0 NA 0 0
dim(covid)
[1] 212 4
# Valores vacios en Country Code
covid <- covid[!is.na(covid$Country_code), ]
colSums(covid == 0)
WHO_region Country_code Cumulative_cases Cumulative_deaths
0 0 0 0
# Dataset final
head(covid)
WHO_region Country_code Cumulative_cases Cumulative_deaths
1 EMRO AF 235214 7998
2 EURO AL 335047 3605
3 AFRO DZ 272083 6881
4 WPRO AS 8359 34
5 EURO AD 48015 159
6 AFRO AO 107481 1937
dim(covid)
[1] 211 4
1.- Desarrolle las siguientes visualizaciones utilizando rCharts:
Renderice el archivo y verifique se presenta adecuadamente dentro del cuaderno RMD en un browser . Note las configuraciones dinámicas que provee.
# Cargar las librerías necesarias
require(rCharts)
require(dplyr)
require(reshape2)
require(tidyr)
require(knitr)
# Agrupar y resumir los datos por país y región
covidm <- covid %>%
group_by(Country_code, WHO_region) %>%
summarise(
Cumulative_cases = sum(Cumulative_cases, na.rm = TRUE),
Cumulative_deaths = sum(Cumulative_deaths, na.rm = TRUE),
.groups = 'drop'
)
# Verificar las dimensiones del dataset resultante
print(dim(covidm))
[1] 211 4
# Reformatear los datos para incluir tanto muertes como casos en un formato adecuado para un gráfico de barras apiladas
covidm_long <- covidm %>%
pivot_longer(cols = c(Cumulative_cases, Cumulative_deaths), names_to = "Metric", values_to = "Count")
# Verificar la estructura del dataset reformateado
head(covidm_long)
# A tibble: 6 × 4
Country_code WHO_region Metric Count
<chr> <chr> <chr> <int>
1 AD EURO Cumulative_cases 48015
2 AD EURO Cumulative_deaths 159
3 AE EMRO Cumulative_cases 1067030
4 AE EMRO Cumulative_deaths 2349
5 AF EMRO Cumulative_cases 235214
6 AF EMRO Cumulative_deaths 7998
# Crear el gráfico de barras apiladas con ambas métricas agrupadas por WHO_region
chart <- nPlot(Count ~ Country_code, group = 'WHO_region', data = covidm_long, type = 'sccaterChart')
# Personalizar el gráfico (opcional)
chart$xAxis(axisLabel = 'Country Code')
chart$yAxis(axisLabel = 'Count')
# Mostrar el gráfico
#chart
# Guardar el gráfico como archivo JavaScript para su inclusión en RMarkdown
chart$save('covid_chart.html', standalone = TRUE)
Las barras se encuentra apiladas por numero de casos y muertes, debido a la escala es casi imperceptible dada la poquisima cantidada de una de las variables sobre la otra, apenas puede notarse un pequeño sombreado en algunas barras, tambien se puede acceder a esta información llevando el mouse a la zona de diferente color en cada barra.
El instructor pondrá los siguientes ejemplos como plantillas para la resolución de esta tarea.
El archivo index44.html servirá como plantilla base sobre el cual el alumno deberá realizar los cambios siguientes,
Para crear el script D3 deberá seguir los siguientes pasos:
Necesitamos extraer la data covid en 3 sets (listas) que pasaremos al código D3.
# Funciones para crear las listas.
listareg <- as.list(covid$WHO_region)
for (r in 1:length(listareg))
{
if (r < length(listareg)) {
cat('"',as.character(listareg[r]),'"', ",", sep="")}
else{
cat('"',as.character(listareg[r]),'"', sep="")
}
}
"EMRO","EURO","AFRO","WPRO","EURO","AFRO","AMRO","AMRO","AMRO","EURO","AMRO","WPRO","EURO","EURO","AMRO","EMRO","SEARO","AMRO","EURO","EURO","AMRO","AFRO","AMRO","SEARO","AMRO","AMRO","EURO","AFRO","AMRO","AMRO","WPRO","EURO","AFRO","AFRO","AFRO","WPRO","AFRO","AMRO","AMRO","AFRO","AFRO","AMRO","WPRO","AMRO","AFRO","AFRO","WPRO","AMRO","AFRO","EURO","AMRO","AMRO","EURO","EURO","AFRO","EURO","EMRO","AMRO","AMRO","AMRO","EMRO","AMRO","AFRO","AFRO","EURO","AFRO","AFRO","WPRO","EURO","EURO","WPRO","AFRO","AFRO","EURO","EURO","AFRO","EURO","EURO","AMRO","WPRO","AMRO","AFRO","AFRO","AMRO","AMRO","AMRO","EURO","EURO","SEARO","SEARO","EMRO","EMRO","EURO","EURO","EURO","AMRO","WPRO","EMRO","EURO","AFRO","WPRO","EURO","EMRO","EURO","WPRO","EURO","EMRO","AFRO","AFRO","EMRO","EURO","EURO","AFRO","AFRO","WPRO","SEARO","AFRO","EURO","WPRO","AFRO","AFRO","AMRO","WPRO","EURO","WPRO","EURO","AMRO","EMRO","AFRO","SEARO","WPRO","SEARO","EURO","WPRO","WPRO","AMRO","AFRO","AFRO","EURO","WPRO","EURO","EMRO","EMRO","EMRO","WPRO","AMRO","WPRO","AMRO","AMRO","WPRO","EURO","EURO","AMRO","EMRO","WPRO","EURO","EURO","EURO","AFRO","AMRO","AMRO","AMRO","WPRO","EURO","AFRO","EMRO","AFRO","EURO","AFRO","AFRO","WPRO","AMRO","EURO","EURO","WPRO","EMRO","AFRO","AFRO","EURO","SEARO","EMRO","AMRO","EURO","EURO","EMRO","EURO","SEARO","EURO","SEARO","AFRO","WPRO","AMRO","EMRO","EURO","AMRO","WPRO","AFRO","EURO","EMRO","AFRO","AMRO","AMRO","AMRO","EURO","WPRO","AMRO","WPRO","WPRO","EMRO","AFRO","AFRO"
listareg <- as.list(covid$Country_code)
for (r in 1:length(listareg))
{
if (r < length(listareg)) {
cat('"',as.character(listareg[r]),'"', ",", sep="")}
else{
cat('"',as.character(listareg[r]),'"', sep="")
}
}
"AF","AL","DZ","AS","AD","AO","AI","AG","AR","AM","AW","AU","AT","AZ","BS","BH","BD","BB","BY","BE","BZ","BJ","BM","BT","BO","BQ","BA","BW","BR","VG","BN","BG","BF","BI","CV","KH","CM","CA","KY","CF","TD","CL","CN","CO","KM","CG","CK","CR","CI","HR","CU","CW","CY","CZ","CD","DK","DJ","DM","DO","EC","EG","SV","GQ","ER","EE","SZ","ET","FJ","FI","FR","PF","GA","GM","GE","DE","GH","GR","GL","GD","GU","GT","GN","GW","GY","HT","HN","HU","IS","IN","ID","IR","IQ","IE","IL","IT","JM","JP","JO","KZ","KE","KI","XK","KW","KG","LA","LV","LB","LS","LR","LY","LT","LU","MG","MW","MY","MV","ML","MT","MH","MR","MU","MX","FM","MC","MN","ME","MS","MA","MZ","MM","NR","NP","NL","NC","NZ","NI","NE","NG","MK","MP","NO","PS","OM","PK","PW","PA","PG","PY","PE","PH","PL","PT","PR","QA","KR","MD","RO","RU","RW","KN","LC","VC","WS","SM","ST","SA","SN","RS","SC","SL","SG","SX","SK","SI","SB","SO","ZA","SS","ES","LK","SD","SR","SE","CH","SY","TJ","TH","GB","TL","TG","TO","TT","TN","TR","TC","TV","UG","UA","AE","TZ","US","VI","UY","UZ","VU","VE","VN","WF","YE","ZM","ZW"
for (r in 1:length(covid$Cumulative_cases))
{
if (r < length(covid$Cumulative_cases)) {
cat('[',as.character(covid$Cumulative_cases[r]),',' ,as.character(covid$Cumulative_deaths[r]), "]", ",", sep="") }
else{
cat('[',as.character(covid$Cumulative_cases[r]),',' ,as.character(covid$Cumulative_deaths[r]), "]", sep="") }
}
[235214,7998],[335047,3605],[272083,6881],[8359,34],[48015,159],[107481,1937],[3904,12],[9106,146],[10113624,130685],[451944,8777],[44224,292],[11861161,25236],[6082414,22534],[835636,10353],[39064,849],[696614,1536],[2051272,29498],[107794,593],[994037,7118],[4870157,34339],[71412,688],[28036,163],[18860,165],[62697,21],[1212145,22387],[11922,41],[403662,16391],[330696,2801],[37511921,702116],[7557,64],[347069,178],[1329588,38700],[22139,400],[54569,15],[64474,417],[139310,3056],[125241,1974],[4818690,55280],[31472,37],[15441,113],[7702,194],[5400928,62727],[99369029,122280],[6389838,142727],[9109,160],[25227,389],[7344,2],[1230653,9368],[88433,835],[1317144,18752],[1113662,8530],[45883,305],[696031,1451],[4760891,43507],[100933,1471],[3435679,9693],[15690,189],[16047,74],[661103,4384],[1077122,36050],[516023,24830],[201912,4230],[17130,183],[10189,103],[610471,2998],[75356,1427],[501185,7574],[69047,885],[1499712,11466],[38997490,168091],[79340,650],[49051,307],[12627,372],[1863615,17150],[38437756,174979],[172057,1462],[5661735,39167],[11971,21],[19693,238],[52287,419],[1250369,20203],[38572,468],[9614,177],[74357,1302],[34298,860],[472888,11114],[2230591,49053],[210315,186],[45041192,533623],[6829353,162059],[7627863,146837],[2465545,25375],[1743422,9699],[4841558,12707],[26727644,197081],[157137,3604],[33803572,74694],[1746997,14122],[1504370,19072],[344105,5689],[5085,24],[274279,3212],[667290,2570],[88953,1024],[219060,671],[977765,7475],[1239904,10947],[36138,709],[7930,294],[507269,6437],[1367900,9809],[393140,1000],[68564,1427],[89168,2686],[5306834,37351],[186694,316],[33166,743],[122577,918],[16252,17],[63869,997],[328167,1073],[7617685,334524],[26460,65],[17181,67],[1011489,2136],[251280,2654],[1403,8],[1279115,16305],[233841,2252],[642839,19494],[5393,1],[1003450,12031],[8639162,22986],[80163,314],[2635247,4246],[16179,245],[9518,315],[267188,3155],[350731,9977],[14837,41],[1511220,5732],[703228,5708],[399449,4628],[1580631,30656],[6372,10],[1044783,8738],[46864,670],[735759,19880],[4526977,220975],[4140383,66864],[6666637,120718],[5661625,28695],[1252713,5938],[514524,690],[34571873,35934],[636266,12241],[3533987,68815],[24254803,403155],[133264,1468],[6607,46],[30271,410],[9674,124],[17057,31],[25292,126],[6771,80],[841469,9646],[89409,1971],[2583470,18057],[51790,172],[7979,125],[3006155,2024],[11051,92],[1877895,21226],[1356318,10073],[25954,199],[27334,1361],[4072759,102595],[18823,147],[13980340,121852],[672797,16907],[63993,5046],[82501,1406],[2753981,27339],[4456903,14170],[57423,3163],[17786,125],[4797692,34712],[24964791,232112],[23460,138],[39530,290],[16976,12],[191496,4390],[1153361,29423],[17004718,101419],[6805,40],[2943,1],[172154,3632],[5532518,109920],[1067030,2349],[43230,846],[103436829,1191632],[25389,132],[1041317,7682],[175081,1016],[12019,14],[552695,5856],[11624000,43206],[3760,9],[11945,2159],[349749,4069],[266385,5740]
Calculamos el minimo y el maximo de las variables Cumulative Cases y Cumulative Deaths
# Cumulative Cases
min(covid$Cumulative_cases)
[1] 1403
max(covid$Cumulative_cases)
[1] 103436829
# Cumulative Deaths
min(covid$Cumulative_deaths)
[1] 1
max(covid$Cumulative_deaths)
[1] 1191632
Calculamos los cuartiles de la variable Cumulative Cases
quantile(covid$Cumulative_cases)
0% 25% 50% 75% 100%
1403 32319 267188 1502041 103436829