Tarea

Objetivo

El objetivo de esta tarea es familiarizar al estudiante con el manejo de gráficos dinámicos, específicamente usaremos visualizaciones creadas con el paquete rCharts. Como se discutió en la presentación rCharts permite la creación de gráficos dinámicos basado en diferentes librerías de Javascript, pero para el presente trabajo nos centraremos solo en NVD3. NVD3 es una librería basada en Javascript D3 que permite la creación de distintos tipos de visualizaciones.

EJERCICIO

Considere el archivo sobre los efectos del virus COVID-19 en el mundo . La data puede ser provista por el instructor o bajada de algunos de los repositorios a nivel mundial de acuerdo a enlace provisto al inicio de este documento. El dataset contiene varias columnas y sus filas corresponden a los valores de cada país día a día desde el inicio de la pandemia. Nos interesa únicamente los totales, por lo que el estudiante deberá realizar los filtros a la última fecha y agregaciones que crea conveniente.

La data puede ser descargada de:

https://covid19.who.int/WHO-COVID-19-global-data.csv

  • Lea la data directamente del Internet al ambiente R, note que esta contiene información día a día , lo cual no nos interesa , queremos obtener la data a la fecha última , por lo tanto deberá realizar un filtro a esa fecha.

De todas la columnas presentes en el archivo nos interesa únicamente las siguientes:

WHO_region, Country_code Cumulative_cases Cumulative_deaths

  • Realice una selección de dichas columnas

  • Realice un filtro de aquellos países que no tiene casos o muertos registrados

  • Realice un filtro de los países que pertenecen a la región “Other”

  • Asegúrese de que la data está completa , es decir , no existen valores en blanco

Realizar el siguiente trabajo :

1.- Desarrolle las siguientes visualizaciones utilizando rCharts:

a) Presente el mejor diagrama  basado en rCharts  usando la librería NVD3 para graficar  la posición de cada país con respecto a los  casos y muertes agrupando por región. Se requiere 1 solo diagrama . Revise la documentación de rCharts provista en el aula vitual. Note la funcionalidades dinámicas que se presenta. Realice el gráfico primero en R-studio y cuando este satisfecho  de su presentación , salve el código javascript  generado  e inclúyalo en  el archivo rmd mediante una etiqueta  iframe.. Para esto tome como plantilla  el laboratorio:
covid_data <-  read.csv("https://covid19.who.int/WHO-COVID-19-global-data.csv",
                   header=TRUE, sep=",")

EXPLORAMOS LA DATA

Analizamos que columnas disponemos:

## [1] "Date_reported"     "Country_code"      "Country"          
## [4] "WHO_region"        "New_cases"         "Cumulative_cases" 
## [7] "New_deaths"        "Cumulative_deaths"
dim(covid_data)
## [1] 57120     8
head(covid_data)
##   Date_reported Country_code     Country WHO_region New_cases
## 1    2020-01-05           AF Afghanistan       EMRO        NA
## 2    2020-01-12           AF Afghanistan       EMRO        NA
## 3    2020-01-19           AF Afghanistan       EMRO        NA
## 4    2020-01-26           AF Afghanistan       EMRO        NA
## 5    2020-02-02           AF Afghanistan       EMRO        NA
## 6    2020-02-09           AF Afghanistan       EMRO        NA
##   Cumulative_cases New_deaths Cumulative_deaths
## 1                0         NA                 0
## 2                0         NA                 0
## 3                0         NA                 0
## 4                0         NA                 0
## 5                0         NA                 0
## 6                0         NA                 0
summary(covid_data)
##  Date_reported      Country_code         Country         
##  Length:57120       Length:57120       Length:57120      
##  Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character  
##                                                          
##                                                          
##                                                          
##                                                          
##   WHO_region          New_cases        Cumulative_cases   
##  Length:57120       Min.   :  -65079   Min.   :        0  
##  Class :character   1st Qu.:      43   1st Qu.:     3946  
##  Mode  :character   Median :     399   Median :    45562  
##                     Mean   :   20021   Mean   :  1774440  
##                     3rd Qu.:    4031   3rd Qu.:   516023  
##                     Max.   :40475477   Max.   :103436829  
##                     NA's   :18374                         
##    New_deaths      Cumulative_deaths  
##  Min.   :-3432.0   Min.   :      0.0  
##  1st Qu.:    4.0   1st Qu.:     28.0  
##  Median :   20.0   Median :    544.5  
##  Mean   :  283.1   Mean   :  19891.8  
##  3rd Qu.:  105.0   3rd Qu.:   6881.0  
##  Max.   :47687.0   Max.   :1191632.0  
##  NA's   :32201
str(covid_data)
## 'data.frame':    57120 obs. of  8 variables:
##  $ Date_reported    : chr  "2020-01-05" "2020-01-12" "2020-01-19" "2020-01-26" ...
##  $ Country_code     : chr  "AF" "AF" "AF" "AF" ...
##  $ Country          : chr  "Afghanistan" "Afghanistan" "Afghanistan" "Afghanistan" ...
##  $ WHO_region       : chr  "EMRO" "EMRO" "EMRO" "EMRO" ...
##  $ New_cases        : int  NA NA NA NA NA NA NA NA 1 NA ...
##  $ Cumulative_cases : int  0 0 0 0 0 0 0 0 1 1 ...
##  $ New_deaths       : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ Cumulative_deaths: int  0 0 0 0 0 0 0 0 0 0 ...
require(dplyr)
covid19 <- max(covid_data$Date_reported, na.rm = TRUE)
covid19 <- covid_data %>%
  filter(Date_reported == covid19)

Seleccionar las columnas de interes

WHO_region, Country_code, Cumulative_cases, Cumulative_deaths.

Realice una selección de dichas columnas

covid <- covid19[, c("WHO_region", "Country_code", "Cumulative_cases", "Cumulative_deaths" )]
#View(covid)
dim(covid)
## [1] 240   4
colSums(covid == 0)
##        WHO_region      Country_code  Cumulative_cases Cumulative_deaths 
##                 0                NA                 2                13

Realice un filtro de aquellos países que no tiene casos o muertos registrados

covid <- covid[covid$Cumulative_cases != 0, ]
covid <- covid[covid$Cumulative_deaths != 0, ]
colSums(covid == 0)
##        WHO_region      Country_code  Cumulative_cases Cumulative_deaths 
##                 0                NA                 0                 0
dim(covid)
## [1] 227   4

Realice un filtro de los países que pertenecen a la región “Other”

unique(covid$WHO_region)

covid <- covid[covid$WHO_region != "OTHER", ]
#View(covid)
unique(covid$WHO_region)
## [1] "EMRO"  "EURO"  "AFRO"  "WPRO"  "AMRO"  "SEARO" ""

Asegúrese de que la data está completa , es decir , no existen valores en blanco

Valores vacios en Who_Region

sum(covid$WHO_region == "")
## [1] 14
covid<- covid[covid$WHO_region != "", ]

Comprobar valores 0

colSums(covid == 0)
##        WHO_region      Country_code  Cumulative_cases Cumulative_deaths 
##                 0                NA                 0                 0

Valores vacios en Country Code

covid <- covid[!is.na(covid$Country_code), ]
colSums(covid == 0)
##        WHO_region      Country_code  Cumulative_cases Cumulative_deaths 
##                 0                 0                 0                 0
head(covid)
##   WHO_region Country_code Cumulative_cases Cumulative_deaths
## 1       EMRO           AF           235214              7998
## 2       EURO           AL           335047              3605
## 3       AFRO           DZ           272083              6881
## 4       WPRO           AS             8359                34
## 5       EURO           AD            48015               159
## 6       AFRO           AO           107481              1937
dim(covid)
## [1] 211   4

Cargar las librerías necesarias

require(devtools)
install_github('ramnathv/rCharts' ,type="source",  force=TRUE)
## Downloading GitHub repo ramnathv/rCharts@HEAD
## ── R CMD build ──────────────────────────────────────────────────────────
##      checking for file ‘/tmp/RtmpXVb533/remotes14957e84865/ramnathv-rCharts-479a4f9/DESCRIPTION’ ...  ✔  checking for file ‘/tmp/RtmpXVb533/remotes14957e84865/ramnathv-rCharts-479a4f9/DESCRIPTION’ (547ms)
##   ─  preparing ‘rCharts’:
##    checking DESCRIPTION meta-information ...  ✔  checking DESCRIPTION meta-information
##   ─  checking for LF line-endings in source and make files and shell scripts (509ms)
##   ─  checking for empty or unneeded directories
##   ─  building ‘rCharts_0.4.5.tar.gz’
##      Warning: invalid uid value replaced by that for user 'nobody'
##      
## 
## Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.4'
## (as 'lib' is unspecified)
require(rCharts)
require(rCharts)
require(rCharts)
require(dplyr)
require(reshape2)
require(tidyr)
require(knitr)
library(rCharts)
names(covid)
## [1] "WHO_region"        "Country_code"      "Cumulative_cases" 
## [4] "Cumulative_deaths"

Agrupar y resumir los datos por país y región

covidm <- covid %>%
  group_by(Country_code, WHO_region) %>%
  dplyr::summarise(across(c(Cumulative_deaths, Cumulative_cases), sum, na.rm = TRUE))
## `summarise()` has grouped output by 'Country_code'. You can override
## using the `.groups` argument.
covidr <- covid %>%
  group_by(Country_code, WHO_region) %>%
  summarise(
    Cumulative_cases = sum(Cumulative_cases, na.rm = TRUE),
    Cumulative_deaths = sum(Cumulative_deaths, na.rm = TRUE),
    .groups = 'drop'
  )

Verificar las dimensiones del dataset resultante

print(dim(covidr))
## [1] 211   4
covidm_l <- covidr %>%
  pivot_longer(cols = c(Cumulative_cases, Cumulative_deaths), names_to = "Metric", values_to = "Count")

Verificar la estructura del dataset reformateado

head(covidm_l)
## # A tibble: 6 × 4
##   Country_code WHO_region Metric              Count
##   <chr>        <chr>      <chr>               <int>
## 1 AD           EURO       Cumulative_cases    48015
## 2 AD           EURO       Cumulative_deaths     159
## 3 AE           EMRO       Cumulative_cases  1067030
## 4 AE           EMRO       Cumulative_deaths    2349
## 5 AF           EMRO       Cumulative_cases   235214
## 6 AF           EMRO       Cumulative_deaths    7998

Convertir los datos a formato largo para el gráfico

covidm_long <- covidm %>%
  pivot_longer(cols = c(Cumulative_deaths, Cumulative_cases), 
               names_to = "Metric", 
               values_to = "Count")

Crear el gráfico de barras apiladas

chart <- nPlot(Count ~ Country_code, group = 'Metric', data = covidm_long, type = 'multiBarChart')

Personalizar el gráfico

chart$xAxis(axisLabel = 'Country Code')
chart$yAxis(axisLabel = 'Count')

Guardar el gráfico como archivo JavaScript para su inclusión en RMarkdown

chart$save('D:\\NO ELIMINAR\\Desktop\\PUCE\\VISUALIZACIÓN\\covid_chart.html', standalone = TRUE)

En el mismo archivo vamos a incluir un gráfico D3 relacionado con la misma data Covid El instructor pondrá los siguientes ejemplos como plantillas para la resolución de esta tarea. Index44.html Index42.html Index31.html El archivo index44.html servirá como plantilla base sobre el cual el alumno deberá realizar los cambios siguientes,

Para crear el script D3 deberá seguir los siguientes pasos:

Necesitamos extraer la data covid en 3 sets (listas) que pasaremos al código D3.

La primera lista es de las regiones

La segunda lista es de lista es de los paises

La tercera lista es de los casos acumulados.

Funciones para crear las listas.

listareg <- as.list(covid$WHO_region)
for (r in 1:length(listareg))
{
  if (r < length(listareg)) {
    cat('"',as.character(listareg[r]),'"', ",", sep="")}
  else{
    cat('"',as.character(listareg[r]),'"',  sep="")
  }
}
## "EMRO","EURO","AFRO","WPRO","EURO","AFRO","AMRO","AMRO","AMRO","EURO","AMRO","WPRO","EURO","EURO","AMRO","EMRO","SEARO","AMRO","EURO","EURO","AMRO","AFRO","AMRO","SEARO","AMRO","AMRO","EURO","AFRO","AMRO","AMRO","WPRO","EURO","AFRO","AFRO","AFRO","WPRO","AFRO","AMRO","AMRO","AFRO","AFRO","AMRO","WPRO","AMRO","AFRO","AFRO","WPRO","AMRO","AFRO","EURO","AMRO","AMRO","EURO","EURO","AFRO","EURO","EMRO","AMRO","AMRO","AMRO","EMRO","AMRO","AFRO","AFRO","EURO","AFRO","AFRO","WPRO","EURO","EURO","WPRO","AFRO","AFRO","EURO","EURO","AFRO","EURO","EURO","AMRO","WPRO","AMRO","AFRO","AFRO","AMRO","AMRO","AMRO","EURO","EURO","SEARO","SEARO","EMRO","EMRO","EURO","EURO","EURO","AMRO","WPRO","EMRO","EURO","AFRO","WPRO","EURO","EMRO","EURO","WPRO","EURO","EMRO","AFRO","AFRO","EMRO","EURO","EURO","AFRO","AFRO","WPRO","SEARO","AFRO","EURO","WPRO","AFRO","AFRO","AMRO","WPRO","EURO","WPRO","EURO","AMRO","EMRO","AFRO","SEARO","WPRO","SEARO","EURO","WPRO","WPRO","AMRO","AFRO","AFRO","EURO","WPRO","EURO","EMRO","EMRO","EMRO","WPRO","AMRO","WPRO","AMRO","AMRO","WPRO","EURO","EURO","AMRO","EMRO","WPRO","EURO","EURO","EURO","AFRO","AMRO","AMRO","AMRO","WPRO","EURO","AFRO","EMRO","AFRO","EURO","AFRO","AFRO","WPRO","AMRO","EURO","EURO","WPRO","EMRO","AFRO","AFRO","EURO","SEARO","EMRO","AMRO","EURO","EURO","EMRO","EURO","SEARO","EURO","SEARO","AFRO","WPRO","AMRO","EMRO","EURO","AMRO","WPRO","AFRO","EURO","EMRO","AFRO","AMRO","AMRO","AMRO","EURO","WPRO","AMRO","WPRO","WPRO","EMRO","AFRO","AFRO"
listareg <- as.list(covid$Country_code)
for (r in 1:length(listareg))
{
  if (r < length(listareg)) {
    cat('"',as.character(listareg[r]),'"', ",", sep="")}
  else{
    cat('"',as.character(listareg[r]),'"',  sep="")
  }
}
## "AF","AL","DZ","AS","AD","AO","AI","AG","AR","AM","AW","AU","AT","AZ","BS","BH","BD","BB","BY","BE","BZ","BJ","BM","BT","BO","BQ","BA","BW","BR","VG","BN","BG","BF","BI","CV","KH","CM","CA","KY","CF","TD","CL","CN","CO","KM","CG","CK","CR","CI","HR","CU","CW","CY","CZ","CD","DK","DJ","DM","DO","EC","EG","SV","GQ","ER","EE","SZ","ET","FJ","FI","FR","PF","GA","GM","GE","DE","GH","GR","GL","GD","GU","GT","GN","GW","GY","HT","HN","HU","IS","IN","ID","IR","IQ","IE","IL","IT","JM","JP","JO","KZ","KE","KI","XK","KW","KG","LA","LV","LB","LS","LR","LY","LT","LU","MG","MW","MY","MV","ML","MT","MH","MR","MU","MX","FM","MC","MN","ME","MS","MA","MZ","MM","NR","NP","NL","NC","NZ","NI","NE","NG","MK","MP","NO","PS","OM","PK","PW","PA","PG","PY","PE","PH","PL","PT","PR","QA","KR","MD","RO","RU","RW","KN","LC","VC","WS","SM","ST","SA","SN","RS","SC","SL","SG","SX","SK","SI","SB","SO","ZA","SS","ES","LK","SD","SR","SE","CH","SY","TJ","TH","GB","TL","TG","TO","TT","TN","TR","TC","TV","UG","UA","AE","TZ","US","VI","UY","UZ","VU","VE","VN","WF","YE","ZM","ZW"
for (r in 1:length(covid$Cumulative_cases))
{
  if (r < length(covid$Cumulative_cases)) {
    cat('[',as.character(covid$Cumulative_cases[r]),',' ,as.character(covid$Cumulative_deaths[r]), "]", ",", sep="")     }
  else{
    cat('[',as.character(covid$Cumulative_cases[r]),',' ,as.character(covid$Cumulative_deaths[r]), "]",  sep="")     }  
}
## [235214,7998],[335047,3605],[272083,6881],[8359,34],[48015,159],[107481,1937],[3904,12],[9106,146],[10113624,130685],[451944,8777],[44224,292],[11861161,25236],[6082414,22534],[835636,10353],[39064,849],[696614,1536],[2051272,29498],[107794,593],[994037,7118],[4870157,34339],[71412,688],[28036,163],[18860,165],[62697,21],[1212145,22387],[11922,41],[403662,16391],[330696,2801],[37511921,702116],[7557,64],[347069,178],[1329588,38700],[22139,400],[54569,15],[64474,417],[139310,3056],[125241,1974],[4818690,55280],[31472,37],[15441,113],[7702,194],[5400928,62727],[99369029,122280],[6389838,142727],[9109,160],[25227,389],[7344,2],[1230653,9368],[88433,835],[1317144,18752],[1113662,8530],[45883,305],[696031,1451],[4760891,43507],[100933,1471],[3435679,9693],[15690,189],[16047,74],[661103,4384],[1077122,36050],[516023,24830],[201912,4230],[17130,183],[10189,103],[610471,2998],[75356,1427],[501185,7574],[69047,885],[1499712,11466],[38997490,168091],[79340,650],[49051,307],[12627,372],[1863615,17150],[38437756,174979],[172057,1462],[5661735,39167],[11971,21],[19693,238],[52287,419],[1250369,20203],[38572,468],[9614,177],[74357,1302],[34298,860],[472888,11114],[2230591,49053],[210315,186],[45041192,533623],[6829353,162059],[7627863,146837],[2465545,25375],[1743422,9699],[4841558,12707],[26727644,197081],[157137,3604],[33803572,74694],[1746997,14122],[1504370,19072],[344105,5689],[5085,24],[274279,3212],[667290,2570],[88953,1024],[219060,671],[977765,7475],[1239904,10947],[36138,709],[7930,294],[507269,6437],[1367900,9809],[393140,1000],[68564,1427],[89168,2686],[5306834,37351],[186694,316],[33166,743],[122577,918],[16252,17],[63869,997],[328167,1073],[7617685,334524],[26460,65],[17181,67],[1011489,2136],[251280,2654],[1403,8],[1279115,16305],[233841,2252],[642839,19494],[5393,1],[1003450,12031],[8639162,22986],[80163,314],[2635247,4246],[16179,245],[9518,315],[267188,3155],[350731,9977],[14837,41],[1511220,5732],[703228,5708],[399449,4628],[1580631,30656],[6372,10],[1044783,8738],[46864,670],[735759,19880],[4526977,220975],[4140383,66864],[6666637,120718],[5661625,28695],[1252713,5938],[514524,690],[34571873,35934],[636266,12241],[3533987,68815],[24254803,403155],[133264,1468],[6607,46],[30271,410],[9674,124],[17057,31],[25292,126],[6771,80],[841469,9646],[89409,1971],[2583470,18057],[51790,172],[7979,125],[3006155,2024],[11051,92],[1877895,21226],[1356318,10073],[25954,199],[27334,1361],[4072759,102595],[18823,147],[13980340,121852],[672797,16907],[63993,5046],[82501,1406],[2753981,27339],[4456903,14170],[57423,3163],[17786,125],[4797692,34712],[24964791,232112],[23460,138],[39530,290],[16976,12],[191496,4390],[1153361,29423],[17004718,101419],[6805,40],[2943,1],[172154,3632],[5532518,109920],[1067030,2349],[43230,846],[103436829,1191632],[25389,132],[1041317,7682],[175081,1016],[12019,14],[552695,5856],[11624000,43206],[3760,9],[11945,2159],[349749,4069],[266385,5740]

Cumulative Cases

min(covid$Cumulative_cases)
## [1] 1403
max(covid$Cumulative_cases)
## [1] 103436829

Cumulative Deaths

min(covid$Cumulative_deaths)
## [1] 1
max(covid$Cumulative_deaths)
## [1] 1191632

Calculamos los cuartiles de la variable Cumulative Cases

quantile(covid$Cumulative_cases)
##        0%       25%       50%       75%      100% 
##      1403     32319    267188   1502041 103436829