Keywords: Visualization, R coding, dygraph, plotly, manipulateWidget, Time Series, Covid-19, Peru.
Hello! I’ve been learning R programming with RStudio and this is my first visualization project to complete, where I show what I’ve been learning regarding data visualization. I have studied communication, with experience in film production, and have been drawn to the field of analytics through the area of visualization. I am amazed by how much information can be contained in a single graph. Visualization is like a good movie, show the story and at the same time generates conversations and analysis.
This project is about the evolution of Covid-19 in Peru from March 6, 2020 to March 31, 2021. Peru, located in South America, has a population of 33 million, and three distinct geographical regions: desertic west coast, central mountain high Andean region, and eastern tropical Amazon range. For more Peru information click here.
Data, presented at national level, correspond to: Cumulative number of cases across daily reports for; a) confirmed cases, b) recovered cases, and c) number of deaths. Since March 15, 2020 Peru has under a series of lockdowns that were relaxed as the number of cases slowed down; or tightened, as the number of cases started to increase in response to the advancement of Covid-19 due to a combination of environmental conditions, health care measures and human behavior response to continuous restrictions that affected mainly, social and economic interactions. An important driver of the early increase of people affected by Covid-19 is the need of a large segment of Peru’s population to keep working through the pandemia in very low pay jobs, that made impossible for them to follow government’s ‘stay in place’ regulations.
My visualization objectives are to graph each cumulative series in a single plot, graph each series individual daily cases in a single interactive plot each, and get all these four graphs in a single plot that keeps the interactivity of the individual cases plots. Needlessly to say, I have learned a lot completing these tasks, and enjoyed each step of my learning curve.
Datasets were downloaded from the github repository for CSSE-John Hopkins University. Each dataset contains information at national level, 274 rows for countries in the confirmed and deaths data set; and 259 rows in the recovered dataset, with data for each day corresponding to a column, for a total of 439 columns. Date in format mm-dd-yyyy gives name to each column. Each country correspond to a row, except for Australia, Canada, China, Denmark, France, Netherlands and UK , that present information at province level. Last access was on 03-31-2021.
Downloaded datasets were in wide format and have the string of characters ‘X.’ in front of the date giving name to the columns. I’ve used the base::substring command to eliminate these strings. Next, in each dataset, I’ve added, with the command dplyr::inner_join, the variable continents, which corresponds to the continent each country is located.
Then, I’ve converted each dataset from a wide format to a long format using the command tidyr::pivot_longer. As a result the datasets have 5 columns, and 67080 rows. Columns correspond to Country.Region, continent, region, dates, and confirmed (for confirmed cases dataset), recovered (for recovered cases dataset), or deaths (for number of deaths dataset). Finally, the variable dates was converted (from integer) to dates with the command lubridate::mdy.
This project was made with:
library(readxl)
library(tidyverse)
library(magrittr)
library(lubridate)
library(dplyr)
library(xts)
library(tsibble)
library(slider)
library(dygraphs)
library(plotly)
library(manipulateWidget)
library(knitr)
Read Cumulative Confirmed Cases
confirmed <- read.csv("/Users/marcoarellano/Desktop/DATA SCIENCE/Covid 19/03.31.2021/DATA/time_series_covid19_confirmed_global.csv")
n_colsc <- dim(confirmed)[2]
n_rowsc <- dim(confirmed)[1]
names(confirmed)[5:n_colsc] <- substring(names(confirmed)[5:n_colsc],2)
tail(confirmed[, 1:6])
## Province.State Country.Region Lat Long 1.22.20 1.23.20
## 269 Venezuela 6.42380 -66.58970 0 0
## 270 Vietnam 14.05832 108.27720 0 2
## 271 West Bank and Gaza 31.95220 35.23320 0 0
## 272 Yemen 15.55273 48.51639 0 0
## 273 Zambia -13.13390 27.84933 0 0
## 274 Zimbabwe -19.01544 29.15486 0 0
confirmed dataset has 274 rows and 439 columns.
Read Cumulative Recovered Cases
recovered <- read.csv("/Users/marcoarellano/Desktop/DATA SCIENCE/COVID 19/03.31.2021/DATA/time_series_covid19_recovered_global.csv")
n_colsr <- dim(recovered)[2]
n_rowsr <- dim(recovered)[1]
names(recovered)[5:n_colsr] <- substring(names(recovered)[5:n_colsr],2)
tail(recovered[, 1:6])
## Province.State Country.Region Lat Long 1.22.20 1.23.20
## 254 Venezuela 6.42380 -66.58970 0 0
## 255 Vietnam 14.05832 108.27720 0 0
## 256 West Bank and Gaza 31.95220 35.23320 0 0
## 257 Yemen 15.55273 48.51639 0 0
## 258 Zambia -13.13390 27.84933 0 0
## 259 Zimbabwe -19.01544 29.15486 0 0
recovered dataset has 259 rows and 439 columns.
Read Cumulative Deaths Cases
deaths <- read.csv("/Users/marcoarellano/Desktop/DATA SCIENCE/COVID 19/03.31.2021/DATA/time_series_covid19_deaths_global.csv")
n_colsd <- dim(deaths)[2]
n_rowsd <- dim(deaths)[1]
names(deaths)[5:n_colsd] <- substring(names(deaths)[5:n_colsd],2)
tail(deaths[, 1:6])
## Province.State Country.Region Lat Long 1.22.20 1.23.20
## 269 Venezuela 6.42380 -66.58970 0 0
## 270 Vietnam 14.05832 108.27720 0 0
## 271 West Bank and Gaza 31.95220 35.23320 0 0
## 272 Yemen 15.55273 48.51639 0 0
## 273 Zambia -13.13390 27.84933 0 0
## 274 Zimbabwe -19.01544 29.15486 0 0
deaths dataset has 274 rows and 439 columns.
Read Continent and country list
continents <- read_excel("~/Desktop/DATA SCIENCE/COVID 19/03.31.2021/DATA/continents_Corrected.xlsx")
head(continents)
## # A tibble: 6 x 3
## Country.Region continent region
## <chr> <chr> <chr>
## 1 Afghanistan Asia Southern Asia
## 2 Albania Eastern Europe Southern Europe
## 3 Algeria Africa Northern Africa
## 4 Andorra Western Europe and other States Northern Europe
## 5 Angola Africa Middle Africa
## 6 Antigua and Barbuda Latin America and Caribbean States Caribbean
Here I create the long format for each dataset following the steps:
dplyr::inner_join()dplyr::pivot_longer to change having each date as a column, to have each date as a row, within each country.Create the long format for accumulated confirmed cases (confirmed_long)
confirmed_long <- confirmed %>%
inner_join(continents, by = "Country.Region") %>%
pivot_longer (
cols = !c(Province.State, Country.Region, Lat, Long, continent, region),
names_to = c("dates"),
values_to = "confirmed") %>%
mutate(dates = mdy(dates)) %>%
group_by(Country.Region, continent, region, dates) %>%
summarise(confirmed = sum(confirmed)) %>%
ungroup()
n_colscl <- dim(confirmed_long)[2]
n_rowscl <- dim(confirmed_long)[1]
tail(confirmed_long)
## # A tibble: 6 x 5
## Country.Region continent region dates confirmed
## <chr> <chr> <chr> <date> <int>
## 1 Zimbabwe Africa Eastern Africa 2021-03-26 36805
## 2 Zimbabwe Africa Eastern Africa 2021-03-27 36818
## 3 Zimbabwe Africa Eastern Africa 2021-03-28 36822
## 4 Zimbabwe Africa Eastern Africa 2021-03-29 36839
## 5 Zimbabwe Africa Eastern Africa 2021-03-30 36839
## 6 Zimbabwe Africa Eastern Africa 2021-03-31 36882
confirmed_long dataset has 67860 rows and 5 columns.
Create the long format for accumulated recovered cases (recovered_long)
recovered_long <- recovered %>%
inner_join(continents, by = "Country.Region") %>%
pivot_longer (
cols = !c(Province.State, Country.Region, Lat, Long, continent, region),
names_to = c("dates"),
values_to = "recovered") %>%
mutate(dates = mdy(dates))%>%
group_by(Country.Region, continent, region, dates) %>%
summarise(recovered = sum(recovered)) %>%
ungroup()
n_colsrl <- dim(recovered_long)[2]
n_rowsrl <- dim(recovered_long)[1]
tail(recovered_long)
## # A tibble: 6 x 5
## Country.Region continent region dates recovered
## <chr> <chr> <chr> <date> <int>
## 1 Zimbabwe Africa Eastern Africa 2021-03-26 34572
## 2 Zimbabwe Africa Eastern Africa 2021-03-27 34575
## 3 Zimbabwe Africa Eastern Africa 2021-03-28 34603
## 4 Zimbabwe Africa Eastern Africa 2021-03-29 34617
## 5 Zimbabwe Africa Eastern Africa 2021-03-30 34617
## 6 Zimbabwe Africa Eastern Africa 2021-03-31 34686
recovered_long dataset has 67860 rows and 5 columns.
Create the long format for accumulated deaths (deaths_long)
deaths_long <- deaths %>%
inner_join(continents, by = "Country.Region") %>%
pivot_longer (
cols = !c(Province.State, Country.Region, Lat, Long, continent, region),
names_to = c("dates"),
values_to = "deaths") %>%
mutate(dates = mdy(dates)) %>%
group_by(Country.Region, continent, region, dates) %>%
summarise(deaths= sum(deaths)) %>%
ungroup()
n_colsdl <- dim(deaths_long)[2]
n_rowsdl <- dim(deaths_long)[1]
tail(deaths_long)
## # A tibble: 6 x 5
## Country.Region continent region dates deaths
## <chr> <chr> <chr> <date> <int>
## 1 Zimbabwe Africa Eastern Africa 2021-03-26 1518
## 2 Zimbabwe Africa Eastern Africa 2021-03-27 1519
## 3 Zimbabwe Africa Eastern Africa 2021-03-28 1520
## 4 Zimbabwe Africa Eastern Africa 2021-03-29 1520
## 5 Zimbabwe Africa Eastern Africa 2021-03-30 1520
## 6 Zimbabwe Africa Eastern Africa 2021-03-31 1523
deaths_long dataset has 67860 rows and 5 columns.
Below, I create a new column that corresponds to the number of daily cases in each dataset.
Daily cases are achieved by subtracting the cases from day (j-1) to day (j), the difference give us the case increase in a single day. To achieve this, I use the dplyr::lag command that allows to find the number of cases in the day before. Using the lag() option default=0, means that the lag value for the first observation will be the same as the observed value for that day.
Create the variable confirmed_dailycases
confirmed_long <- confirmed_long %>%
arrange(dates) %>%
group_by(Country.Region) %>%
mutate(confirmed_dailycases = confirmed - lag(confirmed, default = 0)) %>%
ungroup()
tail(confirmed_long)
## # A tibble: 6 x 6
## Country.Region continent region dates confirmed confirmed_daily…
## <chr> <chr> <chr> <date> <int> <dbl>
## 1 Vanuatu Asia Melanesia 2021-03-31 3 0
## 2 Venezuela Latin America … South Am… 2021-03-31 160497 1348
## 3 Vietnam Asia South-Ea… 2021-03-31 2603 9
## 4 Yemen Asia Western … 2021-03-31 4357 110
## 5 Zambia Africa Eastern … 2021-03-31 88418 219
## 6 Zimbabwe Africa Eastern … 2021-03-31 36882 43
Create the variable recovered_dailycases
recovered_long <- recovered_long %>%
arrange(dates) %>%
group_by(Country.Region) %>%
mutate(recovered_dailycases = recovered - lag(recovered, default = 0)) %>%
ungroup()
tail(recovered_long)
## # A tibble: 6 x 6
## Country.Region continent region dates recovered recovered_daily…
## <chr> <chr> <chr> <date> <int> <dbl>
## 1 Vanuatu Asia Melanesia 2021-03-31 1 0
## 2 Venezuela Latin America … South Am… 2021-03-31 147846 683
## 3 Vietnam Asia South-Ea… 2021-03-31 2359 0
## 4 Yemen Asia Western … 2021-03-31 1676 9
## 5 Zambia Africa Eastern … 2021-03-31 84592 73
## 6 Zimbabwe Africa Eastern … 2021-03-31 34686 69
Create the variable deaths_dailycases
deaths_long <- deaths_long %>%
arrange(dates) %>%
group_by(Country.Region) %>%
mutate(deaths_dailycases = deaths - lag(deaths, default = 0)) %>%
ungroup()
tail(deaths_long)
## # A tibble: 6 x 6
## Country.Region continent region dates deaths deaths_dailycas…
## <chr> <chr> <chr> <date> <int> <dbl>
## 1 Vanuatu Asia Melanesia 2021-03-31 0 0
## 2 Venezuela Latin America an… South Ame… 2021-03-31 1602 13
## 3 Vietnam Asia South-Eas… 2021-03-31 35 0
## 4 Yemen Asia Western A… 2021-03-31 888 6
## 5 Zambia Africa Eastern A… 2021-03-31 1208 6
## 6 Zimbabwe Africa Eastern A… 2021-03-31 1523 3
Here I select data from my country, Peru, following the steps,
dplyr::filter to select Peru data in each dataset.dplyr::full_join to create a single dataset with all three series: confirmed, recovered and deaths. The variable dates is use for joining the three Peru datasets.Peru_confirmed <- confirmed_long %>%
filter(Country.Region %in% "Peru") %>%
select(Country.Region, dates, confirmed, confirmed_dailycases)
Peru_confirmed <- Peru_confirmed[,-c(1)]
Peru_recovered <- recovered_long %>%
filter(Country.Region %in% "Peru") %>%
select(Country.Region, dates, recovered, recovered_dailycases)
Peru_recovered <- Peru_recovered[,-c(1)]
Peru_deaths <- deaths_long %>%
filter(Country.Region %in% "Peru") %>%
select(Country.Region, dates, deaths, deaths_dailycases)
Peru_deaths <- Peru_deaths[,-c(1)]
Combine the three datasets in the new dataset Peru_global
Peru_global <- Peru_confirmed %>%
full_join(Peru_recovered, by = "dates") %>%
full_join(Peru_deaths, by = "dates") %>%
mutate(deaths_100k = ceiling((deaths/32625948)*10^5))
n_colspg <- dim(Peru_global)[2]
n_rowspg <- dim(Peru_global)[1]
tail(Peru_global)
## # A tibble: 6 x 8
## dates confirmed confirmed_daily… recovered recovered_daily… deaths
## <date> <int> <dbl> <int> <dbl> <int>
## 1 2021-03-26 1512384 11919 1423259 16955 51032
## 2 2021-03-27 1520973 8589 1432450 9191 51238
## 3 2021-03-28 1529882 8909 1442405 9955 51469
## 4 2021-03-29 1533121 3239 1451112 8707 51635
## 5 2021-03-30 1533121 0 1451112 0 51635
## 6 2021-03-31 1548807 15686 1468457 17345 52008
## # … with 2 more variables: deaths_dailycases <dbl>, deaths_100k <dbl>
Peru_global dataset has 435 rows and 8 columns.
Next, I create a 7-day Rolling Average variable for confirmed, recovered, and deaths variables in dataset Peru_global.
First, transform Peru_global dataset to a time series object. I use the command tsibble::as_tsibble
nr <- nrow(Peru_global)
Peru_global$rid <- seq(1, nr ,1)
Peru_global_ts <- as_tsibble(Peru_global,
key = rid,
index = dates)
The 7-day rolling average takes seven consecutive values and calculate their average, this average is paired with the central date of the 7-day interval, which correspond to the 4th date, the following 7-day interval is created dropping the earliest date of the interval and adding the next date after the latest date of the interval.
slider::slide_index_dbl.Peru_global_ts <- Peru_global_ts %>%
filter_index("2020-03-06" ~ .) %>%
mutate(confirmed7_dailycases = slide_index_dbl(.i = dates,
.x = confirmed_dailycases,
.f = mean,
.before = 3,
.after= 3),
recovered7_dailycases = slide_index_dbl(.i = dates,
.x = recovered_dailycases,
.f = mean,
.before = 3,
.after = 3),
deaths7_dailycases = slide_index_dbl(.i = dates,
.x = deaths_dailycases,
.f = mean,
.before = 3,
.after = 3),
confirmed7 = slide_index_dbl(.i = dates,
.x = confirmed,
.f = mean,
.before = 3,
.after = 3),
recovered7 = slide_index_dbl(.i = dates,
.x = recovered,
.f = mean,
.before = 3,
.after = 3),
deaths7 = slide_index_dbl(.i = dates,
.x = deaths,
.f = mean,
.before = 3,
.after = 3))
head(Peru_global_ts)
## # A tsibble: 6 x 15 [1D]
## # Key: rid [6]
## dates confirmed confirmed_daily… recovered recovered_daily… deaths
## <date> <int> <dbl> <int> <dbl> <int>
## 1 2020-03-06 1 1 0 0 0
## 2 2020-03-07 1 0 0 0 0
## 3 2020-03-08 6 5 0 0 0
## 4 2020-03-09 7 1 0 0 0
## 5 2020-03-10 11 4 0 0 0
## 6 2020-03-11 11 0 0 0 0
## # … with 9 more variables: deaths_dailycases <dbl>, deaths_100k <dbl>,
## # rid <dbl>, confirmed7_dailycases <dbl>, recovered7_dailycases <dbl>,
## # deaths7_dailycases <dbl>, confirmed7 <dbl>, recovered7 <dbl>, deaths7 <dbl>
print(Peru_global_ts)
## # A tsibble: 391 x 15 [1D]
## # Key: rid [391]
## dates confirmed confirmed_daily… recovered recovered_daily… deaths
## <date> <int> <dbl> <int> <dbl> <int>
## 1 2020-03-06 1 1 0 0 0
## 2 2020-03-07 1 0 0 0 0
## 3 2020-03-08 6 5 0 0 0
## 4 2020-03-09 7 1 0 0 0
## 5 2020-03-10 11 4 0 0 0
## 6 2020-03-11 11 0 0 0 0
## 7 2020-03-12 15 4 0 0 0
## 8 2020-03-13 28 13 0 0 0
## 9 2020-03-14 38 10 0 0 0
## 10 2020-03-15 43 5 0 0 0
## # … with 381 more rows, and 9 more variables: deaths_dailycases <dbl>,
## # deaths_100k <dbl>, rid <dbl>, confirmed7_dailycases <dbl>,
## # recovered7_dailycases <dbl>, deaths7_dailycases <dbl>, confirmed7 <dbl>,
## # recovered7 <dbl>, deaths7 <dbl>
I use the dygraph library to graph an interactive time series of confirmed, recovered and deaths daily cases. Each plot has 2 variables: the daily number of cases and 7-day rolling average.
The interactive graph allows to zoom in selected time intervals for a more detailed view of the series.
First Graph is for the Number of Daily Confirmed Cases.
peru_int_confirmeddaily <- cbind(Peru_global_ts[c(1, 3, 10)])
peru_int_confirmeddaily$confirmed7_dailycases <- round(peru_int_confirmeddaily$confirmed7_dailycases, 0)
rownames( peru_int_confirmeddaily) <- as.POSIXlt( peru_int_confirmeddaily[, 1])
ts_peru_int_confirmeddaily <- peru_int_confirmeddaily[, -1]
dygraph(ts_peru_int_confirmeddaily,
main = "Confirmed Covid-19 Daily cases") %>%
dySeries("confirmed_dailycases", stepPlot = TRUE,
fillGraph = TRUE, color = "lightblue", label = "Confirmed Daily Cases") %>%
dySeries("confirmed7_dailycases", drawPoints = TRUE,
pointShape = "square", color = "darkblue", label = "Rolling Avg 7") %>%
dyRangeSelector(height = 20) %>%
dyLegend(width = 300)
Second Graph is for the Number of Daily Recovered Cases.
peru_int_recovereddaily <- cbind(Peru_global_ts[c(1, 5, 11)])
peru_int_recovereddaily$recovered7_dailycases <- round(peru_int_recovereddaily$recovered7_dailycases, 0)
rownames( peru_int_recovereddaily) <- as.POSIXlt( peru_int_recovereddaily[, 1])
ts_peru_int_recovereddaily <- peru_int_recovereddaily[, -1]
dygraph(ts_peru_int_recovereddaily,
main = " Recovered Covid-19 Daily cases") %>%
dySeries("recovered_dailycases", stepPlot = TRUE,
fillGraph = TRUE, color = "turquoise", label = "Recovered Daily Cases") %>%
dySeries("recovered7_dailycases", drawPoints = TRUE,
pointShape = "circle", color = "green", label = "Rolling Avg 7") %>%
dyRangeSelector(height = 20) %>%
dyLegend(width = 300)
Third Graph is for the Number of Daily Deaths.
peru_int_deathsdaily <- cbind(Peru_global_ts[c(1, 7, 12)])
peru_int_deathsdaily$deaths7_dailycases <- round(peru_int_deathsdaily$deaths7_dailycases, 0)
rownames( peru_int_deathsdaily) <- as.POSIXlt( peru_int_deathsdaily[, 1])
ts_peru_int_deathsdaily <- peru_int_deathsdaily[, -1]
dygraph(ts_peru_int_deathsdaily,
main = "Covid-19 Daily Deaths") %>%
dySeries("deaths_dailycases", stepPlot = TRUE, fillGraph = TRUE,
color = "orange", label = "Deaths Daily Cases") %>%
dySeries("deaths7_dailycases", drawPoints = TRUE, pointShape = "square",
color = "red", label = "Rolling Avg 7") %>%
dyRangeSelector(height = 20) %>%
dyLegend(width = 285)
The plotly library is used to create the last graph. For this graph we use the accumulated values and the 7-day rolling average of our 3 variables: confirmed, recovered and deaths.
This chart has 2 y-axis. The y-axis on the left corresponds to the values of confirmed and recovered cases; on the other hand, the right y-axis corresponds to the values of deaths. I considered to have two y-axis because the confirmed and recovered values have a similar range, in contrast to the death values that had a lower range. For that reason, in order to visualize the trend in a better way, it was decided to add the second y-axis.
plot_ly() %>%
add_trace(x = ~Peru_global_ts$dates, y = ~ round(Peru_global_ts$confirmed7, 0), name = "Confirmed",
type = 'scatter', mode = 'lines', line = list(color = 'blue', size = 4),
hoverinfo = "text",
text = ~paste("Date: ", Peru_global_ts$dates,
"<br>",
"Confirmed: ", round(Peru_global_ts$confirmed7, 0))) %>%
add_trace(x = ~Peru_global_ts$dates, y = ~round(Peru_global_ts$recovered7, 0), name = "Recovered",
type = 'scatter', mode = 'lines', line = list(color = 'green', size = 4),
hoverinfo = "text",
text = ~paste("Date: ", Peru_global_ts$dates,
"<br>",
"Recovered: ", round(Peru_global_ts$recovered7, 0))) %>%
add_trace(x = ~Peru_global_ts$dates, y = ~round(Peru_global_ts$deaths7, 0), name = "Deaths", yaxis = "y2",
type = 'scatter', mode = 'lines', line = list(color = 'red', size = 4),
hoverinfo = "text",
text = ~paste("Date: ",Peru_global_ts$dates,
"<br>",
"Deaths: ", round(Peru_global_ts$deaths7, 0))) %>%
layout( title = list(text ="7-day Rolling Average Cumulative Covid-19 cases Peru 2020-2021",
size = 10),
yaxis2 = list(tickfont = list(color = "red"),
overlaying = "y",
side = "right",
title = "Cumulative Deaths",
showgrid = FALSE),
xaxis = list(title = "Dates",
color = "black"),
yaxis = list(tickangle = 0,
title = "Cumulative Confirmed and Recovered <br><br><br>",
standoff = 90,
showgrid = FALSE),
legend = list(orientation = "h",
xanchor = "center",
x = 0.5,
y = -0.2),
autosize = T,
margin = list(l = 100, r = 100, b = 100, t = 100, pad = 20))
After all our graphics are ready, I combine them in a single figure that has two columns; the left column has a combined cumulative cases series graph, and the right column has three individual graphs corresponding to daily cases.
I use the command manipulateWidget::combineWidget. This command allows to join our interactive graphics in a single image in a quick and easy way.
First, I create a function to combine in a single graph the three cumulative series.
cumulates_plotly <- function(id){plot_ly() %>%
add_trace(x = ~Peru_global_ts$dates, y = ~ round(Peru_global_ts$confirmed7, 0), name = "Confirmed",
type = 'scatter', mode = 'lines', line = list(color = 'blue', size = 4),
hoverinfo = "text",
text = ~paste("Date: ", Peru_global_ts$dates,
"<br>",
"Confirmed: ", round(Peru_global_ts$confirmed7, 0))) %>%
add_trace(x = ~Peru_global_ts$dates, y = ~round(Peru_global_ts$recovered7, 0), name = "Recovered",
type = 'scatter', mode = 'lines', line = list(color = 'green', size = 4),
hoverinfo = "text",
text = ~paste("Date: ", Peru_global_ts$dates,
"<br>",
"Recovered: ", round(Peru_global_ts$recovered7, 0))) %>%
add_trace(x = ~Peru_global_ts$dates, y = ~round(Peru_global_ts$deaths7, 0), name = "Deaths", yaxis = "y2",
type = 'scatter', mode = 'lines', line = list(color = 'red', size = 4),
hoverinfo = "text",
text = ~paste("Date: ", Peru_global_ts$dates,
"<br>",
"Deaths: ", round(Peru_global_ts$deaths7, 0))) %>%
layout( title = list(text = "7-day Rolling Average Cumulative Covid-19 cases Peru 2020-2021",
size = 10),
yaxis2 = list(tickfont = list(color = "red"),
overlaying = "y",
side = "right",
title = "Cumulative Deaths",
showgrid= FALSE),
xaxis = list(title = "Dates",
color = "black"),
yaxis = list(tickangle =0,
title = "Cumulative Confirmed and Recovered <br><br><br>",
standoff = 90,
showgrid = FALSE),
legend = list(orientation = "h",
xanchor = "center",
x = 0.5,
y = -0.2),
autosize = T,
margin = list(l = 100, r = 100, b = 100, t = 100, pad = 20))}
Second, I create a function to define each component of the right column in the final figure.
c1 <-function(id){dygraph(ts_peru_int_confirmeddaily,
main = "Confirmed Covid-19 Daily cases") %>%
dySeries("confirmed_dailycases", stepPlot = TRUE,
fillGraph = TRUE, color = "lightblue", label = "Confirmed Daily Cases") %>%
dySeries("confirmed7_dailycases", drawPoints = TRUE,
pointShape = "square", color = "darkblue", label = "Rolling Avg 7") %>%
dyRangeSelector(height = 20) %>%
dyLegend(width = 300)}
r1<-function(id){dygraph(ts_peru_int_recovereddaily,
main = " Recovered Covid-19 Daily cases") %>%
dySeries("recovered_dailycases", stepPlot = TRUE,
fillGraph = TRUE, color = "turquoise", label = "Recovered Daily Cases") %>%
dySeries("recovered7_dailycases", drawPoints = TRUE,
pointShape = "circle", color = "green", label = "Rolling Avg 7") %>%
dyRangeSelector(height = 20) %>%
dyLegend(width = 300)}
d1<-function(id){dygraph(ts_peru_int_deathsdaily,
main = "Covid-19 Daily Deaths") %>%
dySeries("deaths_dailycases", stepPlot = TRUE, fillGraph = TRUE,
color = "orange", label = "Deaths Daily Cases") %>%
dySeries("deaths7_dailycases", drawPoints = TRUE, pointShape = "square",
color = "red", label = "Rolling Avg 7") %>%
dyRangeSelector(height = 20) %>%
dyLegend(width = 285)}
To conclude, I use combineWidget to arrange the charts in the final figure Additionally, I create the function write_alt_text to add an alternative text to the graph.
write_alt_text <- function(
chart_type,
type_of_data,
reason,
source){glue::glue(
"{chart_type} of {type_of_data} where {reason}.<br> \n\nData source from {source}")}
combineWidgets(
ncol = 2, colsize = c(2,1),
cumulates_plotly(1),
title = "Covid-19 Peru Interactive Time Series",
footer = write_alt_text(
"<br/>Time Series",
"confirmed, recovered and deaths cases from Covid-19 in Peru",
"information about the evolution of Covid-19 is needed",
"MINSA-Peru/ CSSE-John Hopkins University.<br>Made by Marco Arellano B. Twitter: marellanob93, Github: marellanob"),
combineWidgets(
ncol = 1,
c1(2),
r1(3),
d1(4)))