COVID-19 Global Analysis Dashboard
This interactive dashboard provides comprehensive visualization and analysis of COVID-19 confirmed cases globally.
Data Source: - Johns Hopkins University CSSE
COVID-19 Data - Time Series Data:
time_series_covid19_confirmed_global.csv - Repository: GitHub
Key Features:
Visualizations Include:
Technical Stack:
flexdashboard, tidyverse,
plotly, DT, lubridateProject Context:
Data Visualization project to demonstrate:
Data Coverage:
Interactivity Features:
Notes:
---
title: "COVID-19 Global Analysis Dashboard"
output:
flexdashboard::flex_dashboard:
orientation: columns
vertical_layout: fill
theme: cosmo
social: menu
source_code: embed
---
```{r setup, include=FALSE}
# Load required libraries
library(flexdashboard)
library(tidyverse)
library(plotly)
library(DT)
library(lubridate)
library(scales)
library(viridis)
# Set options
knitr::opts_chunk$set(echo = FALSE, warning = FALSE, message = FALSE)
```
```{r load-data}
# Load and process data
covid <- read.csv("time_series_covid19_confirmed_global.csv")
# Transform to long format
covid_long <- covid %>%
pivot_longer(
cols = 5:ncol(covid),
names_to = "Date",
values_to = "Cases"
) %>%
mutate(
Date = str_replace(Date, "^X", ""),
Date = mdy(Date)
)
# Aggregate by country
covid_country <- covid_long %>%
group_by(Country.Region, Date) %>%
summarise(Total_Cases = sum(Cases, na.rm = TRUE), .groups = "drop") %>%
arrange(Country.Region, Date) %>%
group_by(Country.Region) %>%
mutate(
Daily_New_Cases = Total_Cases - lag(Total_Cases, default = 0),
Daily_New_Cases = pmax(Daily_New_Cases, 0)
) %>%
ungroup()
# Global totals
covid_global <- covid_country %>%
group_by(Date) %>%
summarise(
Global_Total_Cases = sum(Total_Cases),
Global_Daily_Cases = sum(Daily_New_Cases),
.groups = "drop"
)
# Get latest date and statistics
latest_date <- max(covid_global$Date)
latest_global <- covid_global %>% filter(Date == latest_date)
# Top 10 countries
top10_latest <- covid_country %>%
filter(Date == latest_date) %>%
arrange(desc(Total_Cases)) %>%
slice(1:10)
top10_countries_names <- top10_latest$Country.Region
# Filter data for top 10 countries over time
top10_trends <- covid_country %>%
filter(Country.Region %in% top10_countries_names)
```
# Overview {data-icon="fa-globe"}
## Column {data-width="350"}
### Global Total Cases
```{r}
valueBox(
comma(latest_global$Global_Total_Cases),
icon = "fa-users",
color = "#d9534f"
)
```
### Daily New Cases
```{r}
valueBox(
comma(latest_global$Global_Daily_Cases),
icon = "fa-chart-line",
color = "#f0ad4e"
)
```
### Countries Affected
```{r}
n_countries <- covid_country %>%
filter(Date == latest_date, Total_Cases > 0) %>%
nrow()
valueBox(
n_countries,
icon = "fa-flag",
color = "#5bc0de"
)
```
### Data Coverage Period
```{r}
date_range <- paste(
format(min(covid_global$Date), "%b %d, %Y"),
"to",
format(max(covid_global$Date), "%b %d, %Y")
)
valueBox(
date_range,
icon = "fa-calendar",
color = "#5cb85c"
)
```
## Column {data-width="650"}
### Global Cumulative Cases Over Time
```{r}
p_global <- ggplot(covid_global, aes(x = Date, y = Global_Total_Cases)) +
geom_line(color = "#d9534f", size = 1) +
scale_y_continuous(labels = comma) +
labs(
x = "Date",
y = "Total Confirmed Cases",
title = NULL
) +
theme_minimal(base_size = 12)
ggplotly(p_global) %>%
layout(hovermode = "x unified")
```
### Top 10 Countries by Total Cases
```{r}
p_top10 <- ggplot(top10_latest,
aes(x = reorder(Country.Region, Total_Cases),
y = Total_Cases,
fill = Total_Cases)) +
geom_col() +
scale_y_continuous(labels = comma) +
scale_fill_viridis_c(option = "plasma", labels = comma) +
coord_flip() +
labs(
x = NULL,
y = "Total Confirmed Cases",
title = NULL
) +
theme_minimal(base_size = 12) +
theme(legend.position = "none")
ggplotly(p_top10)
```
# Country Trends {data-icon="fa-chart-line"}
## Column {data-width="600"}
### Time Series: Top 10 Countries
```{r}
p_trends <- ggplot(top10_trends,
aes(x = Date, y = Total_Cases, color = Country.Region)) +
geom_line(size = 0.8) +
scale_y_continuous(labels = comma) +
scale_color_viridis_d(option = "turbo") +
labs(
x = "Date",
y = "Total Confirmed Cases",
color = "Country",
title = NULL
) +
theme_minimal(base_size = 12) +
theme(legend.position = "right")
ggplotly(p_trends) %>%
layout(hovermode = "x unified", legend = list(orientation = "v"))
```
## Column {data-width="400"}
### Daily New Cases: Top 10 Countries
```{r}
p_daily <- ggplot(top10_trends,
aes(x = Date, y = Daily_New_Cases, color = Country.Region)) +
geom_line(size = 0.6, alpha = 0.7) +
scale_y_continuous(labels = comma) +
scale_color_viridis_d(option = "turbo") +
labs(
x = "Date",
y = "Daily New Cases",
color = "Country",
title = NULL
) +
theme_minimal(base_size = 11) +
theme(legend.position = "right")
ggplotly(p_daily) %>%
layout(hovermode = "x unified")
```
### Peak Daily Cases by Country
```{r}
peak_daily <- top10_trends %>%
group_by(Country.Region) %>%
summarise(
Max_Daily_Cases = max(Daily_New_Cases, na.rm = TRUE),
Peak_Date = Date[which.max(Daily_New_Cases)]
) %>%
arrange(desc(Max_Daily_Cases))
datatable(
peak_daily,
colnames = c("Country", "Peak Daily Cases", "Peak Date"),
options = list(
pageLength = 10,
dom = 't',
ordering = TRUE
),
rownames = FALSE
) %>%
formatCurrency("Max_Daily_Cases", currency = "", digits = 0) %>%
formatDate("Peak_Date", method = "toLocaleDateString")
```
# Comparative Analysis {data-icon="fa-balance-scale"}
## Column {data-width="500"}
### Heatmap: Cases Intensity Over Time (Top 10 Countries)
```{r}
# Prepare data for heatmap - sample every 7 days to reduce size
heatmap_data <- top10_trends %>%
filter(wday(Date) == 1) %>% # Keep only Sundays
mutate(
Month_Year = format(Date, "%Y-%m"),
Cases_Log = log10(Total_Cases + 1) # Log scale for better visualization
)
p_heatmap <- ggplot(heatmap_data,
aes(x = Date, y = Country.Region, fill = Cases_Log)) +
geom_tile() +
scale_fill_viridis_c(
option = "magma",
name = "Cases (log10)",
labels = function(x) comma(10^x - 1)
) +
labs(
x = "Date",
y = NULL,
title = NULL
) +
theme_minimal(base_size = 11) +
theme(
axis.text.x = element_text(angle = 45, hjust = 1),
panel.grid = element_blank()
)
ggplotly(p_heatmap)
```
## Column {data-width="500"}
### Growth Rate Comparison
```{r}
# Calculate growth rate (% change over 7 days)
growth_data <- covid_country %>%
filter(Country.Region %in% top10_countries_names) %>%
arrange(Country.Region, Date) %>%
group_by(Country.Region) %>%
mutate(
Cases_7d_ago = lag(Total_Cases, 7),
Growth_Rate = ((Total_Cases - Cases_7d_ago) / Cases_7d_ago) * 100
) %>%
filter(Date >= latest_date - days(90)) %>% # Last 90 days
ungroup()
p_growth <- ggplot(growth_data,
aes(x = Date, y = Growth_Rate, color = Country.Region)) +
geom_line(size = 0.7, alpha = 0.8) +
geom_hline(yintercept = 0, linetype = "dashed", color = "gray50") +
scale_color_viridis_d(option = "turbo") +
labs(
x = "Date",
y = "7-Day Growth Rate (%)",
color = "Country",
title = NULL
) +
theme_minimal(base_size = 11)
ggplotly(p_growth) %>%
layout(hovermode = "x unified")
```
### Latest Statistics: Top 10 Countries
```{r}
latest_stats <- covid_country %>%
filter(Date == latest_date) %>%
arrange(desc(Total_Cases)) %>%
slice(1:10) %>%
select(Country.Region, Total_Cases, Daily_New_Cases)
datatable(
latest_stats,
colnames = c("Country", "Total Cases", "Daily New Cases"),
options = list(
pageLength = 10,
dom = 't',
ordering = FALSE
),
rownames = FALSE
) %>%
formatCurrency(c("Total_Cases", "Daily_New_Cases"),
currency = "",
digits = 0)
```
# Data Table {data-icon="fa-table"}
### Complete Dataset (Latest Date)
```{r}
# Full data table for latest date
full_data <- covid_country %>%
filter(Date == latest_date) %>%
arrange(desc(Total_Cases)) %>%
select(Country.Region, Total_Cases, Daily_New_Cases) %>%
mutate(Rank = row_number())
datatable(
full_data,
colnames = c("Country/Region", "Total Cases", "Daily New Cases", "Rank"),
options = list(
pageLength = 25,
order = list(list(3, 'asc')),
searchHighlight = TRUE,
autoWidth = TRUE
),
filter = 'top',
rownames = FALSE
) %>%
formatCurrency(c("Total_Cases", "Daily_New_Cases"),
currency = "",
digits = 0)
```
# About {data-icon="fa-info-circle"}
## Column
### About This Dashboard
**COVID-19 Global Analysis Dashboard**
This interactive dashboard provides comprehensive visualization and analysis of COVID-19 confirmed cases globally.
**Data Source:** - Johns Hopkins University CSSE COVID-19 Data - Time Series Data: `time_series_covid19_confirmed_global.csv` - Repository: [GitHub](https://github.com/CSSEGISandData/COVID-19)
**Key Features:**
1. **Overview Tab**: Global statistics and key metrics
2. **Country Trends**: Time series analysis for top 10 affected countries
3. **Comparative Analysis**: Heatmaps and growth rate comparisons
4. **Data Table**: Searchable and sortable dataset
**Visualizations Include:**
- Line charts showing cumulative and daily trends
- Bar charts comparing countries
- Heatmaps displaying intensity over time
- Interactive tables with filtering capabilities
**Technical Stack:**
- R version 4.4.2
- Packages: `flexdashboard`, `tidyverse`, `plotly`, `DT`, `lubridate`
**Project Context:**
Data Visualization project to demonstrate:
- Data collection and preprocessing skills
- Exploratory data analysis techniques
- Interactive dashboard design principles
- Effective visual communication of complex data
**Data Coverage:**
- Start Date: January 22, 2020
- End Date: March 9, 2023
- Geographic Coverage: Global (290+ regions/countries)
**Interactivity Features:**
- Hover tooltips for detailed information
- Zoom and pan capabilities on charts
- Searchable and sortable data tables
- Responsive design for various screen sizes
**Notes:**
- Data reflects confirmed cases only (does not include deaths or recoveries)
- Some countries include province/state subdivisions
- Daily new cases calculated as day-over-day differences
- Negative values (due to data corrections) are set to zero