Overview

Column

Global Total Cases

676,570,149

Daily New Cases

177,325

Countries Affected

201

Data Coverage Period

Jan 22, 2020 to Mar 09, 2023

Column

Global Cumulative Cases Over Time

Top 10 Countries by Total Cases

Comparative Analysis

Column

Heatmap: Cases Intensity Over Time (Top 10 Countries)

Column

Growth Rate Comparison

Latest Statistics: Top 10 Countries

Data Table

Complete Dataset (Latest Date)

About

Column

About This Dashboard

COVID-19 Global Analysis Dashboard

This interactive dashboard provides comprehensive visualization and analysis of COVID-19 confirmed cases globally.

Data Source: - Johns Hopkins University CSSE COVID-19 Data - Time Series Data: time_series_covid19_confirmed_global.csv - Repository: GitHub

Key Features:

  1. Overview Tab: Global statistics and key metrics
  2. Country Trends: Time series analysis for top 10 affected countries
  3. Comparative Analysis: Heatmaps and growth rate comparisons
  4. Data Table: Searchable and sortable dataset

Visualizations Include:

  • Line charts showing cumulative and daily trends
  • Bar charts comparing countries
  • Heatmaps displaying intensity over time
  • Interactive tables with filtering capabilities

Technical Stack:

  • R version 4.4.2
  • Packages: flexdashboard, tidyverse, plotly, DT, lubridate

Project Context:

Data Visualization project to demonstrate:

  • Data collection and preprocessing skills
  • Exploratory data analysis techniques
  • Interactive dashboard design principles
  • Effective visual communication of complex data

Data Coverage:

  • Start Date: January 22, 2020
  • End Date: March 9, 2023
  • Geographic Coverage: Global (290+ regions/countries)

Interactivity Features:

  • Hover tooltips for detailed information
  • Zoom and pan capabilities on charts
  • Searchable and sortable data tables
  • Responsive design for various screen sizes

Notes:

  • Data reflects confirmed cases only (does not include deaths or recoveries)
  • Some countries include province/state subdivisions
  • Daily new cases calculated as day-over-day differences
  • Negative values (due to data corrections) are set to zero
---
title: "COVID-19 Global Analysis Dashboard"
output: 
  flexdashboard::flex_dashboard:
    orientation: columns
    vertical_layout: fill
    theme: cosmo
    social: menu
    source_code: embed
---

```{r setup, include=FALSE}
# Load required libraries
library(flexdashboard)
library(tidyverse)
library(plotly)
library(DT)
library(lubridate)
library(scales)
library(viridis)

# Set options
knitr::opts_chunk$set(echo = FALSE, warning = FALSE, message = FALSE)
```

```{r load-data}
# Load and process data
covid <- read.csv("time_series_covid19_confirmed_global.csv")

# Transform to long format
covid_long <- covid %>%
  pivot_longer(
    cols = 5:ncol(covid),
    names_to = "Date",
    values_to = "Cases"
  ) %>%
  mutate(
    Date = str_replace(Date, "^X", ""),
    Date = mdy(Date)
  )

# Aggregate by country
covid_country <- covid_long %>%
  group_by(Country.Region, Date) %>%
  summarise(Total_Cases = sum(Cases, na.rm = TRUE), .groups = "drop") %>%
  arrange(Country.Region, Date) %>%
  group_by(Country.Region) %>%
  mutate(
    Daily_New_Cases = Total_Cases - lag(Total_Cases, default = 0),
    Daily_New_Cases = pmax(Daily_New_Cases, 0)
  ) %>%
  ungroup()

# Global totals
covid_global <- covid_country %>%
  group_by(Date) %>%
  summarise(
    Global_Total_Cases = sum(Total_Cases),
    Global_Daily_Cases = sum(Daily_New_Cases),
    .groups = "drop"
  )

# Get latest date and statistics
latest_date <- max(covid_global$Date)
latest_global <- covid_global %>% filter(Date == latest_date)

# Top 10 countries
top10_latest <- covid_country %>%
  filter(Date == latest_date) %>%
  arrange(desc(Total_Cases)) %>%
  slice(1:10)

top10_countries_names <- top10_latest$Country.Region

# Filter data for top 10 countries over time
top10_trends <- covid_country %>%
  filter(Country.Region %in% top10_countries_names)
```

# Overview {data-icon="fa-globe"}

## Column {data-width="350"}

### Global Total Cases

```{r}
valueBox(
  comma(latest_global$Global_Total_Cases),
  icon = "fa-users",
  color = "#d9534f"
)
```

### Daily New Cases

```{r}
valueBox(
  comma(latest_global$Global_Daily_Cases),
  icon = "fa-chart-line",
  color = "#f0ad4e"
)
```

### Countries Affected

```{r}
n_countries <- covid_country %>%
  filter(Date == latest_date, Total_Cases > 0) %>%
  nrow()

valueBox(
  n_countries,
  icon = "fa-flag",
  color = "#5bc0de"
)
```

### Data Coverage Period

```{r}
date_range <- paste(
  format(min(covid_global$Date), "%b %d, %Y"),
  "to",
  format(max(covid_global$Date), "%b %d, %Y")
)

valueBox(
  date_range,
  icon = "fa-calendar",
  color = "#5cb85c"
)
```

## Column {data-width="650"}

### Global Cumulative Cases Over Time

```{r}
p_global <- ggplot(covid_global, aes(x = Date, y = Global_Total_Cases)) +
  geom_line(color = "#d9534f", size = 1) +
  scale_y_continuous(labels = comma) +
  labs(
    x = "Date",
    y = "Total Confirmed Cases",
    title = NULL
  ) +
  theme_minimal(base_size = 12)

ggplotly(p_global) %>%
  layout(hovermode = "x unified")
```

### Top 10 Countries by Total Cases

```{r}
p_top10 <- ggplot(top10_latest, 
                  aes(x = reorder(Country.Region, Total_Cases), 
                      y = Total_Cases,
                      fill = Total_Cases)) +
  geom_col() +
  scale_y_continuous(labels = comma) +
  scale_fill_viridis_c(option = "plasma", labels = comma) +
  coord_flip() +
  labs(
    x = NULL,
    y = "Total Confirmed Cases",
    title = NULL
  ) +
  theme_minimal(base_size = 12) +
  theme(legend.position = "none")

ggplotly(p_top10)
```

# Country Trends {data-icon="fa-chart-line"}

## Column {data-width="600"}

### Time Series: Top 10 Countries

```{r}
p_trends <- ggplot(top10_trends, 
                   aes(x = Date, y = Total_Cases, color = Country.Region)) +
  geom_line(size = 0.8) +
  scale_y_continuous(labels = comma) +
  scale_color_viridis_d(option = "turbo") +
  labs(
    x = "Date",
    y = "Total Confirmed Cases",
    color = "Country",
    title = NULL
  ) +
  theme_minimal(base_size = 12) +
  theme(legend.position = "right")

ggplotly(p_trends) %>%
  layout(hovermode = "x unified", legend = list(orientation = "v"))
```

## Column {data-width="400"}

### Daily New Cases: Top 10 Countries

```{r}
p_daily <- ggplot(top10_trends, 
                  aes(x = Date, y = Daily_New_Cases, color = Country.Region)) +
  geom_line(size = 0.6, alpha = 0.7) +
  scale_y_continuous(labels = comma) +
  scale_color_viridis_d(option = "turbo") +
  labs(
    x = "Date",
    y = "Daily New Cases",
    color = "Country",
    title = NULL
  ) +
  theme_minimal(base_size = 11) +
  theme(legend.position = "right")

ggplotly(p_daily) %>%
  layout(hovermode = "x unified")
```

### Peak Daily Cases by Country

```{r}
peak_daily <- top10_trends %>%
  group_by(Country.Region) %>%
  summarise(
    Max_Daily_Cases = max(Daily_New_Cases, na.rm = TRUE),
    Peak_Date = Date[which.max(Daily_New_Cases)]
  ) %>%
  arrange(desc(Max_Daily_Cases))

datatable(
  peak_daily,
  colnames = c("Country", "Peak Daily Cases", "Peak Date"),
  options = list(
    pageLength = 10,
    dom = 't',
    ordering = TRUE
  ),
  rownames = FALSE
) %>%
  formatCurrency("Max_Daily_Cases", currency = "", digits = 0) %>%
  formatDate("Peak_Date", method = "toLocaleDateString")
```

# Comparative Analysis {data-icon="fa-balance-scale"}

## Column {data-width="500"}

### Heatmap: Cases Intensity Over Time (Top 10 Countries)

```{r}
# Prepare data for heatmap - sample every 7 days to reduce size
heatmap_data <- top10_trends %>%
  filter(wday(Date) == 1) %>%  # Keep only Sundays
  mutate(
    Month_Year = format(Date, "%Y-%m"),
    Cases_Log = log10(Total_Cases + 1)  # Log scale for better visualization
  )

p_heatmap <- ggplot(heatmap_data, 
                    aes(x = Date, y = Country.Region, fill = Cases_Log)) +
  geom_tile() +
  scale_fill_viridis_c(
    option = "magma",
    name = "Cases (log10)",
    labels = function(x) comma(10^x - 1)
  ) +
  labs(
    x = "Date",
    y = NULL,
    title = NULL
  ) +
  theme_minimal(base_size = 11) +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1),
    panel.grid = element_blank()
  )

ggplotly(p_heatmap)
```

## Column {data-width="500"}

### Growth Rate Comparison

```{r}
# Calculate growth rate (% change over 7 days)
growth_data <- covid_country %>%
  filter(Country.Region %in% top10_countries_names) %>%
  arrange(Country.Region, Date) %>%
  group_by(Country.Region) %>%
  mutate(
    Cases_7d_ago = lag(Total_Cases, 7),
    Growth_Rate = ((Total_Cases - Cases_7d_ago) / Cases_7d_ago) * 100
  ) %>%
  filter(Date >= latest_date - days(90)) %>%  # Last 90 days
  ungroup()

p_growth <- ggplot(growth_data, 
                   aes(x = Date, y = Growth_Rate, color = Country.Region)) +
  geom_line(size = 0.7, alpha = 0.8) +
  geom_hline(yintercept = 0, linetype = "dashed", color = "gray50") +
  scale_color_viridis_d(option = "turbo") +
  labs(
    x = "Date",
    y = "7-Day Growth Rate (%)",
    color = "Country",
    title = NULL
  ) +
  theme_minimal(base_size = 11)

ggplotly(p_growth) %>%
  layout(hovermode = "x unified")
```

### Latest Statistics: Top 10 Countries

```{r}
latest_stats <- covid_country %>%
  filter(Date == latest_date) %>%
  arrange(desc(Total_Cases)) %>%
  slice(1:10) %>%
  select(Country.Region, Total_Cases, Daily_New_Cases)

datatable(
  latest_stats,
  colnames = c("Country", "Total Cases", "Daily New Cases"),
  options = list(
    pageLength = 10,
    dom = 't',
    ordering = FALSE
  ),
  rownames = FALSE
) %>%
  formatCurrency(c("Total_Cases", "Daily_New_Cases"), 
                 currency = "", 
                 digits = 0)
```

# Data Table {data-icon="fa-table"}

### Complete Dataset (Latest Date)

```{r}
# Full data table for latest date
full_data <- covid_country %>%
  filter(Date == latest_date) %>%
  arrange(desc(Total_Cases)) %>%
  select(Country.Region, Total_Cases, Daily_New_Cases) %>%
  mutate(Rank = row_number())

datatable(
  full_data,
  colnames = c("Country/Region", "Total Cases", "Daily New Cases", "Rank"),
  options = list(
    pageLength = 25,
    order = list(list(3, 'asc')),
    searchHighlight = TRUE,
    autoWidth = TRUE
  ),
  filter = 'top',
  rownames = FALSE
) %>%
  formatCurrency(c("Total_Cases", "Daily_New_Cases"), 
                 currency = "", 
                 digits = 0)
```

# About {data-icon="fa-info-circle"}

## Column

### About This Dashboard

**COVID-19 Global Analysis Dashboard**

This interactive dashboard provides comprehensive visualization and analysis of COVID-19 confirmed cases globally.

**Data Source:** - Johns Hopkins University CSSE COVID-19 Data - Time Series Data: `time_series_covid19_confirmed_global.csv` - Repository: [GitHub](https://github.com/CSSEGISandData/COVID-19)

**Key Features:**

1.  **Overview Tab**: Global statistics and key metrics
2.  **Country Trends**: Time series analysis for top 10 affected countries
3.  **Comparative Analysis**: Heatmaps and growth rate comparisons
4.  **Data Table**: Searchable and sortable dataset

**Visualizations Include:**

-   Line charts showing cumulative and daily trends
-   Bar charts comparing countries
-   Heatmaps displaying intensity over time
-   Interactive tables with filtering capabilities

**Technical Stack:**

-   R version 4.4.2
-   Packages: `flexdashboard`, `tidyverse`, `plotly`, `DT`, `lubridate`

**Project Context:**

Data Visualization project to demonstrate:

-   Data collection and preprocessing skills
-   Exploratory data analysis techniques
-   Interactive dashboard design principles
-   Effective visual communication of complex data

**Data Coverage:**

-   Start Date: January 22, 2020
-   End Date: March 9, 2023
-   Geographic Coverage: Global (290+ regions/countries)

**Interactivity Features:**

-   Hover tooltips for detailed information
-   Zoom and pan capabilities on charts
-   Searchable and sortable data tables
-   Responsive design for various screen sizes

**Notes:**

-   Data reflects confirmed cases only (does not include deaths or recoveries)
-   Some countries include province/state subdivisions
-   Daily new cases calculated as day-over-day differences
-   Negative values (due to data corrections) are set to zero