Quarto Report

Author

Yash/Souvik

Quarto Report: Global COVID-19 Analysis

Introduction

The COVID-19 pandemic has affected countries and regions worldwide. In this report, we explore global trends by analyzing key COVID-19 metrics such as confirmed cases, deaths, recoveries, and active cases. The goal is to highlight insights about the most impacted countries, compare specific countries, and present a broader view of the global pandemic situation.


Data Overview

We used the COVID-19 Global Dataset to analyze the following key metrics: - Confirmed cases: Total number of confirmed cases of COVID-19. - Deaths: Total number of deaths due to COVID-19. - Recovered cases: Total number of recovered individuals. - Active cases: Number of currently active cases of COVID-19.

Below is a preview of the dataset:

# Load necessary libraries
library(readr)
library(dplyr)

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
library(tidyr)

# Load the COVID-19 dataset (adjust the path if needed)
covid_data <- read_csv("C:/Users/yashp/OneDrive/Documents/worldometer_coronavirus_summary_data.csv")
Rows: 226 Columns: 12
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (2): country, continent
dbl (10): total_confirmed, total_deaths, total_recovered, active_cases, seri...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# Preview the first few rows of the dataset
head(covid_data)
# A tibble: 6 × 12
  country    continent total_confirmed total_deaths total_recovered active_cases
  <chr>      <chr>               <dbl>        <dbl>           <dbl>        <dbl>
1 Afghanist… Asia               179267         7690          162202         9375
2 Albania    Europe             275574         3497          271826          251
3 Algeria    Africa             265816         6875          178371        80570
4 Andorra    Europe              42156          153           41021          982
5 Angola     Africa              99194         1900           97149          145
6 Anguilla   North Am…            2984            9            2916           59
# ℹ 6 more variables: serious_or_critical <dbl>,
#   total_cases_per_1m_population <dbl>, total_deaths_per_1m_population <dbl>,
#   total_tests <dbl>, total_tests_per_1m_population <dbl>, population <dbl>

Top 10 Countries by Confirmed Cases

Let’s start by analyzing the top 10 countries by the total number of confirmed COVID-19 cases.

# Filter and arrange top 10 countries by confirmed cases
top_countries <- covid_data %>%
  arrange(desc(total_confirmed)) %>%
  slice_head(n = 10)

# Plotting the top 10 countries by confirmed cases
library(ggplot2)

ggplot(top_countries, aes(x = reorder(country, total_confirmed), y = total_confirmed)) +
  geom_bar(stat = "identity", fill = "steelblue") +
  coord_flip() +
  labs(
    title = "Top 10 Countries by Confirmed COVID-19 Cases",
    x = "Country",
    y = "Total Confirmed Cases"
  ) +
  theme_minimal()

Interpretation:

The plot above shows the top 10 countries with the highest number of confirmed COVID-19 cases. Countries like the USA, India, and Brazil are among the most impacted. This suggests that these countries have faced larger outbreaks due to factors such as population density, healthcare system capacity, and government response.

Comparison of Two Countries

We will now compare the COVID-19 metrics (confirmed cases, deaths, recoveries, and active cases) of two countries of your choice. Here, we use USA and India as examples.

# Filter data for USA and India
country_comparison <- covid_data %>%
  filter(country %in% c("USA", "India")) %>%
  select(country, total_confirmed, total_deaths, total_recovered, active_cases) %>%
  pivot_longer(cols = -country, names_to = "metric", values_to = "value")

# Plotting the comparison between USA and India
ggplot(country_comparison, aes(x = metric, y = value, fill = country)) +
  geom_bar(stat = "identity", position = "dodge") +
  labs(
    title = "COVID-19 Comparison: USA vs India",
    x = "Metric",
    y = "Value"
  ) +
  theme_minimal()

Interpretation:

From this plot, we can observe how USA and India compare across key COVID-19 metrics:

  • USA has a significantly higher number of total confirmed cases, deaths, and active cases than India.

  • India shows a high number of recoveries, suggesting better management of the pandemic and higher recovery rates.

  • Despite the high number of confirmed cases in the USA, active cases remain a significant concern.

Insights

1. Most Impacted Countries

From the earlier analysis of the top 10 countries by confirmed cases, we can conclude that countries like the USA, India, and Brazil have been the hardest hit by COVID-19. This reflects larger outbreaks, more significant impacts on the healthcare system, and the need for continued interventions.

2. Recovery and Mortality

A closer look at countries with high confirmed cases reveals that India has shown strong recovery metrics compared to its high case numbers. On the other hand, countries like the USA have faced a much higher mortality rate, suggesting that there were significant challenges in managing the pandemic effectively despite the high number of recoveries.

3. Active Cases

Comparing active cases between countries shows that while India has a higher total confirmed case count, its active cases are considerably lower than the USA. This indicates better control over ongoing outbreaks in India as compared to the USA, which still has a significant number of active cases.

Conclusion

In conclusion, the global COVID-19 dataset provides valuable insights into the pandemic’s progress. Through the analysis of total confirmed cases, deaths, recoveries, and active cases, we can draw the following conclusions:

  • Countries like the USA and India have been the most impacted by COVID-19.

  • High recovery rates in some countries, such as India, suggest effective management of the virus, while others, like the USA, face challenges in controlling the virus despite high recovery rates.

  • Active cases in some regions, especially in the USA, remain a concern and continue to pose a threat to the healthcare system.

As the pandemic evolves, countries must continue to monitor these metrics closely to implement targeted responses, ensuring the continued safety and health of their populations.

References

  1. COVID-19 Global Dataset from Kaggle (https://www.kaggle.com/datasets/josephassaker/covid19-global-dataset)