Introduction
This project demonstrates some basic data visualizations using a
specific data-set. the data set we have used here is a combination of
two data sets on the Covid-19 outbreak cases of Brazil and Peru which
have been merged into a single data set for ease of use. The data set
contains total cases, daily cases, active cases, total deaths and daily
deaths of both the countries between the period of 15th of February to
the 7th of June, 2020.
library(readxl)
covid19stats_2020 <- read_excel("brazil_&_peru_covid19stats_2020.xlsx")
library(ggplot2)
library(gapminder)
library(gganimate)
## No renderer backend detected. gganimate will default to writing frames to separate files
## Consider installing:
## - the `gifski` package for gif output
## - the `av` package for video output
## and restarting the R session
library(plotly)
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
print(covid19stats_2020)
## # A tibble: 114 × 11
## date Peru_total_cases brazil_total_cases Peru_daily_cases
## <dttm> <dbl> <dbl> <dbl>
## 1 2020-02-15 00:00:00 0 0 0
## 2 2020-02-16 00:00:00 0 0 0
## 3 2020-02-17 00:00:00 0 0 0
## 4 2020-02-18 00:00:00 0 0 0
## 5 2020-02-19 00:00:00 0 0 0
## 6 2020-02-20 00:00:00 0 0 0
## 7 2020-02-21 00:00:00 0 0 0
## 8 2020-02-22 00:00:00 0 0 0
## 9 2020-02-23 00:00:00 0 0 0
## 10 2020-02-24 00:00:00 0 0 0
## # ℹ 104 more rows
## # ℹ 7 more variables: brazil_daily_cases <dbl>, Peru_active_cases <dbl>,
## # brazil_active_cases <dbl>, Peru_total_deaths <dbl>,
## # brazil_total_deaths <dbl>, Peru_daily_deaths <dbl>,
## # brazil_daily_deaths <dbl>
Peru Daily Cases displayed through a bar chart
Now we look into some data visualizations with the table above. We
first plot a bar chart of total daily cases of Peru during the given
time period of the five months mentioned above.
bar_chart_1 <- ggplot(data = covid19stats_2020, aes(x = date, y = Peru_daily_cases)) +
geom_bar(stat = "identity", fill = "red") +
labs(title = "Peru Daily Cases", x = "Months", y = "Total Daily Cases")
ggplotly(bar_chart_1)
From the above bar plot we can see that the number of daily cases
has risen steadily from the middle of March to its peak of 8,805 cases
on the 31st of May, and later the cases fell towards June.
Comparative total daily cases between Brazil & Peru
Now we look into a different visualization of data. We will plot a
comparitive scatter plot between the total daily cases between the
countries for the given time period.
scatter_plot_1 <- ggplot(data = covid19stats_2020, aes(x = date)) +
geom_point(aes(y = brazil_daily_cases), color = "blue", size = 3) +
geom_point(aes(y = Peru_daily_cases), color = "red", size = 3) +
labs(title = "Comparative Scatterplot of daily cases of Brazil & Peru", x = "Months", y = "Total Daily Cases") +
scale_color_manual(values = c("blue", "red"))
ggplotly(scatter_plot_1)
The data-set above presents a comparative scatter plot of daily
COVID-19 cases in Brazil and Peru. In this plot, the red scatter points
correspond to Peru, while the blue scatter points represent Brazil.From
the above scatter-plot we can see that Brazil had a more severe Covid-19
outbreak than Peru, this may be due to Brazil’s comparatively higher
population than Peru, thus more people were exposed to the virus.
Total Cases of Covid-19 outbreak in Brazil
Now we look into an interactive line graph of the total cases of the
Covid-19 outbreak in Brazil
line_chart <-ggplot(covid19stats_2020, aes(x = date, y = brazil_total_cases, group = 1 )) +
geom_line(color= "red", size= 1.5) +
labs(title = "Total Cases of Covid-19 outbreak in Brazil",
x = "Months",
y = "Total Cases") +
theme(
plot.title = element_text(size = 18, face = "bold", color = "blue"),
panel.background = element_rect(fill = "lightgray"),
panel.grid.major = element_line(color = "gray", linetype = "dashed"),
panel.grid.minor = element_line(color = "gray", linetype = "dotted")
)
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
ggplotly(line_chart)
From the above animated line graph, we can see that the total cases
of Covid-19 outbreak has significantly increased over the course of the
5 months period. The cases have made a steep climb from its first case
in the 25th of February to 691,962 cases by the 7th of June 2020.
Conclusion
The purpose of this project was to illustrate basic data
visualizations using R, focusing on the COVID-19 outbreak data-set for
Brazil and Peru from February 15th to June 7th, 2020.