R assignment 1 COVID-19 mortality

1. Preparative steps in R-Script: Install packages, set working directory and load data

Visit the website of Sciensano (https://epistat.wiv-isp.be/covid/) and download the dataset including the mortality data by age, sex and region over time (csv-file; Mortality by date, age, sex and region).

The analysis were done using package incidence2

#set working directory (path) and import COVID-19 mortality dataset (Sciensano)
  setwd("C:/Users/naima/Documents/R_studio_files")
  hosp_dat <-read.table("COVID19BE_MORT.csv", header = TRUE, sep =",")
  
#view data
  str(hosp_dat)

## 'data.frame':    10443 obs. of  5 variables:
##  $ DATE    : chr  "2020-03-07" "2020-03-10" "2020-03-11" "2020-03-11" ...
##  $ REGION  : chr  "Brussels" "Brussels" "Flanders" "Brussels" ...
##  $ AGEGROUP: chr  "75-84" "85+" "85+" "65-74" ...
##  $ SEX     : chr  "M" "F" "M" "M" ...
##  $ DEATHS  : int  1 1 1 1 1 1 1 1 1 2 ...

  summary(hosp_dat)

##      DATE              REGION            AGEGROUP             SEX           
##  Length:10443       Length:10443       Length:10443       Length:10443      
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##      DEATHS      
##  Min.   : 1.000  
##  1st Qu.: 1.000  
##  Median : 2.000  
##  Mean   : 3.156  
##  3rd Qu.: 3.000  
##  Max.   :55.000

2. Question 1

Produce a plot showing the daily incidence of deaths over time aggregated over age, sex and region and indicate on the x-axis the first day of each month.

daily_incidence <- incidence(hosp_dat, date_index = DATE, count = DEATHS, interval = "day")

plot(daily_incidence, title= "Daily incidence of deaths over time", xlab = "Time", ylab = "Daily incidence of death", fill= "blue", angle = 30, date_format = "%d/%m/%Y", date_breaks = "1 month")

This figure shows the number of daily deaths in the different COVID-19 waves between March 2020 and November 2022. The number ofdaily deaths was the highest during the two first waves.

Note that incidence refers to daily new deaths (incidence count) and not incidence rate ( new deaths / risk population).

3. Question 2

Show the cumulative incidence of deaths over time.

cum_incidence <- cumulate(daily_incidence)
  
plot(cum_incidence, title= "Cumulative incidence of deaths over time", xlab = "Time", ylab = "Cumulative incidence of deaths", fill= "blue", angle = 30, date_format = "%d/%m/%Y", date_breaks = "1 month")

3. Question 3

Produce the same plots for the daily number of deaths over time, by age group, region and sex. Briefly explain the key differences from an epidemiological point of view.

At the start I produced plots integrating the subgroups, but I decided to switch to a plot per subgroup to allow for a better visual interpretation of the findings.