The present document aims to provide understanding on use of IDM concepts to undertake the situational assessment of an epidemic/ outbreak/ pandemic by plotting epidemic curves and calculating effective reproduction numbers \(R_{eff}\), growth rates and doubling time. We shall be using incidence
and tidyverse
packages for the same. Additional packages recommended for learning are provided at Free and open training resources to respond to outbreaks, health emergencies and humanitarian crises website https://www.reconlearn.org/ . We shall not be discussing other packages but interested students are encouraged to go through the website and discuss on moodle platform/ individual sessions over the next week.
In an outbreak situation, in addition to on ground prevention and control (as discussed in outbreak investigation part of this module), a public health specialist is expected to:-
Generate and interpret epidemic curves.
Calculate Effective reproduction numbers.
Calculate growth rates.
Calculate doubling/ halving time,.
We will fetch COVID-19 data as available from covid19india.org website for India.
# Fetch and load data
dat <- data.table::fread("https://api.covid19india.org/csv/latest/state_wise_daily.csv")
# Convert to tidy format
dat <- dat %>% pivot_longer(
cols = 4:42,
names_to = "States",
values_to = "Cases"
)
# EDA
glimpse(dat)
## Rows: 29,367
## Columns: 5
## $ Date <chr> "14-Mar-20", "14-Mar-20", "14-Mar-20", "14-Mar-20", "14-Ma...
## $ Date_YMD <date> 2020-03-14, 2020-03-14, 2020-03-14, 2020-03-14, 2020-03-1...
## $ Status <chr> "Confirmed", "Confirmed", "Confirmed", "Confirmed", "Confi...
## $ States <chr> "TT", "AN", "AP", "AR", "AS", "BR", "CH", "CT", "DN", "DD"...
## $ Cases <int> 81, 0, 1, 0, 0, 0, 0, 0, 0, 0, 7, 0, 0, 14, 0, 2, 0, 6, 19...
unique(dat$Status)
## [1] "Confirmed" "Recovered" "Deceased"
# filter only confirmed cases
dat <- dat %>% filter(
dat$Status == "Confirmed"
)
# Data Cleaning
dat$Cases <- if_else(dat$Cases>0,
dat$Cases,
abs(dat$Cases))
dat$Date_YMD <- lubridate::ymd(dat$Date_YMD)
listofstates <- unique(dat$States)
listofstates
## [1] "TT" "AN" "AP" "AR" "AS" "BR" "CH" "CT" "DN" "DD" "DL" "GA" "GJ" "HR" "HP"
## [16] "JK" "JH" "KA" "KL" "LA" "LD" "MP" "MH" "MN" "ML" "MZ" "NL" "OR" "PY" "PB"
## [31] "RJ" "SK" "TN" "TG" "TR" "UP" "UT" "WB" "UN"
We can develop iterations for all the states, but avoiding same to avoid complexity. Hence, we have to manually choose the state of choice for demo purposes.
n= 19 # Change to state of your choice
states = listofstates[[n]]
states
## [1] "KL"
df <- dat %>% filter(dat$States == states)
tail(df)
ggplot(data = df, aes(x = Date_YMD, y = Cases)) +
geom_line(color = "red")+
scale_color_discrete()+
labs(y = "Daily COVID-19 cases",
x = "Timeline")+
theme_classic() +
scale_x_date(date_breaks = "1 month", date_labels = "%b")+
theme(legend.position="bottom")
df %>% filter(df$Date_YMD > dmy("01092020")) %>% ggplot(aes(x = Date_YMD, y = Cases)) +
geom_line(color = "red")+
scale_color_discrete()+
labs(y = "Daily COVID-19 cases",
x = "Timeline")+
theme_classic() +
scale_x_date(date_breaks = "1 month", date_labels = "%b")+
theme(legend.position="bottom")
To work with ìncidence`package, an incidence object is required to be created. The same takes atleast two arguments viz. the timeline and the number of cases on each day. There are alternate methods also available which can create incidence objects from a line list also.
i <- as.incidence(df$Cases, df$Date_YMD)
i
## <incidence object>
## [545642 cases from days 2020-03-14 to 2020-11-19]
##
## $counts: matrix with 251 rows and 1 columns
## $n: 545642 cases in total
## $dates: 251 dates marking the left-side of bins
## $interval: 1 day
## $timespan: 251 days
## $cumulative: FALSE
plot(i)
plot(i[-30:-1])
f <- fit(i[200:250])
f
## <incidence_fit object>
##
## $model: regression of log-incidence over time
##
## $info: list containing the following items:
## $r (daily growth rate):
## [1] -0.009662796
##
## $r.conf (confidence interval):
## 2.5 % 97.5 %
## [1,] -0.01420663 -0.005118962
##
## $halving (halving time in days):
## [1] 71.73361
##
## $halving.conf (confidence interval):
## 2.5 % 97.5 %
## [1,] 48.7904 135.4078
##
## $pred: data.frame of incidence predictions (51 rows, 5 columns)
plot(i[200:250], fit = f)
best.fit <- fit_optim_split(i)
best.fit
## $df
## dates mean.R2
## 1 2020-09-09 0.4806508
## 2 2020-09-10 0.4732215
## 3 2020-09-11 0.4655507
## 4 2020-09-12 0.4565719
## 5 2020-09-13 0.4478886
## 6 2020-09-14 0.4409851
## 7 2020-09-15 0.4343875
## 8 2020-09-16 0.4323145
## 9 2020-09-17 0.4331681
## 10 2020-09-18 0.4355704
## 11 2020-09-19 0.4403081
## 12 2020-09-20 0.4458940
## 13 2020-09-21 0.4528206
## 14 2020-09-22 0.4803731
## 15 2020-09-23 0.5052529
## 16 2020-09-24 0.5209989
## 17 2020-09-25 0.5296152
## 18 2020-09-26 0.5376711
## 19 2020-09-27 0.5420031
## 20 2020-09-28 0.5430704
## 21 2020-09-29 0.5797107
## 22 2020-09-30 0.5835311
## 23 2020-10-01 0.5771983
## 24 2020-10-02 0.5746098
## 25 2020-10-03 0.5651661
## 26 2020-10-04 0.5635809
## 27 2020-10-05 0.5566840
## 28 2020-10-06 0.5880725
## 29 2020-10-07 0.5866284
## 30 2020-10-08 0.5691645
## 31 2020-10-09 0.5961152
## 32 2020-10-10 0.5839718
## 33 2020-10-11 0.5605408
## 34 2020-10-12 0.5443080
## 35 2020-10-13 0.5594027
## 36 2020-10-14 0.5453853
## 37 2020-10-15 0.5556621
## 38 2020-10-16 0.5483301
## 39 2020-10-17 0.5454343
## 40 2020-10-18 0.5273345
## 41 2020-10-19 0.5184269
## 42 2020-10-20 0.5477682
## 43 2020-10-21 0.5524239
## 44 2020-10-22 0.5361109
## 45 2020-10-23 0.5272639
## 46 2020-10-24 0.5075949
## 47 2020-10-25 0.4889441
## 48 2020-10-26 0.4822917
## 49 2020-10-27 0.5261437
## 50 2020-10-28 0.5516430
## 51 2020-10-29 0.5219627
## 52 2020-10-30 0.5117550
## 53 2020-10-31 0.5066843
## 54 2020-11-01 0.4806629
## 55 2020-11-02 0.4647556
## 56 2020-11-03 0.5242427
## 57 2020-11-04 0.5109508
## 58 2020-11-05 0.4661591
## 59 2020-11-06 0.4450809
## 60 2020-11-07 0.4219229
## 61 2020-11-08 0.4069926
## 62 2020-11-09 0.4012606
## 63 2020-11-10 0.4242461
##
## $split
## [1] "2020-10-09"
##
## $fit
## <list of incidence_fit objects>
##
## attr(x, 'locations'): list of vectors with the locations of each incidence_fit object
##
## 'before'
## 'after'
##
## $model: regression of log-incidence over time
##
## $info: list containing the following items:
## $r (daily growth rate):
## before after
## 0.03987393 -0.01240162
##
## $r.conf (confidence interval):
## 2.5 % 97.5 %
## before 0.03812660 0.041621255
## after -0.01844847 -0.006354774
##
## $doubling (doubling time in days):
## before
## 17.38347
##
## $doubling.conf (confidence interval):
## 2.5 % 97.5 %
## before 16.65368 18.18014
##
## $halving (halving time in days):
## after
## 55.89165
##
## $halving.conf (confidence interval):
## 2.5 % 97.5 %
## after 37.57207 109.075
##
## $pred: data.frame of incidence predictions (245 rows, 6 columns)
##
## $plot
plot(i, fit = best.fit$fit)