Warning in install.packages :
package ‘grid’ is not available (for R version 3.6.3)
Warning in install.packages :
package ‘grid’ is a base package, and should not be updated
also installing the dependencies ‘sys’, ‘askpass’, ‘curl’, ‘mime’, ‘openssl’, ‘yaml’, ‘later’, ‘igraph’, ‘matlab’, ‘expm’, ‘RcppParallel’, ‘RcppArmadillo’, ‘tidyselect’, ‘BH’, ‘plogr’, ‘httr’, ‘jsonlite’, ‘base64enc’, ‘htmltools’, ‘htmlwidgets’, ‘tidyr’, ‘hexbin’, ‘lazyeval’, ‘crosstalk’, ‘purrr’, ‘data.table’, ‘promises’
There is a binary version available but the source version is later:
Error in install.packages : Unrecognized response “install.packages(c("ggplot2", "gridExtra", "markovchain", "plotly", "reshape"))”
Attaching package: ‘dplyr’
The following objects are masked from ‘package:stats’:
filter, lag
The following objects are masked from ‘package:base’:
intersect, setdiff, setequal, union
Keep up to date with changes at https://www.tidyverse.org/blog/
Attaching package: ‘gridExtra’
The following object is masked from ‘package:dplyr’:
combine
No renderer backend detected. gganimate will default to writing frames to separate files
Consider installing:
- the `gifski` package for gif output
- the `av` package for video output
and restarting the R session
Registered S3 method overwritten by 'data.table':
method from
print.data.table
Registered S3 method overwritten by 'htmlwidgets':
method from
print.htmlwidget tools:rstudio
Attaching package: ‘plotly’
The following object is masked from ‘package:ggplot2’:
last_plot
The following object is masked from ‘package:stats’:
filter
The following object is masked from ‘package:graphics’:
layout
Attaching package: ‘reshape’
The following object is masked from ‘package:plotly’:
rename
The following object is masked from ‘package:dplyr’:
rename
SIR is an epidemiological model to study the dynamical spread of the diseases. S stands for susceptible, I for infected and R for recovered. It implicitly includes the deads. 95% of Susceptible people will remain in the Susceptible state,5% infected 80% of those who are Infected will move to the Recovered Category and 20% remain infected between successive timesteps. 14% of the recovered are still susceptible and could relapse while 86% of the recovered will remain at that state. All the numbers are backed by the data reported in the studies cited by the Word Economic Forum article https://www.weforum.org/agenda/2020/03/coronavirus-recovery-what-happens-after-covid19/
Running the markov model and calibration with actual parameters extracted from the behavior COVID-19 in the most affected countries
####SIR for COVID-19 in the US
mcSIRSL <- new("markovchain", states=c("S","I","R"),
transitionMatrix=matrix(data=c(0.95,0.05,0,0,0.2,0.80,0.14,0,0.86),
byrow=TRUE, nrow=3), name="SIR")
initialState <- c(350000000,17000,16775)
show(mcSIRSL)SIR
A 3 - dimensional discrete Markov Chain defined by the following states:
S, I, R
The transition matrix (by rows) is defined as follows:
S I R
S 0.95 0.05 0.00
I 0.00 0.20 0.80
R 0.14 0.00 0.86
A simple graphical representation of the SIR model with vertex the states and the edges representing the probability of transition of one state to another.
Generating data based on the model simulation and storing in a data frame.
timesteps <- 12
df <- data.frame( "timestep" = numeric(),
"S" = numeric(), "I" = numeric(),"R" = numeric(),
stringsAsFactors=FALSE)
for (i in 0:timesteps) {
newrow <- as.list(c(i,round(as.numeric(initialState * mcSIRSL ^ i),0)))
df[nrow(df) + 1, ] <- newrow
}
head(df)Plotting the simulation results of the model
p<-ggplot(df_new, aes(x = timestep, y = value, color = variable)) +
geom_line(size = 1) + geom_point(size = 1) +
scale_color_brewer(palette = "Set1")
pI query the data from data world at https://data.world/covid-19-data-resource-hub/covid-19-case-counts/workspace/query?queryid=sample-0
data= read.csv('/Users/borishouenou/Downloads/covid19cases-covid-19-case-counts-QueryResult.csv')
summary(data) date country_region province_state
2020-02-21: 100 Australia : 228 Florida : 450
2020-03-15: 100 Canada : 456 British Columbia : 228
2020-02-28: 97 China :1368 Cayman Islands : 228
2020-01-26: 96 Denmark : 73 Channel Islands : 228
2020-02-10: 96 France : 228 District of Columbia: 228
2020-02-29: 96 United Kingdom: 456 Indiana : 228
(Other) :4415 US :2191 (Other) :3410
case_type cases difference
Active :1248 Min. : 0.00 Min. :-71.000
Confirmed:1249 1st Qu.: 0.00 1st Qu.: 0.000
Deaths :1250 Median : 0.00 Median : 0.000
Recovered:1253 Mean : 66.89 Mean : 1.961
3rd Qu.: 2.00 3rd Qu.: 0.000
Max. :1233.00 Max. :203.000
prep_flow_runtime latest_date lat
2020-03-20T07:43:39:5000 2020-03-19:5000 Min. :-41.45
1st Qu.: 27.61
Median : 32.97
Mean : 29.76
3rd Qu.: 38.90
Max. : 71.71
long location
Min. :-123.87 POINT(-107.3025 42.756) : 228
1st Qu.: -86.26 POINT(-123.1207 49.2827): 228
Median : -77.03 POINT(-2.3644 49.3723) : 228
Mean : -13.28 POINT(-64.8963 18.3358) : 228
3rd Qu.: 112.29 POINT(-77.0268 38.8974) : 228
Max. : 145.97 POINT(-80.945 33.8569) : 228
(Other) :3632
'data.frame': 5000 obs. of 11 variables:
$ date : Factor w/ 57 levels "2020-01-23","2020-01-24",..: 49 49 8 46 21 36 30 30 30 30 ...
$ country_region : Factor w/ 7 levels "Australia","Canada",..: 7 7 7 7 7 7 7 7 7 7 ...
$ province_state : Factor w/ 23 levels "Arizona","British Columbia",..: 1 1 1 1 3 7 3 3 3 3 ...
$ case_type : Factor w/ 4 levels "Active","Confirmed",..: 4 3 2 4 4 4 4 3 2 1 ...
$ cases : int 0 0 0 0 0 0 0 0 0 0 ...
$ difference : int 0 0 0 0 0 0 0 0 0 0 ...
$ prep_flow_runtime: Factor w/ 1 level "2020-03-20T07:43:39": 1 1 1 1 1 1 1 1 1 1 ...
$ latest_date : Factor w/ 1 level "2020-03-19": 1 1 1 1 1 1 1 1 1 1 ...
$ lat : num 32.1 32.1 32.1 32.8 37.5 ...
$ long : num -112 -112 -112 -111 -121 ...
$ location : Factor w/ 47 levels "POINT(-107.3025 42.756)",..: 3 3 3 2 11 26 8 8 8 8 ...
Recovered, death and infected in the US summary
data_us<-data %>%
filter(country_region=='US') %>%
group_by(date, case_type) %>%
summarise(Total=sum(cases))Subseting the period that correponds to the parameters used in the Markov model
Plotting the Actual data of the cases in the US. Frankly, the US just got in to the hod bed of the diseases and the actual data plotted here will drastically evolve, and so will the shape of the curves. If the parameters of the experience in China and worldwide start kicking in the US model. In that hypothesis,the simulation in this model could match the US data.
g<-ggplot(data_us, aes(x = date, y = Total, color = case_type)) +
geom_line(size = 3) + geom_point(size = 2) +
theme_bw() +
scale_color_brewer(palette = "Set1")
#g+transition_reveal(as.numeric(date))
g + theme(axis.text.x = element_text(angle = 90))COVID-19 is just picking up in the US at the time of this analysis. Flattening out the curves will require on one hand, that the large infection and death rates other countries experience be lower in the US, and on the other the recovery rate gets higher; which entails opposing a radical response to the disease.