1 SIR dynamics with varying parameters

In the first week, you gained first experience with coding simple models in R using the deSolve package. Last week, you went into further details about the drivers of an epidemic and the dynamics of the SIR model. This week, we are bringing this together to think more deeply about the roles of \(\beta\) and \(\gamma\).

1.0.1 Question: Imagine a disease where every person infects 1 person on average, every 2 days, and is infectious for 4 days. What are the values for \(\beta\) and \(\gamma\)?

So far, when looking at the dynamics of an epidemic, we have always plotted the number of people in each compartment. Sometimes however it is more informative to look at the proportion of the population in each compartment rather than the actual number, particularly if the population size changes over time. Although we are modelling an epidemic in a closed population with constant population size in this etivity, it is a good opportunity to practice plotting the prevalence as a proportion of the total population who are in the \(S\), \(I\) and \(R\) compartments. To do this, it is helpful to first think about which variables you need to calculate this at each timestep.

Model the epidemic with the \(\beta\) and \(\gamma\) values you calculated above, assuming introduction of a single infected person in a totally susceptible population of 1 million. Remember, if you have been saving your code somewhere, you can open that and copy-paste the SIR code you have previously developed. You might also have to try different durations to run the model for. Then plot the proportion of susceptible, infected and recovered people over time.

1.0.2 Question: What do you observe?

2 YOUR CODE GOES HERE

Now imagine an intervention is introduced to control infection, for example infected people are isolated so that they cannot spread infection. As a result, \(\beta\) drops to 0.1. Model the epidemic under this new scenario.

2.0.1 Question: What do you observe?

2.0.2 YOUR CODE GOES HERE

In your code above, vary \(\gamma\) with \(\beta\) held constant at 0.1 days\(^{-1}\). Notice that there are some values where the initial infected case fails to cause an epidemic.

2.0.3 Question: Assuming \(\beta\) equals 0.1 days\(^{-1}\), what value of \(\gamma\) do you need in order to get an epidemic? In real life, what could give rise to such a change in \(\gamma\)?

Hint: as the infection rate becomes lower, i.e. the infection spreads more slowly, you have to follow the population over a longer period to observe a visible change in the proportion infected.

You should see that different combinations of infection and recovery rates can give rise to very different infection dynamics.

2.0.4 Question: Based on your answers to the previous question, can you think of a condition involving \(\beta\) and \(\gamma\) that is necessary for an epidemic? Test this condition using your code above.

Don’t forget to check the correct answers provided in the Solutions!

3 Solution: SIR dynamics with varying parameters

We are modelling a disease where every infectious person infects 1 person on average, every 2 days, and is infectious for 4 days.

\(\beta\) = 1 person/2 days = 0.5 days\(^{-1}\)
\(\gamma\) = 1/4 = 0.25 days\(^{-1}\)

Modelling this epidemic for a duration of 100 days:

Question # LOAD THE PACKAGES:

library(deSolve)
library(reshape2)
library(ggplot2)

4 MODEL INPUTS:

# Vector storing the initial number of people in each compartment (at timestep 0)
initial_state_values <- c(S = 1000000-1,  # the whole population we are modelling is susceptible to infection
                          I = 1,          # the epidemic starts with a single infected person
                          R = 0)          # there is no prior immunity in the population

# Vector storing the parameters describing the transition rates in units of days^-1
parameters <- c(beta = 0.5,     # the infection rate, which acts on susceptibles
                gamma = 0.25)   # the rate of recovery, which acts on those infected

# TIMESTEPS:

# Vector storing the sequence of timesteps to solve the model at
times <- seq(from = 0, to = 100, by = 1)   # from 0 to 100 days in daily intervals

# SIR MODEL FUNCTION: 

# The model function takes as input arguments (in the following order): time, state and parameters
sir_model <- function(time, state, parameters) {  

    with(as.list(c(state, parameters)), {   # tell R to unpack variable names from the state and parameters inputs    
        
    # Calculating the total population size N (the sum of the number of people in each compartment)
      N <- S+I+R
      
    # Defining lambda as a function of beta and I:
      lambda <- beta * I/N
        
    # The differential equations
      dS <- -lambda * S               # people move out of (-) the S compartment at a rate lambda (force of infection)
      dI <- lambda * S - gamma * I    # people move into (+) the I compartment from S at a rate lambda, 
                                      # and move out of (-) the I compartment at a rate gamma (recovery)
      dR <- gamma * I                 # people move into (+) the R compartment from I at a rate gamma
      
    # Return the number of people in the S, I and R compartments at each timestep 
    # (in the same order as the input state variables)
    return(list(c(dS, dI, dR))) 
    })
  
}

5 MODEL OUTPUT (solving the differential equations):

# Solving the differential equations using the ode integration algorithm
output <- as.data.frame(ode(y = initial_state_values, 
                            times = times, 
                            func = sir_model,
                            parms = parameters))

# PLOTTING THE OUTPUT
output_long <- melt(as.data.frame(output), id = "time")                  # turn output dataset into long format

# Adding a column for the proportion of the population in each compartment at each timestep
# One way of calculating this is dividing the number in each compartment by the total initial population size
# We can do this in this case because our population is closed, so the population size stays the same
# at every timestep
output_long$proportion <- output_long$value/sum(initial_state_values)

# Plot this new column
ggplot(data = output_long,                                               # specify object containing data to plot
       aes(x = time, y = proportion, colour = variable, group = variable)) +  # assign columns to axes and groups
  geom_line() +                                                          # represent data as lines
  xlab("Time (days)")+                                                   # add label for x axis
  ylab("Proportion of the population") +                                 # add label for y axis
  labs(colour = "Compartment")                                           # add legend title

5.0.1 What do you observe when \(\beta\) = 0.5 and \(\gamma\) = 0.25?

An epidemic occurs, reaching a peak 56 days after introduction of the first infectious case, at which point about 15% of the population are infected. By the end of the epidemic, about 80% of the population have been infected and recovered.

Modelling a scenario where beta drops to 0.1 because an infection control measure is introduced:

# Vector storing the parameters describing the transition rates in units of days^-1
parameters <- c(beta = 0.1,      # the infection rate
                gamma = 0.25)    # the rate of recovery, which acts on those infected

# Solving the differential equations using the ode integration algorithm
output <- as.data.frame(ode(y = initial_state_values, 
                            times = times, 
                            func = sir_model,
                            parms = parameters))

# PLOTTING THE OUTPUT
output_long <- melt(as.data.frame(output), id = "time")                  # turn output dataset into long format

# Calculating the proportion in each compartment
output_long$proportion <- output_long$value/sum(initial_state_values)

ggplot(data = output_long,                                               # specify object containing data to plot
       aes(x = time, y = proportion, colour = variable, group = variable)) +  # assign columns to axes and groups
  geom_line() +                                                          # represent data as lines
  xlab("Time (days)")+                                                   # add label for x axis
  ylab("Proportion of the population") +                                 # add label for y axis
  labs(colour = "Compartment")                                           # add legend title

5.0.2 What do you observe when \(\beta\) is reduced to 0.1 instead, with \(\gamma\) remaining at 0.25?

Under this set of conditions, no epidemic occurs - the number of infected people does not increase following the introduction of a first infectious case.

5.0.3 Assuming \(\beta\) equals 0.1, what value of \(\gamma\) do you need in order to get an epidemic? In real life, what could give rise to this change in \(\gamma\)?

With \(\gamma\) around 0.09 or lower, we start to see a small epidemic if we run the model for long enough (ca. 1000 days). Different mechanisms can lead to such a decrease in the recovery rate, corresponding to an increase of the average infectious period, for example strain evolution of the infectious agent or changes in social behaviour.

5.0.4 Based on your answers to the previous question, can you think of a condition involving \(\beta\) and \(\gamma\) that is necessary for an epidemic? Test this condition using your code above.

For an epidemic to happen, the ratio \(\beta\)/\(\gamma\) has to be greater than 1. In other words, to give rise to an epidemic, infectious people have to be infectious enough (\(\beta\) has to be high enough) for long enough (\(\gamma\) has to be low enough) to pass on the pathogen - \(\beta\) has to be higher than \(\gamma\). Because of the relationship between these two parameters, a low infection rate can still lead an epidemic if infected people are infectious for long enough, as you modelled in the previous question.

6 week 2:

In the last etivity, you derived the expression for the basic reproduction number R0 in the simple SIR model: we get an epidemic only if \(\beta\)/\(\gamma\) > 1, i.e. if the average number of secondary cases caused by a single infected case in a totally susceptible population is greater than 1. As susceptibility declines over the course of the epidemic, the effective reproduction number Reff determines the shape of the epidemic curve as it reflects the amount of immunity in the population at any given time.

Remember that in a simple homogenous SIR model, Reff is directly related to the proportion of the population that is susceptible: \[\begin{align} R_{eff} = R_{0} \frac{S}{N} \end{align}\]

In this etivity, your task is to model an epidemic and study the connection between the behaviour of Reff and the epidemic curve. Like in the previous etivities, we are looking at a closed fully susceptible population, into which a single infected person is introduced. The rest is up to you! Use your previously developed SIR code and chose your combination of parameters so that R0 > 1. Solve the differential equations, and use the output to calculate Reff at each timestep. Then, generate 2 plots for comparison: one of the proportion of susceptible, infected and recovered individuals over time, and one of the effective reproduction number over time.

Comparing the changes over time is easiest if you can align the Reff plot directly below the SIR proportion plot. To achieve this, include the code for the 2 plots right after each other in the same cell, and move the legend of the proportion plot to the bottom using the theme command. Use help(theme) or search the internet to find out how!

7 Solution: Simulating the effective reproduction number Reff

# LOAD THE PACKAGES:
library(deSolve)
library(reshape2)
library(ggplot2)

# MODEL INPUTS:

# Vector storing the initial number of people in each compartment (at timestep 0)
initial_state_values <- c(S = 1000000-1,   # the whole population we are modelling is susceptible to infection
                          I = 1,           # the epidemic starts with a single infected person
                          R = 0)           # there is no prior immunity in the population

# Vector storing the parameters describing the transition rates in units of days^-1
parameters <- c(beta = 0.4,      # the infection rate, which acts on susceptibles
                gamma = 0.1)     # the rate of recovery, which acts on those infected

# TIMESTEPS:

# Vector storing the sequence of timesteps to solve the model at
times <- seq(from = 0, to = 100, by = 1)   # from 0 to 100 days in daily intervals

# SIR MODEL FUNCTION: 

# The model function takes as input arguments (in the following order): time, state and parameters
sir_model <- function(time, state, parameters) {  

    with(as.list(c(state, parameters)), {  # tell R to unpack variable names from the state and parameters inputs    
        
    # Calculating the total population size N (the sum of the number of people in each compartment)
      N <- S+I+R
      
    # Defining lambda as a function of beta and I:
      lambda <- beta * I/N
        
    # The differential equations
      dS <- -lambda * S               # people move out of (-) the S compartment at a rate lambda (force of infection)
      dI <- lambda * S - gamma * I    # people move into (+) the I compartment from S at a rate lambda, 
                                      # and move out of (-) the I compartment at a rate gamma (recovery)
      dR <- gamma * I                 # people move into (+) the R compartment from I at a rate gamma
      
    # Return the number of people in the S, I and R compartments at each timestep 
    # (in the same order as the input state variables)
    return(list(c(dS, dI, dR))) 
    })
  
}

# MODEL OUTPUT (solving the differential equations):

# Solving the differential equations using the ode integration algorithm
output <- as.data.frame(ode(y = initial_state_values, 
                            times = times, 
                            func = sir_model,
                            parms = parameters))

Generating 2 plots:

output_long <- melt(as.data.frame(output), id = "time")                  # turn output dataset into long format

# Calculating the proportion in each compartment as a column in the long-format output
output_long$proportion <- output_long$value/sum(initial_state_values)

# Plot the proportion of people in the S, I and R compartments over time
ggplot(data = output_long,                                               # specify object containing data to plot
       aes(x = time, y = proportion, colour = variable, group = variable)) +  # assign columns to axes and groups
  geom_line() +                                                          # represent data as lines
  xlab("Time (days)")+                                                   # add label for x axis
  ylab("Proportion of the population") +                                 # add label for y axis
  labs(colour = "Compartment",                                           # add legend title  
       title = "Proportion susceptible, infected and recovered over time") +                                                               
  theme(legend.position = "bottom")                                      # move legend to the bottom of the plot

# Calculating the effective reproduction number in a new column
output$reff <- parameters["beta"]/parameters["gamma"] *                  # R0 = beta/gamma
                output$S/(output$S+output$I+output$R)                    # multiply R0 by the proportion susceptible
                                                                         # at each timestep/for each row
# In this calculation, the total population size (output$S+output$I+output$R) is calculated for each timestep
# so this approach would also be appropriate if the population size varies over time

# Plot Reff
ggplot(data = output,                                                    # specify object containing data to plot
       aes(x = time, y = reff)) +                                        # assign columns to axes and groups
  geom_line() +                                                          # represent data as lines
  xlab("Time (days)")+                                                   # add label for x axis
  ylab("Reff") +                                                         # add label for y axis
  labs(title = "Effective reproduction number over time")                # add plot title

7.0.1 How does Reff vary over the course of the epidemic? What do you notice about the connection between the change in Reff and the epidemic curve over time? In particular, in relation to Reff, when does the epidemic peak and start to decline?

The effective reproduction number is highest when everyone is susceptible: at the beginning, Reff = R0. At this point in our example, every infected cases causes an average of 4 secondary infections. Over the course of the epidemic, Reff declines in proportion to susceptibility.

The peak of the epidemic happens when Reff goes down to 1 (in the example here, after 50 days). As Reff decreases further below 1, the epidemic prevalence goes into decline. This is exactly what you would expect, given your understanding of the meaning of Reff: once the epidemic reaches the point where every infected case cannot cause at least one more infected case (that is, when Reff < 1), the epidemic cannot sustain itself and comes to an end.

---
title: "W2 SIR dynamic with vary parametters"
author: "Binh Thang Tran"
date: "5/18/2020"
output:
  html_document:
    code_download: yes
    code_folding: hide
    number_sections: yes
    theme: journal
    toc: yes
    toc_float: yes
  word_document:
    toc: yes
---


# SIR dynamics with varying parameters

In the first week, you gained first experience with coding simple models in R using the deSolve package. Last week, you went into further details about the drivers of an epidemic and the dynamics of the SIR model. This week, we are bringing this together to think more deeply about the roles of $\beta$ and $\gamma$.

### Question: Imagine a disease where every person infects 1 person on average, every 2 days, and is infectious for 4 days. What are the values for $\beta$ and $\gamma$?

So far, when looking at the dynamics of an epidemic, we have always plotted the number of people in each compartment. Sometimes however it is more informative to look at the proportion of the population in each compartment rather than the actual number, particularly if the population size changes over time. Although we are modelling an epidemic in a closed population with constant population size in this etivity, it is a good opportunity to practice plotting the prevalence as a proportion of the total population who are in the $S$, $I$ and $R$ compartments. To do this, it is helpful to first think about which variables you need to calculate this at each timestep.

Model the epidemic with the $\beta$ and $\gamma$ values you calculated above, assuming introduction of a single infected person in a totally susceptible population of 1 million. Remember, if you have been saving your code somewhere, you can open that and copy-paste the SIR code you have previously developed. You might also have to try different durations to run the model for. Then plot the proportion of susceptible, infected and recovered people over time.

### Question: What do you observe?



# YOUR CODE GOES HERE ###


Now imagine an intervention is introduced to control infection, for example infected people are isolated so that they cannot spread infection. As a result, $\beta$ drops to 0.1. Model the epidemic under this new scenario. 

### Question: What do you observe?


### YOUR CODE GOES HERE ###


In your code above, vary $\gamma$ with $\beta$ held constant at 0.1 days$^{-1}$. Notice that there are some values where the initial infected case fails to cause an epidemic.

### Question: Assuming $\beta$ equals 0.1 days$^{-1}$, what value of $\gamma$ do you need in order to get an epidemic? In real life, what could give rise to such a change in $\gamma$?
*Hint: as the infection rate becomes lower, i.e. the infection spreads more slowly, you have to follow the population over a longer period to observe a visible change in the proportion infected.*

You should see that different combinations of infection and recovery rates can give rise to very different infection dynamics. 

### Question: Based on your answers to the previous question, can you think of a condition involving $\beta$ and $\gamma$ that is necessary for an epidemic? Test this condition using your code above.

**Don't forget to check the correct answers provided in the Solutions!**




# Solution: SIR dynamics with varying parameters

We are modelling a disease where every infectious person infects 1 person on average, every 2 days, and is infectious for 4 days.   

$\beta$ = 1 person/2 days = 0.5 days$^{-1}$  
$\gamma$ = 1/4 = 0.25 days$^{-1}$

**Modelling this epidemic for a duration of 100 days:**

Question
# LOAD THE PACKAGES:
```{r}

library(deSolve)
library(reshape2)
library(ggplot2)
```

# MODEL INPUTS:

```{R}


# Vector storing the initial number of people in each compartment (at timestep 0)
initial_state_values <- c(S = 1000000-1,  # the whole population we are modelling is susceptible to infection
                          I = 1,          # the epidemic starts with a single infected person
                          R = 0)          # there is no prior immunity in the population

# Vector storing the parameters describing the transition rates in units of days^-1
parameters <- c(beta = 0.5,     # the infection rate, which acts on susceptibles
                gamma = 0.25)   # the rate of recovery, which acts on those infected

# TIMESTEPS:

# Vector storing the sequence of timesteps to solve the model at
times <- seq(from = 0, to = 100, by = 1)   # from 0 to 100 days in daily intervals

# SIR MODEL FUNCTION: 

# The model function takes as input arguments (in the following order): time, state and parameters
sir_model <- function(time, state, parameters) {  

    with(as.list(c(state, parameters)), {   # tell R to unpack variable names from the state and parameters inputs    
        
    # Calculating the total population size N (the sum of the number of people in each compartment)
      N <- S+I+R
      
    # Defining lambda as a function of beta and I:
      lambda <- beta * I/N
        
    # The differential equations
      dS <- -lambda * S               # people move out of (-) the S compartment at a rate lambda (force of infection)
      dI <- lambda * S - gamma * I    # people move into (+) the I compartment from S at a rate lambda, 
                                      # and move out of (-) the I compartment at a rate gamma (recovery)
      dR <- gamma * I                 # people move into (+) the R compartment from I at a rate gamma
      
    # Return the number of people in the S, I and R compartments at each timestep 
    # (in the same order as the input state variables)
    return(list(c(dS, dI, dR))) 
    })
  
}


```

# MODEL OUTPUT (solving the differential equations):

```{r}


# Solving the differential equations using the ode integration algorithm
output <- as.data.frame(ode(y = initial_state_values, 
                            times = times, 
                            func = sir_model,
                            parms = parameters))

# PLOTTING THE OUTPUT
output_long <- melt(as.data.frame(output), id = "time")                  # turn output dataset into long format

# Adding a column for the proportion of the population in each compartment at each timestep
# One way of calculating this is dividing the number in each compartment by the total initial population size
# We can do this in this case because our population is closed, so the population size stays the same
# at every timestep
output_long$proportion <- output_long$value/sum(initial_state_values)

# Plot this new column
ggplot(data = output_long,                                               # specify object containing data to plot
       aes(x = time, y = proportion, colour = variable, group = variable)) +  # assign columns to axes and groups
  geom_line() +                                                          # represent data as lines
  xlab("Time (days)")+                                                   # add label for x axis
  ylab("Proportion of the population") +                                 # add label for y axis
  labs(colour = "Compartment")                                           # add legend title
```


### What do you observe when $\beta$ = 0.5 and $\gamma$ = 0.25?

An epidemic occurs, reaching a peak 56 days after introduction of the first infectious case, at which point about 15% of the population are infected. By the end of the epidemic, about 80% of the population have been infected and recovered.

**Modelling a scenario where *beta* drops to 0.1 because an infection control measure is introduced:**


```{R}
# Vector storing the parameters describing the transition rates in units of days^-1
parameters <- c(beta = 0.1,      # the infection rate
                gamma = 0.25)    # the rate of recovery, which acts on those infected

# Solving the differential equations using the ode integration algorithm
output <- as.data.frame(ode(y = initial_state_values, 
                            times = times, 
                            func = sir_model,
                            parms = parameters))

# PLOTTING THE OUTPUT
output_long <- melt(as.data.frame(output), id = "time")                  # turn output dataset into long format

# Calculating the proportion in each compartment
output_long$proportion <- output_long$value/sum(initial_state_values)

ggplot(data = output_long,                                               # specify object containing data to plot
       aes(x = time, y = proportion, colour = variable, group = variable)) +  # assign columns to axes and groups
  geom_line() +                                                          # represent data as lines
  xlab("Time (days)")+                                                   # add label for x axis
  ylab("Proportion of the population") +                                 # add label for y axis
  labs(colour = "Compartment")                                           # add legend title
```




### What do you observe when $\beta$ is reduced to 0.1 instead, with $\gamma$ remaining at 0.25?

Under this set of conditions, no epidemic occurs - the number of infected people does not increase following the introduction of a first infectious case.

### Assuming $\beta$ equals 0.1, what value of $\gamma$ do you need in order to get an epidemic? In real life, what could give rise to this change in $\gamma$?

With $\gamma$ around 0.09 or lower, we start to see a small epidemic if we run the model for long enough (ca. 1000 days). Different mechanisms can lead to such a decrease in the recovery rate, corresponding to an increase of the average infectious period, for example strain evolution of the infectious agent or changes in social behaviour.

### Based on your answers to the previous question, can you think of a condition involving $\beta$ and $\gamma$ that is necessary for an epidemic? Test this condition using your code above. 

For an epidemic to happen, the ratio $\beta$/$\gamma$ has to be greater than 1. In other words, to give rise to an epidemic, infectious people have to be **infectious enough** ($\beta$ has to be high enough) for **long enough** ($\gamma$ has to be low enough) to pass on the pathogen - $\beta$ has to be higher than $\gamma$. Because of the relationship between these two parameters, a low infection rate can still lead an epidemic if infected people are infectious for long enough, as you modelled in the previous question.


```{R}

```



# week 2: 


In the last etivity, you derived the expression for the basic reproduction number R<sub>0</sub> in the simple SIR model: we get an epidemic only if $\beta$/$\gamma$ > 1, i.e. if the average number of secondary cases caused by a single infected case in a totally susceptible population is greater than 1. As susceptibility declines over the course of the epidemic, the effective reproduction number R<sub>eff</sub> determines the shape of the epidemic curve as it reflects the amount of immunity in the population at any given time.

Remember that in a simple homogenous SIR model, R<sub>eff</sub> is directly related to the proportion of the population that is susceptible:
\begin{align}
R_{eff} = R_{0} \frac{S}{N}
\end{align}

In this etivity, your task is to model an epidemic and study the connection between the behaviour of R<sub>eff</sub> and the epidemic curve. Like in the previous etivities, we are looking at a closed fully susceptible population, into which a single infected person is introduced. The rest is up to you! Use your previously developed SIR code and chose your combination of parameters so that R<sub>0</sub> > 1. Solve the differential equations, and use the output to calculate R<sub>eff</sub> at each timestep. Then, generate 2 plots for comparison: one of the proportion of susceptible, infected and recovered individuals over time, and one of the effective reproduction number over time. 

Comparing the changes over time is easiest if you can align the R<sub>eff</sub> plot directly below the SIR proportion plot. To achieve this, include the code for the 2 plots right after each other in the same cell, and move the legend of the proportion plot to the bottom using the theme command. Use help(theme) or search the internet to find out how!

# Solution: Simulating the effective reproduction number R<sub>eff</sub>

```{r}
# LOAD THE PACKAGES:
library(deSolve)
library(reshape2)
library(ggplot2)

# MODEL INPUTS:

# Vector storing the initial number of people in each compartment (at timestep 0)
initial_state_values <- c(S = 1000000-1,   # the whole population we are modelling is susceptible to infection
                          I = 1,           # the epidemic starts with a single infected person
                          R = 0)           # there is no prior immunity in the population

# Vector storing the parameters describing the transition rates in units of days^-1
parameters <- c(beta = 0.4,      # the infection rate, which acts on susceptibles
                gamma = 0.1)     # the rate of recovery, which acts on those infected

# TIMESTEPS:

# Vector storing the sequence of timesteps to solve the model at
times <- seq(from = 0, to = 100, by = 1)   # from 0 to 100 days in daily intervals

# SIR MODEL FUNCTION: 

# The model function takes as input arguments (in the following order): time, state and parameters
sir_model <- function(time, state, parameters) {  

    with(as.list(c(state, parameters)), {  # tell R to unpack variable names from the state and parameters inputs    
        
    # Calculating the total population size N (the sum of the number of people in each compartment)
      N <- S+I+R
      
    # Defining lambda as a function of beta and I:
      lambda <- beta * I/N
        
    # The differential equations
      dS <- -lambda * S               # people move out of (-) the S compartment at a rate lambda (force of infection)
      dI <- lambda * S - gamma * I    # people move into (+) the I compartment from S at a rate lambda, 
                                      # and move out of (-) the I compartment at a rate gamma (recovery)
      dR <- gamma * I                 # people move into (+) the R compartment from I at a rate gamma
      
    # Return the number of people in the S, I and R compartments at each timestep 
    # (in the same order as the input state variables)
    return(list(c(dS, dI, dR))) 
    })
  
}

# MODEL OUTPUT (solving the differential equations):

# Solving the differential equations using the ode integration algorithm
output <- as.data.frame(ode(y = initial_state_values, 
                            times = times, 
                            func = sir_model,
                            parms = parameters))
```


Generating 2 plots:
```{r}
output_long <- melt(as.data.frame(output), id = "time")                  # turn output dataset into long format

# Calculating the proportion in each compartment as a column in the long-format output
output_long$proportion <- output_long$value/sum(initial_state_values)

# Plot the proportion of people in the S, I and R compartments over time
ggplot(data = output_long,                                               # specify object containing data to plot
       aes(x = time, y = proportion, colour = variable, group = variable)) +  # assign columns to axes and groups
  geom_line() +                                                          # represent data as lines
  xlab("Time (days)")+                                                   # add label for x axis
  ylab("Proportion of the population") +                                 # add label for y axis
  labs(colour = "Compartment",                                           # add legend title  
       title = "Proportion susceptible, infected and recovered over time") +                                                               
  theme(legend.position = "bottom")                                      # move legend to the bottom of the plot

# Calculating the effective reproduction number in a new column
output$reff <- parameters["beta"]/parameters["gamma"] *                  # R0 = beta/gamma
                output$S/(output$S+output$I+output$R)                    # multiply R0 by the proportion susceptible
                                                                         # at each timestep/for each row
# In this calculation, the total population size (output$S+output$I+output$R) is calculated for each timestep
# so this approach would also be appropriate if the population size varies over time

# Plot Reff
ggplot(data = output,                                                    # specify object containing data to plot
       aes(x = time, y = reff)) +                                        # assign columns to axes and groups
  geom_line() +                                                          # represent data as lines
  xlab("Time (days)")+                                                   # add label for x axis
  ylab("Reff") +                                                         # add label for y axis
  labs(title = "Effective reproduction number over time")                # add plot title
```


### How does R<sub>eff</sub> vary over the course of the epidemic? What do you notice about the connection between the change in R<sub>eff</sub> and the epidemic curve over time? In particular, in relation to R<sub>eff</sub>, when does the epidemic peak and start to decline?


The effective reproduction number is highest when everyone is susceptible: at the beginning, R<sub>eff</sub> = R<sub>0</sub>. At this point in our example, every infected cases causes an average of 4 secondary infections. Over the course of the epidemic, R<sub>eff</sub> declines in proportion to susceptibility. 

The peak of the epidemic happens when R<sub>eff</sub> goes down to 1 (in the example here, after 50 days). As R<sub>eff</sub> decreases further below 1, the epidemic prevalence goes into decline. This is exactly what you would expect, given your understanding of the meaning of R<sub>eff</sub>: once the epidemic reaches the point where every infected case cannot cause at least one more infected case (that is, when R<sub>eff</sub> < 1), the epidemic cannot sustain itself and comes to an end.


