The occurence of COVID-19 pandemic has highlighted the importance of infectios disease modeling (IDM) in Public Health decision making for infectious disease prevention and control. Though concepts and application of IDM has been used since long, a very few public health specialists are provided with formal training.
The present module has been prepared to provide formal training on IDM to second year students undergoing Masters in Public Health at Achutha Menon Centre for Health Science Studies, Sree Chitra Tirunal Institute for Medical Sciences and Technology. The goal of the present module is to develop skills and competencies among students which enables understanding, development and interpretation of infectious disease models.
A model is just a simplified representation of a complex phenomenon. We are all familiar with the use of models in various contexts-by architects, economists and many branches of biomedicine-for example, the use oflaboratory animals as models when carrying out research on drugs or toxic materials.
The steps in development of a model are revisted multiple times in the process of developing a model. The steps proposed by Habbema et al. based on their development of detailed models for estimating transmission and control of onchocerciasis and schistosomiasis are as under:-
Figure 3.1: Steps in development of a model.
It is very important to develop a Research Question to decide the type of model which needs to be constructed. For example:-
If one infectious person infected with a new disease enters a town of 100,000 susceptible individuals, how will the average number of people who are susceptible, infectious and immune to the disease change during the subsequent few weeks?
How many people in a population of 100,000 needs to be vaccinated to achieve herd-immunity against COVID-19 vaccine, if the vaccine is 90% effective in preventing infections and 40% of the population is already affected by COVID-19?
What are the factors responsible for causing death within first 10 days of getting infected with COVID-19 in persons aged 35-49 years receiving treatment at our institutional set-up?
It is important to identify the question carefully as it forms the basis for choosing the type of model to be developed. Classically, models have been categorized into the following:-
These models are based on the available datasets such as research level data/ routine data. They are useful in determining the potential risk factors as required in question 03 above.
These models are based on existing or hypothesized understanding of the causal relationships leading to disease occurrence. This type of model is not useful for identifying risk factors, as a priori knowledge about these and their interrelationships are used to define the model.
The models can also be categorised based on the purpose for which they are being developed. It includes the following:-
The purpose of a descriptive model is to describe or illustrate characteristics of some data.
The goal of an inferential model is to produce a decision for a research question or to test aspecific hypothesis. The goal is to make some statement of truth regarding a predefinedconjecture or idea. In many (but not all) cases, a qualitative statement is produced (e.g., that adifference was “statistically significant”).
Here, the primary goal is that the predicted values have the highest possible fidelity to the true value of the new data.
In IDM, three types of models have been used.
Expensive and this approach is now used in limited manner.
It includes models based on statistical and Machine learning based methods such as regression, time series, etc.
Developed first at Johns Hopkins University to explore the implications of chance for infection transmission in populations. The original mechanical models consisted of colored beads arranged in trays and were very clumsy-but this approach has been continued with the use of computers and computer simulation, and is now an important method of analysis of infectious disease phenomena.
Population parameters are described by symbols and linked by algebraic formulae, and nowadays, analysed using computers. These models are generally categorized as under:-
Individuals in the population are subdivided into broad subgroups (‘compartments’) and the model tracks the infection process for these individuals collectively. These models can be either deterministic or stochastic.
The model tracks the infection process for every individual in the population.
The model incorporates contact (and therefore transmission) between individuals.
A model in which the network of contacts between individuals is explicitly modelled.
The commonly used facts relevant to modeling of an infectious disease under consideration includes understanding the following concepts and terms:-
An infected person who does not have any recognized signs of the infection.
The theoretical average of all infectious persons in a population, averaging over.
The number of new events, such as infections or cases in the population at risk (usually susceptibles), per unit person per unit time.
Average number of secondary infectious persons resulting from one infectious person following their introduction into a totally susceptible population. It is sometimes referred to using different names, e.g. the basic reproductive number, the basic reproductive rate, the basic reproduction ratio, etc. It is a number, rather than a rate, since there are no units of time in the definition.
Average number of secondary infectious persons resulting from one infectious person in a given population in which some individuals may already be immune because of infection or vaccination. In the modelling literature, this quantity is also often referred to as the ‘effective reproduction number’.
The rate at which two specific individuals come into effective contact per unit time. Technically it is the per capita rate at which two specific individuals come into effective contact per unit time. It is also sometimes referred to using other names, such as the transmission coefficient, transmission rate, contact parameter, etc.
A contact that is sufficient to lead to transmission if it occurs between an infectious and a susceptible person.
The rate at which susceptible individuals become infected per unit time. It is also known as the incidence rate or the hazard rate.
The time taken for the number of infected individuals to double.
Rate at which the prevalence of infectious persons increases, typically calculated during the early stages of an epidemic.
This is also commonly used to refer to the indirect protection experienced by unvaccinated individuals resulting from the presence of immune individuals in a population.
The proportion of the population which needs to be immune in the population for the infection incidence to be stable.
Time interval between successive infections in a chain of transmission. Also referred to as the generation time.
Figure 5.1: Commonly Used compartment models.
The proportion of individuals in a population that have the outcome of interest at a given time. If an infection is endemic, then prevalence"" incidence x duration of the condition.
Contact pattern whereby individuals are equally likely to contact all individuals, irrespective of their age or other characteristic.
Contact pattern whereby the rate at which groups of individuals (characterized by age, gender or some other criterion etc.) come into contact with others depends on the proportion of the total contacts generated by each group.
A summary measure of mixing between individuals.
Non-random mixing, whereby the rate at which individuals come into effective contact depends on their age, sex, or some other characteristic.
Different hazards acting on a single compartment in a model; for example, infected people may be subject to hazards of recovery and death.
There are three main considerations when considering the model structure:
1) The natural history of the infection.
2) The accuracy and time period over which model predictions are required.
3) The research question.
Figure 6.1: Natural History: Illustrative Example
The time period between infection and onset of clinical symptoms. Note that for many infections, including measles, individuals can be infectious before they show any clinical symptoms.
The time period during which individuals are infectious.
Time period between infection and onset of infectiousness. Sometimes referred to as the ‘latent’ period.
Based on the natural history of a disease, the simplest form of a model is Susceptible-Infected (SI) model (eg. HIV/AIDS). With increasing complexities, the models can be Susceptible-Infected-Recovered (SIR), Susceptible-Pre-infectious-Infectious-Recovered (SEIR), etc. Common structures for models used to describe the transmission of infections are as under:-
Figure 6.2: Commonly Used compartment models.
The structure also depends on how accurate the model predictions need to be. For example, estimates of the daily numbers of influenza cases based on an SIR model are likely to be less reliable than those from an SEIR model. Similarly, to describe the long-term transmission of an infection, a model may need to incorporate key aspects of the demography of the population (births, deaths, and migration).
A model in which it is assumed that there is no migration into or out of the population, and in which there are no births or deaths.
A model in which it is assumed that the population size does not change.
How complex models should be? The quote, often attributed to Einstein, that ‘models should be as simple as possible and no simpler’ should be borne in mind when trying to design a model. A model is a simplification of reality that allows us to explore patterns in data and, hopefully, discover fundamental insights that explain these patterns. Thus a model to determine impact of a treatment method may be same or different than a model estimating vaccination impact for a same disease agent.
A model in which individuals in the population are subdivided into broad subgroups (compartments) and the model tracks individuals collectively. Models may be deterministic or stochastic. Deterministic models describe what happens ‘on average’ in a population. In these models, the input parameters (e.g. the rate of disease onset or the rate at which people recover) are fixed, and therefore the model’s predictions, such as the number of cases which will be seen over time, are ‘predetermined’. Stochastic models, on the other hand, allow the number of individuals who move between compartments to vary through chance, for example, the rate at which people are infected or infectious individuals recover from disease may vary randomly.
The assumption that the risk of infection increases as the population size increases. This might occur if the boundary in which the population is confined remains unchanged with increasing population size, and therefore crowding increases. It is sometimes referred to as the ‘pseudo mass action’ assumption.
The assumption that the risk of infection remains unchanged as the population size increases. This might occur if the boundary in which the population is confined increases with increasing population size, and therefore crowding remains unchanged, or if the behaviour of individuals does not change as the population size increases. It is sometimes referred to as the ‘true mass action’ assumption.
Deterministic compartmental models can be set up using either ‘difference’ equations or ‘differential’ equations. Difference equations describe the transitions between different disease categories using discrete (e.g. daily) time steps by expressing the number of individuals at a given time t+ 1 (e.g. tomorrow), in terms of the number at an earlier time t, (e.g. today).
Once compartments are identified, parameters determining rate of change/ transition among different states/ compartments are required to be specified. This includes parameters such as force of infection, recovery rate, etc.