Constructing a Markov model for cost-effectiveness analysis using Excel: A tutorial

Mark Bounthavong

25 January 2024

Introduction

In Pharmacoeconomics, modeling chronic diseases can be difficult using a decision tree. The branches would continue to grow and become more complex due to the potential pathways that might occur in the long-run. In decision tree models, the path is unidirectional which means that subject moves from left to right. Chronic disease tends to have a long time horizon, which could lead to various paths and endpoints and complicates the model further. To simply this, we can use disease (or transition) states to introduce the concept of “re-entry.” Markov models use disease states to emulate a disease course, particularly chronic diseases. These disease states may be represented as branches on a decision tree.

For example, let’s take a look at Figure 1, which is a decision tree with two treatment strategies (e.g., Treatment A and Treatment B). Subjects who receive Treatment A might experience an “illness” and either “live” or “die.” Alternatively, subjects who receive Treatment A might experience “no illness” and either “live” or “die.”

Figure 1. Decision tree.

What happens after they live?

Let’s assume that subjects who live may experience either an “illness” or “no illness.” Then, the subjects can either “live” or “die” once again. If we add these extra branches to Treatment A and Treatment B, the decision tree becomes quite complex (Figure 2).

Figure 2. Decision tree with more branches.

There are a couple of features in this complex decision tree that we should be aware of (Figure 3). First, the illness” node branches to the “live” and “die” nodes. Next, if the subject lives, then they can experience another “illness” or “no illness.” This can repeat as much as the modeler wants or as long as the time horizon continues. Notice that there is a pattern to this. The decision tree branches the occur after the “live” node are exactly the same in the previous nodes.

Figure 3. Decision tree with more branches.

We can think of these repeat patterns as loops where each node can re-enter at a previous node. Figure 4 illustrates how one can do this with the “illness” -> “illness” nodes and “no illness” to “no illness” nodes.

Figure 4. Decision tree - loops or re-entry points.

This can be redrawn as three states. Figure 5 illustrates the three states that are based on the decision tree.

Figure 5. 3-state Markov model.

In this Markov model example, there are three states (“Well”, “Illness”, and “Dead”). The “Well” or “No Illness” and the “Illness” states are the disease (or transition) states. The “Dead” state is the absorbing state, which means that subjects are unable to transition to another state in the Markov model. This makes sense because subjects who die are not going to transition to “Well” afterwards. They will remain dead for the entire model.

Motivating example

Markov models can be build in R or Excel. In this tutorial, we will construct a 3-state Markov model using Excel. The example above will serve as our motivating example.

The three states are:

Disease state 1 - “Well” or “No illness”
Disease state 2 - “Illness
Absorbing state - “Dead”

Let’s assume that we are simulating a disease for a group of 20 year old adults. The time horizon for our three-state Markov model will be 100 years. Each cycle will be 1 year.

It’s important to declare how long the model will run for (time horizon) and the length of each cycle. Each time we run the MC simulation, we call that a cycle. If each cycle is 1 year, then a time horizon of 100 years will involve 100 cycles or iterations.

The Markov model is iterated many times through a sequential process that is updated conditioned on the previous sequence. For example, the amount of subjects in the “Well” state at cycle 1 may be greater than the amount of subjects in the “Well” state at cycle 2. Therefore, the amount of patients that transition out of cycle 2 may be less than the amount of subjects in cycle 1.

Markov process versus Markov chain

Markov models are iterated at each cycle (or sequence). The engine that is used to perform these iterations is called a Markov process. But there are some important distinctions.

When the transition probabilities are dynamic (changing with each cycle) then we call this iteration a Markov process (Figure 6). For example, in some disease, the likelihood of dying increases with age; hence, the transition probability of dying will change with each cycle.

Figure 6. Markov process.

However, when the transition probabilities are static (not changing with each cycle), this is a special type of Markov process called a Markov chain (Figure 7).

Figure 7. Markov chain.

In this example, our three-state Markov model will use static transition probabilities; hence, we’ll iterate our model with the Markov chain.

Step 1: Transition probabilities

We will need to define the parameters to be used in the Markov model. Since the Markov model has transition probabilities, we will need to define these for our example.

Let’s visualize the different transition pathways (indicated by the arrows) that occur in our three-state Markov model (Figure 8).

Figure 8. 3-state Markov model - Transition probabilities.

We can redraw this as a matrix (Figure 8.1). The state in the column is the origin and the state in the row is the destination.

Figure 8.1. Markov probability matrix.

In total, we have seven transition probabilities denoted as $tp_{i}$ where $i$ represents the transition path. In our example, we allowed for subjects to transition from the “illness” state back to the “well state. Depending on your model, you may not allow for this to occur.

We can create a table to enter these values in Excel (Figure 9).

Figure 9. 3-state Markov model - Transition probabilities.

Since we are applying probability theory, we need to make sure that the sum of the transition probabilities for each state is equal to 100%. For instance, the “Well” state has three transition pathways:

Well -> Well ($tp_{1}$ = 0.83)
Well -> Illness ($tp_{2}$ = 0.15)
Well -> Death ($tp_{3}$ = 0.02)

The sum of these need to be 100%. In Figure 9, we have the transition probabilities for each transition pathway. Notice that $tp_{1} = 1 - tp_{2} - tp_{3}$. This is because of the probability theory that the sum of the transition probabilities need to equal to 100%. That would require one of the transition probabilities to be the balanced from the sum of the other transition probabilities.

Notice that the “Death” state is 100%. That’s because it’s the absorbing state, which means that once a subject enters it, they cannot leave. This makes sense because a subject who dies cannot be resurrected to the “Well” or “Illness” states.

We can redraw this as a Markov probability matrix (Figure 9.1).

Figure 9.1 - Markov probability matrix.

Step 2: Costs and Payoffs

Once you have determined the transition probabilities for the Markov model, you will need to determine how much costs and payoffs are accrued for each state. For instance, a subject in the “Well” state accrues some costs such as medication and office-based visits while also generating some payoffs such as improved utility score or quality of life (QoL) value.

Every time a subject falls into a specific transition state, there are going to be some costs and payoffs accrued.

For this example, we’ll make some assumptions about the costs and payoffs associated with each disease state. Figure 10 contains the costs and payoffs (utilities) for each disease state. Note that the “Dead” state had zero costs and zero utility.

In Pharmacoeconomics, the utility score is the patient preference value for being in the health state. It’s not to be confused with the utility score that economists use in their estimation of welfare theory. This is confusion between how the term “utility” is used between pharmacoeconomists and economists. For this example, we will refer to the utility score as the patient preference.

The utility score has a range between 0 and 1, where 0 is equivalent to death and 1 is equivalent to perfect health. The utility score is used to estimate the quality-adjusted life year (QALY). For instance, 1 QALY is equal to living in perfect health (utility score = 1) for 1 year. Equivalently, 1 QALY is equal to living in moderate health (utility score = 0.50) for two years.

Figure 10. 3-state Markov model - Costs and Payoffs.

Based on our Costs and Payoffs, a subject who is in the “Well” state will accrue $500 and 1.00 utility for the cycle (e.g., year). Alternatively, a subject who is in the “Illness” state will accrue $2000 and 0.50 utility for the cycle (e.g., year). Finally, for a subject who is in the “Dead” state, they accrue $0 and 0 utility since they are dead.

Step 3: Setup the Markov process

Now that we have the transition probabilities and the costs/payoffs for each disease state, we can begin to build the mathematical model in Excel. The Excel file can be downloaded from the tutorial’s GitHub page. (Note: To download the file, click on the file until you see the Raw option on the upper right. Right-click on the Raw option and save the link as to somewhere on your computer. This will prevent file corruption. Figure 11.)

Figure 11. Saving Excel file from GitHub.

Cycle 0 (zero)

You will need to download the Excel file to follow along. The file is located on the Markov model tutorial GitHub site. (Note: Make sure to follow the above directions at the beginning of this tutorial to properly download the Excel file. Otherwise, it will be corrupt.)

We start the Markov model with cycle 0. In the example, we populate the cycle number from 0 to 100 starting in cell E12 (Figure 12.1).

Figure 12.1. - Location of cell E12.

At cycle 0, we assume that all patients will start in the “Well” state (Figure 12.2). The cohort size can be any value, but for ease of calculation, we’ll set it at 1 subject. This way, when we interpret the values, we can interpret them as probabilities instead of the number of subjects. The cohort size is set in cell G6 in the Excel file.

In cell G12 we enter the formula =$G$6 * 1, which multiplies the cohort size with the probability of being in the “Well” state. Since everyone begins in the “Well” state, the probability is equal to 1.

In cells H12 and I12, we enter the formula =$G$6 * 0, which multiples the cohort size with the probability of being in the “Illness” and “Dead” states. Since everyone begins in the “Well” state, the probability of being in the “Illness” and “Dead” states is 0 (zero).

Figure 12.2. - Building he mathematical model - Cycle 0.

Cycle 1

Once we set up the formula for cycle 0, we can begin to set up the rest of the Markov model for cycles 1 to 100.

In cycle 1, we need to make sure we are using the size of the cohort from the previous cycle. The number of subjects eligible for transition will be based on the number in the previous cycle.

In cell G13, we need to identify all the transition pathways that go into the “Well” state in cycle 1. In Figure 13, these are denoted by the red arrows. According to our three-state Markov model, there are two paths to the “Well” state: “Well -> Well” and “Illness -> Well.”

In other words, we multiply the probability of transitioning from the “Well” state to the “Well” state with the size of the cohort in the “Well” state in cycle 0 plus the probability of transitioning from the “Illness” state to the “Well” state with the size of the cohort in the “Illness” state in cycle 0. The formula is =(G12 * tp_1) + (H12 * tp_5), where $tp_{1}$ is the transition probability from “Well -> Well” and $tp_{5}$ is the transition probability from “Illness -> Well.”

Figure 13. Building the mathematical model - Well state in Cycle 1.

In cell H13, we need to identify all the transition pathways that go into the “Well” state in cycle 1. In Figure 14, these are denoted by the red arrows. According to our three-state Markov model, there are two paths to the “Illness” state: “Illness -> Illness” and “Well -> Illness.”

In other words, we need to multiply the probability of transitioning from the “Illness” state to the “Illness” state with the size of the “Illness” cohort in cycle 0 plus the probability of transitioning from the “Well” state to the “Illness” state with the size of the cohort in the “Well” state in cycle 0. The formula is =(G12 * tp_2) + (H12 * tp_4), where $tp_{2}$ is the transition probability from “Well -> Illness” and $tp_{4}$ is the transition probability from “Illness -> Illness.”

Figure 14. Building the mathematical model - Illness state in Cycle 1.

In cell I13, we need to identify all the transition pathways that go into the “Dead” state in cycle 1. In Figure 15, these are denoted by the red arrows. According to our three-state Markov model, there are three paths to the “Dead” state: “Dead -> Dead”, “Illness -> Dead,” and “Well -> Dead.”

In other words, we need to multiply the probability of transitioning from the “Well” state to the “Dead” state with the size of the “Well” cohort in cycle 0 plus the probability of transition from the “Illness” state to the “Dead” state with the size of the cohort in the “Illness” state plus the transition from the “Dead” state to the “Dead” state (which is 100%) with the size of the cohort in the “Dead” state in cycle 0. The formula is =(G12 * tp_3) + (H12 * tp_46) + (I12 * tp7), where $tp_{3}$ is the transition probability from “Well -> Dead” and $tp_{4}$ is the transition probability from “Illness -> Dead,” and $tp_{7}$ is the transition probability from “Dead -> Dead.”

Figure 15. Building the mathematical model - Dead state in Cycle 1.

Cycle 2 and greater

Hopefully, you should see a pattern in cycle 1. Because this pattern is the same for the remaining cycles all the way to the end of our time horizon at cycle 100. You can select and drag the cells in cycle 1 ($G$13 to $J$13) all the way to cycle 100. Excel will populate the remaining values. Make sure to check the last column to insure that the Total is equal to 1. This is a check to ensure that the sum of the probabilities equal to 1.

Step 4: Estimate the expected costs and payoffs

Estimate the expected costs for each disease state at each cycle

Once you have set up the Markov model in Excel, the next step is to estimate the expected costs and payoffs at each cycle.

Let’s take a look at cycle 1. We have the number of subjects (probability because our cohort size is 1) in each state for cycle 1 (0.83 in “Well.” 0.15 in “Illness,” and 0.02 in “Dead”). We multiple these values with the costs associated with each state at cycle 1.

Let’s refer back to Figure 10, which summarizes the costs and utility score for each state. For cycle 1, the expected cost for the “Well” state is 0.83 * $500 = $415 (Figure 16.1).

Figure 16.1. - Estimating the expected costs at cycle 1.

We repeat this procedure for the remaining cycles (Figure 16.2). The expected cost in the “Well” state for cycle 2 is 0.7339 * $500 = $366.95.)

Figure 16.2. - Estimating the expected costs at cycle 2.

We perform the same procedure for the “Illness” state and the “Dead” state. However, in the “Dead” state, the expected cost is zero.

Estimate the expected utility for each disease state at each cycle

Next, we can estimate the expected utility score in the “Well” state for cycle 1.

Similar to how we estimated the expected costs, we can estimate the expected utilities for each disease state at cycle 1. In Figure 19.1, We have the number of subjects in each state at cycle 1. We multiply these values with the utility scores associated with each state at cycle 1. At cycle 1, the expected utility for the “Well” state is 0.83 * 1 = 0.83 QALYs (Figure 19.1).

Note: You’ll notice that the unit is in quality-adjusted life year (QALY) instead of utility. This is because we are capturing the utility value for 1 year. In pharmacoeconomics, the QALY is equal to the utility score multiplied by the time or the period experienced by the patient. In this case, each cycle is 1 year, and when multiplied by the utility in the disease state, the QALY is the utility score experienced by the subject for 1 year.

Figure 19.1. - Estimating the expected utility for the Well state at cycle 1.

We can repeat this process for cycle 2 (Figure 19.2). The expected utility score of the “Well” state for cycle 2 is 0.7339 * 1 = 0.73 QALYs.

Figure 19.2. - Estimating the expected utility for the Well state at cycle 2.

Sum the expected costs and QALYs for all the cycles

To generate the total expected costs and QALYs for the entire time horizon (cycles 0 to 100), you will need to sum all the expected costs and QALYs. Figure 20 illustrates this, but make sure that you sum all the values in the column. We have a total expected cost of approximately $31,129 and a total expected QALY of 27.91 QALYs.

$Figure 20. Sum the total expected costs and QALYs.$

Figure 20. Sum the total expected costs and QALYs.

Step 5: Discounting

Since we are modeling for 100 years or 100 cycles, we need to apply discounting to the costs and QALYs gained. You can use any value, but we will use a discount factor of 3%. We discount both the costs and QALYs because we want to avoid the Keeler-Cretin paradox. In the Keeler-Cretin paradox, a payer can avoid having to pay for the benefits if the costs are the only element that has to be discounted and not the benefits. Over an infinite amount of time, the net present value of the cost will be zero, but the benefits will retain its value. This doesn’t make sense, so the consensus has argued to apply discounting to both costs and benefits. Think of it this way. If someone was diagnosed with cancer and life-saving treatment needed to be initiated immediately, you wouldn’t delay treatment to benefit from the effect of discounting on the costs of the intervention. That’s because the effectiveness of treatment will be attenuated (or discounted) if it’s not initiated immediately. What’s the point of saving money if you’re not going to be around to enjoy it?

To apply discounting to the Markov model, we need to modify the formula in the Total Expected Costs and Total Expected QALYs.

The formula for discounting:

$V_0 = {V_{t}} / (1 + r)^{t},$

where $V_0$ is the net present value, $V_{t}$ is the value at time $t$, $r$ is the discount rate, and $t$ is the time or cycle when the discount is applied.

For example, in cycle 1, the discounted total expected cost for the “Well” state is $402.91 (Figure 21.1). Note: We applied a 3% discount rate to the total expected costs.

Figure 21.1 Applying the discount rate for the total expected costs in the Well state at cycle 1.

Once we apply discounting, we would expect to see lower expected total costs and QALYs from the un-discounted Markov model.

We can compare the two models (un-discounted and discounted) and calculate the incremental costs, incremental QALYs, and incremental cost-effectiveness ratio (ICER) Figure 22 summarizes the total expected costs and QALYs included the ICER for the un-discounted and discounted Markov models.

Figure 22. Comparison between un-discounted and discounted Markov models.

We are comparing the total expected costs and QALYs between a Markov model that did and did not apply discounting. However, for real-world application of the Markov model, you will be expected to select a more appropriate comparator and apply discounting to the costs and QALYs.

Step 6: Markov trace

You can visualize the number or probability of being in the three states of the Markov model across the cycles. This is easily done using the X-Y Scatter with Smooth lines plot feature in Excel (Figure 23).

Figure 23. Creating the Markov trace in Excel.

Figure 23 contains the final Markov trace that is presented in the Excel file under the Markov model trace sheet.

Figure 24. Markov model trace.

Conclusions

The Markov model is a useful tool to simulate the disease course for a chronic disease. It can be used to introduce “re-entry” points and simplify a decision tree model. More importantly, it takes advantage of disease states that subjects will likely experience during the course of their disease, which can be used to assign costs and utilities that are in turn used to calculate the total expected costs and QALYs.

Another advantage that we’ll discuss in future tutorials is the application of stochastic processes where we can vary the uncertainty surrounding the parameters in a multivariable sensitivity analysis also called a probabilistic sensitivity analysis.

Acknowledgement

The decision tree was build using a tool developed by the University of Pittsburgh.

I was inspired by Briggs, et al (2006) text “Decision Modeling in Health Economic Evaluation, which can be acquired from the Wiley’s website (link). I considered this one of the best text on instructing users on how to construct Markov models in Excel.

GitHub site

Materials for this tutorial are available on the Markov model tutorial GitHub site (link).

Disclaimer

This is a work in progress.

This is for educational purposes only.