Gallstone Diagnosis and Treatment with Discrete Markov Chain Analysis

Introduction
Gallstone Diagnosis and Treatment: A Brief Outline
Solution Approach
Marginal Transition Probabilites by Current State
Transition Matrix Dynamics
Predicting Time in Each State Before Death
Conclusion
Appendix

logo Gallstones Occur in the US every day

Introduction

This Whitepaper demonstrates the unique benefits of Discrete Markov Chain Analysis in supporting and improving medical decision making – in particular, when making treatment decisions for gallstone patients.

Discrete Markov Chain Analysis is a mathematical modeling technique derived from matrix algebra. In the case of medical decision support, it describes the transitions a cohort of patients make among a number of mutually exclusive and exhaustive health states during a series of short intervals or cycles.

Discrete Markov Chain Analysis has a array of extremely valuable use cases across a spectrum of industries and in academic research; such as e-commerce, genetics and medicine, weather forecasting, finance and economics, actuarial science, sociology, and many others.

Gallstone Diagnosis and Treatment: A Brief Outline

Below is a summary of the medical problem being analyzed. ¹ ²

The gallbladder’s job is to store bile, which is produced by the liver and helps with digestion.
Gallstones are hard lumps which form in the bile duct or gallbladder.
Obesity and diets high in cholesterol and fat can lead to gallstone formation.
Women develop gallstones with notably higher prevalence than do men.
Gallstones can be asymptomatic – many people have them and don’t know it.
Gallstones usually do not show up with regular X-rays
Nearly 90% of patients who seek treatment receive laparoscopic cholecystectomy surgery, a procedure to remove the stones and gallbladder.

Gallstones

Globally and in the US, there is great variation in gallstone prevalence across geographic and perhaps genetic and/or cultural dimensions (see below). Some populations have rates over 20%, notably Danish and Norweigian; while Japan, Thailand and Sub-Saharan Africa are at or below 5% ³.

In the USA about 15% of the adult population have gallstones, though the rate among Native Americans and Hispanics is much higher than others in the US ⁴, and Americans of European descent have slightly higher rates than those of African descent.

Gallstones

http://www.worldgastroenterology.org/publications/e-wgn/e-wgn-expert-point-of-view-articles-collection/the-growing-global-burden-of-gallstone-disease

Physicians who diagnose asymptomatic gallstones are faced with the decision to either immediately remove the gall bladder to prevent possible life-threatening complications, or to postpone surgery until complications do occur. What is the long-term trend of each strategy?

And how can we maximize the long-term safety and efficacy of the decision to operate now, or to wait?

Markov chains, the key functional tool of Discrete Markov Chain Analysis, can be used to model this scenario. And in absence of a clinical study, Markov Chain analysis may be the only effective way to evaluate the benefits and risks of various medical treatment strategies.

Solution Approach

Suppose in the “postpone surgery” strategy, a patient will continue to have asymptomatic gallstones (state A in the model below), from one 4-month period to the next, with probability 0.95.

Either of two major complications (state C), cholescystis or biliary complications, may result, and require surgery with probability of 0.04.

Because of the patient’s specific age, she will have the probability of natural death of 0.01 (state D.) If the disease progresses and becomes symptomatic, then surgery is performed with a risk of death from surgical complications of 0.005.

Otherwise, once successful surgery is performed, the patient enters the recovery state (state R.)

Ninety percent of the patients move on to the well state (state W), while 9% stay in recovery state, and 1% die of natural causes. Once a patient enters the well state, she continues there until death, with a survival probability of 0.99.

The following 5x5 matrix, called the transition matrix, represents each of these events, and will be the foundation of our Markov Chain strategy to determine if and when to postpone surgery until complications occur. We call this matrix ‘P1’, indicating it is for the 1-period transition, where the period is 4 months, a standard follow-up time for the Gallstone diagnosis and the time basis for each of the statistics described above.

The entries in each row, and in each column, add up to 1. This is because their values represent percentages of the whole population, whose distribution is spanned by the 5 different states as described above. It should be noted that a patient is in exactly one state at a time for a given period, which is why these percentages add up to 1 for any given population cohort.

Each row represents the proportion of the population currently in the given row state. For example, the first row represents only those in the population who are currently in state ‘A’ (asymptomatic’). For each such row, the values indexed by the 5 columns show the next period state, with the same sequence of states in the columns as the rows, as indicated by the following reference key:

A = Asymptomatic gallstones (no need for immediate Surgery)

C = Complications requiring immediate surgery

R = Patient in recovery from prior surgery

W = Well state (patient has fully recovered from surgery)

D = Death

Marginal Transition Probabilites by Current State

Calculation of Marginal Next State Transition Probabilites Distribution by Current State:

For any given inital state, there is also a distribution of next states, as defined in the transition matrix. If we look at the transition matrix one row at a time, we will see the next state distribution for each initial state. This is known as the marignal distribution of transitions.

For example, most folks who are initially in the asymptomatic state ‘A’ will stay there, while most in the Complications state ‘C’ will move on to the Well state ‘W’:

Notice that D (Death) is an absorbing state, i.e., a patient in that state (obviously) can never leave, unlike a patient in any other state. Furthermore, there are paths from every other state to state D with positive probability, and as we all know, the long-term chance of death is 100%.

One of the very useful results of this fact, which will be demonstrated later in this paper, is the ability to estimate time to death for each of the other 4 states, as an aid to decision-making beyond the short-term.

But the primary value of the transition matrix is to project future states for a given number of 4-month time intervals. For example, to know the chance that a patient who is currently in state A (asymtomatic) will still be in that that state in 1 year (that is, after 3 consecutive 4-month transitions.)

This can be done by multiplying the current known (or hypothesized) distribution of states for a given patient cohort, in the form of a vector ‘V’ of probailities which add up to 1; by the transition matrix ‘P1’ one time for each future future time period transition we are interested in. The result will be a vector similar to V, with a probability distribution of the future state in question of the original cohort, across the 5 health states listed above.

The power of Markov chains rests in its ability to deal with the complex and dynamic interactive effects of the movement between sates over time, as demonstrated in the directed graph of the transition matrix below.

This same complexity, however, makes it very difficult to understand the transition chains without help of a directed graph of the permutations of these transitions.

Transition Matrix Dynamics

Below is a plot of the single step Transition Matrix, with directed arrows indicating each possble transition with probability >0. Also shown is the associated probability of moving there from each state:

This is simply a visualization of the P1 transition matrix, in a slightly more informative visualization. But what we want to know, given this initial state information which is provided by the physician, is to see how the state distribution changes over time, and to come up with a method to evalauate the outcome of an arbitrary number of such steps forward in time. What happens after these transitions have change the initial distribution of the population more than once?

First, let’s look at a the transition matrix for 2 steps into the future. This is obtained by multipying the transition matrix ‘P1’ by itself: after 2 time intervals PxP= P^2 will be represent the distribution of migrations from each current state to each future state, afer two 4-month initervals.

2nd Generation Transition Matrix (=PxP): ⁵

Directed graph of of 2- step transition matrices:

3rd generation transition matrix

Then we will create the multiply by the different transition matrices to see the future effect after 3 4-month time periods (i.e., 1 year), and then 5 and 10 years.

Below is the 5-year transition matrix P15. (The 1-year transition matrix was creaated above as P3)

Below is the 10-year transition matrix P30, and its directed graph.

Initial Distribution of Population Health State

The transitions from one state to the next are defined above; but the final distribution of a given cohort or population will depend on the initial distribution.

In our example, most though not all patients begin in the asymtomatic state (‘A’). The overall distribution of population initial states is shown below.

We see there that most of the patients are asymptomatic (‘A’), and most of the rest are in the Well state (‘W’):

This starting point can be adjusted for different populations. In our case, we will use these precentages as the initial state, and then multiply the above transition matrices associated with the 1,5 and 10 year transition matrices, to get the expected distribution of our cohort for after those 3 different time intervals.

The results are below:

Predicting Time in Each State Before Death

One of the more useful functions of a Markov Transition Matrix is the ability it allows to estimate the time spent in each given state before reaching the absorbing’ state (in this case, death.) With a small amount of manipulation and linear algebra, we can derive the following table of years that patients initially in each of the 5 states are expected to spend in the other states before reaching death.

YEARS SPENT IN EACH STATE BEFORE DEATH, BY INITIAL STATE:

LIFE EXPECTANCY IN YEARS BY INITIAL STATE:

Finally, to see the very long-term pattern, we create a 30-year transition matrix, and find the proportions. As might be expected, most but not all of the older gallstone patients are expected to have died after 30 years:

##      A      C      R      W      D 
## 0.0079 0.0003 0.0004 0.3977 0.5937

Conclusion

Our original problem was to discover what is the best descision and long-term trend of each of the 2 gallstone treaetment strategies - operate now, or wait.

Our Analysis shows that over a 5-year period less than half of initially asymtpomatic Gallstone patients remain asymptomatic; and for a 10-year time span only 21% of them do. Also, patients who have surgery have slightly longer life expectancies.

Furthermore, in this analysis we have simplfied the patient state model to assume that asymptomatic patients who do not get surgery will always stay asymptomatic. In reality, there may occasionally be symptoms other than surgey-requiring complications eventually. These types of low-grade symptoms are rare and have a wide range of likelihood estimates; and since they are not major health concerns and not known precisely, they can’t be used directly in a decision model like this, so we left them out.

However, these minor events will surely add some discomfort for some patients, and will also likely add some expense to the average long-term treatment;; so this adds some wait to our conclusion: due to the high degree of safety in laparoscopic cholecystectomy surgery and this other factor, in the event that no clear-cut risk emerges between having surgery or not, the default postion should be surgery for the asymptomatic upon discovery of the asymptomatic patient state.

Although there are many factors involved in treating any disease, any physician who assumes the initial state and transition dynamics as quantified above, should seriously consider these results when determining treatment plans.

In addition, she should experiment with different initial state and transition assumptions, to understand sensitivty to the distribution assumptions. The R-language code contained in the appendix allows for changing assumptions about the initial states and the transition matrix, so these scripts can be used for such experimentation.

Given the assumptions in the hypothetical case described in this paper, the algorithm is proposed: as soon as diagnosis is made and after the evaluation of choledocholitiasis risk, laparoscopic cholecystectomy surgery should be offered to all patients excepting those with unusually high risk of morbidity or mortality due to surery. These ‘guidelines’ are hypothetical and are given for their demonstrative purposes only. They are not to be used as a clinical judgement for any actual patient.

Appendix

R Scripts used in plot and table generation:

Plot the Transition Matrix

library(diagram) par(mfrow=c(1,1)) plotmat(t( P1),relsize=0.7,arr.lcol=col_matrix,box.col=col_fill,txt.col=“yellow”,box.size=.1,main=“Transition Matrix ‘P1 (1 period forward)’”,cex=0.7)

2nd Generation Transition Matrix (=PxP):

P2<-P1%*%P1 #2nd generation transition matrix P2

Directed graph of of 2- step transition matrices:

par(mfrow=c(1,1))

plotmat(round(t(P2),3),relsize=0.75,arr.lcol=col_matrix,box.col=col_fill,txt.col=“yellow”,box.size=.05,main=“GallstoneMatrix ‘P2’(2 periods forward)”,cex=0.7,box.lwd
=1)

3rd generation transition matrix

P3<-P2%*%P1

round(P3,2) plotmat(round(t(P3),2),relsize=0.75,arr.lcol=col_matrix,box.col=col_fill,txt.col=“yellow”,box.size=.05,main=“GallstoneMatrix ‘P3’(3 periods=1 YEAR forward)”,cex=0.7,box.lwd =1)

5-year transition matrix P15.

P15<-P3%%P3%%P3%%P3%%P3

P15

Plot The Directed 5-year Directed Graph:

par(mfrow=c(1,1)) plotmat(round(t(P15),2),relsize=0.75,arr.lcol=col_matrix,box.col=col_fill,txt.col=“yellow”,box.size=.05,main=“GallstoneMatrix ‘P30’(15 periods=5 YEARS forward)”,cex=0.7,box.lwd =1)

10-year transition matrix P30, and its directed graph.

P30<-P15%*%P15

P30

P90<-P30%%P30%%P30

plotmat(round(t(P30),4),relsize=0.75,arr.lcol=col_matrix,box.col=col_fill,txt.col=“yellow”,box.size=.05,main=“GallstoneMatrix ‘P30’(30 periods=10 YEARS forward)”,cex=0.7,box.lwd
=1)

Distribution of our cohort for after those 3 different time intervals.

V3=as.vector(Initial%%P3); names(V3)<-names(Initial) V15=as.vector(Initial%%P15); names(V15)<-names(Initial) V30=as.vector(Initial%*%P30); names(V30)<-names(Initial)

par(mfrow=c(2,2)) barplot(Initial,main=“Initial Distribution”,legend.text=rownames(P1),col=col_fill,ylim=c(0,1),args.legend=list(x = “topright”)) barplot(V3, main=“1-Year Distribution”, ylim=c(0,1),col=col_fill) barplot(V15, main=“5-Year Distribution”, ylim=c(0,1),col=col_fill) barplot(V30, main=“10-Year Distribution”, ylim=c(0,1),col=col_fill)

Years Spent in each state before death:

Create the Identitity Matrix ‘I’ and (non-absorbing)

Sub-matrix of P1 ‘Q’, to calculate the fundamental matrix F=(I-Q)^-1, which will give expected time to death for each state:

I<-matrix(data=c(1,0,0,0,0,1,0,0,0,0,1,0,0,0,0,1),nrow=4,ncol=4) Q<-P1[1:4,1:4] F<-solve(I-Q) par(mfrow=c(2,2)) barplot(F[1,]/3,main=“Asymtomatic”,col=col_fill,ylim=c(0,35),args.legend=list(x = “topright”)) barplot(F[2,]/3, main=“Complications”,ylim=c(0,35), col=col_fill) barplot(F[3,]/3, main=“Recovery”,ylim=c(0,35), col=col_fill) barplot(F[4,]/3, main=“Well”,ylim=c(0,35), col=col_fill)

Life expectancy in years, by inital state:

par(mfrow=c(1,1))

VF.Years<-apply(F,1,sum)/3 VF.YearsDF<-data.frame(apply(F,1,sum)/3)

barplot(VF.Years,main=“Life Expectancy in Years Each State”,col=col_fill,ylim=c(33,34))

30-year transition matrix, and the proportions.

V90=as.vector(Initial%*%P90) names(V90)=names(Initial) round(V90,4) par(mfrow=c(2,1)) barplot(Initial,main=“Initial Distribution”,legend.text=rownames(P1),col=col_fill,ylim=c(0,1)) barplot(V90, main=“30-Year Distribution”, ylim=c(0,1),col=col_fill)