Introduction

Net Promoter Score (NPS) is a widely used market research metric based on a single survey question: “How likely are you to recommend our company, product, or service to a friend or colleague?” Developed by Fred Reichheld and popularized through his 2003 article in the Harvard Business Review, NPS has become a standard tool for measuring customer loyalty across industries.

When applying predictive analytics to NPS, the score is typically treated as a single numeric value, with the goal of forecasting its future trend. While this approach provides a general sense of customer sentiment over time, it often overlooks the underlying dynamics among the different groups that compose the score. Recognizing these internal shifts can help businesses make more informed decisions and allocate resources more effectively.

In this project, I explore an alternative approach to NPS prediction that goes beyond simply forecasting the overall score. Instead, it focuses on the composition of NPS: the proportions of Detractors, Passives, and Promoters. By analyzing how customers transition between these categories over time, we can uncover deeper insights into customer behavior and loyalty. This perspective offers a more granular view of customer dynamics and enables more targeted, data-driven strategies.

DataSet composition

We are going to be working with the NPS for financial services dataset found in Kaeggle. dataset, which can be found on Kaggle. This dataset contains real-world NPS data from a retail bank for the year 2021. It includes 5,000 observations across 7 variables, as displayed below.

##        ID          Market           Survey date         Customer Name     
##  Min.   :1000   Length:5000        Min.   :2021-01-01   Length:5000       
##  1st Qu.:2250   Class :character   1st Qu.:2021-04-02   Class :character  
##  Median :3500   Mode  :character   Median :2021-07-03   Mode  :character  
##  Mean   :3500                      Mean   :2021-06-30                     
##  3rd Qu.:4749                      3rd Qu.:2021-09-28                     
##  Max.   :5999                      Max.   :2021-12-30                     
##      Month           Quarter           NPS        
##  Min.   : 1.000   Min.   :1.000   Min.   : 0.000  
##  1st Qu.: 4.000   1st Qu.:2.000   1st Qu.: 5.000  
##  Median : 7.000   Median :3.000   Median : 8.000  
##  Mean   : 6.496   Mean   :2.503   Mean   : 6.841  
##  3rd Qu.: 9.000   3rd Qu.:3.000   3rd Qu.:10.000  
##  Max.   :12.000   Max.   :4.000   Max.   :10.000

For our study, our focus will be in the NPS and survey date columns.

Creating the NPS metric

The Net Promoter Score (NPS) classifies respondents into three categories based on their ratings: Promoters (respondents who rate 9 or 10), Passives (respondents who rate 7 or 8) and Detractors (respondents who rate 6 or lower). To calculate the NPS at any given point in time \(t\), you subtract the percentage of Detractors from the percentage of Promoters, following the formula:

\[NPS_t = \frac{Promoters-Detractors}{Promoters+Detractors+Passives} \]

This simple calculation provides an overall measure of customer loyalty.

For our NPS calculations, we will focus on monthly NPS. For each month \(t\), we will apply the formula mentioned earlier, using only the survey responses collected within that specific month. The NPS for each month will be calculated by subtracting the percentage of Detractors from the percentage of Promoters based on the answers received during that timeframe.

Looking at the Monthly NPS

Looking at the NPS scores alone and knowing that the range typically spans from 0 to 100, it may seem like the situation is concerning. However, it’s important to contextualize the NPS by comparing it to other players within the same segment. There are various interpretations of the NPS number, and while it can give a general sense of customer sentiment, the safest approach is to compare it with that of competitors in the same industry.

Since the NPS is composed of three distinct groups, the Promoters, Passives, and Detractors, it can be insightful to understand the relative volume of each group within the overall score. On its own, the NPS doesn’t provide a complete picture of how many customers are genuinely satisfied or dissatisfied with the company’s services. A deeper analysis of these groups can offer a more nuanced understanding of customer loyalty and satisfaction.

We see significant volatility across NPS groups, which may stem from new customers forming strong first impressions — either positive or negative. It could also reflect natural shifts among returning clients, such as a previously satisfied customer giving a lower score after a poor experience. Regardless of the cause, the data clearly shows that customer classifications are fluid and often shift based on recent interactions.

In the following sections, we will explore an approach to quantify these shifts between classifications and examine how this information can be integrated into strategies for forecasting and customer outreach.

Applied Markov chain

In probability theory and statistics, a Markov chain is a specific type of Markov process characterized by either a discrete state space or a discrete index set, typically representing time. It describes a sequence of possible events in which the probability of each event depends solely on the state attained in the previous event. In simpler terms, this means that “what happens next depends only on the current state.” If you’re interested in diving deeper into this topic, I recommend reading these articles from the Mathematics Library and this chapter Chapter 8: Markov Chains from the UoA Department of Statistics.

In a Markov chain, a system moves between states with certain probabilities. These movements are called transitions, and the associated transition probabilities describe the chance of moving from one state to another. The model is defined by three parts: a set of possible states (state space), a transition matrix, and an initial state. It assumes the process continues indefinitely and only the current state matters, not how it was reached.

Transition probability of NPS scores

Applying the Markov chain idea to our case, we assume that a customer’s next NPS response depends only on their previous one. For instance, while a customer going from a score of 10 to 3 is possible, seems unlike. By calculating these probabilities, we not only confirm that such shifts can happen, but we can also measure how likely they are. This helps identify customers at risk of big sentiment swings so we can monitor them more closely.

In this context, the NPS states are the three categories: Detractors, Passives, and Promoters. A customer can stay in the same category or move to another in their next survey. The key is that the transition depends only on their current state — not past ones. This is exactly what transition probabilities capture.

To build this, we focus on customers who took the survey multiple times. Tracking how they move between categories allows us to calculate the percentage of each type of transition, forming our transition matrix.

## This code chunk makes sure we only take consecutives answers from the survey, using each client as an id and organizing in chronological

Matrix<-NPS%>%
  add_count(`Customer Name`)%>%
  filter(n>1)%>%
  select(`Survey date`,`Customer Name`,Category)%>%
  arrange(`Customer Name`,`Survey date`)%>%
  group_by(`Customer Name`)%>%
  mutate(Lag1=lag(Category),Time_diff=as.numeric(difftime(`Survey date`,lag(`Survey date`))))%>%
  ungroup()%>%
  filter(!is.na(Time_diff))

# Now we create our transition matrix, using the NPS classes as labels. Each row needs to add up to 1, since it represents the total sum of the transition probabilities
Matrix<-Matrix%>%
  select(Category,Lag1)%>%
  group_by(Category,Lag1)%>%
  summarise(n=n())%>%
  pivot_wider(names_from = Category,values_from = n)

Matrix<-Matrix[,2:4]/rowSums(Matrix[,2:4])
row.names(Matrix)<-colnames(Matrix)

The chart and matrix below show how customers shift between Detractor, Passive, and Promoter categories over time, offering a clearer picture of how customer sentiment evolves.

##           Detractor   Passive  Promoter
## Detractor 0.3833333 0.1333333 0.4833333
## Passive   0.2692308 0.1153846 0.6153846
## Promoter  0.3176471 0.1764706 0.5058824

Each class (Detractor, Passive, Promoter) in the Markov Chain is represented with arrows: One arrow points to the other classes, showing the probability of transitioning to a different state, and one circular arrow loops back to the same class, representing the likelihood of staying in the same state. In NPS terms, this tells us how likely a customer is to maintain their satisfaction level or shift to a different one in the next survey — in other words, how consistent or variable their perception is over time.

Now, let’s look at the numbers:

This shows a general upward trend: customers are more likely to become Promoters than to move in the opposite direction. This paints a much more promising picture than a single static NPS score, as it suggests a positive underlying dynamic in customer sentiment.

Of course, because the system evolves randomly, we can’t predict with certainty where a specific customer will fall in the next survey. But we can estimate the overall distribution of customers across categories in the future — and that’s a powerful insight.

This is where Markov Chains really shine: they help us answer key questions like, “What is the most likely next step for a customer?” or “What will the customer base look like a few cycles ahead?” That allows for more informed strategies and more proactive customer engagement.

Question: Assuming that a customer is Passive, what is the probability that he will became a Promoter in the next two months?

Answer:

We answer that by multiplying the transition from Passive to Promoter by itself (in other words, elevated by a power of 2).

## Unnamed Markov chain^2 
##  A  3 - dimensional discrete Markov Chain defined by the following states: 
##  Detractor, Passive, Promoter 
##  The transition matrix  (by rows)  is defined as follows: 
##           Detractor   Passive  Promoter
## Detractor 0.3363713 0.1517898 0.5118389
## Passive   0.3297453 0.1578083 0.5124463
## Promoter  0.3299681 0.1519883 0.5180437

We can se the transition matrix probabilities are changed from the previous values, now showing the transition we can expect in the next month. We can do this for subsequent months, for example, in 3 months

## Unnamed Markov chain^3 
##  A  3 - dimensional discrete Markov Chain defined by the following states: 
##  Detractor, Passive, Promoter 
##  The transition matrix  (by rows)  is defined as follows: 
##           Detractor   Passive  Promoter
## Detractor 0.3323929 0.1526882 0.5149188
## Passive   0.3316663 0.1526064 0.5157273
## Promoter  0.3319627 0.1529523 0.5150850

If we increase the number of surveys, the predictive power tends to enter a plato, refer to as stationary state. That would be the long term proportion of distribution between classes if we take this current transition matrix. It can be calculated using the function steadyStates()

##      Detractor   Passive Promoter
## [1,] 0.3320603 0.1528118 0.515128

From the stationary state of our Markov chain, we see that a customer has a 50% probability of eventually becoming a Promoter. This means that if a customer stays with the company over a long period, there’s a strong chance they will settle into the Promoter group and likely remain there.

We can also flip this perspective: if we acquire 100 new customers, the stationary distribution suggests that, over time, around 50 will become Promoters, while the rest will stabilize as Passives or Detractors based on their respective long-term probabilities. This gives us a clearer picture of future customer sentiment and helps inform smarter retention and engagement strategies.

##      Detractor  Passive Promoter
## [1,]  33.20603 15.28118  51.5128

With these proportions in hand, we can even estimate our future NPS score by considering the natural distribution and transition of customers between the classes over time. By simply plugging the stationary state proportions into the NPS formula, subtracting the percentage of Detractors from the percentage of Promoters, we arrive at a projected future NPS score of 18.

This approach provides a data-driven forecast of long-term customer sentiment, assuming no major changes in service or customer experience.

Final remarks

As mentioned earlier, the goal is to take a predictive approach to NPS that not only forecasts future scores but also provides deeper insights into current customer behavior. When we look at the NPS formula, the ideal scenario is clear: convert all Detractors and Passives into Promoters, while retaining existing Promoters.

This article aims to equip you with tools to quantify that objective, by helping you measure how close (or far) your customer base is from that ideal state. More importantly, it offers a starting point to understand why some Promoters are turning into Detractors, which is valuable insight for identifying areas that need improvement, and why some Detractors are becoming Promoters, highlighting what parts of the business are working exceptionally well.