Net Promoter Score (NPS) is a widely used market research metric based on a single survey question: “How likely are you to recommend our company, product, or service to a friend or colleague?” Developed by Fred Reichheld and popularized through his 2003 article in the Harvard Business Review, NPS has become a standard tool for measuring customer loyalty across industries.
When applying predictive analytics to NPS, the score is typically treated as a single numeric value, with the goal of forecasting its future trend. While this approach provides a general sense of customer sentiment over time, it often overlooks the underlying dynamics among the different groups that compose the score. Recognizing these internal shifts can help businesses make more informed decisions and allocate resources more effectively.
In this project, I explore an alternative approach to NPS prediction that goes beyond simply forecasting the overall score. Instead, it focuses on the composition of NPS: the proportions of Detractors, Passives, and Promoters. By analyzing how customers transition between these categories over time, we can uncover deeper insights into customer behavior and loyalty. This perspective offers a more granular view of customer dynamics and enables more targeted, data-driven strategies.
We are going to be working with the NPS for financial services dataset found in Kaeggle. dataset, which can be found on Kaggle. This dataset contains real-world NPS data from a retail bank for the year 2021. It includes 5,000 observations across 7 variables, as displayed below.
## ID Market Survey date Customer Name
## Min. :1000 Length:5000 Min. :2021-01-01 Length:5000
## 1st Qu.:2250 Class :character 1st Qu.:2021-04-02 Class :character
## Median :3500 Mode :character Median :2021-07-03 Mode :character
## Mean :3500 Mean :2021-06-30
## 3rd Qu.:4749 3rd Qu.:2021-09-28
## Max. :5999 Max. :2021-12-30
## Month Quarter NPS
## Min. : 1.000 Min. :1.000 Min. : 0.000
## 1st Qu.: 4.000 1st Qu.:2.000 1st Qu.: 5.000
## Median : 7.000 Median :3.000 Median : 8.000
## Mean : 6.496 Mean :2.503 Mean : 6.841
## 3rd Qu.: 9.000 3rd Qu.:3.000 3rd Qu.:10.000
## Max. :12.000 Max. :4.000 Max. :10.000
For our study, our focus will be in the NPS and survey date columns.
The Net Promoter Score (NPS) classifies respondents into three categories based on their ratings: Promoters (respondents who rate 9 or 10), Passives (respondents who rate 7 or 8) and Detractors (respondents who rate 6 or lower). To calculate the NPS at any given point in time \(t\), you subtract the percentage of Detractors from the percentage of Promoters, following the formula:
\[NPS_t = \frac{Promoters-Detractors}{Promoters+Detractors+Passives} \]
This simple calculation provides an overall measure of customer loyalty.
For our NPS calculations, we will focus on monthly NPS. For each month \(t\), we will apply the formula mentioned earlier, using only the survey responses collected within that specific month. The NPS for each month will be calculated by subtracting the percentage of Detractors from the percentage of Promoters based on the answers received during that timeframe.
Looking at the NPS scores alone and knowing that the range typically spans from 0 to 100, it may seem like the situation is concerning. However, it’s important to contextualize the NPS by comparing it to other players within the same segment. There are various interpretations of the NPS number, and while it can give a general sense of customer sentiment, the safest approach is to compare it with that of competitors in the same industry.
Since the NPS is composed of three distinct groups, the Promoters, Passives, and Detractors, it can be insightful to understand the relative volume of each group within the overall score. On its own, the NPS doesn’t provide a complete picture of how many customers are genuinely satisfied or dissatisfied with the company’s services. A deeper analysis of these groups can offer a more nuanced understanding of customer loyalty and satisfaction.
We see significant volatility across NPS groups, which may stem from new customers forming strong first impressions — either positive or negative. It could also reflect natural shifts among returning clients, such as a previously satisfied customer giving a lower score after a poor experience. Regardless of the cause, the data clearly shows that customer classifications are fluid and often shift based on recent interactions.
In the following sections, we will explore an approach to quantify these shifts between classifications and examine how this information can be integrated into strategies for forecasting and customer outreach.
In probability theory and statistics, a Markov chain is a specific type of Markov process characterized by either a discrete state space or a discrete index set, typically representing time. It describes a sequence of possible events in which the probability of each event depends solely on the state attained in the previous event. In simpler terms, this means that “what happens next depends only on the current state.” If you’re interested in diving deeper into this topic, I recommend reading these articles from the Mathematics Library and this chapter Chapter 8: Markov Chains from the UoA Department of Statistics.
In a Markov chain, a system moves between states with certain probabilities. These movements are called transitions, and the associated transition probabilities describe the chance of moving from one state to another. The model is defined by three parts: a set of possible states (state space), a transition matrix, and an initial state. It assumes the process continues indefinitely and only the current state matters, not how it was reached.
Applying the Markov chain idea to our case, we assume that a customer’s next NPS response depends only on their previous one. For instance, while a customer going from a score of 10 to 3 is possible, seems unlike. By calculating these probabilities, we not only confirm that such shifts can happen, but we can also measure how likely they are. This helps identify customers at risk of big sentiment swings so we can monitor them more closely.
In this context, the NPS states are the three categories: Detractors, Passives, and Promoters. A customer can stay in the same category or move to another in their next survey. The key is that the transition depends only on their current state — not past ones. This is exactly what transition probabilities capture.
To build this, we focus on customers who took the survey multiple times. Tracking how they move between categories allows us to calculate the percentage of each type of transition, forming our transition matrix.
## This code chunk makes sure we only take consecutives answers from the survey, using each client as an id and organizing in chronological
Matrix<-NPS%>%
add_count(`Customer Name`)%>%
filter(n>1)%>%
select(`Survey date`,`Customer Name`,Category)%>%
arrange(`Customer Name`,`Survey date`)%>%
group_by(`Customer Name`)%>%
mutate(Lag1=lag(Category),Time_diff=as.numeric(difftime(`Survey date`,lag(`Survey date`))))%>%
ungroup()%>%
filter(!is.na(Time_diff))
# Now we create our transition matrix, using the NPS classes as labels. Each row needs to add up to 1, since it represents the total sum of the transition probabilities
Matrix<-Matrix%>%
select(Category,Lag1)%>%
group_by(Category,Lag1)%>%
summarise(n=n())%>%
pivot_wider(names_from = Category,values_from = n)
Matrix<-Matrix[,2:4]/rowSums(Matrix[,2:4])
row.names(Matrix)<-colnames(Matrix)
The chart and matrix below show how customers shift between Detractor, Passive, and Promoter categories over time, offering a clearer picture of how customer sentiment evolves.
## Detractor Passive Promoter
## Detractor 0.3833333 0.1333333 0.4833333
## Passive 0.2692308 0.1153846 0.6153846
## Promoter 0.3176471 0.1764706 0.5058824
Each class (Detractor, Passive, Promoter) in the Markov Chain is represented with arrows: One arrow points to the other classes, showing the probability of transitioning to a different state, and one circular arrow loops back to the same class, representing the likelihood of staying in the same state. In NPS terms, this tells us how likely a customer is to maintain their satisfaction level or shift to a different one in the next survey — in other words, how consistent or variable their perception is over time.
Now, let’s look at the numbers:
A Promoter is 50.5% likely to remain a Promoter in the following month and 31.7% likely to become a Detractor.
A Detractor has a 48.3% chance of becoming a Promoter — which is higher than the 38.3% chance of remaining a Detractor.
A Passive is 61.5% likely to become a Promoter, compared to a 26.9% chance of falling to a Detractor.
This shows a general upward trend: customers are more likely to become Promoters than to move in the opposite direction. This paints a much more promising picture than a single static NPS score, as it suggests a positive underlying dynamic in customer sentiment.
Of course, because the system evolves randomly, we can’t predict with certainty where a specific customer will fall in the next survey. But we can estimate the overall distribution of customers across categories in the future — and that’s a powerful insight.
This is where Markov Chains really shine: they help us answer key questions like, “What is the most likely next step for a customer?” or “What will the customer base look like a few cycles ahead?” That allows for more informed strategies and more proactive customer engagement.
Question: Assuming that a customer is Passive, what is the probability that he will became a Promoter in the next two months?
Answer:
We answer that by multiplying the transition from Passive to Promoter by itself (in other words, elevated by a power of 2).
## Unnamed Markov chain^2
## A 3 - dimensional discrete Markov Chain defined by the following states:
## Detractor, Passive, Promoter
## The transition matrix (by rows) is defined as follows:
## Detractor Passive Promoter
## Detractor 0.3363713 0.1517898 0.5118389
## Passive 0.3297453 0.1578083 0.5124463
## Promoter 0.3299681 0.1519883 0.5180437
We can se the transition matrix probabilities are changed from the previous values, now showing the transition we can expect in the next month. We can do this for subsequent months, for example, in 3 months
## Unnamed Markov chain^3
## A 3 - dimensional discrete Markov Chain defined by the following states:
## Detractor, Passive, Promoter
## The transition matrix (by rows) is defined as follows:
## Detractor Passive Promoter
## Detractor 0.3323929 0.1526882 0.5149188
## Passive 0.3316663 0.1526064 0.5157273
## Promoter 0.3319627 0.1529523 0.5150850
If we increase the number of surveys, the predictive power tends to enter a plato, refer to as stationary state. That would be the long term proportion of distribution between classes if we take this current transition matrix. It can be calculated using the function steadyStates()
## Detractor Passive Promoter
## [1,] 0.3320603 0.1528118 0.515128
From the stationary state of our Markov chain, we see that a customer has a 50% probability of eventually becoming a Promoter. This means that if a customer stays with the company over a long period, there’s a strong chance they will settle into the Promoter group and likely remain there.
We can also flip this perspective: if we acquire 100 new customers, the stationary distribution suggests that, over time, around 50 will become Promoters, while the rest will stabilize as Passives or Detractors based on their respective long-term probabilities. This gives us a clearer picture of future customer sentiment and helps inform smarter retention and engagement strategies.
## Detractor Passive Promoter
## [1,] 33.20603 15.28118 51.5128
With these proportions in hand, we can even estimate our future NPS score by considering the natural distribution and transition of customers between the classes over time. By simply plugging the stationary state proportions into the NPS formula, subtracting the percentage of Detractors from the percentage of Promoters, we arrive at a projected future NPS score of 18.
This approach provides a data-driven forecast of long-term customer sentiment, assuming no major changes in service or customer experience.
As mentioned earlier, the goal is to take a predictive approach to NPS that not only forecasts future scores but also provides deeper insights into current customer behavior. When we look at the NPS formula, the ideal scenario is clear: convert all Detractors and Passives into Promoters, while retaining existing Promoters.
This article aims to equip you with tools to quantify that objective, by helping you measure how close (or far) your customer base is from that ideal state. More importantly, it offers a starting point to understand why some Promoters are turning into Detractors, which is valuable insight for identifying areas that need improvement, and why some Detractors are becoming Promoters, highlighting what parts of the business are working exceptionally well.