library(tidyverse)
library(tidylog)

Recent work in the field of emotion regulation has directed increasing attention to concepts related to emotion regulation (ER) flexibility. ER flexibility refers to the practice of dynamically adapting one’s ER strategy choices according to the context of the emotion. Under the umbrella of ER flexibility-related constructs, several researchers have discussed the idea of emotion regulation diversity. Similar to ER flexibility, ER diversity entails using a variety of different ER strategies, but in contrast to ER flexibility, ER diversity is agnostic to the contextual fit of the strategies. Thus, ER diversity may be a marker of ER flexibility even if not completely equivalent.

A recent paper from Wen et al., (2021) has proposed a new ER diversity metric which is based on a formula from the field of ecology, known as the Shannon Diversity Index, that is used to assess biodiversity. Both measures derive are instantiations of the formula for entropy, which is a concept coming from the field of information theory. However, the exact implementation of the ER diversity metric differs in subtle but important ways from the measures used in ecology and information theory.

Both the ER diversity measure and the Shannon Diversity Index is said to capture the evenness and richness of the distribution (of ER strategy use and biological species respectively). Evenness refers to the how similar in magnitude the different elements of the distribution are. If someone uses three different ER strategies to a similar degree, they would have a distribution high in evenness. If an environment has three species of animals and similar quantities of each species, the distribution would be said to be high in evenness.

Richness, however, is being used in a different way in the two contexts. In the context of ER diversity as it is currently defined, richness refers to how highly an individual rates the usage of different strategies. If someone rates 5/7 on three different strategies, they are said to have more richness in strategy use than if someone rates 2/7 on three different strategies. In the context of biodiversity, richness refers to the number of different species present in the environment. An environment with only elephants and butterflies would be less rich than an environment with elephants, butterflies, and manatees. A direct mapping from the use of richness in the ecology literature to ER literature would say richness quantifies the number of different strategies that an individual uses. An person using reappraisal, distraction, and suppression would be higher in richness than a person using reappraisal and distraction.

In the following exposition, I propose a minor alteration to the computation of ER diversity which brings it in closer alignment with the measures used in ecology and information theory.

Existing Metric - “Max Likert”

The diversity metric defined in Wen et al., (2021) defines ER diversity as follows:

\(H = -\sum_{i=1}^s p_i \cdot \ln(p_i)\)

In this method, we define \(p_i\) as a fraction where the numerator is the rating that someone reported for a given strategy \(i\) and the denominator is the sum of the maximum values on the Likert scale for each strategy \(i\) over a total of \(s\) strategies. Therefore, if someone rated rumination as \(2\) out of \(4\) and reappraisal as \(3\) out of \(4\), the denominator of the two \(p_i\)’s would be \(8\), i.e., \(4+4\). The value of \(p_{i}\) for rumination would be \(\frac{2}{8}\) and the value of \(p_{i}\) for reappraisal would be \(\frac{3}{8}\). To get the diversity metric \(H\), you would take each \(p_i\), (i.e., \(p_{rumination}\) and \(p_{reappraisal}\)), multiply it by its log, and then add up all of those products and reverse the sign (as indicated by the negative before the summation symbol).

As we continue, we’ll call this method of calculating \(H\) the “max Likert” method.

Modified Metric - “Sum Ratings”

Another way of calculating \(H\), which we’ll call the “sum ratings” method, could involve a different denominator of \(p_i\). In the max Likert method, we computed the denominator as being the sum of the maximum values of the Likert scale for each strategy. I propose that the denominator should be the sum of the ratings that the participant reported for each strategy. Thus, in the example in the previous section, \(p_i\) for rumination would be \(\frac{2}{5}\) and \(p_i\) for reappraisal would be \(\frac{3}{5}\) because the two ratings the participant provided, i.e., \(2\) and \(3\), sum to \(5\).

What does this mean for the properties of H?

This slight difference may seem minor and inconsequential, but actually results in at least two important different mathematical properties between the two formulations.

The first difference is that the \(p_i\) values that you sum across in the max Likert metric always sum to a total of \(1\). This is also true in both the original Shannon entropy formula (because the \(p_i\) values represent values of a categorical probability distribution and therefore must sum to \(1\)) and in the biodiversity metric (where the \(p_i\) values represent proportions of a whole population and thus also sum to \(1\)).

The second difference is that with the sum ratings method, an increase in the reported value of one strategy results in a decrease in the \(p_i\) for the other strategies. To see this, compare a case where someone reported \(2\) on rumination versus a case where someone reported \(4\) on rumination. If someone reported \(2\) on rumination, then \(p_i\) for rumination equals \(\frac{2}{5}\) and \(p_i\) for reappraisal equals \(\frac{3}{5}\). If someone reported the same reappraisal value, (i.e., \(3\)), but reported a \(4\) on rumination instead, then \(p_i\) for rumination equals \(\frac{4}{7}\) and \(p_i\) for reappraisal equals \(\frac{3}{7}\). As you can see, \(p_i\) for reappraisal is lower when the rating on rumination increases even though the rating for reappraisal has stayed the same. This feature is not present in the max Likert method.

What does this mean for interpretation of H?

There are two main implications for the interpretation of \(H\) that arise out of the differences between formulations.

First, the max Likert formulation combines information about both theamount of strategy use and the evenness of strategy use into the \(H\) metric whereas the sum ratings metric allows you to separate out the effect of evenness of strategy use from the effect of amount of strategy use. Separating these two constructs is advantageous because we know already that the degree to which someone uses an ER strategy is deeply consequential. In order to understand whether the evenness with which someone uses different strategies is important, we need to separate evenness from amount of strategy use. The sum ratings method achieves this by having a self-balancing component to the resulting value of \(H\). When one strategy rating increases, it pushes down the \(p_i\) values for the other strategies. This self-balancing component means that the participant does not obtain higher values of \(H\) just because they reported more a greater magnitude of strategy use.

Second, the max Likert method actually does not behave in the way that it aims to do under certain conditions. The max Likert formulation claims to increase as a function of increasing strategy use and evenness. However, with an \(s\) of \(2\) (i.e., when there are \(2\) strategies), the value of \(H\) does not monotonically increase. In other words, the relationship between amount of strategy use and the value of \(H\) is positive for some values of strategy use (as expected) and negative for other values of strategy use (the opposite of what we expect). Not only does this make the max Likert metric an erroneous measure of amount of strategy use, it also violates one of the three axioms from which the formula of Shannon entropy was originally derived.

Now let’s turn to an illustration of the monotonicity of two metrics under different conditions. First, I will create two different R functions which compute each of the two metrics. In all simulations, we will use a 7-point Likert scale.

MAX_LIKERT = 7

H_maxLikert <- function(...){
  ratings = c(...)
   df_ratings = tibble(
    rating = ratings,
    denominator = length(ratings) * MAX_LIKERT,  # Key difference 
    pr = rating/denominator,
    ln_pr = log(pr),
    multiplied = pr*ln_pr,
    H = -sum(multiplied)
  ) 
  H = -sum(df_ratings$multiplied)
  return(H)
}

H_sumRatings <- function(...){
  ratings = c(...)
  df_ratings = tibble(
    rating = ratings,
    denominator = sum(ratings), # Key difference 
    pr = rating/denominator,
    ln_pr = log(pr),
    multiplied = pr*ln_pr,
    H = -sum(multiplied)
  ) 
  H = -sum(df_ratings$multiplied)
  return(H)
}

Below I am plotting the value of \(H\) using the two methods in a case where we have 2 strategies (i.e., \(s=2\)). For simplicity of plotting, I am plotting only the cases where the 2 strategies are rated with equal values. Therefore, the x-axis represents the rating for both of the two strategies simultaneously. The two points described above regarding the interpretation of H are visible. First, you can see the the modified \(H\) is unaffected by the amount of strategy use. Second, you can see that existing \(H\) metric produces higher \(H\) values only up to a certain point and then begins to produce lower \(H\) values as a function of strategy ratings.

df_h <- expand.grid(s1 = seq(1, 7), 
                    s2 = seq(1, 7)) %>% 
  mutate(H_sumRatings = pmap_dbl(list(s1, s2), H_sumRatings)) %>% 
  mutate(H_maxLikert = pmap_dbl(list(s1, s2), H_maxLikert)) 

df_h %>% 
  pivot_longer(c(H_sumRatings, H_maxLikert), 
               names_to = "method",
               values_to = "H") %>% 
  filter(s1 == s2) %>% 
  ggplot(aes(x = s1, y = H)) + 
  geom_point() +
  geom_line() + 
  facet_wrap(~method) +
  labs(x = "Strategy value (equal across 2 strategies)")

Yet another way of calculating ER diversity

In the previous sections, I have described a method of computing ER diversity where I (1) aimed to modify the existing measure of ER diversity as little as possible, and (2) aimed to propose something that integrates well with the kinds of study designs which are commonly employed in our field. However, it is still not the most direct mapping from the Shannon biodiversity measure or Shannon’s entropy and it is possible that more faithful mappings of the construct could be more informative to the things we care about.

Both Shannon’s entropy and Shannon’s biodiversity index are formulas which are designed to assess the properties of a categorical distribution. Entropy is applied to many kinds of categorical distributions across many fields from physics to computer science to statistics. Shannon’s biodiversity index seems at first glance to not be applied to a categorical distribution, but it is as well. By considering the counts of animals in an environment as proportions of the total number of animals, you can think about the \(p_i\) values as being the probability of getting animal \(i\) if you drew a random animal out of the environment. In this way, we have converted counts of different animals into a categorical distribution.

In the case of ER diversity, it is not clear what the “categorical distribution” we are characterizing is. Even if we calculate ER diversity using the sum ratings method, the categorical distribution framing of the problem would be akin to saying “if we draw a random rating scale point of emotion regulation strategy from a distribution of emotion regulation strategy rating points, what is the probability we get a rating scale point from strategy \(i\)?” It is not clear what it means to draw a random scale point from a distribution of scale points. Another way of thinking about ER diversity, which treats the formula as a characterization of a categorical distribution is to measure strategies as binary values that represent whether someone did or did not use the strategy in a given instance. You could collect a series of observations of emotional episodes with binary ratings of ER strategy use, perhaps in the context of an experience sampling study, and then compute \(p_i\) to indicate the probability of using a strategy or not using a strategy. Note that you would want to make sure that the sum of all the \(p_i\)s sum to \(1\). There are different ways you could construct the study and the computation of \(H\) to achieve this.

ER diversity metric

Ashish

2024-03-21

Existing Metric - “Max Likert”

Modified Metric - “Sum Ratings”

What does this mean for the properties of H?

What does this mean for interpretation of H?

Yet another way of calculating ER diversity