Example and analysis borrowed from this paper. These data show aggregated responses provided by 13 women who were asked to rank their preference for spending leisure time with (1: male; 2: female; 3: both sexes).
Each column is an option that was ranked. Each row is a possible rank order. And the freq_ranking
is the number of times this rank order appeared in the data.
d <- leisure.black %>%
rename(male = X1,
female = X2,
both_sexes = X3,
freq_ranking = n)
d %>% kable()
male | female | both_sexes | freq_ranking |
---|---|---|---|
1 | 2 | 3 | 1 |
1 | 3 | 2 | 1 |
2 | 1 | 3 | 0 |
2 | 3 | 1 | 5 |
3 | 1 | 2 | 0 |
3 | 2 | 1 | 6 |
So if we look at the last row of the table, this tells us that the ranking Male:3, Female:2, and Both:1 occurred six times in the dataset.
The goal is provide a set of numbers that clearly communicates the central tendency of people’s preferences. There seem to be three common stats presented for rank data, which each answer a different question:
The first thing to do is compute the popularity of an option using the mean rank attributed to an object. We can think of this as a weighted mean where we take the mean of all of the rankings for that option but weighted by the frequency of that option being ranked at that rank
# popularity of the male option
stats::weighted.mean(x = d$male, w = d$freq_ranking)
## [1] 2.307692
There’s an R function destat()
that will compute all of the mean ranks from an aggregated data set
tibble(
option = c('male', 'female', 'both'),
mean_rank = destat(d)$mean.rank
) %>%
kable(digits = 2)
option | mean_rank |
---|---|
male | 2.31 |
female | 2.46 |
both | 1.23 |
The mean ranks tell us that the “both” option was clearly most preferred, and there is no strong preference between the other two objects.
This measure tells us how many times a given object was ranked higher than each of the other objects.
pairwise_matrix <- destat(d)$pair
# clean up table and print
colnames(pairwise_matrix) <- c("male", "female", "both")
rownames(pairwise_matrix) <- c("male", "female", "both")
pairwise_matrix %>% kable()
male | female | both | |
---|---|---|---|
male | 0 | 7 | 2 |
female | 6 | 0 | 1 |
both | 11 | 12 | 0 |
Using this information, we can say things like:
the option both was ranked higher than male 11 times, higher than female 12 times.
This measure tells us the number of times people ranked an object at particular ranking spot.
marginal_matrix <- destat(d)$mar
# clean up table and print
colnames(marginal_matrix) <- c("1", "2", "3")
rownames(marginal_matrix) <- c("male", "female", "both")
marginal_matrix %>% kable()
1 | 2 | 3 | |
---|---|---|---|
male | 2 | 5 | 6 |
female | 0 | 7 | 6 |
both | 11 | 1 | 1 |
Using this information, we can say things like:
the option female was ranked first 0 times, second 7 times, and third 6 times.