1 Background

The aim of this notebook is to show how Likert scales can be used as quantitative values. I am a journal editor and also I am pedantic about not using Likert scores as quantitative variables. They are categorical variables and they cannot be averaged.

I was sent a paper where a group had used Likert scales as a quantitative variable and used this data to calculate means and other statistics. I rejected the paper and sent my rejection back to the authors. They replied saying that I had been unfair and that this was standard procedure in their field. My view was that it might be but statistically it is still wrong.

I looked at the other papers that they included in their argument and I thought about how Likerts worked and I did some deeper reading to find that there are situations when Likert scores can be treated as quantitative data. That happens when you combine the results from multiple questions. This is what psychologist and social scientists often do to get a composite measure of some quality such as well being or happiness.

What I am going to do here is use simulations to show why these composites are quantitative and normally distributed even if the original Likert data is not.

2 Simple Example

Imagine that I have a questionnaire that has simply yes and no answers. If I add up the total number of yes answers I can get a score. For one question the possible scores are 0 and 1. This is what statisticians call a Bernoulli trial. Assuming that 0 and 1 are equally likely the probability of 0 is 0.5 and of 1 is 0.5 and the mean score will be 0.5 x 0 + 0.5 x 1 = 0.5.

Now if there are two questions and again the probability of answering yes and no is 0.5 there are 3 possible scores, 0,1,2. The probability of getting 0 is 0.5 x 0.5, which is the same as the probability of getting 1 and probability of getting 2. But there are two ways of getting 1 either yes then no or no then yes. So the mean score will be 0.25 x 0 + 2 x 0.25 x 1 + 0.25 x 2 = 1

The Median score is also one and as there are 2 ways of scoring 1 and only one way of scoring 0 and 2 which means that 1 is also the Mode.

Mean = Median = Mode are the properties of the normal distribution. What we are creating as we add more questions is the binomial distribution which if there were an infinite number of questions would become the normal distribution.

The following R code simulates this for 100 questionaires.

#Simple True or False for 2 questions averaged.
x <- vector()
for (i in 1:100){
  sample <- sample(c(0,1), size=2, replace=TRUE)
  y <- sum(sample)
  x[i] <- y
}
hist(x, main="Histogram of the Total for 100 random selections of size two from 0 and 1", xlab="Total", ylab="Frequency")

hist(x, main="Histogram of the Total for 100 random selections of size two from 0 and 1", xlab="Total", ylab="Density", freq=FALSE)

I have drawn the histograms with both frequency and more correctly although less often used density on the y-axis.

I can extend this example to a questionnaire with 7 yes or no answers and average the results of 1000 questionaires

#Simple True or False for 7 questions averaged.
x <- vector()
for (i in 1:1000){
  sample <- sample(c(0,1), size=7, replace=TRUE)
  y <- sum(sample)
  x[i] <- y
}
hist(x, main="Total for 1000 random selections of size seven from 0 and 1", xlab="Total", ylab="Frequency")

hist(x, main="Total for 1000 random selections of size seven from 0 and 1", xlab="Total", ylab="Density", freq=FALSE)

This example used values for Yes and No of 0 and 1 but what happens if I use other values for Yes/No or True/False?

#Simple True or False for 7 questions averaged but with different values for F and T
x <- vector()
for (i in 1:1000){
  sample <- sample(c(3,5), size=7, replace=TRUE)
  y <- sum(sample)
  x[i] <- y
}
hist(x, main="Total for 1000 random selections of size seven from 3 and 5", xlab="Total", ylab="Frequency")

hist(x, main="Total for 1000 random selections of size seven from 3 and 5", xlab="Total", ylab="Density", freq=FALSE)

The shape of the distribution remains the same but the curve shifts to the right on the number line and becomes more widely spread.

3 True Likert Scales

Typically Likert scales will have 5 or 7 levels. More than this and it is difficult to carry out chi-squared test analysis except for very large samples. The number of categories is normally odd so that there is a mid point where there is no preference given to either extreme of the scale.

Again I am going to use a simulation in R to show the distribution of the total scores from the combination of a number of questions. the first is for a 5 scale Likert averaged over 7 questions

#Simple 5 value Likert for 7 questions averaged but with a Likert Scale of 1-5
x <- vector()
for (i in 1:1000){
  sample <- sample(c(1:5), size=7, replace=TRUE)
  y <- sum(sample)
  x[i] <- y
}
hist(x, main="Sum for 1000 Random Selections of 7 questions From a 5 Value Likert", xlab="Total", ylab="Frequency")

hist(x, main="Sum for 1000 Random Selections of 7 questions From a 5 Value Likert", xlab="Total", ylab="Density", freq=FALSE)

You get a very clear bell shaped curve. This is a binomial and not a normal curve as it is discrete but you can see that now it is safe to treat the sum as a quantitative score from which you can calculate means and standard deviations.

This is an example of something called the central limit theorem. Which says that when you calculate any statistic - in this case the sum of the score then the distribution of that statistic will be normally distributed. The power of this theorem cannot be under-estimated. Almost all of statistics is based upon it.

The next example uses a 7 scale Likert and 7 questions

#Simple 7 value Likert for 7 questions averaged but with a Likert Scale of 1-7
x <- vector()
for (i in 1:1000){
  sample <- sample(c(1:7), size=7, replace=TRUE)
  y <- sum(sample)
  x[i] <- y
}
hist(x, main="Sum for 1000 Random Selections of 7 questions From a 7 Value Likert", xlab="Total", ylab="Frequency")

hist(x, main="Sum for 1000 Random Selections of 7 questions From a 7 Value Likert", xlab="Total", ylab="Density", freq=FALSE)

Again you see the same pattern but for a 7 scale Likert we can reduce the number of questions to 3 and still produce a good approximation to a normal distribution.

#Simple 7 value Likert for 3 questions summed
x <- vector()
for (i in 1:1000){
  sample <- sample(c(1:7), size=3, replace=TRUE)
  y <- sum(sample)
  x[i] <- y
}
hist(x, main="Sum for 1000 Random Selections of 3 questions From a 7 Value Likert", xlab="Total", ylab="Frequency")

hist(x, main="Sum for 1000 Random Selections of 3 questions From a 7 Value Likert", xlab="Total", ylab="Density", freq=FALSE)

Again we need to check what happens if we set different scores for the Likert and not the numbers 1 to 7. In this case we have done even more to make it difficult by using irregular spacing between the scores.

#Simple 7 value Likert for 3 questions and irregular spaced values.
x <- vector()
for (i in 1:1000){
  sample <- sample(c(1,3,4,6,9,10,13), size=3, replace=TRUE)
  y <- sum(sample)
  x[i] <- y
}
hist(x, main="Sum for 1000 Random Selections of 3 questions From a 7 Value Likert Scale with Unequal Separations", xlab="Total", ylab="Frequency")

hist(x, main="Sum for 1000 Random Selections of 3 questions From a 7 Value Likert Scale with Unequal Separations", xlab="Total", ylab="Density", freq=FALSE)

No matter what we do we still end up with something that approximates to a normal distribution

4 Conclusion

As long as you average the scores from a small number of Likert questions to produce a composite score then you can treat then as quantitative variables. But you cannot treat a single Likert score as a quantitative variable as it is not metric and it is unlikely to be normally distributed (the main assumption is actually that all Likert scores are equally likely which is a uniform distribution)

Likert Scores as Quantitative Variables

1 Background

2 Simple Example

3 True Likert Scales

4 Conclusion