Likert scales were introduced by the social psychologist Rensis Likert. They are ways of quantifying feelings. Sometimes this is like and dislike but they are also often used in a medical context to assess pain.
They can be on a continuous line where user pick a point anywhere along that line that can score from 0 to 5 or 0 to 10 or even 0 to 100. Alternatively they can have fixed values to choose between.
Although they are numerical there is a debate about whether they are quantitative data. Social scientists are more likely to treat them in a quantitative way and statisticians and mathematicians in a qualitative way.
I prefer to treat them as qualitative because they are not metric.
Metric Measures
This does not mean following the SI system. What I mean is that the distances between measures are consistent. When you use a Likert scale are the distances along the scale equal? Is there the same different between liked a lot and liked and between no strong opinion and didn’t like? The answer is that we do not know but most likely not. Even on pain the way that signalling and the way the brain interprets these signals works means that the scale cannot be considered consistent either.
When I take a tape measure the distance between 1cm and 2cm on the scale is the same as between 50cm and 51cm on the scale. With Likert scales that is not true.
Combining Measures
However I can combine multiple Likert scores to create a composite measurement that is going to be approximately metric. I can also assess this metric to make sure that it has the properties that I want and the consistency that I expect using Kappa scores. This is what often happens with psychometric testing. This happens again because of the central limit theorem that I mentioned briefly in the section about sample distributions with the second set of data on volume measurements. The easiest way to demonstrate this is with a simulation. The alternative is a practical apparatus called a Galton Board.
A Simulation/Experiment
Imagine the simplest possible Likert scale which is a choice between yes and no. This is binary, the same as a flip of a coin. It can either be 1 - yes or 0 no.
This is an example of what statisticians call a Bernoulli trial. Now imagine I have a questionnaire and it is made up of these yes and no questions. The highest score is if I say yes to everything which will be n - the number of questions. The lowest score will be 0 if I said no to everything. Then there will be a collection of numbers in between.
If there are two questions I can have the scores 0, 1 or 2. But there are 2 ways of getting 1. I could have answered yes to the first question or the second question.
When we combine multiple Bernoulli trials like this we get a binomial distribution. This was historically very important in gambling and games of chance.
For the simple experiment we are going to say that the answers to the questions are completely random. It is the same as flipping a coin to answer each question. There is probability of 0.5 of picking yes and the same probability of picking no.
Now I can plot the distributions for the scores for different numbers of questions. We already worked it out for two questions, where it produces heights 1,2,1 now we will calculate it for 5,10 and 20 questions.
set.seed=(1234)bin1 <-rbinom(10000, 2, 0.5)hist(bin1, xlab="Score", main="Histogram of Questionaire Scores for Two Questions")
bin2 <-rbinom(10000, 5, 0.5)hist(bin2, xlab="Score", main="Histogram of Questionaire Scores for Five Questions")
bin3 <-rbinom(10000, 10, 0.5)hist(bin3, xlab="Score", main="Histogram of Questionaire Scores for Ten Questions")
bin4 <-rbinom(10000, 20, 0.5)hist(bin4, xlab="Score", main="Histogram of Questionaire Scores for Twenty Questions")
mean(bin4)
[1] 10.0198
When you get to 20 questions you have a mean close to 10 and the data looks very similar to a normal distribution. If you were to combine those 20 yes/no questions you would get a valid score which is quantitative and this is what psychometric tests often do. If they are used in that way then I have no issue with Likert data being used as quantitative data. BUT YOU CANNOT DO THIS WITH A SINGLE QUESTION which is still qualitative even if it is on a continuous scale.
Summarising the Likert Data from a Student Survey
You can either tabulate the data or you can create barcharts but you cannot create means, standard deviations or variances.
library("dplyr")
Attaching package: 'dplyr'
The following objects are masked from 'package:stats':
filter, lag
The following objects are masked from 'package:base':
intersect, setdiff, setequal, union
# Read in the csv file to the dataframe dfss2dfss2 <-read.csv("Class_Survey1.csv")# Simplify the column namescolnames(dfss2) <-c("Gender", "Height", "Col.", "Month","Sib.","Imp1", "Imp2", "Pos")table(dfss2$Imp1)
1 2 3 4 5
31 57 334 699 492
barplot(table(dfss2$Imp1), main="Importance of Statistics to your Career")
table(dfss2$Imp2)
1 2 3 4 5
12 26 208 625 742
barplot(table(dfss2$Imp2), main="Importance of Statistics to your Degree")
table(dfss2$Pos)
1 2 3 4 5
115 254 607 448 189
barplot(table(dfss2$Pos), main="How Positive do you Feel About Statistics?")
You can also create cross-tables using multiple categorical variables. For example if we want to see if there is a difference between genders in how they feel about statistics.
tab1 <-table(dfss2$Gender, dfss2$Imp1)tab1
1 2 3 4 5
Female 23 43 259 549 373
Male 7 13 73 146 119
Prefer not to say 1 1 2 4 0
tab2 <-table(dfss2$Gender, dfss2$Imp2)tab2
1 2 3 4 5
Female 9 17 154 491 576
Male 2 7 53 132 164
Prefer not to say 1 2 1 2 2
tab3 <-table(dfss2$Gender, dfss2$Pos)tab3
1 2 3 4 5
Female 87 202 486 338 134
Male 24 49 120 110 55
Prefer not to say 4 3 1 0 0
prop1 <-round(prop.table(tab1, 1),3)prop1
1 2 3 4 5
Female 0.018 0.034 0.208 0.440 0.299
Male 0.020 0.036 0.204 0.408 0.332
Prefer not to say 0.125 0.125 0.250 0.500 0.000
prop2 <-round(prop.table(tab2, 1),3)prop2
1 2 3 4 5
Female 0.007 0.014 0.123 0.394 0.462
Male 0.006 0.020 0.148 0.369 0.458
Prefer not to say 0.125 0.250 0.125 0.250 0.250
prop3 <-round(prop.table(tab3, 1),3)prop3
1 2 3 4 5
Female 0.070 0.162 0.390 0.271 0.107
Male 0.067 0.137 0.335 0.307 0.154
Prefer not to say 0.500 0.375 0.125 0.000 0.000
With unbalanced numbers in the different genders it is difficult to compare absolute numbers and it is easier to compare the proportions. Looking at the proportions there doesn’t seem to be any significant differences in the responses between the different genders, except perhaps for the highest level of confidence which is higher than men among women.
The numbers in the prefer not to say category are too small to draw any conclusions.
You can also use R to make formatted tables that are easier to read.
library("kableExtra")
Attaching package: 'kableExtra'
The following object is masked from 'package:dplyr':
group_rows