This tutorial is based on Field et al. (2012). Reliability analysis can be done with the alpha()
function in the psych
package.
Load packages and data:
library(dplyr)
#url contains the data set
url <- "http://www.uk.sagepub.com/dsur/study/DSUR%20Data%20Files/Chapter%2017/raq.dat"
dat <- read.table(url, header = TRUE)
This dataset is a questionnaire with 23 items with four subscales measuring different types of fear:
tbl_df(dat)
## Source: local data frame [2,571 x 23]
##
## Q01 Q02 Q03 Q04 Q05 Q06 Q07 Q08 Q09 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18
## 1 4 5 2 4 4 4 3 5 5 4 5 4 4 4 4 3 5 4
## 2 5 5 2 3 4 4 4 4 1 4 4 3 5 3 2 3 4 4
## 3 4 3 4 4 2 5 4 4 4 4 3 3 4 2 4 3 4 3
## 4 3 5 5 2 3 3 2 4 4 2 4 4 4 3 3 3 4 2
## 5 4 5 3 4 4 3 3 4 2 4 4 3 3 4 4 4 4 3
## 6 4 5 3 4 2 2 2 4 2 3 4 2 3 3 1 4 3 1
## 7 4 3 3 4 4 4 4 4 3 4 4 4 4 4 4 4 4 4
## 8 4 4 3 4 4 4 4 4 2 4 4 3 4 4 3 4 4 4
## 9 3 3 5 2 1 3 1 1 3 3 1 1 1 1 1 1 1 1
## 10 4 2 2 3 4 5 4 4 3 4 4 3 4 5 4 3 4 4
## .. ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
## Variables not shown: Q19 (int), Q20 (int), Q21 (int), Q22 (int), Q23 (int)
Create a new variable to store items for each subscale. This makes things easier later on. I’m using select()
from dplyr
package to select the variables/columns.
computerFear <- select(dat, 6, 7, 10, 13, 14, 15, 18) #each number refers to the column
statsFear <- select(dat, 1, 3, 4, 5, 12, 16, 20, 21)
mathsFear <- select(dat, 8, 11, 17)
peerFear <- select(dat, 2, 9, 19, 22, 23)
To do the reliability analysis, you’ll need to load the psych
package and use the alpha
function. However, ggplot2
also has a function called alpha
. If you’ve loaded ggplot2
, the alpha
function in ggplot2
will be called instead. If that happens, you can specify the package using psych::alpha()
.
library(psych)
If your scale contains items that are reversed scored, you need to specify them. The keys
argument allows you to specify which items are reverse-scored:
alpha(statsFear, keys = c(1, -1, 1, 1, 1, 1, 1, 1)) #Q03, which is item 2 in the statsFear subscale is reverse-scored
If all your items are positively scored (not reverse-scored), you can do the following to do your reliability analyses. It is assumed that all items are positively scored, so you don’t have to specify anything.
alpha(computerFear)
alpha(mathsFear)
alpha(peerFear)
alpha(computerFear)
##
## Reliability analysis
## Call: alpha(x = computerFear)
##
## raw_alpha std.alpha G6(smc) average_r S/N ase mean sd
## 0.82 0.82 0.81 0.4 4.6 0.0094 3.4 0.71
##
## lower alpha upper 95% confidence boundaries
## 0.8 0.82 0.84
##
## Reliability if an item is dropped:
## raw_alpha std.alpha G6(smc) average_r S/N alpha se
## Q06 0.79 0.79 0.77 0.38 3.7 0.011
## Q07 0.79 0.79 0.77 0.38 3.7 0.011
## Q10 0.82 0.82 0.80 0.44 4.7 0.010
## Q13 0.79 0.79 0.77 0.39 3.8 0.011
## Q14 0.80 0.80 0.77 0.39 3.9 0.011
## Q15 0.81 0.81 0.79 0.41 4.2 0.011
## Q18 0.79 0.78 0.76 0.38 3.6 0.011
##
## Item statistics
## n raw.r std.r r.cor r.drop mean sd
## Q06 2571 0.75 0.74 0.68 0.62 3.8 1.12
## Q07 2571 0.75 0.73 0.68 0.62 3.1 1.10
## Q10 2571 0.54 0.57 0.44 0.40 3.7 0.88
## Q13 2571 0.72 0.73 0.67 0.61 3.6 0.95
## Q14 2571 0.70 0.70 0.64 0.58 3.1 1.00
## Q15 2571 0.64 0.64 0.54 0.49 3.2 1.01
## Q18 2571 0.76 0.76 0.72 0.65 3.4 1.05
##
## Non missing response frequency for each item
## 1 2 3 4 5 miss
## Q06 0.06 0.10 0.13 0.44 0.27 0
## Q07 0.09 0.24 0.26 0.34 0.07 0
## Q10 0.02 0.10 0.18 0.57 0.14 0
## Q13 0.03 0.12 0.25 0.48 0.12 0
## Q14 0.07 0.18 0.38 0.31 0.06 0
## Q15 0.06 0.18 0.30 0.39 0.07 0
## Q18 0.06 0.12 0.31 0.37 0.14 0
What do the summary statistics mean?
How to interpret ‘Reliability if an item is dropped’?
How to interpret ‘Item statistics’?
?alpha
for details)mean and sd: mean and sd of the scale if that item is dropped
All items should correlate the the total score, so we’re looking for items that don’t correlate with the overall score from the scale. If r.drop values are less than about .3, it means that particular item doesn’t correlate very well with the scale overall.
How to interpret the final frequency table?
alpha(statsFear, keys = c(1, -1, 1, 1, 1, 1, 1, 1)) #Q03, which is item 2 in the statsFear subscale is reverse-scored
##
## Reliability analysis
## Call: alpha(x = statsFear, keys = c(1, -1, 1, 1, 1, 1, 1, 1))
##
## raw_alpha std.alpha G6(smc) average_r S/N ase mean sd
## 0.82 0.82 0.81 0.37 4.7 0.0089 3 0.64
##
## lower alpha upper 95% confidence boundaries
## 0.8 0.82 0.84
##
## Reliability if an item is dropped:
## raw_alpha std.alpha G6(smc) average_r S/N alpha se
## Q01 0.80 0.80 0.79 0.37 4.1 0.0101
## Q03- 0.80 0.80 0.79 0.37 4.1 0.0101
## Q04 0.80 0.80 0.78 0.36 4.0 0.0102
## Q05 0.81 0.81 0.80 0.38 4.2 0.0099
## Q12 0.80 0.80 0.79 0.36 4.0 0.0102
## Q16 0.79 0.80 0.78 0.36 3.9 0.0103
## Q20 0.82 0.82 0.80 0.40 4.6 0.0096
## Q21 0.79 0.80 0.78 0.36 3.9 0.0104
##
## Item statistics
## n raw.r std.r r.cor r.drop mean sd
## Q01 2571 0.65 0.67 0.60 0.54 3.6 0.83
## Q03- 2571 0.69 0.67 0.60 0.55 2.6 1.08
## Q04 2571 0.69 0.70 0.64 0.58 3.2 0.95
## Q05 2571 0.63 0.63 0.55 0.49 3.3 0.96
## Q12 2571 0.69 0.69 0.63 0.57 2.8 0.92
## Q16 2571 0.71 0.71 0.67 0.60 3.1 0.92
## Q20 2571 0.58 0.56 0.47 0.42 2.4 1.04
## Q21 2571 0.72 0.71 0.67 0.61 2.8 0.98
##
## Non missing response frequency for each item
## 1 2 3 4 5 miss
## Q01 0.02 0.07 0.29 0.52 0.11 0
## Q03 0.03 0.17 0.34 0.26 0.19 0
## Q04 0.05 0.17 0.36 0.37 0.05 0
## Q05 0.04 0.18 0.29 0.43 0.06 0
## Q12 0.09 0.23 0.46 0.20 0.02 0
## Q16 0.06 0.16 0.42 0.33 0.04 0
## Q20 0.22 0.37 0.25 0.15 0.02 0
## Q21 0.09 0.29 0.34 0.26 0.02 0
How to interpret?
This demonstrates what happens when you forget to reverse your items using the keys
parameter:
#alpha(statsFear, keys = c(1, -1, 1, 1, 1, 1, 1, 1)) #Q03, which is item 2 in the statsFear subscale is reverse-scored
alpha(statsFear, keys = c(1, 1, 1, 1, 1, 1, 1, 1))
##
## Reliability analysis
## Call: alpha(x = statsFear, keys = c(1, 1, 1, 1, 1, 1, 1, 1))
##
## raw_alpha std.alpha G6(smc) average_r S/N ase mean sd
## 0.61 0.64 0.71 0.18 1.8 0.014 3.1 0.5
##
## lower alpha upper 95% confidence boundaries
## 0.58 0.61 0.63
##
## Reliability if an item is dropped:
## raw_alpha std.alpha G6(smc) average_r S/N alpha se
## Q01 0.52 0.56 0.64 0.15 1.3 0.017
## Q03 0.80 0.80 0.79 0.37 4.1 0.010
## Q04 0.50 0.55 0.64 0.15 1.2 0.017
## Q05 0.52 0.57 0.66 0.16 1.3 0.017
## Q12 0.52 0.56 0.65 0.15 1.3 0.017
## Q16 0.51 0.55 0.63 0.15 1.2 0.017
## Q20 0.56 0.60 0.68 0.18 1.5 0.016
## Q21 0.50 0.55 0.63 0.15 1.2 0.017
##
## Item statistics
## n raw.r std.r r.cor r.drop mean sd
## Q01 2571 0.65 0.68 0.62 0.51 3.6 0.83
## Q03 2571 -0.35 -0.37 -0.64 -0.55 3.4 1.08
## Q04 2571 0.69 0.69 0.65 0.53 3.2 0.95
## Q05 2571 0.65 0.65 0.57 0.47 3.3 0.96
## Q12 2571 0.66 0.67 0.62 0.50 2.8 0.92
## Q16 2571 0.69 0.70 0.66 0.53 3.1 0.92
## Q20 2571 0.57 0.55 0.45 0.35 2.4 1.04
## Q21 2571 0.70 0.70 0.66 0.54 2.8 0.98
##
## Non missing response frequency for each item
## 1 2 3 4 5 miss
## Q01 0.02 0.07 0.29 0.52 0.11 0
## Q03 0.03 0.17 0.34 0.26 0.19 0
## Q04 0.05 0.17 0.36 0.37 0.05 0
## Q05 0.04 0.18 0.29 0.43 0.06 0
## Q12 0.09 0.23 0.46 0.20 0.02 0
## Q16 0.06 0.16 0.42 0.33 0.04 0
## Q20 0.22 0.37 0.25 0.15 0.02 0
## Q21 0.09 0.29 0.34 0.26 0.02 0
What’s wrong?
The fear of computers subscale was reliable, α = .82, but the fear of negative peer evaluation subscale had relatively low reliability, α = .57.