install.packages("psych")Item Analyses
Tutorial Objectives
For this tutorial, we will cover the following:
Review: loading packages
Review: loading data
Calculate a total test score
Review: Calculate descriptive statistics for test scores
Compute item statistics for Norm-Referenced Tests
Item Facility
Item Discrimination (point-biserial correlation)
If time: Compute item statistics for Criterion-Referenced Tests (not necessary for homework assignment)
Create a new variable for ‘mastery’
𝜙
B-index (advanced)
Getting Started
First, we will install a new package, psych, that we will use for item statistics:
Now, let’s load the two packages we will use for this tutorial:
library(tidyverse)
library(psych)Finally, we will load our test score data. This comes from a Korean Elicited Imitation Test (EIT, see Isbell & Son, 2021). It assesses overall Korean oral proficiency. For this tutorial, I converted the original item scores, which range from 0 to 4, to 0 or 1.
d <- read_csv("Korean_EIT_dichotomous.csv")Computing a Total Score
The data we just loaded only has item scores - no total score. To get started, we’ll try something new: computing a total score from a set of item scores. We will use some tidyverse functions to do this.
d <- d %>% mutate(EIT_total = rowSums(select(., I01:I30), na.rm = T))This line of code introduces three new things: the %>% (“pipe”) operator, the mutate() function, and the select() function. The pipe puts the thing on the left into the function on the right. So, we are putting our table of item score data into the mutate() function. The mutate() function creates new variables (or changes existing ones). We made a new variable called EIT_total, which is a sum of many variables in the same row (each row = 1 person).
But we can’t sum up everything in the row, because the ID variable isn’t an item score! So we used the select() function to only pick the item score variables when we calculated the total score.
Review: Describing Test Scores
To review what we learned last week, let’s calculate some descriptive statistics for the EIT total scores.
mean(d$EIT_total)[1] 12.84906
sd(d$EIT_total)[1] 8.829038
median(d$EIT_total)[1] 12
min(d$EIT_total)[1] 0
max(d$EIT_total)[1] 30
Remember, you can clean up your numbers using the round() function. It is good to use 2 decimal places for most things.
round(mean(d$EIT_total),2)[1] 12.85
For additional review, try creating a histogram of EIT total scores.
As an extra challenge, try computing z-scores and T-scores for the EIT.
NRT Item Statistics
We will use a function from the psych package, alpha(), to quickly generate NRT item statistics. This is also a function we will use in the future to calculate reliability, but for now we will focus on item statistics.
The alpha() function only works with item score data. Every other variable (like id, or EIT_total) should be left out. We’ll use the select() function to help us pick the right variables to put in alpha(). Let’s try it:
alpha(select(d, I01:I30))
Reliability analysis
Call: alpha(x = select(d, I01:I30))
raw_alpha std.alpha G6(smc) average_r S/N ase mean sd median_r
0.95 0.95 0.96 0.4 20 0.0038 0.43 0.29 0.41
95% confidence boundaries
lower alpha upper
Feldt 0.94 0.95 0.96
Duhachek 0.94 0.95 0.96
Reliability if an item is dropped:
raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
I01 0.95 0.95 0.96 0.41 20 0.0038 0.0093 0.41
I02 0.95 0.95 0.96 0.40 20 0.0038 0.0110 0.41
I03 0.95 0.95 0.96 0.40 19 0.0039 0.0118 0.41
I04 0.95 0.95 0.96 0.40 20 0.0038 0.0107 0.41
I05 0.95 0.95 0.96 0.40 19 0.0039 0.0116 0.41
I06 0.95 0.95 0.96 0.39 19 0.0040 0.0118 0.41
I07 0.95 0.95 0.96 0.40 19 0.0039 0.0116 0.41
I08 0.95 0.95 0.96 0.39 19 0.0040 0.0115 0.40
I09 0.95 0.95 0.96 0.39 19 0.0039 0.0117 0.41
I10 0.95 0.95 0.96 0.40 19 0.0039 0.0117 0.41
I11 0.95 0.95 0.96 0.40 19 0.0039 0.0114 0.41
I12 0.95 0.95 0.96 0.39 19 0.0040 0.0118 0.40
I13 0.95 0.95 0.96 0.40 19 0.0039 0.0110 0.41
I14 0.95 0.95 0.96 0.39 19 0.0040 0.0116 0.40
I15 0.95 0.95 0.96 0.40 19 0.0039 0.0110 0.41
I16 0.95 0.95 0.96 0.39 19 0.0039 0.0112 0.40
I17 0.95 0.95 0.96 0.39 19 0.0039 0.0113 0.41
I18 0.95 0.95 0.96 0.39 19 0.0040 0.0112 0.40
I19 0.95 0.95 0.96 0.40 19 0.0039 0.0109 0.41
I20 0.95 0.95 0.96 0.39 19 0.0040 0.0115 0.40
I21 0.95 0.95 0.96 0.39 19 0.0040 0.0113 0.40
I22 0.95 0.95 0.96 0.40 19 0.0039 0.0111 0.41
I23 0.95 0.95 0.96 0.39 19 0.0040 0.0111 0.40
I24 0.95 0.95 0.96 0.39 19 0.0040 0.0113 0.40
I25 0.95 0.95 0.96 0.39 19 0.0040 0.0112 0.40
I26 0.95 0.95 0.96 0.39 19 0.0040 0.0113 0.40
I27 0.95 0.95 0.96 0.39 19 0.0040 0.0114 0.41
I28 0.95 0.95 0.96 0.39 19 0.0040 0.0114 0.40
I29 0.95 0.95 0.96 0.39 19 0.0040 0.0114 0.40
I30 0.95 0.95 0.96 0.40 19 0.0039 0.0117 0.41
Item statistics
n raw.r std.r r.cor r.drop mean sd
I01 318 0.39 0.38 0.35 0.34 0.74 0.44
I02 318 0.51 0.50 0.48 0.47 0.71 0.45
I03 318 0.60 0.59 0.58 0.56 0.56 0.50
I04 317 0.47 0.47 0.44 0.43 0.84 0.37
I05 318 0.61 0.60 0.59 0.58 0.67 0.47
I06 318 0.67 0.66 0.65 0.63 0.54 0.50
I07 318 0.63 0.62 0.61 0.59 0.62 0.49
I08 318 0.68 0.67 0.66 0.65 0.60 0.49
I09 318 0.67 0.66 0.65 0.63 0.60 0.49
I10 318 0.64 0.63 0.61 0.60 0.64 0.48
I11 318 0.64 0.65 0.63 0.61 0.24 0.43
I12 318 0.68 0.68 0.66 0.65 0.50 0.50
I13 318 0.53 0.54 0.52 0.50 0.12 0.32
I14 318 0.69 0.69 0.68 0.66 0.37 0.48
I15 318 0.60 0.62 0.61 0.58 0.18 0.38
I16 318 0.66 0.67 0.65 0.63 0.21 0.41
I17 318 0.66 0.67 0.66 0.63 0.21 0.41
I18 318 0.72 0.72 0.71 0.69 0.31 0.47
I19 318 0.58 0.60 0.59 0.56 0.14 0.35
I20 318 0.70 0.70 0.69 0.67 0.31 0.46
I21 318 0.73 0.73 0.72 0.70 0.46 0.50
I22 318 0.61 0.62 0.61 0.58 0.16 0.37
I23 318 0.76 0.76 0.75 0.73 0.36 0.48
I24 318 0.72 0.72 0.71 0.70 0.37 0.48
I25 318 0.72 0.72 0.72 0.69 0.29 0.45
I26 318 0.73 0.73 0.72 0.71 0.42 0.49
I27 318 0.68 0.67 0.66 0.65 0.38 0.49
I28 318 0.69 0.69 0.68 0.66 0.29 0.46
I29 318 0.74 0.73 0.73 0.71 0.41 0.49
I30 318 0.65 0.64 0.62 0.61 0.59 0.49
Non missing response frequency for each item
0 1 miss
I01 0.26 0.74 0
I02 0.29 0.71 0
I03 0.44 0.56 0
I04 0.16 0.84 0
I05 0.33 0.67 0
I06 0.46 0.54 0
I07 0.38 0.62 0
I08 0.40 0.60 0
I09 0.40 0.60 0
I10 0.36 0.64 0
I11 0.76 0.24 0
I12 0.50 0.50 0
I13 0.88 0.12 0
I14 0.63 0.37 0
I15 0.82 0.18 0
I16 0.79 0.21 0
I17 0.79 0.21 0
I18 0.69 0.31 0
I19 0.86 0.14 0
I20 0.69 0.31 0
I21 0.54 0.46 0
I22 0.84 0.16 0
I23 0.64 0.36 0
I24 0.63 0.37 0
I25 0.71 0.29 0
I26 0.58 0.42 0
I27 0.62 0.38 0
I28 0.71 0.29 0
I29 0.59 0.41 0
I30 0.41 0.59 0
The alpha() function gives us loads of output! It’s a little overwhelming. For NRT item statistics, what we want to look for are the columns labelled mean and raw.r . mean is our item facility value, and raw.r is our item discrimination. (r.drop and r.cor are other suitable values for item discrimination, but we will keep things simple for now).
To call just the output we want more efficiently, it can help to save the alpha() output to an object.
EIT_stats <- alpha(select(d, I01:I30))Then we can just look at parts of the output. We can call specific parts by using the $, as follows:
EIT_stats$item.stats n raw.r std.r r.cor r.drop mean sd
I01 318 0.3879268 0.3813498 0.3496675 0.3446018 0.7421384 0.4381469
I02 318 0.5052283 0.4997994 0.4751159 0.4654617 0.7106918 0.4541559
I03 318 0.5999113 0.5949481 0.5759155 0.5620795 0.5628931 0.4968105
I04 317 0.4664586 0.4680516 0.4403665 0.4331602 0.8391167 0.3680042
I05 318 0.6115104 0.6041397 0.5882872 0.5763897 0.6666667 0.4721475
I06 318 0.6675350 0.6611192 0.6474029 0.6343850 0.5377358 0.4993597
I07 318 0.6292054 0.6206932 0.6064286 0.5941395 0.6163522 0.4870401
I08 318 0.6805402 0.6704202 0.6593320 0.6490391 0.6037736 0.4898834
I09 318 0.6666850 0.6600908 0.6485099 0.6341321 0.6037736 0.4898834
I10 318 0.6368319 0.6294242 0.6144315 0.6027350 0.6383648 0.4812312
I11 318 0.6372642 0.6462563 0.6343493 0.6072590 0.2389937 0.4271410
I12 318 0.6786236 0.6771883 0.6645488 0.6462478 0.5000000 0.5007880
I13 318 0.5263203 0.5441254 0.5219772 0.4988819 0.1194969 0.3248835
I14 318 0.6925887 0.6926308 0.6799532 0.6623565 0.3679245 0.4830007
I15 318 0.6039602 0.6183611 0.6071139 0.5751414 0.1792453 0.3841621
I16 318 0.6550565 0.6650782 0.6544374 0.6274939 0.2075472 0.4061898
I17 318 0.6609353 0.6685751 0.6570946 0.6335929 0.2138365 0.4106589
I18 318 0.7212431 0.7214208 0.7144092 0.6943854 0.3144654 0.4650344
I19 318 0.5846748 0.6028439 0.5899500 0.5578689 0.1415094 0.3490956
I20 318 0.6992259 0.7008475 0.6910533 0.6707646 0.3113208 0.4637634
I21 318 0.7327295 0.7267723 0.7197207 0.7047519 0.4591195 0.4991114
I22 318 0.6092467 0.6241944 0.6121396 0.5817642 0.1635220 0.3704242
I23 318 0.7553758 0.7560926 0.7517900 0.7305610 0.3584906 0.4803130
I24 318 0.7240558 0.7209225 0.7146316 0.6963714 0.3742138 0.4846819
I25 318 0.7160732 0.7226575 0.7159021 0.6896376 0.2893082 0.4541559
I26 318 0.7335298 0.7280187 0.7212347 0.7060649 0.4182390 0.4940473
I27 318 0.6768688 0.6726461 0.6636672 0.6452979 0.3836478 0.4870401
I28 318 0.6883061 0.6921950 0.6821791 0.6595707 0.2924528 0.4556061
I29 318 0.7359833 0.7332488 0.7261795 0.7087153 0.4056604 0.4917933
I30 318 0.6469005 0.6383695 0.6227102 0.6127017 0.5911950 0.4923879
We can even save the item stats to a .csv document - this is very helpful for preparing reports later.
write_csv(EIT_stats$item.stats, "EIT_item_stats.csv")CRT Item Statistics
We’re going to imagine that the EIT was a criterion-referenced test. We’ll set a passing score of 20 out of 30.
Unfortunately, there are not many specialized packages for computing CRT item statistics. Luckily, we can get item facility from alpha(). We’ll have to do other things a little bit more manually.
First, we’ll need to identify who passed the test. We will use mutate() and if_else() to do this.
d <- d %>% mutate(master = if_else(EIT_total >= 20, 1, 0))Now we have a new variable where 1 indicates that someone passed. As we can see, relatively few people (83) passed the test - only about 26% of the group.
table(d$master)
0 1
235 83
mean(d$master)[1] 0.2610063
First, let’s try calculating 𝜙. We can do this with correlations. Let’s try it for one item first:
cor(d$I01, d$master)[1] 0.2193702
The 𝜙 for the first EIT item (I01) is .22.
To do this efficiently for all items, we’ll use a trick with two tidyverse functions: summarise_at() and vars().
d %>% summarise_at(vars(I01:I30), ~ cor(.x, master,
use = "pairwise.complete.obs"))# A tibble: 1 × 30
I01 I02 I03 I04 I05 I06 I07 I08 I09 I10 I11 I12 I13
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 0.219 0.269 0.423 0.241 0.344 0.450 0.410 0.379 0.452 0.373 0.574 0.494 0.532
# ℹ 17 more variables: I14 <dbl>, I15 <dbl>, I16 <dbl>, I17 <dbl>, I18 <dbl>,
# I19 <dbl>, I20 <dbl>, I21 <dbl>, I22 <dbl>, I23 <dbl>, I24 <dbl>,
# I25 <dbl>, I26 <dbl>, I27 <dbl>, I28 <dbl>, I29 <dbl>, I30 <dbl>
Pretty neat! The code is a bit complicated, but this is very convenient.
Last, we’ll try calculating a B index. This will also be a little tricky, as we have to calculate item facility for two different groups, the people who passed and the people who didn’t, and then subtract those values for each item.
d %>% group_by(master) %>%
summarise_at(vars(I01:I30), ~ mean(.x, na.rm = T)) %>%
arrange(desc(master)) %>%
summarise_at(vars(I01:I30), ~ .x[1] - .x[2])# A tibble: 1 × 30
I01 I02 I03 I04 I05 I06 I07 I08 I09 I10 I11 I12 I13
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 0.219 0.277 0.477 0.202 0.370 0.511 0.454 0.422 0.504 0.408 0.557 0.562 0.393
# ℹ 17 more variables: I14 <dbl>, I15 <dbl>, I16 <dbl>, I17 <dbl>, I18 <dbl>,
# I19 <dbl>, I20 <dbl>, I21 <dbl>, I22 <dbl>, I23 <dbl>, I24 <dbl>,
# I25 <dbl>, I26 <dbl>, I27 <dbl>, I28 <dbl>, I29 <dbl>, I30 <dbl>
This got complicated! For the homework assignment, thankfully, you’ll only be using alpha() to calculate NRT item statistics.