#load packages
library(tidyverse)
#get data
download.file("https://raw.githubusercontent.com/gtlaflair/ltrc-2019/gh-pages/data/placement_1.csv",
"placement_1.csv", mode = "wb")
#load data
data <- read_csv("placement_1.csv")
We often work with large(-ish) datasets that have many columns and rows. We do not always want to or need to work with an entire dataset at once, and sometimes we will compute new variables, adjust values, or run some statistics on specific parts of the dataset. Selecting these small parts is referred to as subsetting and there are a few different ways to identify subsets of data in R.
#base R
read_sub <- data[40:74]
#tidyverse options
read_tidy_select <- data %>% select(contains("read"))
read_tidy_varnames <- data %>% select(q36_read_mi:q70_read_det_an)
read_tidy_position <- data %>% select(40:74)
#for practical purposes, we'll take something with a sensible name
read <- data %>% select(contains("read"))
Now we’ll turn the the psych package for running some other classical test theory (CTT)-based analyses. Psych is a really versatile package with lots of useful functions.
#install.packages("psych") #remove the first # if you need to install the package
library(psych) #loads the package into the current session
To calculate Cronbach’s alpha and get CTT item statistics, simply use the alpha() function. Note: the Help file for the alpha() function is super informative. raw_alpha and std.alpha are nearly identical in most cases; raw_alpha is the standard Cronbach’s alpha calculation. You can just report the value in the middle of the 95% CI (also really neat that a CI is provided!).
You also get CTT item stats. In this case, frequency of ‘1’ is equivalent to IF; it’s also available in the column labelled “mean”. Discrimination (Point-biserial correlation) is r.drop.
alpha(read)
##
## Reliability analysis
## Call: alpha(x = read)
##
## raw_alpha std.alpha G6(smc) average_r S/N ase mean sd median_r
## 0.83 0.83 0.9 0.12 4.8 0.026 0.6 0.17 0.12
##
## lower alpha upper 95% confidence boundaries
## 0.77 0.83 0.88
##
## Reliability if an item is dropped:
## raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## q36_read_mi 0.82 0.82 0.9 0.12 4.6 0.027 0.014 0.12
## q37_read_det 0.82 0.83 0.9 0.12 4.7 0.027 0.014 0.12
## q38_read_det 0.82 0.82 0.9 0.12 4.7 0.027 0.014 0.12
## q39_read_mi 0.82 0.83 0.9 0.12 4.7 0.027 0.014 0.12
## q40_read_voc 0.82 0.82 0.9 0.12 4.7 0.027 0.013 0.12
## q41_read_voc 0.82 0.83 0.9 0.12 4.8 0.026 0.013 0.13
## q42_read_mi 0.82 0.83 0.9 0.12 4.8 0.027 0.014 0.13
## q43_read_det 0.82 0.82 0.9 0.12 4.6 0.027 0.013 0.12
## q44_read_mi 0.82 0.82 0.9 0.12 4.7 0.027 0.013 0.12
## q45_read_det 0.82 0.82 0.9 0.12 4.6 0.027 0.014 0.12
## q46_read_mi 0.83 0.83 0.9 0.13 5.0 0.026 0.013 0.13
## q47_read_torg 0.82 0.83 0.9 0.12 4.8 0.026 0.013 0.12
## q48_read_det 0.82 0.83 0.9 0.12 4.8 0.027 0.014 0.12
## q49_read_det 0.82 0.83 0.9 0.12 4.8 0.027 0.014 0.12
## q50_read_voc 0.82 0.82 0.9 0.12 4.5 0.028 0.013 0.12
## q51_read_torg 0.82 0.82 0.9 0.12 4.6 0.028 0.014 0.12
## q52_read_inf 0.82 0.82 0.9 0.12 4.6 0.027 0.014 0.12
## q53_read_torg 0.83 0.83 0.9 0.13 4.9 0.026 0.013 0.13
## q54_read_det 0.82 0.82 0.9 0.12 4.6 0.028 0.013 0.12
## q55_read_torg 0.82 0.82 0.9 0.12 4.6 0.028 0.013 0.12
## q56_read_purp 0.81 0.82 0.9 0.12 4.5 0.028 0.013 0.12
## q57_read_purp 0.83 0.83 0.9 0.13 4.9 0.026 0.014 0.13
## q58_read_mi 0.82 0.83 0.9 0.12 4.7 0.027 0.014 0.12
## q59_read_inf 0.81 0.82 0.9 0.12 4.5 0.028 0.013 0.12
## q60_read_det_an 0.82 0.82 0.9 0.12 4.6 0.027 0.014 0.12
## q61_read_purp_an 0.82 0.83 0.9 0.12 4.8 0.027 0.014 0.13
## q62_read_voc_an 0.83 0.83 0.9 0.13 4.9 0.026 0.013 0.12
## q63_read_inf_an 0.82 0.82 0.9 0.12 4.6 0.027 0.014 0.12
## q64_read_mi_an 0.82 0.83 0.9 0.12 4.8 0.027 0.014 0.12
## q65_read_torg_an 0.82 0.83 0.9 0.12 4.8 0.026 0.014 0.12
## q66_read_torg_an 0.81 0.82 0.9 0.12 4.5 0.028 0.014 0.12
## q67_read_purp_an 0.82 0.82 0.9 0.12 4.6 0.027 0.014 0.12
## q68_read_mi_an 0.82 0.83 0.9 0.12 4.8 0.027 0.014 0.12
## q69_read_det_an 0.82 0.83 0.9 0.12 4.7 0.027 0.014 0.12
## q70_read_det_an 0.82 0.82 0.9 0.12 4.7 0.027 0.014 0.12
##
## Item statistics
## n raw.r std.r r.cor r.drop mean sd
## q36_read_mi 88 0.43 0.46 0.44 0.386 0.89 0.32
## q37_read_det 88 0.34 0.35 0.33 0.283 0.85 0.36
## q38_read_det 88 0.39 0.39 0.37 0.326 0.81 0.40
## q39_read_mi 88 0.33 0.35 0.33 0.264 0.78 0.41
## q40_read_voc 88 0.41 0.43 0.42 0.353 0.82 0.39
## q41_read_voc 88 0.19 0.25 0.24 0.141 0.92 0.27
## q42_read_mi 88 0.28 0.30 0.28 0.233 0.89 0.32
## q43_read_det 88 0.43 0.45 0.44 0.371 0.76 0.43
## q44_read_mi 88 0.35 0.41 0.41 0.319 0.95 0.21
## q45_read_det 88 0.42 0.43 0.42 0.369 0.85 0.36
## q46_read_mi 88 0.14 0.14 0.11 0.061 0.70 0.46
## q47_read_torg 88 0.30 0.28 0.26 0.216 0.40 0.49
## q48_read_det 88 0.32 0.33 0.30 0.254 0.81 0.40
## q49_read_det 88 0.33 0.34 0.31 0.260 0.75 0.44
## q50_read_voc 88 0.53 0.54 0.54 0.478 0.81 0.40
## q51_read_torg 88 0.49 0.47 0.45 0.418 0.59 0.49
## q52_read_inf 88 0.45 0.44 0.42 0.373 0.44 0.50
## q53_read_torg 88 0.24 0.22 0.19 0.154 0.45 0.50
## q54_read_det 88 0.52 0.50 0.50 0.458 0.57 0.50
## q55_read_torg 88 0.50 0.48 0.48 0.435 0.56 0.50
## q56_read_purp 88 0.56 0.54 0.53 0.501 0.51 0.50
## q57_read_purp 88 0.23 0.23 0.19 0.145 0.43 0.50
## q58_read_mi 88 0.35 0.34 0.32 0.278 0.61 0.49
## q59_read_inf 88 0.62 0.60 0.60 0.558 0.55 0.50
## q60_read_det_an 88 0.44 0.47 0.46 0.381 0.83 0.38
## q61_read_purp_an 88 0.28 0.28 0.25 0.211 0.27 0.45
## q62_read_voc_an 88 0.23 0.21 0.18 0.150 0.40 0.49
## q63_read_inf_an 88 0.47 0.47 0.45 0.404 0.36 0.48
## q64_read_mi_an 88 0.31 0.29 0.27 0.227 0.45 0.50
## q65_read_torg_an 88 0.29 0.28 0.25 0.213 0.38 0.49
## q66_read_torg_an 88 0.56 0.54 0.54 0.499 0.30 0.46
## q67_read_purp_an 88 0.47 0.47 0.44 0.401 0.34 0.48
## q68_read_mi_an 88 0.35 0.33 0.31 0.276 0.53 0.50
## q69_read_det_an 88 0.37 0.37 0.34 0.303 0.30 0.46
## q70_read_det_an 88 0.44 0.43 0.40 0.371 0.27 0.45
##
## Non missing response frequency for each item
## 0 1 miss
## q36_read_mi 0.11 0.89 0
## q37_read_det 0.15 0.85 0
## q38_read_det 0.19 0.81 0
## q39_read_mi 0.22 0.78 0
## q40_read_voc 0.18 0.82 0
## q41_read_voc 0.08 0.92 0
## q42_read_mi 0.11 0.89 0
## q43_read_det 0.24 0.76 0
## q44_read_mi 0.05 0.95 0
## q45_read_det 0.15 0.85 0
## q46_read_mi 0.30 0.70 0
## q47_read_torg 0.60 0.40 0
## q48_read_det 0.19 0.81 0
## q49_read_det 0.25 0.75 0
## q50_read_voc 0.19 0.81 0
## q51_read_torg 0.41 0.59 0
## q52_read_inf 0.56 0.44 0
## q53_read_torg 0.55 0.45 0
## q54_read_det 0.43 0.57 0
## q55_read_torg 0.44 0.56 0
## q56_read_purp 0.49 0.51 0
## q57_read_purp 0.57 0.43 0
## q58_read_mi 0.39 0.61 0
## q59_read_inf 0.45 0.55 0
## q60_read_det_an 0.17 0.83 0
## q61_read_purp_an 0.73 0.27 0
## q62_read_voc_an 0.60 0.40 0
## q63_read_inf_an 0.64 0.36 0
## q64_read_mi_an 0.55 0.45 0
## q65_read_torg_an 0.62 0.38 0
## q66_read_torg_an 0.70 0.30 0
## q67_read_purp_an 0.66 0.34 0
## q68_read_mi_an 0.47 0.53 0
## q69_read_det_an 0.70 0.30 0
## q70_read_det_an 0.73 0.27 0
#to get some more detail, save the output as an object, which allows you to call specific info more easily
read_rel <- alpha(read)
#more reliability indices
read_rel$total
## raw_alpha std.alpha G6(smc) average_r S/N ase mean sd
## 0.8251025 0.8288168 0.904726 0.1215233 4.841696 0.0263587 0.6038961 0.1677314
## median_r
## 0.1217997
#item stats
read_rel$item.stats
## n raw.r std.r r.cor r.drop mean
## q36_read_mi 88 0.4316186 0.4608162 0.4434800 0.38582750 0.8863636
## q37_read_det 88 0.3389172 0.3498596 0.3270724 0.28349677 0.8522727
## q38_read_det 88 0.3861963 0.3866761 0.3687368 0.32643652 0.8068182
## q39_read_mi 88 0.3292686 0.3504764 0.3268457 0.26431620 0.7840909
## q40_read_voc 88 0.4097620 0.4293561 0.4205116 0.35257464 0.8181818
## q41_read_voc 88 0.1867330 0.2499418 0.2361900 0.14145335 0.9204545
## q42_read_mi 88 0.2843999 0.3026054 0.2795421 0.23331533 0.8863636
## q43_read_det 88 0.4332633 0.4493525 0.4394412 0.37115324 0.7613636
## q44_read_mi 88 0.3509029 0.4108112 0.4081349 0.31903242 0.9545455
## q45_read_det 88 0.4212150 0.4348710 0.4230232 0.36930739 0.8522727
## q46_read_mi 88 0.1388695 0.1426591 0.1050185 0.06118605 0.7045455
## q47_read_torg 88 0.2952506 0.2832201 0.2622658 0.21604208 0.3977273
## q48_read_det 88 0.3171607 0.3264651 0.2965789 0.25444975 0.8068182
## q49_read_det 88 0.3282001 0.3358172 0.3127481 0.25968779 0.7500000
## q50_read_voc 88 0.5291988 0.5392619 0.5420148 0.47785205 0.8068182
## q51_read_torg 88 0.4866652 0.4709641 0.4478221 0.41840661 0.5909091
## q52_read_inf 88 0.4455125 0.4398160 0.4187310 0.37344314 0.4431818
## q53_read_torg 88 0.2367174 0.2186012 0.1874001 0.15398440 0.4545455
## q54_read_det 88 0.5234405 0.5024657 0.4998781 0.45766072 0.5681818
## q55_read_torg 88 0.5028715 0.4824284 0.4785254 0.43516322 0.5568182
## q56_read_purp 88 0.5641805 0.5356891 0.5328640 0.50145660 0.5113636
## q57_read_purp 88 0.2272411 0.2284363 0.1880328 0.14466987 0.4318182
## q58_read_mi 88 0.3543845 0.3411942 0.3156675 0.27832447 0.6136364
## q59_read_inf 88 0.6156073 0.5982617 0.5991671 0.55829191 0.5454545
## q60_read_det_an 88 0.4351173 0.4654515 0.4589311 0.38071006 0.8295455
## q61_read_purp_an 88 0.2829377 0.2800009 0.2518572 0.21061091 0.2727273
## q62_read_voc_an 88 0.2316081 0.2146005 0.1793426 0.15016854 0.3977273
## q63_read_inf_an 88 0.4720241 0.4665676 0.4486219 0.40422910 0.3636364
## q64_read_mi_an 88 0.3070928 0.2921209 0.2694716 0.22696822 0.4545455
## q65_read_torg_an 88 0.2915405 0.2791659 0.2469691 0.21306819 0.3750000
## q66_read_torg_an 88 0.5566419 0.5406792 0.5360034 0.49909542 0.2954545
## q67_read_purp_an 88 0.4678245 0.4656779 0.4449997 0.40077190 0.3409091
## q68_read_mi_an 88 0.3535436 0.3306766 0.3080021 0.27550237 0.5340909
## q69_read_det_an 88 0.3731634 0.3675500 0.3406556 0.30301957 0.2954545
## q70_read_det_an 88 0.4359308 0.4294318 0.4030160 0.37107121 0.2727273
## sd
## q36_read_mi 0.3191878
## q37_read_det 0.3568629
## q38_read_det 0.3970568
## q39_read_mi 0.4138094
## q40_read_voc 0.3879049
## q41_read_voc 0.2721389
## q42_read_mi 0.3191878
## q43_read_det 0.4286927
## q44_read_mi 0.2094926
## q45_read_det 0.3568629
## q46_read_mi 0.4588614
## q47_read_torg 0.4922333
## q48_read_det 0.3970568
## q49_read_det 0.4354942
## q50_read_voc 0.3970568
## q51_read_torg 0.4944837
## q52_read_inf 0.4996080
## q53_read_torg 0.5007831
## q54_read_det 0.4981680
## q55_read_torg 0.4996080
## q56_read_purp 0.5027355
## q57_read_purp 0.4981680
## q58_read_mi 0.4897059
## q59_read_inf 0.5007831
## q60_read_det_an 0.3781866
## q61_read_purp_an 0.4479140
## q62_read_voc_an 0.4922333
## q63_read_inf_an 0.4838024
## q64_read_mi_an 0.5007831
## q65_read_torg_an 0.4868973
## q66_read_torg_an 0.4588614
## q67_read_purp_an 0.4767313
## q68_read_mi_an 0.5016951
## q69_read_det_an 0.4588614
## q70_read_det_an 0.4479140
#looking to improve reliability? This shows what happens to reliability estimates if you delete an item
read_rel$alpha.drop
## raw_alpha std.alpha G6(smc) average_r S/N alpha se
## q36_read_mi 0.8195960 0.8221129 0.8995782 0.1196623 4.621542 0.02718498
## q37_read_det 0.8218034 0.8257288 0.9009747 0.1223130 4.738183 0.02684768
## q38_read_det 0.8205309 0.8245430 0.9002509 0.1214335 4.699402 0.02704966
## q39_read_mi 0.8223148 0.8257090 0.9009865 0.1222982 4.737532 0.02676470
## q40_read_voc 0.8198372 0.8231510 0.8984617 0.1204139 4.654542 0.02714501
## q41_read_voc 0.8249165 0.8288791 0.8995323 0.1247000 4.843823 0.02643906
## q42_read_mi 0.8230662 0.8272309 0.9014488 0.1234419 4.788072 0.02669341
## q43_read_det 0.8190712 0.8224924 0.8983605 0.1199362 4.633561 0.02725242
## q44_read_mi 0.8221817 0.8237582 0.8968872 0.1208569 4.674022 0.02682555
## q45_read_det 0.8196132 0.8229698 0.8988036 0.1202821 4.648754 0.02718412
## q46_read_mi 0.8289707 0.8321555 0.9043327 0.1272629 4.957894 0.02582882
## q47_read_torg 0.8243093 0.8278407 0.9004743 0.1239050 4.808575 0.02646833
## q48_read_det 0.8225679 0.8264752 0.9025114 0.1228719 4.762866 0.02673179
## q49_read_det 0.8225121 0.8261775 0.9013187 0.1226484 4.752995 0.02674804
## q50_read_voc 0.8161816 0.8194783 0.8952077 0.1177883 4.539501 0.02770588
## q51_read_torg 0.8171648 0.8217758 0.9006554 0.1194199 4.610910 0.02758678
## q52_read_inf 0.8187706 0.8228070 0.9004544 0.1201640 4.643564 0.02730673
## q53_read_torg 0.8265686 0.8298474 0.9031345 0.1254487 4.877078 0.02616961
## q54_read_det 0.8157182 0.8207224 0.8968436 0.1186673 4.577940 0.02783081
## q55_read_torg 0.8165299 0.8213936 0.8973148 0.1191460 4.598905 0.02768871
## q56_read_purp 0.8140635 0.8195998 0.8970345 0.1178736 4.543230 0.02810750
## q57_read_purp 0.8268547 0.8295446 0.9044473 0.1252137 4.866636 0.02608396
## q58_read_mi 0.8221265 0.8260059 0.9011280 0.1225200 4.747322 0.02679862
## q59_read_inf 0.8119721 0.8174525 0.8958789 0.1163788 4.478026 0.02844004
## q60_read_det_an 0.8191304 0.8219590 0.8974428 0.1195516 4.616685 0.02724206
## q61_read_purp_an 0.8241126 0.8279417 0.9021909 0.1239819 4.811982 0.02650888
## q62_read_voc_an 0.8265800 0.8299703 0.9039309 0.1255442 4.881327 0.02614580
## q63_read_inf_an 0.8177190 0.8219220 0.8992088 0.1195249 4.615515 0.02749433
## q64_read_mi_an 0.8240069 0.8275612 0.9008734 0.1236923 4.799158 0.02651689
## q65_read_torg_an 0.8243614 0.8279678 0.9032330 0.1240018 4.812866 0.02648075
## q66_read_torg_an 0.8146645 0.8194301 0.8967418 0.1177544 4.538022 0.02795286
## q67_read_purp_an 0.8178713 0.8219515 0.9001292 0.1195461 4.616448 0.02745897
## q68_read_mi_an 0.8222910 0.8263412 0.9013801 0.1227713 4.758420 0.02680060
## q69_read_det_an 0.8211904 0.8251607 0.9015850 0.1218904 4.719539 0.02692666
## q70_read_det_an 0.8189878 0.8231485 0.9008826 0.1204120 4.654463 0.02728542
## var.r med.r
## q36_read_mi 0.01378056 0.1190962
## q37_read_det 0.01385751 0.1233532
## q38_read_det 0.01361015 0.1197474
## q39_read_mi 0.01368053 0.1233532
## q40_read_voc 0.01335805 0.1178511
## q41_read_voc 0.01293957 0.1259882
## q42_read_mi 0.01356305 0.1259882
## q43_read_det 0.01334949 0.1190962
## q44_read_mi 0.01322197 0.1217997
## q45_read_det 0.01351409 0.1207578
## q46_read_mi 0.01304529 0.1276097
## q47_read_torg 0.01343519 0.1242063
## q48_read_det 0.01373068 0.1240347
## q49_read_det 0.01358199 0.1217997
## q50_read_voc 0.01325173 0.1173789
## q51_read_torg 0.01371220 0.1205495
## q52_read_inf 0.01377635 0.1181249
## q53_read_torg 0.01336051 0.1276097
## q54_read_det 0.01328524 0.1181249
## q55_read_torg 0.01324041 0.1207578
## q56_read_purp 0.01319967 0.1173789
## q57_read_purp 0.01350324 0.1276097
## q58_read_mi 0.01372801 0.1242063
## q59_read_inf 0.01313777 0.1169337
## q60_read_det_an 0.01352360 0.1190962
## q61_read_purp_an 0.01360150 0.1256562
## q62_read_voc_an 0.01335863 0.1242063
## q63_read_inf_an 0.01361858 0.1169337
## q64_read_mi_an 0.01351042 0.1242063
## q65_read_torg_an 0.01355305 0.1242063
## q66_read_torg_an 0.01360184 0.1163279
## q67_read_purp_an 0.01374604 0.1178511
## q68_read_mi_an 0.01350866 0.1242063
## q69_read_det_an 0.01393102 0.1217997
## q70_read_det_an 0.01395809 0.1197474
It can be very helpful to connect multiple data frames. We’ll do a
little practice now with left_join(), a
tidyverse function that is the most straightforward way of
joining data.
First, we’re going to bring some person ID data to the
read dataframe.
read <- bind_cols(select(data, ID:admin_date), read)
bind_cols lets you attach columns in a (new) dataframe.
But it doesn’t do it in a ‘smart’ way; it just slaps them together side
by side (if they are all the same length, that is) and assumes every row
is in the right order.
Now we’re going to compute total scores for each person.
rtot <- read %>% group_by(ID) %>%
pivot_longer(q36_read_mi:q70_read_det_an, names_to = "item", values_to = "score") %>%
summarise(read_total = sum(score))
Take a look at the new rtot dataframe.
view(rtot)
We can see an ID column and a read_total column. If we want to join
that back up to our overall reading test dataframe read AND
make sure that each total score is matched to the right person, we can
do so with left_join().
read <- left_join(read, rtot, by = "ID")
The key here is having an index column that allows you to match rows
across dataframes. If the index column is named something different in
different dataframes, that’s okay - you can just use
by = c("X" = "Y") to specify.