Setup

#load packages
library(tidyverse)

#get data
download.file("https://raw.githubusercontent.com/gtlaflair/ltrc-2019/gh-pages/data/placement_1.csv",
              "placement_1.csv", mode = "wb")

#load data
data <- read_csv("placement_1.csv")

R Fundamentals: Subsetting/Selecting

We often work with large(-ish) datasets that have many columns and rows. We do not always want to or need to work with an entire dataset at once, and sometimes we will compute new variables, adjust values, or run some statistics on specific parts of the dataset. Selecting these small parts is referred to as subsetting and there are a few different ways to identify subsets of data in R.

#base R
read_sub <- data[40:74]

#tidyverse options
read_tidy_select <- data %>% select(contains("read"))
read_tidy_varnames <- data %>% select(q36_read_mi:q70_read_det_an)
read_tidy_position <- data %>% select(40:74)

#for practical purposes, we'll take something with a sensible name
read <- data %>% select(contains("read"))

Reliability with the psych package

Now we’ll turn the the psych package for running some other classical test theory (CTT)-based analyses. Psych is a really versatile package with lots of useful functions.

#install.packages("psych") #remove the first # if you need to install the package
library(psych) #loads the package into the current session

To calculate Cronbach’s alpha and get CTT item statistics, simply use the alpha() function. Note: the Help file for the alpha() function is super informative. raw_alpha and std.alpha are nearly identical in most cases; raw_alpha is the standard Cronbach’s alpha calculation. You can just report the value in the middle of the 95% CI (also really neat that a CI is provided!).

You also get CTT item stats. In this case, frequency of ‘1’ is equivalent to IF; it’s also available in the column labelled “mean”. Discrimination (Point-biserial correlation) is r.drop.

alpha(read)
## 
## Reliability analysis   
## Call: alpha(x = read)
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean   sd median_r
##       0.83      0.83     0.9      0.12 4.8 0.026  0.6 0.17     0.12
## 
##  lower alpha upper     95% confidence boundaries
## 0.77 0.83 0.88 
## 
##  Reliability if an item is dropped:
##                  raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## q36_read_mi           0.82      0.82     0.9      0.12 4.6    0.027 0.014  0.12
## q37_read_det          0.82      0.83     0.9      0.12 4.7    0.027 0.014  0.12
## q38_read_det          0.82      0.82     0.9      0.12 4.7    0.027 0.014  0.12
## q39_read_mi           0.82      0.83     0.9      0.12 4.7    0.027 0.014  0.12
## q40_read_voc          0.82      0.82     0.9      0.12 4.7    0.027 0.013  0.12
## q41_read_voc          0.82      0.83     0.9      0.12 4.8    0.026 0.013  0.13
## q42_read_mi           0.82      0.83     0.9      0.12 4.8    0.027 0.014  0.13
## q43_read_det          0.82      0.82     0.9      0.12 4.6    0.027 0.013  0.12
## q44_read_mi           0.82      0.82     0.9      0.12 4.7    0.027 0.013  0.12
## q45_read_det          0.82      0.82     0.9      0.12 4.6    0.027 0.014  0.12
## q46_read_mi           0.83      0.83     0.9      0.13 5.0    0.026 0.013  0.13
## q47_read_torg         0.82      0.83     0.9      0.12 4.8    0.026 0.013  0.12
## q48_read_det          0.82      0.83     0.9      0.12 4.8    0.027 0.014  0.12
## q49_read_det          0.82      0.83     0.9      0.12 4.8    0.027 0.014  0.12
## q50_read_voc          0.82      0.82     0.9      0.12 4.5    0.028 0.013  0.12
## q51_read_torg         0.82      0.82     0.9      0.12 4.6    0.028 0.014  0.12
## q52_read_inf          0.82      0.82     0.9      0.12 4.6    0.027 0.014  0.12
## q53_read_torg         0.83      0.83     0.9      0.13 4.9    0.026 0.013  0.13
## q54_read_det          0.82      0.82     0.9      0.12 4.6    0.028 0.013  0.12
## q55_read_torg         0.82      0.82     0.9      0.12 4.6    0.028 0.013  0.12
## q56_read_purp         0.81      0.82     0.9      0.12 4.5    0.028 0.013  0.12
## q57_read_purp         0.83      0.83     0.9      0.13 4.9    0.026 0.014  0.13
## q58_read_mi           0.82      0.83     0.9      0.12 4.7    0.027 0.014  0.12
## q59_read_inf          0.81      0.82     0.9      0.12 4.5    0.028 0.013  0.12
## q60_read_det_an       0.82      0.82     0.9      0.12 4.6    0.027 0.014  0.12
## q61_read_purp_an      0.82      0.83     0.9      0.12 4.8    0.027 0.014  0.13
## q62_read_voc_an       0.83      0.83     0.9      0.13 4.9    0.026 0.013  0.12
## q63_read_inf_an       0.82      0.82     0.9      0.12 4.6    0.027 0.014  0.12
## q64_read_mi_an        0.82      0.83     0.9      0.12 4.8    0.027 0.014  0.12
## q65_read_torg_an      0.82      0.83     0.9      0.12 4.8    0.026 0.014  0.12
## q66_read_torg_an      0.81      0.82     0.9      0.12 4.5    0.028 0.014  0.12
## q67_read_purp_an      0.82      0.82     0.9      0.12 4.6    0.027 0.014  0.12
## q68_read_mi_an        0.82      0.83     0.9      0.12 4.8    0.027 0.014  0.12
## q69_read_det_an       0.82      0.83     0.9      0.12 4.7    0.027 0.014  0.12
## q70_read_det_an       0.82      0.82     0.9      0.12 4.7    0.027 0.014  0.12
## 
##  Item statistics 
##                   n raw.r std.r r.cor r.drop mean   sd
## q36_read_mi      88  0.43  0.46  0.44  0.386 0.89 0.32
## q37_read_det     88  0.34  0.35  0.33  0.283 0.85 0.36
## q38_read_det     88  0.39  0.39  0.37  0.326 0.81 0.40
## q39_read_mi      88  0.33  0.35  0.33  0.264 0.78 0.41
## q40_read_voc     88  0.41  0.43  0.42  0.353 0.82 0.39
## q41_read_voc     88  0.19  0.25  0.24  0.141 0.92 0.27
## q42_read_mi      88  0.28  0.30  0.28  0.233 0.89 0.32
## q43_read_det     88  0.43  0.45  0.44  0.371 0.76 0.43
## q44_read_mi      88  0.35  0.41  0.41  0.319 0.95 0.21
## q45_read_det     88  0.42  0.43  0.42  0.369 0.85 0.36
## q46_read_mi      88  0.14  0.14  0.11  0.061 0.70 0.46
## q47_read_torg    88  0.30  0.28  0.26  0.216 0.40 0.49
## q48_read_det     88  0.32  0.33  0.30  0.254 0.81 0.40
## q49_read_det     88  0.33  0.34  0.31  0.260 0.75 0.44
## q50_read_voc     88  0.53  0.54  0.54  0.478 0.81 0.40
## q51_read_torg    88  0.49  0.47  0.45  0.418 0.59 0.49
## q52_read_inf     88  0.45  0.44  0.42  0.373 0.44 0.50
## q53_read_torg    88  0.24  0.22  0.19  0.154 0.45 0.50
## q54_read_det     88  0.52  0.50  0.50  0.458 0.57 0.50
## q55_read_torg    88  0.50  0.48  0.48  0.435 0.56 0.50
## q56_read_purp    88  0.56  0.54  0.53  0.501 0.51 0.50
## q57_read_purp    88  0.23  0.23  0.19  0.145 0.43 0.50
## q58_read_mi      88  0.35  0.34  0.32  0.278 0.61 0.49
## q59_read_inf     88  0.62  0.60  0.60  0.558 0.55 0.50
## q60_read_det_an  88  0.44  0.47  0.46  0.381 0.83 0.38
## q61_read_purp_an 88  0.28  0.28  0.25  0.211 0.27 0.45
## q62_read_voc_an  88  0.23  0.21  0.18  0.150 0.40 0.49
## q63_read_inf_an  88  0.47  0.47  0.45  0.404 0.36 0.48
## q64_read_mi_an   88  0.31  0.29  0.27  0.227 0.45 0.50
## q65_read_torg_an 88  0.29  0.28  0.25  0.213 0.38 0.49
## q66_read_torg_an 88  0.56  0.54  0.54  0.499 0.30 0.46
## q67_read_purp_an 88  0.47  0.47  0.44  0.401 0.34 0.48
## q68_read_mi_an   88  0.35  0.33  0.31  0.276 0.53 0.50
## q69_read_det_an  88  0.37  0.37  0.34  0.303 0.30 0.46
## q70_read_det_an  88  0.44  0.43  0.40  0.371 0.27 0.45
## 
## Non missing response frequency for each item
##                     0    1 miss
## q36_read_mi      0.11 0.89    0
## q37_read_det     0.15 0.85    0
## q38_read_det     0.19 0.81    0
## q39_read_mi      0.22 0.78    0
## q40_read_voc     0.18 0.82    0
## q41_read_voc     0.08 0.92    0
## q42_read_mi      0.11 0.89    0
## q43_read_det     0.24 0.76    0
## q44_read_mi      0.05 0.95    0
## q45_read_det     0.15 0.85    0
## q46_read_mi      0.30 0.70    0
## q47_read_torg    0.60 0.40    0
## q48_read_det     0.19 0.81    0
## q49_read_det     0.25 0.75    0
## q50_read_voc     0.19 0.81    0
## q51_read_torg    0.41 0.59    0
## q52_read_inf     0.56 0.44    0
## q53_read_torg    0.55 0.45    0
## q54_read_det     0.43 0.57    0
## q55_read_torg    0.44 0.56    0
## q56_read_purp    0.49 0.51    0
## q57_read_purp    0.57 0.43    0
## q58_read_mi      0.39 0.61    0
## q59_read_inf     0.45 0.55    0
## q60_read_det_an  0.17 0.83    0
## q61_read_purp_an 0.73 0.27    0
## q62_read_voc_an  0.60 0.40    0
## q63_read_inf_an  0.64 0.36    0
## q64_read_mi_an   0.55 0.45    0
## q65_read_torg_an 0.62 0.38    0
## q66_read_torg_an 0.70 0.30    0
## q67_read_purp_an 0.66 0.34    0
## q68_read_mi_an   0.47 0.53    0
## q69_read_det_an  0.70 0.30    0
## q70_read_det_an  0.73 0.27    0
#to get some more detail, save the output as an object, which allows you to call specific info more easily
read_rel <- alpha(read)

#more reliability indices
read_rel$total
##  raw_alpha std.alpha  G6(smc) average_r      S/N       ase      mean        sd
##  0.8251025 0.8288168 0.904726 0.1215233 4.841696 0.0263587 0.6038961 0.1677314
##   median_r
##  0.1217997
#item stats
read_rel$item.stats
##                   n     raw.r     std.r     r.cor     r.drop      mean
## q36_read_mi      88 0.4316186 0.4608162 0.4434800 0.38582750 0.8863636
## q37_read_det     88 0.3389172 0.3498596 0.3270724 0.28349677 0.8522727
## q38_read_det     88 0.3861963 0.3866761 0.3687368 0.32643652 0.8068182
## q39_read_mi      88 0.3292686 0.3504764 0.3268457 0.26431620 0.7840909
## q40_read_voc     88 0.4097620 0.4293561 0.4205116 0.35257464 0.8181818
## q41_read_voc     88 0.1867330 0.2499418 0.2361900 0.14145335 0.9204545
## q42_read_mi      88 0.2843999 0.3026054 0.2795421 0.23331533 0.8863636
## q43_read_det     88 0.4332633 0.4493525 0.4394412 0.37115324 0.7613636
## q44_read_mi      88 0.3509029 0.4108112 0.4081349 0.31903242 0.9545455
## q45_read_det     88 0.4212150 0.4348710 0.4230232 0.36930739 0.8522727
## q46_read_mi      88 0.1388695 0.1426591 0.1050185 0.06118605 0.7045455
## q47_read_torg    88 0.2952506 0.2832201 0.2622658 0.21604208 0.3977273
## q48_read_det     88 0.3171607 0.3264651 0.2965789 0.25444975 0.8068182
## q49_read_det     88 0.3282001 0.3358172 0.3127481 0.25968779 0.7500000
## q50_read_voc     88 0.5291988 0.5392619 0.5420148 0.47785205 0.8068182
## q51_read_torg    88 0.4866652 0.4709641 0.4478221 0.41840661 0.5909091
## q52_read_inf     88 0.4455125 0.4398160 0.4187310 0.37344314 0.4431818
## q53_read_torg    88 0.2367174 0.2186012 0.1874001 0.15398440 0.4545455
## q54_read_det     88 0.5234405 0.5024657 0.4998781 0.45766072 0.5681818
## q55_read_torg    88 0.5028715 0.4824284 0.4785254 0.43516322 0.5568182
## q56_read_purp    88 0.5641805 0.5356891 0.5328640 0.50145660 0.5113636
## q57_read_purp    88 0.2272411 0.2284363 0.1880328 0.14466987 0.4318182
## q58_read_mi      88 0.3543845 0.3411942 0.3156675 0.27832447 0.6136364
## q59_read_inf     88 0.6156073 0.5982617 0.5991671 0.55829191 0.5454545
## q60_read_det_an  88 0.4351173 0.4654515 0.4589311 0.38071006 0.8295455
## q61_read_purp_an 88 0.2829377 0.2800009 0.2518572 0.21061091 0.2727273
## q62_read_voc_an  88 0.2316081 0.2146005 0.1793426 0.15016854 0.3977273
## q63_read_inf_an  88 0.4720241 0.4665676 0.4486219 0.40422910 0.3636364
## q64_read_mi_an   88 0.3070928 0.2921209 0.2694716 0.22696822 0.4545455
## q65_read_torg_an 88 0.2915405 0.2791659 0.2469691 0.21306819 0.3750000
## q66_read_torg_an 88 0.5566419 0.5406792 0.5360034 0.49909542 0.2954545
## q67_read_purp_an 88 0.4678245 0.4656779 0.4449997 0.40077190 0.3409091
## q68_read_mi_an   88 0.3535436 0.3306766 0.3080021 0.27550237 0.5340909
## q69_read_det_an  88 0.3731634 0.3675500 0.3406556 0.30301957 0.2954545
## q70_read_det_an  88 0.4359308 0.4294318 0.4030160 0.37107121 0.2727273
##                         sd
## q36_read_mi      0.3191878
## q37_read_det     0.3568629
## q38_read_det     0.3970568
## q39_read_mi      0.4138094
## q40_read_voc     0.3879049
## q41_read_voc     0.2721389
## q42_read_mi      0.3191878
## q43_read_det     0.4286927
## q44_read_mi      0.2094926
## q45_read_det     0.3568629
## q46_read_mi      0.4588614
## q47_read_torg    0.4922333
## q48_read_det     0.3970568
## q49_read_det     0.4354942
## q50_read_voc     0.3970568
## q51_read_torg    0.4944837
## q52_read_inf     0.4996080
## q53_read_torg    0.5007831
## q54_read_det     0.4981680
## q55_read_torg    0.4996080
## q56_read_purp    0.5027355
## q57_read_purp    0.4981680
## q58_read_mi      0.4897059
## q59_read_inf     0.5007831
## q60_read_det_an  0.3781866
## q61_read_purp_an 0.4479140
## q62_read_voc_an  0.4922333
## q63_read_inf_an  0.4838024
## q64_read_mi_an   0.5007831
## q65_read_torg_an 0.4868973
## q66_read_torg_an 0.4588614
## q67_read_purp_an 0.4767313
## q68_read_mi_an   0.5016951
## q69_read_det_an  0.4588614
## q70_read_det_an  0.4479140
#looking to improve reliability? This shows what happens to reliability estimates if you delete an item
read_rel$alpha.drop
##                  raw_alpha std.alpha   G6(smc) average_r      S/N   alpha se
## q36_read_mi      0.8195960 0.8221129 0.8995782 0.1196623 4.621542 0.02718498
## q37_read_det     0.8218034 0.8257288 0.9009747 0.1223130 4.738183 0.02684768
## q38_read_det     0.8205309 0.8245430 0.9002509 0.1214335 4.699402 0.02704966
## q39_read_mi      0.8223148 0.8257090 0.9009865 0.1222982 4.737532 0.02676470
## q40_read_voc     0.8198372 0.8231510 0.8984617 0.1204139 4.654542 0.02714501
## q41_read_voc     0.8249165 0.8288791 0.8995323 0.1247000 4.843823 0.02643906
## q42_read_mi      0.8230662 0.8272309 0.9014488 0.1234419 4.788072 0.02669341
## q43_read_det     0.8190712 0.8224924 0.8983605 0.1199362 4.633561 0.02725242
## q44_read_mi      0.8221817 0.8237582 0.8968872 0.1208569 4.674022 0.02682555
## q45_read_det     0.8196132 0.8229698 0.8988036 0.1202821 4.648754 0.02718412
## q46_read_mi      0.8289707 0.8321555 0.9043327 0.1272629 4.957894 0.02582882
## q47_read_torg    0.8243093 0.8278407 0.9004743 0.1239050 4.808575 0.02646833
## q48_read_det     0.8225679 0.8264752 0.9025114 0.1228719 4.762866 0.02673179
## q49_read_det     0.8225121 0.8261775 0.9013187 0.1226484 4.752995 0.02674804
## q50_read_voc     0.8161816 0.8194783 0.8952077 0.1177883 4.539501 0.02770588
## q51_read_torg    0.8171648 0.8217758 0.9006554 0.1194199 4.610910 0.02758678
## q52_read_inf     0.8187706 0.8228070 0.9004544 0.1201640 4.643564 0.02730673
## q53_read_torg    0.8265686 0.8298474 0.9031345 0.1254487 4.877078 0.02616961
## q54_read_det     0.8157182 0.8207224 0.8968436 0.1186673 4.577940 0.02783081
## q55_read_torg    0.8165299 0.8213936 0.8973148 0.1191460 4.598905 0.02768871
## q56_read_purp    0.8140635 0.8195998 0.8970345 0.1178736 4.543230 0.02810750
## q57_read_purp    0.8268547 0.8295446 0.9044473 0.1252137 4.866636 0.02608396
## q58_read_mi      0.8221265 0.8260059 0.9011280 0.1225200 4.747322 0.02679862
## q59_read_inf     0.8119721 0.8174525 0.8958789 0.1163788 4.478026 0.02844004
## q60_read_det_an  0.8191304 0.8219590 0.8974428 0.1195516 4.616685 0.02724206
## q61_read_purp_an 0.8241126 0.8279417 0.9021909 0.1239819 4.811982 0.02650888
## q62_read_voc_an  0.8265800 0.8299703 0.9039309 0.1255442 4.881327 0.02614580
## q63_read_inf_an  0.8177190 0.8219220 0.8992088 0.1195249 4.615515 0.02749433
## q64_read_mi_an   0.8240069 0.8275612 0.9008734 0.1236923 4.799158 0.02651689
## q65_read_torg_an 0.8243614 0.8279678 0.9032330 0.1240018 4.812866 0.02648075
## q66_read_torg_an 0.8146645 0.8194301 0.8967418 0.1177544 4.538022 0.02795286
## q67_read_purp_an 0.8178713 0.8219515 0.9001292 0.1195461 4.616448 0.02745897
## q68_read_mi_an   0.8222910 0.8263412 0.9013801 0.1227713 4.758420 0.02680060
## q69_read_det_an  0.8211904 0.8251607 0.9015850 0.1218904 4.719539 0.02692666
## q70_read_det_an  0.8189878 0.8231485 0.9008826 0.1204120 4.654463 0.02728542
##                       var.r     med.r
## q36_read_mi      0.01378056 0.1190962
## q37_read_det     0.01385751 0.1233532
## q38_read_det     0.01361015 0.1197474
## q39_read_mi      0.01368053 0.1233532
## q40_read_voc     0.01335805 0.1178511
## q41_read_voc     0.01293957 0.1259882
## q42_read_mi      0.01356305 0.1259882
## q43_read_det     0.01334949 0.1190962
## q44_read_mi      0.01322197 0.1217997
## q45_read_det     0.01351409 0.1207578
## q46_read_mi      0.01304529 0.1276097
## q47_read_torg    0.01343519 0.1242063
## q48_read_det     0.01373068 0.1240347
## q49_read_det     0.01358199 0.1217997
## q50_read_voc     0.01325173 0.1173789
## q51_read_torg    0.01371220 0.1205495
## q52_read_inf     0.01377635 0.1181249
## q53_read_torg    0.01336051 0.1276097
## q54_read_det     0.01328524 0.1181249
## q55_read_torg    0.01324041 0.1207578
## q56_read_purp    0.01319967 0.1173789
## q57_read_purp    0.01350324 0.1276097
## q58_read_mi      0.01372801 0.1242063
## q59_read_inf     0.01313777 0.1169337
## q60_read_det_an  0.01352360 0.1190962
## q61_read_purp_an 0.01360150 0.1256562
## q62_read_voc_an  0.01335863 0.1242063
## q63_read_inf_an  0.01361858 0.1169337
## q64_read_mi_an   0.01351042 0.1242063
## q65_read_torg_an 0.01355305 0.1242063
## q66_read_torg_an 0.01360184 0.1163279
## q67_read_purp_an 0.01374604 0.1178511
## q68_read_mi_an   0.01350866 0.1242063
## q69_read_det_an  0.01393102 0.1217997
## q70_read_det_an  0.01395809 0.1197474

Joining

It can be very helpful to connect multiple data frames. We’ll do a little practice now with left_join(), a tidyverse function that is the most straightforward way of joining data.

First, we’re going to bring some person ID data to the read dataframe.

read <- bind_cols(select(data, ID:admin_date), read)

bind_cols lets you attach columns in a (new) dataframe. But it doesn’t do it in a ‘smart’ way; it just slaps them together side by side (if they are all the same length, that is) and assumes every row is in the right order.

Now we’re going to compute total scores for each person.

rtot <- read %>% group_by(ID) %>%
  pivot_longer(q36_read_mi:q70_read_det_an, names_to = "item", values_to = "score") %>%
  summarise(read_total = sum(score))

Take a look at the new rtot dataframe.

view(rtot)

We can see an ID column and a read_total column. If we want to join that back up to our overall reading test dataframe read AND make sure that each total score is matched to the right person, we can do so with left_join().

read <- left_join(read, rtot, by = "ID")

The key here is having an index column that allows you to match rows across dataframes. If the index column is named something different in different dataframes, that’s okay - you can just use by = c("X" = "Y") to specify.