To: Xin Li
From: Adam Chandler (chair), Susie Cobb, Maureen Morris, Wendy Wilcox
Date: December 19, 2017
Subject: Tattle-tape Task Force Final Report
Even though members of our task force are not confident that it is actually preventing theft we must recommend continuing tattle-taping because staff, primarily selectors, clearly oppose changing the policy at this time. Feedback from access services staff was split: smaller units felt that tattle-taping was effective at preventing theft while larger units felt that tattle-taping was ineffective in protecting open stack collections. These responses make sense given the different approaches to responding to gate alarms. Before the CUL tattle-taping policy is changed, we recommend these steps:
- Replacement fees should by recycled back into supporting replacement of missing and lost materials.
- Centralize and streamline the decision-making process and funding for replacing missing and lost materials.
- Consider conducting an inventory of the library’s open stacks collections using the methodology (and tools, perhaps) employed in the EAST validation study to use as a baseline to inform present and future decision making on this issue.
“In order to evaluate the statistical likelihood that a retained volume exists on the shelves of any of the institutions, the EAST incorporated sample-based validation studies. The specific goals of this study were to establish and document the degree of confidence, and the possibility of error, in any EAST committed title being available for circulation. Results of the validation sample studies help predict the likelihood that titles selected for retention actually exist and can be located in the collection of a Retention Partner, and are in useable condition.” [https://eastlibraries.org/validation]
EAST Results
Overall, EAST can report a 97% availability rate.
The aggregated results from both cohorts (312,000 holdings across the 52 libraries) showed:
* 97% of monographs in the sample were accounted for: mean: 97%, median: 97.1%, high of 99.8% and low of 91%. (Note: âaccounted forâ includes those items previously determined to be in circulation based on an automated check of the librariesâ ILS.) * 2.3% of titles were in circulation at the time of the study
* 90% of the titles were deemed to be in average or excellent condition with 10% marked as in poor condition. Not surprisingly, older titles were in poorer condition.
A few notable observations include:
* Items published pre-1900 were in significantly poorer condition; some 45% of these items ranked “poor” on the condition scale
* An item being in poor condition was also somewhat correlated to its subject area
* The most significant factor for an item being missing was the holding library.
Study conducted between April and July. We sampled 6006 monograph across campus. Wendy Wilcox led the team that did the data collection in the stacks. Some notes: Annex was excluded because the stacks there are closed; Fine Arts was excluded because they are in the middle of a building transition.
AF (accounted for) = checkedout + present
Cornell accounted for rate: 96.4%
glimpse(df)
## Observations: 5,975
## Variables: 34
## $ present_or_not <fct> Present, Present, Present, Present, Present...
## $ bib_rec_nbr <chr> "1968678", "2249095", "5689943", "8618953",...
## $ mfhd_id <chr> "2389846", "2702959", "6199187", "8994997",...
## $ item_control_nbr <chr> "3723592", "4103508", "7620171", "9494855",...
## $ barcode <chr> "31924062968908", "31924072130184", "319241...
## $ begin_pub_date <dbl> 1960, 1993, 1971, 2013, 2010, 1971, 1994, 1...
## $ location_code <fct> afr, afr, afr, afr, afr, afr, afr, afr, afr...
## $ firstletter <fct> D, D, D, D, D, D, D, E, E, E, E, E, E, E, E...
## $ class <chr> "DT", "DT", "DT", "DT", "DT", "DT", "DT", "...
## $ classnumber <dbl> 32.000, 328.000, 356.000, 433.285, 433.545,...
## $ normalized_call_no <chr> "DT 32 R 61", "DT 328 ...
## $ display_call_no <chr> "DT32 .R61", "DT328.M53 .H3613x 1993", "DT3...
## $ call_nbr_norm_item <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ enumeration <chr> NA, NA, NA, NA, NA, NA, NA, "c.2", NA, NA, ...
## $ length_cn <dbl> 9, 22, 14, 19, 22, 20, 18, 11, 12, 18, 18, ...
## $ pagination <chr> "312 p. 22 cm.", "xv, 199 p. : ill. ; 24 cm...
## $ title <chr> "Death of Africa. By Peter Ritner.", "Victi...
## $ recorded_uses_item <dbl> 1, 0, 2, 0, 0, 10, 2, 0, 56, 1, 2, 4, 3, 2,...
## $ worldcat_oclc_nbr <chr> "412793", "59941146", "148569", "869824175"...
## $ catalog_url <chr> "https://newcatalog.library.cornell.edu/cat...
## $ us_holdings <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ row_number <dbl> 1585081, 735421, 2715241, 850681, 84661, 55...
## $ initials <chr> "mah94", "mah94", "mah94", "mah94", "mah94"...
## $ condition <fct> Acceptable, Excellent, Acceptable, Excellen...
## $ barcode_validation <chr> "yes", "yes", "yes", "yes", "yes", "no", "y...
## $ timestamp <dttm> 2018-07-06 13:56:22, 2018-07-06 13:56:22, ...
## $ item_status_desc <chr> "Not Charged", "Not Charged", "Not Charged"...
## $ is_missing <fct> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ has_circulated <dbl> 1, 0, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1...
## $ is_oversize <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ age <dbl> 58, 25, 47, 5, 8, 47, 24, 53, 28, 39, 36, 3...
## $ age_group <fct> 11plus, 11plus, 11plus, 0-10, 0-10, 11plus,...
## $ callnum <chr> "dt32 .r61", "dt328.m53 .h3613x 1993", "dt3...
## $ num_cn_chars <int> 3, 5, 3, 3, 4, 4, 4, 3, 2, 2, 3, 3, 3, 2, 3...
p_hat <- df %>%
summarise(stat = mean(is_missing == "1")) %>%
pull()
p_hat
## [1] 0.03531381
replimit = 1000
boot <- df %>%
specify(response = is_missing, success = "1") %>%
generate(reps = replimit, type = "bootstrap") %>%
calculate(stat = "prop")
boot
## # A tibble: 1,000 x 2
## replicate stat
## <int> <dbl>
## 1 1 0.0345
## 2 2 0.0341
## 3 3 0.0330
## 4 4 0.0377
## 5 5 0.0393
## 6 6 0.0380
## 7 7 0.0341
## 8 8 0.0378
## 9 9 0.0346
## 10 10 0.0333
## # ... with 990 more rows
se <- boot %>%
summarize(sd(stat)) %>%
pull()
se
## [1] 0.002373401
“The standard error is the standard deviation of the sampling distribution of the sample mean.” [Geoff Cumming, Understanding the New Statistics: Effect Sizes, Confidence Intervals, and Meta-Analysis, 2012]
CUL mean and 1000 replication bootstrap confidence interval: M = 0.035, 95% CI [0.031, 0.04] .
nrow(population_to_draw_from) [1] 3079136
nrow(population_to_draw_from) * .0357 = 109,925
Therefore, our best estimate of how many total unaccounted for items there may be across these CUL units is 109,925 +/- 524 .
| location_code | total | ave_age | ave_num_uses | percent_excellent | percent_acceptable | percent_poor | percent_na |
|---|---|---|---|---|---|---|---|
| olin | 3216 | 33 | 2.25 | 11 | 81 | 1 | 7 |
| asia | 1282 | 22 | 1.15 | 62 | 31 | 1 | 6 |
| law | 450 | 49 | 1.48 | 56 | 34 | 5 | 6 |
| mann | 280 | 27 | 5.87 | 42 | 41 | 1 | 16 |
| uris | 270 | 35 | 4.32 | 15 | 75 | NA | 10 |
| hlm | 210 | 38 | 5.90 | 29 | 54 | 6 | 11 |
| math | 116 | 37 | 6.69 | 33 | 52 | 6 | 9 |
| mus | 102 | 34 | 3.91 | 47 | 47 | 1 | 5 |
| afr | 49 | 31 | 3.10 | 47 | 41 | 6 | 6 |
mod1 <- glm(is_missing ~ location_code + firstletter + recorded_uses_item + is_oversize + age + length_cn + num_cn_chars, data=df2, family=binomial)
tidy_mod1 <- tidy(mod1) %>%
arrange(p.value)
kable(tidy_mod1)
| term | estimate | std.error | statistic | p.value |
|---|---|---|---|---|
| recorded_uses_item | 0.0182676 | 0.0055599 | 3.2855707 | 0.0010178 |
| location_codemann | 1.0551836 | 0.3601727 | 2.9296600 | 0.0033933 |
| location_codehlm | 0.9843067 | 0.3420155 | 2.8779593 | 0.0040026 |
| location_codelaw | 1.0918400 | 0.5047513 | 2.1631245 | 0.0305316 |
| location_codeasia | 0.3533658 | 0.1971118 | 1.7927177 | 0.0730181 |
| location_codemath | 1.1232229 | 0.6563570 | 1.7112987 | 0.0870260 |
| location_codemus | -2.0436455 | 1.4227482 | -1.4364070 | 0.1508866 |
| length_cn | 0.0202631 | 0.0184296 | 1.0994828 | 0.2715575 |
| num_cn_chars | 0.0583434 | 0.0703174 | 0.8297150 | 0.4066999 |
| age | 0.0008798 | 0.0011542 | 0.7622741 | 0.4458964 |
| location_codeuris | 0.3319163 | 0.4519798 | 0.7343609 | 0.4627288 |
| is_oversize | 0.2037515 | 0.2877821 | 0.7080063 | 0.4789413 |
| location_codeafr | -0.4105901 | 1.0265235 | -0.3999812 | 0.6891704 |
| (Intercept) | -18.4297290 | 620.8715397 | -0.0296836 | 0.9763194 |
| firstletterM | 15.2509858 | 620.8722880 | 0.0245638 | 0.9804029 |
| firstletterL | 15.0742805 | 620.8715578 | 0.0242792 | 0.9806299 |
| firstletterE | 14.9261936 | 620.8715952 | 0.0240407 | 0.9808201 |
| firstletterZ | 14.8995164 | 620.8716369 | 0.0239977 | 0.9808544 |
| firstletterR | 14.7993926 | 620.8716143 | 0.0238365 | 0.9809830 |
| firstletterN | 14.7158794 | 620.8715956 | 0.0237020 | 0.9810903 |
| firstletterT | 14.5164988 | 620.8715821 | 0.0233808 | 0.9813465 |
| firstletterJ | 14.2335562 | 620.8715645 | 0.0229251 | 0.9817100 |
| firstletterC | 14.2139690 | 620.8718576 | 0.0228936 | 0.9817352 |
| firstletterF | 14.1767694 | 620.8716603 | 0.0228337 | 0.9817830 |
| firstletterP | 14.1664464 | 620.8714571 | 0.0228170 | 0.9817962 |
| firstletterH | 14.1176910 | 620.8714729 | 0.0227385 | 0.9818589 |
| firstletterG | 13.9564668 | 620.8716530 | 0.0224788 | 0.9820660 |
| firstletterD | 13.8165996 | 620.8714845 | 0.0222536 | 0.9822457 |
| firstletterB | 13.7878154 | 620.8715151 | 0.0222072 | 0.9822827 |
| firstletterQ | 13.7203047 | 620.8716235 | 0.0220985 | 0.9823694 |
| firstletterK | 13.5972138 | 620.8716554 | 0.0219002 | 0.9825276 |
| firstletterS | 12.8060778 | 620.8723236 | 0.0206259 | 0.9835440 |
| firstletterV | -0.1306172 | 1453.9564338 | -0.0000898 | 0.9999283 |
| firstletterU | -0.0602283 | 918.6032630 | -0.0000656 | 0.9999477 |
##
## Call:
## glm(formula = is_missing ~ location_code + length_cn + recorded_uses_item,
## family = binomial, data = df2)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.1570 -0.2845 -0.2398 -0.2174 2.8426
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -4.328734 0.261644 -16.544 < 2e-16 ***
## location_codeasia 0.347490 0.187052 1.858 0.063208 .
## location_codelaw 0.668909 0.246522 2.713 0.006660 **
## location_codemann 1.031294 0.259681 3.971 7.15e-05 ***
## location_codeuris 0.593217 0.300234 1.976 0.048172 *
## location_codehlm 1.081866 0.293362 3.688 0.000226 ***
## location_codemath 0.748466 0.438482 1.707 0.087832 .
## location_codemus -0.932390 1.011915 -0.921 0.356836
## location_codeafr -0.214804 1.017155 -0.211 0.832746
## length_cn 0.036105 0.012926 2.793 0.005218 **
## recorded_uses_item 0.017584 0.005506 3.194 0.001405 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1825.4 on 5974 degrees of freedom
## Residual deviance: 1775.5 on 5964 degrees of freedom
## AIC: 1797.5
##
## Number of Fisher Scoring iterations: 7
Items with highest probablity of being unaccounted for
## # A tibble: 2 x 2
## `as.integer(is_missing)` n
## <int> <int>
## 1 1 91
## 2 2 9
| bib_rec_nbr | is_missing | location_code | length_cn | recorded_uses_item | .fitted |
|---|---|---|---|---|---|
| 7704606 | 1 | hlm | 17 | 204 | 0.7219358 |
| 2910259 | 0 | hlm | 17 | 147 | 0.4879488 |
| 6194426 | 0 | law | 16 | 145 | 0.3699469 |
| 7093874 | 0 | hlm | 18 | 108 | 0.3322796 |
| 787271 | 0 | mann | 10 | 98 | 0.2291452 |
| 2229926 | 0 | law | 18 | 94 | 0.2047236 |
| 7787849 | 1 | math | 14 | 89 | 0.1809669 |
| 4498531 | 0 | mann | 49 | 1 | 0.1808308 |
| 5347296 | 0 | mann | 49 | 0 | 0.1782406 |
| 4519494 | 0 | mann | 48 | 1 | 0.1755441 |
| 5318335 | 1 | mann | 15 | 68 | 0.1736257 |
| 1564634 | 0 | uris | 15 | 92 | 0.1713339 |
| 4070688 | 0 | uris | 15 | 84 | 0.1522739 |
| 2936554 | 1 | mann | 16 | 56 | 0.1499426 |
| 4695650 | 0 | uris | 30 | 52 | 0.1495671 |
Items with highest probablity of being accounted for
## # A tibble: 1 x 2
## `as.integer(is_missing)` n
## <int> <int>
## 1 1 100
| bib_rec_nbr | is_missing | location_code | length_cn | recorded_uses_item | .fitted | |
|---|---|---|---|---|---|---|
| 5961 | 2402245 | 0 | mus | 11 | 2 | 0.0079326 |
| 5962 | 1788258 | 0 | mus | 9 | 6 | 0.0079179 |
| 5963 | 61313 | 0 | mus | 11 | 1 | 0.0077955 |
| 5964 | 1175404 | 0 | mus | 10 | 3 | 0.0077882 |
| 5965 | 450158 | 0 | mus | 11 | 0 | 0.0076606 |
| 5966 | 1762746 | 0 | mus | 11 | 0 | 0.0076606 |
| 5967 | 2412818 | 0 | mus | 11 | 0 | 0.0076606 |
| 5968 | 175007 | 0 | mus | 11 | 0 | 0.0076606 |
| 5969 | 2422304 | 0 | mus | 10 | 1 | 0.0075211 |
| 5970 | 812844 | 0 | mus | 10 | 1 | 0.0075211 |
| 5971 | 749588 | 0 | mus | 10 | 0 | 0.0073910 |
| 5972 | 2028297 | 0 | mus | 10 | 0 | 0.0073910 |
| 5973 | 2420529 | 0 | mus | 10 | 0 | 0.0073910 |
| 5974 | 2099009 | 0 | mus | 10 | 0 | 0.0073910 |
| 5975 | 940063 | 0 | mus | 9 | 0 | 0.0071308 |
In this model, we first try to remove words from call numbers then count the number of letters. The thinking here is this mights capture some of the complexity of more complicated call numbers. Not successful.This version does not help - still not a significant predictor. The simple call numberlength variable is more predictive.
df2 %>%
select(display_call_no, length_cn, num_cn_chars) %>%
sample_n(5)
## # A tibble: 5 x 3
## display_call_no length_cn num_cn_chars
## <chr> <dbl> <int>
## 1 E207.G81 G792 1871 18 3
## 2 PN1991.77.W3 W37 2013 21 4
## 3 PR6056.F54 W55x 1997 20 5
## 4 HD9000.5 .I582 1998 19 3
## 5 DS121.3 .R57 1992z 18 4
df2 %>%
select(display_call_no, length_cn, num_cn_chars) %>%
arrange(desc(num_cn_chars)) %>%
top_n(5)
## Selecting by num_cn_chars
## # A tibble: 51 x 3
## display_call_no length_cn num_cn_chars
## <chr> <dbl> <int>
## 1 Oversize JN5208 .A16 ser.2 div.1 sect.2 + 41 13
## 2 PL5093.C5 B66 v.14,no.460,etc. 30 10
## 3 HD4813 .I781 3d sess.no.1 25 10
## 4 Trials KD370.N8 L38 1932 24 10
## 5 KF26 .A3 90th Apoll 19 10
## 6 KF26 .A35 92nd Tobac 20 10
## 7 KF26 .A6 92nd Agric 19 10
## 8 KF26 .C6 93rd Unive 19 10
## 9 KF26 .C6 96th Nomin 19 10
## 10 KF26 .E57 95th North 20 10
## # ... with 41 more rows
mod3 <- glm(is_missing ~ location_code + recorded_uses_item + num_cn_chars, data=df2, family=binomial)
summary(mod3)
##
## Call:
## glm(formula = is_missing ~ location_code + recorded_uses_item +
## num_cn_chars, family = binomial, data = df2)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.1231 -0.2849 -0.2322 -0.2241 2.8518
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.889483 0.264240 -14.720 < 2e-16 ***
## location_codeasia 0.432762 0.184028 2.352 0.018693 *
## location_codelaw 0.644010 0.250550 2.570 0.010159 *
## location_codemann 1.012787 0.259366 3.905 9.43e-05 ***
## location_codeuris 0.715768 0.298746 2.396 0.016579 *
## location_codehlm 1.037640 0.291873 3.555 0.000378 ***
## location_codemath 0.653674 0.436420 1.498 0.134182
## location_codemus -1.006419 1.011378 -0.995 0.319689
## location_codeafr -0.239331 1.017033 -0.235 0.813959
## recorded_uses_item 0.017013 0.005454 3.119 0.001812 **
## num_cn_chars 0.055468 0.064544 0.859 0.390133
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1825.4 on 5974 degrees of freedom
## Residual deviance: 1782.1 on 5964 degrees of freedom
## AIC: 1804.1
##
## Number of Fisher Scoring iterations: 7
# olin
df_olin <- df2 %>%
filter(location_code == "olin")
mod_olin <- glm(is_missing ~ recorded_uses_item + length_cn, data=df_olin, family=binomial)
summary(mod_olin)
##
## Call:
## glm(formula = is_missing ~ recorded_uses_item + length_cn, family = binomial,
## data = df_olin)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.4011 -0.2362 -0.2214 -0.2082 2.8683
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -4.463519 0.414020 -10.781 <2e-16 ***
## recorded_uses_item 0.007055 0.020287 0.348 0.7280
## length_cn 0.044925 0.021268 2.112 0.0347 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 763.64 on 3215 degrees of freedom
## Residual deviance: 759.41 on 3213 degrees of freedom
## AIC: 765.41
##
## Number of Fisher Scoring iterations: 6
# asia
df_asia <- df2 %>%
filter(location_code == "asia")
mod_asia <- glm(is_missing ~ recorded_uses_item + length_cn, data=df_asia, family=binomial)
summary(mod_asia)
##
## Call:
## glm(formula = is_missing ~ recorded_uses_item + length_cn, family = binomial,
## data = df_asia)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.5334 -0.2822 -0.2719 -0.2667 2.6158
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.50735 0.57707 -6.078 1.22e-09 ***
## recorded_uses_item 0.04780 0.03733 1.280 0.200
## length_cn 0.01086 0.02731 0.398 0.691
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 416.01 on 1281 degrees of freedom
## Residual deviance: 414.58 on 1279 degrees of freedom
## AIC: 420.58
##
## Number of Fisher Scoring iterations: 6
# law
df_law <- df2 %>%
filter(location_code == "law")
mod_law <- glm(is_missing ~ recorded_uses_item + length_cn, data=df_law, family=binomial)
summary(mod_law)
##
## Call:
## glm(formula = is_missing ~ recorded_uses_item + length_cn, family = binomial,
## data = df_law)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.5050 -0.3166 -0.3149 -0.3133 2.4740
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.046662 0.706540 -4.312 1.62e-05 ***
## recorded_uses_item 0.006834 0.018371 0.372 0.71
## length_cn 0.003802 0.038075 0.100 0.92
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 175.71 on 449 degrees of freedom
## Residual deviance: 175.59 on 447 degrees of freedom
## AIC: 181.59
##
## Number of Fisher Scoring iterations: 5
# mann
df_mann <- df2 %>%
filter(location_code == "mann")
mod_mann <- glm(is_missing ~ recorded_uses_item + length_cn, data=df_mann, family=binomial)
summary(mod_mann)
##
## Call:
## glm(formula = is_missing ~ recorded_uses_item + length_cn, family = binomial,
## data = df_mann)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.3692 -0.3807 -0.3494 -0.3304 2.5107
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.49074 0.78252 -4.461 8.16e-06 ***
## recorded_uses_item 0.03621 0.01567 2.311 0.0209 *
## length_cn 0.03827 0.04073 0.940 0.3474
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 144.10 on 279 degrees of freedom
## Residual deviance: 139.08 on 277 degrees of freedom
## AIC: 145.08
##
## Number of Fisher Scoring iterations: 5
# uris
df_uris <- df2 %>%
filter(location_code == "uris")
mod_uris <- glm(is_missing ~ recorded_uses_item + length_cn, data=df_uris, family=binomial)
summary(mod_uris)
##
## Call:
## glm(formula = is_missing ~ recorded_uses_item + length_cn, family = binomial,
## data = df_uris)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.8050 -0.3635 -0.2641 -0.2152 2.7233
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -5.33176 1.07742 -4.949 7.47e-07 ***
## recorded_uses_item 0.02336 0.01999 1.168 0.2427
## length_cn 0.10522 0.04204 2.503 0.0123 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 110.12 on 269 degrees of freedom
## Residual deviance: 103.18 on 267 degrees of freedom
## AIC: 109.18
##
## Number of Fisher Scoring iterations: 6
# hlm
df_hlm <- df2 %>%
filter(location_code == "hlm")
mod_hlm <- glm(is_missing ~ recorded_uses_item + length_cn, data=df_hlm, family=binomial)
summary(mod_hlm)
##
## Call:
## glm(formula = is_missing ~ recorded_uses_item + length_cn, family = binomial,
## data = df_hlm)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.8395 -0.4045 -0.3860 -0.3534 2.4113
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.385662 1.367555 -2.476 0.0133 *
## recorded_uses_item 0.011548 0.007932 1.456 0.1454
## length_cn 0.048610 0.080933 0.601 0.5481
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 113.13 on 209 degrees of freedom
## Residual deviance: 111.02 on 207 degrees of freedom
## AIC: 117.02
##
## Number of Fisher Scoring iterations: 5
# math
df_math <- df2 %>%
filter(location_code == "math")
mod_math <- glm(is_missing ~ recorded_uses_item + length_cn, data=df_math, family=binomial)
summary(mod_math)
##
## Call:
## glm(formula = is_missing ~ recorded_uses_item + length_cn, family = binomial,
## data = df_math)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.7418 -0.2950 -0.2700 -0.2600 2.6149
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.457243 1.902465 -1.817 0.0692 .
## recorded_uses_item 0.046144 0.021517 2.145 0.0320 *
## length_cn 0.004214 0.124724 0.034 0.9730
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 47.226 on 115 degrees of freedom
## Residual deviance: 42.931 on 113 degrees of freedom
## AIC: 48.931
##
## Number of Fisher Scoring iterations: 6
Sometime after completing it’s validation study, EAST conducted a survey of participating libraries to find out about the theft deterrence practices. 32 libraries responded. The library names are anonymized.
| library | tattletape_yes_no | validation_score |
|---|---|---|
| anteater | Yes | 0.016 |
| armadillo | No | 0.052 |
| axolotl | No | 0.016 |
| buffalo | Yes | 0.065 |
| camel | Yes | 0.024 |
| chameleon | Yes | 0.049 |
| cheetah | No | 0.010 |
| chipmunk | Yes | 0.047 |
| chupacabra | No | 0.027 |
| crow | Yes | 0.025 |
| dolphin | Yes | 0.011 |
| giraffe | Yes | 0.006 |
| grizzly | Yes | 0.010 |
| hedgehog | No | 0.047 |
| hippo | Yes | 0.037 |
| ifrit | Yes | 0.022 |
| iguana | No | 0.044 |
| jackal | Yes | 0.003 |
| koala | No | 0.017 |
| lemur | Yes | 0.084 |
| leopard | Yes | 0.008 |
| liger | Yes | 0.012 |
| llama | Yes | 0.018 |
| manatee | No | 0.016 |
| monkey | Yes | 0.047 |
| narwhal | No | 0.030 |
| nyan cat | Yes | 0.006 |
| otter | Yes | 0.010 |
| panda | Yes | 0.032 |
| quagga | No | 0.018 |
| squirrel | Yes | 0.005 |
| wombat | Yes | 0.033 |
In this experiment we divided the EAST libraries into two groups, the 22 libraries in the survey with security systems vs. the 10 libraries with no security systems, and generated unaccounted for (UF) rates and standard errors using bootstrap simulation for each group. The difference in UF rates can be explained by random noise, as we see from the overlapping 95% confidence intervals. We conclude again that the effect size of having a security system is zero.
At Cornell, we had an experiment ready to be conducted because there is one unit that does not use security stripping or gates, Law. Our intuition might tell us that the Law UF rate should therefore be higher than the other units. That is not the case. In this sample is Law right in the middle of pack, with confidence intervals overlapping other units that have both higher and lower UF rates.
Chandler, Adam. “Cornell Validation Study 2018 Findings 2,” September 2018. http://rpubs.com/acct4rpubs/419508.