Distinctive collections by language, defined in relation to Worldcat and Ivy Plus holdings

Cornell Monograph Acquisitions 2001 - 2018

The purpose of this experiment is to link a 1.487 million record dataset of Cornell’s 2001 - 2018 monograph acquisitions with corresponding OCLC Worldcat holdings, then group the holdings by language of the material, to measure the extent to which the collection we are building (in each language) is distinctive. A great deal of effort was put into compiling this dataset. This is the first time we’ve used it to inform our conversation around distinctive collections.

Variables

language: language of the material as coded in the MARC record

n = number of monograph titles collected by Cornell in the 2001 - 2018 time period

cornell_only = number of titles collected that are owned only by Cornell

cornell_only_per1000 = (cornell_only / number of items collected) * 1000

percent_has_circulated = percentage of titles that have circulated at least once

mean_ivy = average number of copies across Ivy Plus. For a group of titles we take the total number of Ivy Plus libraries that hold titles in the set (for example, Khmer language books) and divide by the number of titles. An average that is close to 1 means that few other libraries hold the same titles; a high average means many of libraries hold the same titles.

mean_oclc = average number of copies across all OCLC libraries. For a group of titles we take the total number of OCLC WorldCat libraries that hold titles in the set (for example, Khmer language books) and divide by the number of titles. An average that is close to 1 means that few other libraries hold the same titles; a high average means many of libraries hold the same titles.

Case study: Khmer

During these two decades, TABLE 1 shows that Cornell added 1232 Khmer language books to the collection. 817 of these are only held by Cornell, which means for every 1000 books we acquired in this language, 663 are unique to Cornell. Continuing with this example, 11% of them circulated, the average holdings across Ivy Plus libraries is 1.12 and the average number of libraries in Worldcat that hold these titles is 1.78.

How many Khmer books have circulated?

## # A tibble: 2 x 2
## # Groups:   has_circulated [2]
##   has_circulated     n
##            <dbl> <int>
## 1              0  1093
## 2              1   140

How many Khmer books that have not circulated are held only by Cornell?

circ %>%
  filter(lang_code == "khm",
         has_circulated == 0,
         oclc_inst_cnt == 1) %>%
  count()

## # A tibble: 1 x 1
##       n
##   <int>
## 1   758

By these criteria, Khmer is indeed a distinctive Cornell collection.

Questions

How can we connect readers to these materials? Across all the languages that we acquired during this time period, there are 68355 books that held only by Cornell AND have never circulated. The implications are fascinating. There is a high chance that the only people who have ever read these materials are the authors, publishers, and the few individuals who purchased copies.
Are there stategies used by archivists, to promote archival collections to local, national, and international communities which could be considered for these circulating materials?
Can we imagine a better quantitative definition of distinctive collection that retains the simplicty of cornell_only_per1000?

Distinctive collections by language, defined in relation to Worldcat and Ivy Plus holdings

Adam Chandler

07 August 2020 15:29

Cornell Monograph Acquisitions 2001 - 2018

Variables

Case study: Khmer

Questions

Appendex 1: Language code and LC class

Appendex 2: Language and LC class circulation by borrower category