The purpose of this experiment is to link a 1.487 million record dataset of Cornell’s 2001 - 2018 monograph acquisitions with corresponding OCLC Worldcat holdings, then group the holdings by language of the material, to measure the extent to which the collection we are building (in each language) is distinctive. A great deal of effort was put into compiling this dataset. This is the first time we’ve used it to inform our conversation around distinctive collections.
language: language of the material as coded in the MARC record
n = number of monograph titles collected by Cornell in the 2001 - 2018 time period
cornell_only = number of titles collected that are owned only by Cornell
cornell_only_per1000 = (cornell_only / number of items collected) * 1000
percent_has_circulated = percentage of titles that have circulated at least once
mean_ivy = average number of copies across Ivy Plus. For a group of titles we take the total number of Ivy Plus libraries that hold titles in the set (for example, Khmer language books) and divide by the number of titles. An average that is close to 1 means that few other libraries hold the same titles; a high average means many of libraries hold the same titles.
mean_oclc = average number of copies across all OCLC libraries. For a group of titles we take the total number of OCLC WorldCat libraries that hold titles in the set (for example, Khmer language books) and divide by the number of titles. An average that is close to 1 means that few other libraries hold the same titles; a high average means many of libraries hold the same titles.
During these two decades, TABLE 1 shows that Cornell added 1232 Khmer language books to the collection. 817 of these are only held by Cornell, which means for every 1000 books we acquired in this language, 663 are unique to Cornell. Continuing with this example, 11% of them circulated, the average holdings across Ivy Plus libraries is 1.12 and the average number of libraries in Worldcat that hold these titles is 1.78.
How many Khmer books have circulated?
## # A tibble: 2 x 2
## # Groups: has_circulated [2]
## has_circulated n
## <dbl> <int>
## 1 0 1093
## 2 1 140
How many Khmer books that have not circulated are held only by Cornell?
circ %>%
filter(lang_code == "khm",
has_circulated == 0,
oclc_inst_cnt == 1) %>%
count()
## # A tibble: 1 x 1
## n
## <int>
## 1 758
By these criteria, Khmer is indeed a distinctive Cornell collection.