Source file ⇒ small-project-books.Rmd

The dataset that was used to create this html document was found in an online Library Data set. Managing a full collection is too difficult for one document and time-consuming and therefore, for the purposes of this project, the document is a small subset of the larger collection.

Basics

download.file("http://tiny.cc/dcf/Library-small.rda", 
  dest="Library-small.rda")
load("~/Library-small.rda")

The two above pieces of code are written in R chunks and download and load the file respectively. The two data tables below are what we get from loading “Library-small.rda”.

## Source: local data frame [2,000 x 18]
## Groups: Item.Type [1]
## 
##     Shelving.Location Item.Type        Call.Number
##                 (chr)     (chr)              (chr)
## 1    Stacks - Level 3    VOLUME   HD9055 .B44 1989
## 2    Stacks - Level 3    VOLUME     HD31 .C53 1961
## 3    Stacks - Level 3    VOLUME  PG3476.B2 A6 1993
## 4    Stacks - Level 3    VOLUME  JN30 .E94168 2010
## 5    Stacks - Level 4    VOLUME    B2929 .N67 1976
## 6    Stacks - Level 3    VOLUME     N 7340 H7 1954
## 7    Stacks - Level 3    VOLUME   JK1896 .C55 2003
## 8  Oversize - Level 2    VOLUME  ND553.R45 A4 1979
## 9    Stacks - Level 4    VOLUME    DC404 .S87 2007
## 10 Oversize - Level 2    VOLUME N6537.L384 A4 2000
## ..                ...       ...                ...
## Variables not shown: Author (chr), Title.or.Description (chr),
##   Textual.Holdings (chr), Material.Format (chr), OCLC.Number (int), ISBN
##   (chr), Item.Barcode (chr), Cost (chr), Current.Status (chr),
##   Loan.Date.Due (time), Issued.Count (int), Issued.Count.YTD (int),
##   Last.Issued.Date (time), Last.Inventoried.Date (time), Item.Deleted.Date
##   (time)
## Source: local data frame [3,765 x 15]
## 
##    OCLC.Number      Format                                         Subject
##          (int)       (chr)                                           (chr)
## 1         1179 Book, Print            Special Industries & Trades, General
## 2         9864 Book, Print                                History, General
## 3        10413 Book, Print               Religions, Mythology, Rationalism
## 4        10982 Book, Print                              English Literature
## 5        11900 Book, Print             Print Media, Printmaking, Engraving
## 6        23797 Book, Print History - Americas, General, Indian, N. America
## 7        32490 Book, Print       Motor Vehicles, Aeronautics, Astronautics
## 8        32490 Book, Print       Motor Vehicles, Aeronautics, Astronautics
## 9        32490 Book, Print       Motor Vehicles, Aeronautics, Astronautics
## 10       35437 Book, Print        Superintendent of Documents Publications
## ..         ...         ...                                             ...
## Variables not shown: Title (chr), Author (chr), Publication.Date (int),
##   Edition (chr), Publisher (chr), ISBN (chr), Language (chr),
##   Physical.Description (chr), Genre (chr), LC.Call.Number (chr),
##   Dewey.Call.Number (chr), Local.Call.Number (chr)

The number of cases or the number of individual, distinct books within the dataset:

## [1] 18

The number of variables or the total number of parameters used to identify each individual book:

## [1] 2000

The number of cases or the number of individual books that may or may not be available in the library’s collection:

## [1] 15

The number of variables or the number of parameters used to identify each individual book:

## [1] 3765

Note that you can just as simply look at the top of each dataset to see the number of row and columns to determine cases and variables. The next two pieces of data are derived from columns within the Inv dataset (the first one). The first dataset below looks at the number of books avaiable within the library, how many are missing, loaned out (possibly to another library) and how many have been checked out by people.

## Source: local data frame [4 x 2]
## 
##   Current.Status     n
##            (chr) (int)
## 1      AVAILABLE  1866
## 2        MISSING     1
## 3        ON_LOAN    18
## 4      WITHDRAWN   115

This next dataset counts the number of times each book has been issued out.

## Source: local data frame [46 x 2]
## 
##    Issued.Count     n
##           (int) (int)
## 1             0   996
## 2             1   337
## 3             2   203
## 4             3   121
## 5             4    69
## 6             5    46
## 7             6    38
## 8             7    29
## 9             8    24
## 10            9    20
## ..          ...   ...