Install necessary packages:

install.packages("devtools", dependencies = TRUE, repos = "http://lib.stat.cmu.edu/R/CRAN/")
## Warning: dependencies 'BiocInstaller', 'lintr (>= 0.2.1)' are not available
## package 'devtools' successfully unpacked and MD5 sums checked
## 
## The downloaded binary packages are in
##  C:\Users\Karen\AppData\Local\Temp\RtmpsZCXQ7\downloaded_packages
install.packages("ggplot2", dependencies = TRUE, repos = "http://lib.stat.cmu.edu/R/CRAN/")
## package 'ggplot2' successfully unpacked and MD5 sums checked
## 
## The downloaded binary packages are in
##  C:\Users\Karen\AppData\Local\Temp\RtmpsZCXQ7\downloaded_packages

Get RCurl on board so csv files can be retrieved, get gglot2 on board

library(RCurl)
## Loading required package: bitops
library(ggplot2)

Retrieve data file from github repository

ten_eleven_comp <- getURL("https://raw.githubusercontent.com/karenweigandt/cuny-summer-bridge/master/Libraries_Annual_Statistics_Comparison_2010-2011.csv")
eleven_coll_stats <- getURL("https://raw.githubusercontent.com/karenweigandt/cuny-summer-bridge/master/Libraries_Collection_Statistics_2011_2.csv")
five_twelve_mat_inv <- getURL("https://raw.githubusercontent.com/karenweigandt/cuny-summer-bridge/master/Libraries_Material_Inventory_2005_-_2012.csv")

Read data file

ten_eleven_comp_csv <- read.csv(text = ten_eleven_comp, header = TRUE, stringsAsFactors = FALSE)
eleven_coll_stats_csv <- read.csv(text = eleven_coll_stats, header = TRUE, stringsAsFactors = FALSE)
five_twelve_mat_inv_csv <- read.csv(text = five_twelve_mat_inv, header = TRUE, stringsAsFactors = FALSE)

Look at first six rows of data

head(ten_eleven_comp_csv)
##   Island                Library Code Holdings.FY.2010 Holdings.FY.2011
## 1   Oahu                  Aiea   OAA            82718            78370
## 2   Oahu         Aina Haina  1/  OAH            70180            66364
## 3   Oahu Ewa Beach P/S Lib.  2/  OEW            91196            80317
## 4   Oahu        Hawaii Kai   3/  OHK            79327            78925
## 5   Oahu          HSL   4/ - 6/  OHS           571114           571271
## 6   Oahu           Kahuku Total                 49650            49521
##   Percent.Difference.Holdings Registered.Borrowers.FY.2010
## 1                         (5)                        23053
## 2                         (5)                        15374
## 3                        (12)                        20928
## 4                         (1)                        18919
## 5                           0                        89591
## 6                         (0)                         8977
##   Registered.Borrowers.FY.2011 Percent.Difference.Borrowers
## 1                        23709                            3
## 2                        15901                            3
## 3                        21057                            1
## 4                        19468                            3
## 5                        90724                            1
## 6                         9223                            3
##   Circulation.Activity.FY.2010 Circulation.Activity.FY.2011
## 1                       176469                       170657
## 2                       162701                       161298
## 3                        91858                        98047
## 4                       156970                       148851
## 5                       467636                       440722
## 6                        55318                        55914
##   Percent.Difference.Circulation Holds.FY.2010 Holds.FY.2011
## 1                            (3)          2869          3095
## 2                            (1)          2665          2788
## 3                              7          2006          2316
## 4                            (5)          3807          4197
## 5                            (6)         11387         10649
## 6                              1           885          1125
##   Percent.Difference.Holds
## 1                        8
## 2                        5
## 3                       15
## 4                       10
## 5                      (6)
## 6                       27
head(eleven_coll_stats_csv)
##   Island      LIBRARY REFERENCE   BOOK    CD  DVD VIDEO PHONOTAPE
## 1   Oahu         Aiea      1381  68590  3408 4145    71       679
## 2   Oahu   Aina Haina      1322  59650  2843 1967    24       532
## 3   Oahu    Ewa Beach      2987  74316  1153 1399     5       470
## 4   Oahu   Hawaii Kai      3464  70517  1715 1989   198      1007
## 5   Oahu Hawaii State    121403 426129 11325 6056  1674      3151
## 6   Oahu       Kahuku      2950  43377  1102 1135   272       671
##   PHONODISC MICROFORM CD.ROM SOFTWARE AV..MISC. LANGUAGE.LEARNING
## 1         0         0      0        0         1                55
## 2         0         0      2        0         8                48
## 3         0         0      3        0         5                22
## 4         0         0      6        0         1                15
## 5       338       290    223        0         0               648
## 6         0         0      1        0         0                12
##   UNCATALOGED  TOTAL
## 1          18  78348
## 2          39  66435
## 3          14  80374
## 4          13  78925
## 5          18 571255
## 6           2  49522
head(five_twelve_mat_inv_csv)
##   Year ISLAND       LIBRARY           INVENTORY.DESCRIPTOR  ITEMS
## 1 2005 Hawaii Bond Memorial        Library Materials Books  15485
## 2 2005 Hawaii Bond Memorial Library Materials Compact Disc    302
## 3 2005 Hawaii Bond Memorial          Library Materials DVD    411
## 4 2005 Hawaii Bond Memorial    Library Materials Phonotape    196
## 5 2005 Hawaii Bond Memorial    Library Materials Videotape    521
## 6 2005 Hawaii          Hilo        Library Materials Books 206026
##     SUM.PRICE
## 1  $238457.06
## 2    $6041.87
## 3   $10058.39
## 4    $4173.71
## 5   $10028.23
## 6 $3313672.62

Subset material inventory data to only include 2011

eleven_mat_inv <- subset(five_twelve_mat_inv_csv, Year == 2011)
#head(eleven_mat_inv)

Subset 2011 material inventory to books on the island of Oahu

oahu_eleven_book_inv <- subset(eleven_mat_inv, ISLAND == 'Oahu' & INVENTORY.DESCRIPTOR == 'Library Materials Books')

#oahu_eleven_book_inv

Histogram for Books in Hawaii libraries

options(sscipen = 3)
qplot(eleven_coll_stats_csv$BOOK, 
      geom = "histogram", 
      binwidth = 25000,
      col = I("red"),
      fill = I("blue"),
      main = "Books in Hawaii Libraries",
      xlab = "Number of Books",
      ylab = "Number of Libraries"
      )

Basic Statistics for number of books in Hawaii libraries

sprintf("Average number of books in Hawaii libraries is %s", round(mean(eleven_coll_stats_csv$BOOK)))
## [1] "Average number of books in Hawaii libraries is 60001"
sprintf("Average standard deviation for the number of books in Hawaii libraries is %s", round(sd(eleven_coll_stats_csv$BOOK)))
## [1] "Average standard deviation for the number of books in Hawaii libraries is 64245"
sprintf("Median number of books in Hawaii libraries is %s", round(median(eleven_coll_stats_csv$BOOK)))
## [1] "Median number of books in Hawaii libraries is 43377"

Quantile for the number of books in Hawaii libraries

qb <- quantile(eleven_coll_stats_csv$BOOK)
qb
##       0%      25%      50%      75%     100% 
##    433.0  27572.5  43377.0  71616.0 426129.0

Boxplot for number of books in Hawaii libraries

options(scipen = 3)
boxplot(eleven_coll_stats_csv$BOOK, main = "Boxplot: Books in Hawaii Libraries")

eleven_patron_circ <- ten_eleven_comp_csv[c(-2, -4, -6, -7, -9, -10, -12,-13, -14, -15)]

#head(eleven_patron_circ)

eleven_oahu_patron_circ <- subset(eleven_patron_circ, Island == 'Oahu')
 
#eleven_oahu_patron_circ

eleven_kauai_patron_circ <- subset(eleven_patron_circ, Island == 'Kauai')

#head(eleven_kauai_patron_circ)

Plot holdings vs. books

qplot(TOTAL, BOOK,
      data = eleven_coll_stats_csv, 
      geom = "point", 
      main = "Year 2011: Holdings vs. Number of Books for All Hawaii Libraries",
      xlab = "Number of Holdings",
      ylab = "Number of Books"
      )

Display a scatter plot of Holdings vs. Circulation for Hawaii libraries or 2011

qplot(Holdings.FY.2011, Circulation.Activity.FY.2011,
      data = eleven_patron_circ, 
      geom = "point", 
      main = "Year 2011: Holdings vs. Circulation in Hawaii Libraries",
      xlab = "Number of Holdings",
      ylab = "Circulation Volume"
      )

Display a scatter plot of Holdings vs. Circulation for Oahu libraries or 2011

qplot(Holdings.FY.2011, Circulation.Activity.FY.2011,
      data = eleven_oahu_patron_circ, 
      geom = "point", 
      main = "Year 2011: Holdings vs. Circulation in Oahu Libraries",
      xlab = "Number of Holdings",
      ylab = "Circulation Volume"
      )

Display a scatterplot of Borrowers vs. Circulation for Oahu libraries or 2011

options(scipen = 3)
qplot(Registered.Borrowers.FY.2011, Circulation.Activity.FY.2011,
      data = eleven_oahu_patron_circ, 
      geom = "point", 
      main = "Year 2011: Borrowers vs. Circulation in Oahu Libraries",
      xlab = "Number of Borrowers",
      ylab = "Circulation Volume"
      )

Display a scatterplot of Borrowers vs. Holdings for Oahu libraries or 2011

options(scipen = 3)
qplot(Registered.Borrowers.FY.2011, Holdings.FY.2011,
      data = eleven_oahu_patron_circ, 
      geom = "point", 
      main = "Year 2011: Borrowers vs. Holdings in Oahu Libraries",
      xlab = "Number of Borrowers",
      ylab = "Number of Holdings"
      )

Display a scatter plot of Holdings vs. Circulation for Kauai libraries or 2011

qplot(Holdings.FY.2011, Circulation.Activity.FY.2011,
      data = eleven_kauai_patron_circ, 
      geom = "point", 
      main = "Year 2011: Holdings vs. Circulation in Kauai Libraries",
      xlab = "Number of Holdings",
      ylab = "Circulation Volume"
      )

Display a scatterplot of Borrowers vs. Circulation for Kauai libraries or 2011

options(scipen = 3)
qplot(Registered.Borrowers.FY.2011, Circulation.Activity.FY.2011,
      data = eleven_kauai_patron_circ, 
      geom = "point", 
      main = "Year 2011: Borrowers vs. Circulation in Kauai Libraries",
      xlab = "Number of Borrowers",
      ylab = "Circulation Volume"
      )

Display a scatterplot of Borrowers vs. Holdings for Kauai libraries or 2011

options(scipen = 3)
qplot(Registered.Borrowers.FY.2011, Holdings.FY.2011,
      data = eleven_kauai_patron_circ, 
      geom = "point", 
      main = "Year 2011: Borrowers vs. Holdings in Kauai Libraries",
      xlab = "Number of Borrowers",
      ylab = "Number of Holdings"
      )

COnclusion:

The information and visualizations presented herein allow us to easily ascertain some information about the libraries inn Hawaii. First, the histogram shows that the majority of Hawaiian libraries contain less than 100,000 books. There is one library with an unusually large amount of holdings, the Big Kahuna of Hawaiian libraries. The boxblot shows 3 datapoints (libraries) that could be considered outliers. Outlier removal is beyond the scope of this analysis.

The first scatterplot shows the relationship of books to overall number of holdings for the libraries. It appears to be a linear (proportional) relationship, which would allow us to infer that relationships between other library data and holdings would also hold true for the relationship between the number of nooks and the other data.

For instance the next scatterplots show the relationships between holdings and circulation, in all of Hawaii and then the island of Oahu. These appear similar, while a scatterplot of the same data for the island of Kauai, exhibits a somewhat different profile. This may be due to the fact that there are fewer libraries (data points) on Kauai.

We have also plotted number of borrowers vs. circulation, and number of borrowers vs. holdings for both Oahu and Kauai. For Kauai the pattern for circulation is similar whether plotted against number of borrowers or holdings. But we can see from the borrowers vs holdings plot that this relationship is not directly proportional. As for the island of Oahu, this is where our large outlier library resides. There also appears to be another library, with a high proportion of holdings to borrowers.

It seems that in general the libraries with larger holdings (more books) have more borrowers and higher circulation, but this is not a hard and fast rule. This may be because the larger libraries are in more heavily populated areas, and have a larger tax base to provide funding, but the reasons can not be inferred from the data presented here. For that, you need more data.