Install necessary packages:
install.packages("devtools", dependencies = TRUE, repos = "http://lib.stat.cmu.edu/R/CRAN/")
## Warning: dependencies 'BiocInstaller', 'lintr (>= 0.2.1)' are not available
## package 'devtools' successfully unpacked and MD5 sums checked
##
## The downloaded binary packages are in
## C:\Users\Karen\AppData\Local\Temp\RtmpsZCXQ7\downloaded_packages
install.packages("ggplot2", dependencies = TRUE, repos = "http://lib.stat.cmu.edu/R/CRAN/")
## package 'ggplot2' successfully unpacked and MD5 sums checked
##
## The downloaded binary packages are in
## C:\Users\Karen\AppData\Local\Temp\RtmpsZCXQ7\downloaded_packages
Get RCurl on board so csv files can be retrieved, get gglot2 on board
library(RCurl)
## Loading required package: bitops
library(ggplot2)
Retrieve data file from github repository
ten_eleven_comp <- getURL("https://raw.githubusercontent.com/karenweigandt/cuny-summer-bridge/master/Libraries_Annual_Statistics_Comparison_2010-2011.csv")
eleven_coll_stats <- getURL("https://raw.githubusercontent.com/karenweigandt/cuny-summer-bridge/master/Libraries_Collection_Statistics_2011_2.csv")
five_twelve_mat_inv <- getURL("https://raw.githubusercontent.com/karenweigandt/cuny-summer-bridge/master/Libraries_Material_Inventory_2005_-_2012.csv")
Read data file
ten_eleven_comp_csv <- read.csv(text = ten_eleven_comp, header = TRUE, stringsAsFactors = FALSE)
eleven_coll_stats_csv <- read.csv(text = eleven_coll_stats, header = TRUE, stringsAsFactors = FALSE)
five_twelve_mat_inv_csv <- read.csv(text = five_twelve_mat_inv, header = TRUE, stringsAsFactors = FALSE)
Look at first six rows of data
head(ten_eleven_comp_csv)
## Island Library Code Holdings.FY.2010 Holdings.FY.2011
## 1 Oahu Aiea OAA 82718 78370
## 2 Oahu Aina Haina 1/ OAH 70180 66364
## 3 Oahu Ewa Beach P/S Lib. 2/ OEW 91196 80317
## 4 Oahu Hawaii Kai 3/ OHK 79327 78925
## 5 Oahu HSL 4/ - 6/ OHS 571114 571271
## 6 Oahu Kahuku Total 49650 49521
## Percent.Difference.Holdings Registered.Borrowers.FY.2010
## 1 (5) 23053
## 2 (5) 15374
## 3 (12) 20928
## 4 (1) 18919
## 5 0 89591
## 6 (0) 8977
## Registered.Borrowers.FY.2011 Percent.Difference.Borrowers
## 1 23709 3
## 2 15901 3
## 3 21057 1
## 4 19468 3
## 5 90724 1
## 6 9223 3
## Circulation.Activity.FY.2010 Circulation.Activity.FY.2011
## 1 176469 170657
## 2 162701 161298
## 3 91858 98047
## 4 156970 148851
## 5 467636 440722
## 6 55318 55914
## Percent.Difference.Circulation Holds.FY.2010 Holds.FY.2011
## 1 (3) 2869 3095
## 2 (1) 2665 2788
## 3 7 2006 2316
## 4 (5) 3807 4197
## 5 (6) 11387 10649
## 6 1 885 1125
## Percent.Difference.Holds
## 1 8
## 2 5
## 3 15
## 4 10
## 5 (6)
## 6 27
head(eleven_coll_stats_csv)
## Island LIBRARY REFERENCE BOOK CD DVD VIDEO PHONOTAPE
## 1 Oahu Aiea 1381 68590 3408 4145 71 679
## 2 Oahu Aina Haina 1322 59650 2843 1967 24 532
## 3 Oahu Ewa Beach 2987 74316 1153 1399 5 470
## 4 Oahu Hawaii Kai 3464 70517 1715 1989 198 1007
## 5 Oahu Hawaii State 121403 426129 11325 6056 1674 3151
## 6 Oahu Kahuku 2950 43377 1102 1135 272 671
## PHONODISC MICROFORM CD.ROM SOFTWARE AV..MISC. LANGUAGE.LEARNING
## 1 0 0 0 0 1 55
## 2 0 0 2 0 8 48
## 3 0 0 3 0 5 22
## 4 0 0 6 0 1 15
## 5 338 290 223 0 0 648
## 6 0 0 1 0 0 12
## UNCATALOGED TOTAL
## 1 18 78348
## 2 39 66435
## 3 14 80374
## 4 13 78925
## 5 18 571255
## 6 2 49522
head(five_twelve_mat_inv_csv)
## Year ISLAND LIBRARY INVENTORY.DESCRIPTOR ITEMS
## 1 2005 Hawaii Bond Memorial Library Materials Books 15485
## 2 2005 Hawaii Bond Memorial Library Materials Compact Disc 302
## 3 2005 Hawaii Bond Memorial Library Materials DVD 411
## 4 2005 Hawaii Bond Memorial Library Materials Phonotape 196
## 5 2005 Hawaii Bond Memorial Library Materials Videotape 521
## 6 2005 Hawaii Hilo Library Materials Books 206026
## SUM.PRICE
## 1 $238457.06
## 2 $6041.87
## 3 $10058.39
## 4 $4173.71
## 5 $10028.23
## 6 $3313672.62
Subset material inventory data to only include 2011
eleven_mat_inv <- subset(five_twelve_mat_inv_csv, Year == 2011)
#head(eleven_mat_inv)
Subset 2011 material inventory to books on the island of Oahu
oahu_eleven_book_inv <- subset(eleven_mat_inv, ISLAND == 'Oahu' & INVENTORY.DESCRIPTOR == 'Library Materials Books')
#oahu_eleven_book_inv
Histogram for Books in Hawaii libraries
options(sscipen = 3)
qplot(eleven_coll_stats_csv$BOOK,
geom = "histogram",
binwidth = 25000,
col = I("red"),
fill = I("blue"),
main = "Books in Hawaii Libraries",
xlab = "Number of Books",
ylab = "Number of Libraries"
)
Basic Statistics for number of books in Hawaii libraries
sprintf("Average number of books in Hawaii libraries is %s", round(mean(eleven_coll_stats_csv$BOOK)))
## [1] "Average number of books in Hawaii libraries is 60001"
sprintf("Average standard deviation for the number of books in Hawaii libraries is %s", round(sd(eleven_coll_stats_csv$BOOK)))
## [1] "Average standard deviation for the number of books in Hawaii libraries is 64245"
sprintf("Median number of books in Hawaii libraries is %s", round(median(eleven_coll_stats_csv$BOOK)))
## [1] "Median number of books in Hawaii libraries is 43377"
Quantile for the number of books in Hawaii libraries
qb <- quantile(eleven_coll_stats_csv$BOOK)
qb
## 0% 25% 50% 75% 100%
## 433.0 27572.5 43377.0 71616.0 426129.0
Boxplot for number of books in Hawaii libraries
options(scipen = 3)
boxplot(eleven_coll_stats_csv$BOOK, main = "Boxplot: Books in Hawaii Libraries")
eleven_patron_circ <- ten_eleven_comp_csv[c(-2, -4, -6, -7, -9, -10, -12,-13, -14, -15)]
#head(eleven_patron_circ)
eleven_oahu_patron_circ <- subset(eleven_patron_circ, Island == 'Oahu')
#eleven_oahu_patron_circ
eleven_kauai_patron_circ <- subset(eleven_patron_circ, Island == 'Kauai')
#head(eleven_kauai_patron_circ)
Plot holdings vs. books
qplot(TOTAL, BOOK,
data = eleven_coll_stats_csv,
geom = "point",
main = "Year 2011: Holdings vs. Number of Books for All Hawaii Libraries",
xlab = "Number of Holdings",
ylab = "Number of Books"
)
Display a scatter plot of Holdings vs. Circulation for Hawaii libraries or 2011
qplot(Holdings.FY.2011, Circulation.Activity.FY.2011,
data = eleven_patron_circ,
geom = "point",
main = "Year 2011: Holdings vs. Circulation in Hawaii Libraries",
xlab = "Number of Holdings",
ylab = "Circulation Volume"
)
Display a scatter plot of Holdings vs. Circulation for Oahu libraries or 2011
qplot(Holdings.FY.2011, Circulation.Activity.FY.2011,
data = eleven_oahu_patron_circ,
geom = "point",
main = "Year 2011: Holdings vs. Circulation in Oahu Libraries",
xlab = "Number of Holdings",
ylab = "Circulation Volume"
)
Display a scatterplot of Borrowers vs. Circulation for Oahu libraries or 2011
options(scipen = 3)
qplot(Registered.Borrowers.FY.2011, Circulation.Activity.FY.2011,
data = eleven_oahu_patron_circ,
geom = "point",
main = "Year 2011: Borrowers vs. Circulation in Oahu Libraries",
xlab = "Number of Borrowers",
ylab = "Circulation Volume"
)
Display a scatterplot of Borrowers vs. Holdings for Oahu libraries or 2011
options(scipen = 3)
qplot(Registered.Borrowers.FY.2011, Holdings.FY.2011,
data = eleven_oahu_patron_circ,
geom = "point",
main = "Year 2011: Borrowers vs. Holdings in Oahu Libraries",
xlab = "Number of Borrowers",
ylab = "Number of Holdings"
)
Display a scatter plot of Holdings vs. Circulation for Kauai libraries or 2011
qplot(Holdings.FY.2011, Circulation.Activity.FY.2011,
data = eleven_kauai_patron_circ,
geom = "point",
main = "Year 2011: Holdings vs. Circulation in Kauai Libraries",
xlab = "Number of Holdings",
ylab = "Circulation Volume"
)
Display a scatterplot of Borrowers vs. Circulation for Kauai libraries or 2011
options(scipen = 3)
qplot(Registered.Borrowers.FY.2011, Circulation.Activity.FY.2011,
data = eleven_kauai_patron_circ,
geom = "point",
main = "Year 2011: Borrowers vs. Circulation in Kauai Libraries",
xlab = "Number of Borrowers",
ylab = "Circulation Volume"
)
Display a scatterplot of Borrowers vs. Holdings for Kauai libraries or 2011
options(scipen = 3)
qplot(Registered.Borrowers.FY.2011, Holdings.FY.2011,
data = eleven_kauai_patron_circ,
geom = "point",
main = "Year 2011: Borrowers vs. Holdings in Kauai Libraries",
xlab = "Number of Borrowers",
ylab = "Number of Holdings"
)
COnclusion:
The information and visualizations presented herein allow us to easily ascertain some information about the libraries inn Hawaii. First, the histogram shows that the majority of Hawaiian libraries contain less than 100,000 books. There is one library with an unusually large amount of holdings, the Big Kahuna of Hawaiian libraries. The boxblot shows 3 datapoints (libraries) that could be considered outliers. Outlier removal is beyond the scope of this analysis.
The first scatterplot shows the relationship of books to overall number of holdings for the libraries. It appears to be a linear (proportional) relationship, which would allow us to infer that relationships between other library data and holdings would also hold true for the relationship between the number of nooks and the other data.
For instance the next scatterplots show the relationships between holdings and circulation, in all of Hawaii and then the island of Oahu. These appear similar, while a scatterplot of the same data for the island of Kauai, exhibits a somewhat different profile. This may be due to the fact that there are fewer libraries (data points) on Kauai.
We have also plotted number of borrowers vs. circulation, and number of borrowers vs. holdings for both Oahu and Kauai. For Kauai the pattern for circulation is similar whether plotted against number of borrowers or holdings. But we can see from the borrowers vs holdings plot that this relationship is not directly proportional. As for the island of Oahu, this is where our large outlier library resides. There also appears to be another library, with a high proportion of holdings to borrowers.
It seems that in general the libraries with larger holdings (more books) have more borrowers and higher circulation, but this is not a hard and fast rule. This may be because the larger libraries are in more heavily populated areas, and have a larger tax base to provide funding, but the reasons can not be inferred from the data presented here. For that, you need more data.