Print vs. Electronic Materials in U.S. Public Libraries
Tara Reynolds & Nicholas Meister
2025-11-12
Project Overview
Title: Collection Development Priorities: Print vs. Electronic Materials
Team:
Tara Reynolds (Data Visualization)
Nicholas Meister (Data Wrangling & Analysis)
Course: LIS 4210 Data Visualization, Fall 2025
Date: November 12, 2025
Dataset
Source: Institute of Museum and Library Services (IMLS)
Public Libraries Survey (PLS) FY2022
Size: 9,248 library systems across all 50 states + DC + territories
Variables: 192 fields including expenditures, collections, services, and demographics
Key Variables:
PRMATEXP - Print materials expenditure
ELMATEXP - Electronic materials expenditure
OTHMATEX - Other materials (audio, video, etc.)
POPU_LSA - Population of legal service area
LOCALE_ADD - Community type (urban/suburban/town/rural)
STABR - State abbreviation
Audience & Research Questions
Audience:
Collection development librarians making purchasing decisions
Library directors planning budgets and strategic priorities
State library agencies setting policy and allocating grants
Publishers and vendors understanding market trends
LIS educators preparing future librarians
Research Questions:
How do materials budget priorities vary across community types (urban vs. rural)?
Which states lead in electronic materials investment?
What is the relationship between library size, location, and digital collection strategy?
Data Wrangling: Loading & Initial Setup
Challenge 1: Loading 9,000+ observations with 192 variables
# Load essential packageslibrary(tidyverse) # Data manipulation and visualizationlibrary(janitor) # Clean variable nameslibrary(scales) # Format numbers and currency# Load data with strategic parameterspls_2022_raw <-read_csv("data/PLS_FY22_AE_pud22i.csv", guess_max =50000, # Examine more rowsshow_col_types =FALSE) # Suppress messages# Standardize variable namespls_2022_raw <- pls_2022_raw %>%clean_names() # Converts to lowercase_with_underscores
Why guess_max = 50000?
By default, read_csv() only examines 1,000 rows to determine column types. With special codes throughout the dataset, we need it to look at far more rows for accurate type detection.
Data Wrangling: Handling Missing Data Codes
Challenge 2: IMLS uses negative codes instead of NA
-1 = Not applicable
-3 = Suppressed for confidentiality
-4 = Not available
pls_2022 <- pls_2022_raw %>%# Convert ALL negative values to proper NA across ALL numeric columnsmutate(across(where(is.numeric), ~ifelse(.x <0, NA_real_, .x)))
Breaking down the code:
mutate() - Modify columns
across() - Apply function to multiple columns
where(is.numeric) - Select only numeric columns
~ ifelse(.x < 0, NA_real_, .x) - If negative, replace with NA; otherwise keep original
Result: One line cleans 100+ numeric variables systematically
Data Wrangling: Creating Calculated Variables
Challenge 3: Computing percentages and per-capita metrics
Key Insight: Urban/suburban libraries allocate significantly more to electronic resources compared to rural libraries, but all community types maintain majority print budgets.
Top States in Electronic Investment
Key Insight: Leading states show significantly higher electronic allocation, with geographic clustering suggesting regional consortia effects.
The Balance: Print vs. Electronic
Key Insight: Most libraries cluster below the diagonal (print-dominant), but large urban systems approach or exceed parity.
Per Capita Spending Patterns
Unexpected Finding: Town libraries show relatively high per-capita materials spending, possibly due to smaller populations with stable tax bases.
State-Level Comparison
Key Insight: Among the largest library systems, strategic approaches vary significantly, with electronic allocations ranging considerably across states.
The Story: Strategic Tensions in Collection Development
What We Found:
Universal Print Majority: Even leading states allocate 60-70% to print materials
Urban-Rural Divide: Urban libraries invest significantly more in electronic resources
Scale Matters: Large libraries can afford balanced portfolios; small libraries face either/or choices
State Variation: Significant differences in strategic priorities across states
So What?
For Librarians: Strategy must align with both user needs AND community type realities
For Administrators: Electronic transition requires sustained multi-year investment
For States: Geographic disparities suggest need for consortial purchasing power
For Vendors: Market segmentation by size and locale is essential
For Policy: Digital equity requires addressing content licensing costs, not just broadband