Revealing Hidden Networks: Mapping Black Mathematicians Across Archival Collections

Nathan Alexander, PhD

Department of Curriculum and Instruction

Program in Applied Data Science and Analytics

Howard University

nathan.alexander@howard.edu

Computational curriculum collective and community-centered teaching and laerning lab

The Connections Project

AI for Recovering Networks from Fragmented Archives

Abstract

Archival collections offer important information about the intellectual and social networks of historical actors. Often, however, relationships are “hidden in plain sight” in archival materials due to their fragmented structures. In our individual investigations of Black mathematicians, our team has found important and valuable connections between different actors across time periods that extend and challenge how we think about the development of their lives and work in the mathematical sciences.

This project is joint work with Drs. John Stigall (Howard University, Philosophy), Robin Wilson (Loyola Marymount University), Terika Harris (Columbia University), Erica Walker (University of Toronto), and Edray Goins (Pomona College).

Dr. Erica Walker, expert on Black mathematicians

Motivation for this work

Dr. Euphemia Lofton Haynes

Dr. Kelly Miller

Dr. Kelly Miller, Dean at Howard University

Sigma Pi Phi, 1911 initiates

Sigma Pi Phi, 1911 initiates - Kelly Miller and Carter G. Woodson

American Negro Academy, circa 1910

Research goals

For this project, we are expanding our existing research using named entity recognition (NER) and social network analysis to develop a set of open source computational records and data tools that can map the network structure of actors across multiple archival collections.

How do archival fragments reveal overlooked relationships among Black mathematicians, and what do those relationships show about the historical development of their intellectual and social worlds?

Overview

  • Density. Archival collections contain dense information about the intellectual and social networks of historical actors, and fewer studies by individual teams – as opposed to institutions – are able to map the mass of archival content.

  • Fragmentation. Many organizations exist for archival research; however, there are still issues of coordination. As a result, many relationships remain “hidden in plain sight” because they are scattered across fragmented, multimodal sources.

  • Explicit networks. Co-authored publications, conference proceedings, and formal organizational rosters capture only a narrow share of networks.

  • Hidden networks. Many unseen relationships, mentorship, informal collaborations, and social infrastructures shape intellectual life

Timeline extraction

To support our knowledge of the intellectual worlds of Black mathematicians, we develop detailed timelines tracking their lives.

We begin with archival collections using methods from Black archival practice (Okechukwu, 2022; Prosper, 2024; Sutherland & Collier, 2022).

  • Care as Stewardship: Reimagining archival labor as an “ethic of care”

  • Refusal and Fugitivity: The, “refusal” to conform to traditional, often harmful, archival standards

  • Reparative Work and Re-membering: Using the archive to repair the fragmentation caused by slavery and systemic oppression

  • Embodied and Living Archives: Recognizing that memory exists not just in paper records, but in bodies and oral traditions

  • Community-Centered Approach: Shifting power back to the community to determine what is documented and how

Timeline extraction

Timeline of Miller and Haynes and Howard University

Data structure

Our work begins with a formal modeling process of the data to be stored.

A relational database of metadata and extracted network links is built on set theory, relations, and relational algebra. In this framework, each table is a relation, each row is a tuple, and each query is an operation that returns new relations.

Core objects

Let \(D\) be a domain, such as names, dates, or identifiers. An \(n\)-ary relation \(R\) is a subset of the Cartesian product \(D_1 \times D_2 \times \cdots \times D_n\), and each tuple in \(R\) is one record in the database.

A database schema specifies the attributes and their domains, while the database instance is the current set of tuples stored at a given time.

Core objects

Let \(D\) be a domain, such as names, dates, or identifiers. An \(n\)-ary relation \(R\) is a subset of the Cartesian product \(D_1 \times D_2 \times \cdots \times D_n\), and each tuple in \(R\) is one record in the database.

A database schema specifies the attributes and their domains, while the database instance is the current set of tuples stored at a given time.

Kelly Miller Network-Organization Database

Metadata as relations

Our metadata records describe the database itself: tables, columns, data types, keys, constraints, and indexes. We use these records in our NER analyses.

NER and metadata interactions

In archival and network research, this metadata often becomes the structured representation of entities and sources, such as person, document, date, collection, and tie type.

If the archive is fragmented, this structure helps preserve provenance while making missingness and partial overlap explicit.

Relational algebra

The basic operations are:

  • Selection \(\sigma\): filters rows by a predicate.
  • Projection \(\pi\): keeps selected columns.
  • Join \(\bowtie\): combines tables on shared keys.
  • Union \(\cup\): merges compatible relations.
  • Difference \(-\): removes matching tuples.
  • Intersection \(\cap\): keeps common tuples.

These operations are closed over relations, meaning the output of each operation is again a relation.

Example

Some of the archival correspondence has information about the sender, recipient, date, and context. A join can connect those records to person authority files, a projection can keep only sender and recipient, and the resulting edge list can be turned into a graph for centrality and broader community analysis.

Letter from Kelly Miller to W. E. B. Du Bois

From tables to networks

A network can be derived from a relational table by constructing an adjacency matrix \(A\), where

\[ A_{ij} = \begin{cases} 1, & \text{if a link exists from } i \text{ to } j \\ 0, & \text{otherwise} \end{cases} \]

Weighted or directed networks replace the 0/1 value with counts, strengths, or direction-specific values. This gives a pipeline:

\[ \text{metadata} \rightarrow \text{relation} \rightarrow \text{relational operations} \]

\[ \text{relational operations} \rightarrow \text{edge list or matrix} \rightarrow \text{network analysis} \]

Interpretation

Mathematically, our databases are not just for storage. They are a formal system for representing entities as sets of tuples and relationships as relations, then using algebraic operations to derive the networks we analyze.

Interpretation

Mathematically, our databases are not just for storage. They are a formal system for representing entities as sets of tuples and relationships as relations, then using algebraic operations to derive the networks we analyze.

Thank you!

We welcome your suggestions!