Single-cell sequencing is an emerging technology in the field of immunology and oncology that allows researchers to couple RNA quantification and other modalities, like immune cell receptor profiling at the level of an individual cell. A number of workflows and software packages have been created to process and analyze single-cell transcriptomic data. These packages allow users to take the vast dimensionality of the data generated in single-cell-based experiments and distill the data into novel insights. Unlike the transcriptomic field, there is a lack of options for software that allow for single-cell immune receptor profiling. Enabling users to easily combine RNA and immune profiling, the scRepertoire framework supports use of 10x, single-cell clonal formats and interaction with popular R-based single-cell data pipelines.
scRepertoire is designed to take filter contig outputs from the 10x Genomics Cell Ranger pipeline, process that data to assign clonotype based on two TCR or Ig chains and analyze the clonotype dynamics. The latter can be separated into 1) clonotype-only analysis functions, such as unique clonotypes or clonal space quantification, and 2) interaction with mRNA expression data using Seurat or SingleCellExperiment packages.
References:
Prerequisite: Ensure the
All_samples_Merged Seurat object is loaded into your R
environment before running the chunks below. ## Load libraries
# Core
library(Seurat)
library(SeuratObject)
library(SeuratData)
library(patchwork)
library(dplyr)
library(ggplot2)
library(tidyverse)
library(rmarkdown)
library(tinytex)
library(grid)
library(cowplot)
library(presto)
# TCR Analysis
library(scRepertoire)
library(SingleCellExperiment)
library(circlize)
library(scales)
# Tables/exports
library(gt)
library(writexl)
# ---------- Output folders ----------
BASE_OUT <- "TCR_analysis-12-01-2026"
DIRS <- list(
base = BASE_OUT,
tables = file.path(BASE_OUT, "Tables"),
figures = file.path(BASE_OUT, "Figures"),
clonality = file.path(BASE_OUT, "Clonality_Analysis"),
cl_fig = file.path(BASE_OUT, "Clonality_Analysis", "Figures"),
cl_tab = file.path(BASE_OUT, "Clonality_Analysis", "Tables"),
phylo = file.path(BASE_OUT, "Phylogeny"),
objects = file.path(BASE_OUT, "Seurat_Objects")
)
invisible(lapply(DIRS, dir.create, showWarnings = FALSE, recursive = TRUE))
# ---------- knitr: save ALL chunk figures into Figures/knitr-* automatically ----------
knitr::opts_chunk$set(
fig.path = file.path(DIRS$figures, "knitr-"),
dpi = 300,
fig.width = 8,
fig.height = 5,
message = FALSE,
warning = FALSE
)
# Helper to save ggplots with consistent dimensions
save_plot <- function(p, filename, w = 11, h = 6.5, dpi = 300) {
ggsave(filename = filename, plot = p, width = w, height = h, dpi = dpi, limitsize = FALSE)
}
cat("Created output folders under:", BASE_OUT, "\n")
Created output folders under: TCR_analysis-12-01-2026
#Load Seurat Object merged from cell lines and a control(PBMC) after filtration
All_samples_Merged <- readRDS("../../0-Seurat_RDS_Final_OBJECT/All_samples_Merged_with_Renamed_Clusters_final-26-10-2025.rds")
All_samples_Merged
An object of class Seurat
62900 features across 49305 samples within 6 assays
Active assay: RNA (36601 features, 0 variable features)
2 layers present: data, counts
5 other assays present: ADT, prediction.score.celltype.l1, prediction.score.celltype.l2, prediction.score.celltype.l3, SCT
5 dimensional reductions calculated: integrated_dr, ref.umap, pca, umap, harmony
scRepertoire functions using the filtered_contig_annotations.csv output from the 10x Genomics Cell Ranger. This file is located in the ./outs/ directory of the VDJ alignment folder. To generate a list of contigs to use for scRepertoire:
-load the filtered_contig_annotations.csv for each of the samples.
-make a list in the R environment.
# Create a new metadata column for plotting with simplified sample names
All_samples_Merged$display_sample <- All_samples_Merged$orig.ident
# Rename L3_B and L4_B to L3 and L4
All_samples_Merged$display_sample <- gsub("^L3_B$", "L3", All_samples_Merged$display_sample)
All_samples_Merged$display_sample <- gsub("^L4_B$", "L4", All_samples_Merged$display_sample)
# Rename PBMC to CD4T_lab
All_samples_Merged$display_sample <- gsub("^PBMC$", "CD4T_lab", All_samples_Merged$display_sample)
There are varying definitions of clones in the literature. For the purposes of scRepertoire, we define a clone as cells with shared/trackable complementarity-determining region 3 (CDR3) sequences. Within this definition, one might use amino acid (aa) sequences of one or both chains to define a clone. Alternatively, we could use nucleotide (nt) or the V(D)JC genes (genes) to define a clone. The latter, genes, would be a more permissive definition of “clones,” as multiple amino acid or nucleotide sequences can result from the same gene combination. Another option to define a clone is the use of the V(D)JC and nucleotide sequence (strict). scRepertoire allows for the use of all these definitions of clones and enables users to select both or individual chains to examine.
contig_list <- list(L1, L2, L3_B, L4_B, L5, L6, L7)
contig_list_with_PBMC <- list(L1, L2, L3_B, L4_B, L5, L6, L7, PBMC)
combined.TCR <- combineTCR(
contig_list,
samples = c("L1", "L2", "L3", "L4", "L5", "L6", "L7"),
removeNA = FALSE,
removeMulti = FALSE,
filterMulti = FALSE
)
combined.TCR_with_PBMC <- combineTCR(
contig_list_with_PBMC,
samples = c("L1", "L2", "L3", "L4", "L5", "L6", "L7", "CD4T_lab"),
removeNA = FALSE,
removeMulti = FALSE,
filterMulti = FALSE
)
# Ensure output dirs exist
dir.create(BASE_OUT, showWarnings = FALSE, recursive = TRUE)
exportClones(combined.TCR,
write.file = TRUE,
dir = BASE_OUT,
file.name = "clones.csv")
exportClones(combined.TCR_with_PBMC,
write.file = TRUE,
dir = BASE_OUT,
file.name = "clones_with_PBMC.csv")
# Combine all data frames in the list into a single data frame
combined_TCR_df <- do.call(rbind, combined.TCR)
combined_TCR_with_PBMC_df <- do.call(rbind, combined.TCR_with_PBMC)
write.csv(combined_TCR_df,
file = file.path(DIRS$tables, "combined_TCR.csv"),
row.names = FALSE)
write.csv(combined_TCR_with_PBMC_df,
file = file.path(DIRS$tables, "combined_TCR_with_PBMC.csv"),
row.names = FALSE)
head(combined.TCR_with_PBMC[])[1]
$L1
NA
dir.create(DIRS$phylo, showWarnings = FALSE, recursive = TRUE)
# Alpha FASTA
alpha_fasta_file <- file.path(DIRS$phylo, "cdr3_alpha_sequences.fasta")
fa <- file(alpha_fasta_file, open = "w")
for (i in seq_len(nrow(combined_TCR_with_PBMC_df))) {
if (!is.na(combined_TCR_with_PBMC_df$cdr3_nt1[i])) {
writeLines(paste0(">", combined_TCR_with_PBMC_df$sample[i], "_alpha_", i), fa)
writeLines(combined_TCR_with_PBMC_df$cdr3_nt1[i], fa)
}
}
close(fa)
# Beta FASTA
beta_fasta_file <- file.path(DIRS$phylo, "cdr3_beta_sequences.fasta")
fb <- file(beta_fasta_file, open = "w")
for (i in seq_len(nrow(combined_TCR_with_PBMC_df))) {
if (!is.na(combined_TCR_with_PBMC_df$cdr3_nt2[i])) {
writeLines(paste0(">", combined_TCR_with_PBMC_df$sample[i], "_beta_", i), fb)
writeLines(combined_TCR_with_PBMC_df$cdr3_nt2[i], fb)
}
}
close(fb)
cat("Alpha FASTA:", alpha_fasta_file, "\n")
Alpha FASTA: TCR_analysis-12-01-2026/Phylogeny/cdr3_alpha_sequences.fasta
cat("Beta FASTA:", beta_fasta_file, "\n")
Beta FASTA: TCR_analysis-12-01-2026/Phylogeny/cdr3_beta_sequences.fasta
p_cq1 <- clonalQuant(combined.TCR, cloneCall="strict", chain="both", scale=TRUE)
p_cq2 <- clonalQuant(combined.TCR_with_PBMC, cloneCall="strict", chain="both", scale=TRUE)
p_cq3 <- clonalQuant(combined.TCR, cloneCall="strict", chain="both", scale=FALSE)
p_cq4 <- clonalQuant(combined.TCR_with_PBMC, cloneCall="strict", chain="both", scale=FALSE)
print(p_cq1); save_plot(p_cq1, file.path(DIRS$figures, "clonalQuant_combined_scaleTRUE.pdf"), 10, 5)
print(p_cq2); save_plot(p_cq2, file.path(DIRS$figures, "clonalQuant_withPBMC_scaleTRUE.pdf"), 10, 5)
print(p_cq3); save_plot(p_cq3, file.path(DIRS$figures, "clonalQuant_combined_scaleFALSE.pdf"), 10, 5)
print(p_cq4); save_plot(p_cq4, file.path(DIRS$figures, "clonalQuant_withPBMC_scaleFALSE.pdf"), 10, 5)
p_ca1 <- clonalAbundance(combined.TCR_with_PBMC, cloneCall="gene", palette="Zissou 1", scale=FALSE)
p_ca2 <- clonalAbundance(combined.TCR_with_PBMC, cloneCall="gene", palette="Zissou 1", scale=TRUE)
p_ca3 <- clonalAbundance(combined.TCR, cloneCall="gene", scale=FALSE)
p_ca4 <- clonalAbundance(combined.TCR, cloneCall="gene", scale=TRUE)
print(p_ca1); save_plot(p_ca1, file.path(DIRS$figures, "clonalAbundance_withPBMC_scaleFALSE.pdf"), 10, 5)
print(p_ca2); save_plot(p_ca2, file.path(DIRS$figures, "clonalAbundance_withPBMC_scaleTRUE.pdf"), 10, 5)
print(p_ca3); save_plot(p_ca3, file.path(DIRS$figures, "clonalAbundance_combined_scaleFALSE.pdf"), 10, 5)
print(p_ca4); save_plot(p_ca4, file.path(DIRS$figures, "clonalAbundance_combined_scaleTRUE.pdf"), 10, 5)
clonalLength(combined.TCR_with_PBMC, cloneCall="nt", chain="both")
clonalLength(combined.TCR_with_PBMC, cloneCall="aa", chain="both")
clonalLength(combined.TCR, cloneCall="nt", chain="both")
clonalLength(combined.TCR, cloneCall="aa", chain="both")
clonalLength(combined.TCR_with_PBMC, cloneCall="nt", chain="TRA", scale=TRUE)
clonalLength(combined.TCR_with_PBMC, cloneCall="aa", chain="TRA", scale=TRUE)
clonalLength(combined.TCR, cloneCall="nt", chain="TRA", scale=TRUE)
clonalLength(combined.TCR, cloneCall="aa", chain="TRA", scale=TRUE)
clonalLength(combined.TCR_with_PBMC, cloneCall="nt", chain="TRB", scale=TRUE)
clonalLength(combined.TCR_with_PBMC, cloneCall="aa", chain="TRB", scale=TRUE)
clonalLength(combined.TCR, cloneCall="nt", chain="TRB", scale=TRUE)
clonalLength(combined.TCR, cloneCall="aa", chain="TRB", scale=TRUE)
p_cc1 <- clonalCompare(combined.TCR, top.clones=10, samples=c("L1","L2"), cloneCall="aa", graph="alluvial")
p_cc2 <- clonalCompare(combined.TCR, top.clones=10, samples=c("L3","L4"), cloneCall="aa", graph="alluvial")
p_cc3 <- clonalCompare(combined.TCR, top.clones=10, samples=c("L5","L6","L7"), cloneCall="aa", graph="alluvial")
print(p_cc1); save_plot(p_cc1, file.path(DIRS$figures, "clonalCompare_L1_L2_alluvial.pdf"), 12, 5)
print(p_cc2); save_plot(p_cc2, file.path(DIRS$figures, "clonalCompare_L3_L4_alluvial.pdf"), 12, 5)
print(p_cc3); save_plot(p_cc3, file.path(DIRS$figures, "clonalCompare_L5_L6_L7_alluvial.pdf"), 12, 5)
# clonalCompare
p_cc_all1 <- clonalCompare(combined.TCR_with_PBMC,
top.clones=10,
samples=c("L1","L2","L3","L4","L5","L6","L7","CD4T_lab"),
cloneCall="aa", graph="alluvial")
p_cc_all2 <- clonalCompare(combined.TCR,
top.clones=10,
samples=c("L1","L2","L3","L4","L5","L6","L7"),
cloneCall="aa", graph="alluvial")
print(p_cc_all1); save_plot(p_cc_all1, file.path(DIRS$figures, "clonalCompare_all_withPBMC_alluvial.pdf"), 16, 5)
print(p_cc_all2); save_plot(p_cc_all2, file.path(DIRS$figures, "clonalCompare_all_celllines_alluvial.pdf"), 16, 5)
clonalScatter(combined.TCR, cloneCall="gene", x.axis="L1", y.axis="L2", dot.size="total", graph="proportion")
clonalScatter(combined.TCR, cloneCall="gene", x.axis="L3", y.axis="L4", dot.size="total", graph="proportion")
clonalScatter(combined.TCR, cloneCall="gene", x.axis="L5", y.axis="L6", dot.size="total", graph="proportion")
clonalScatter(combined.TCR, cloneCall="gene", x.axis="L5", y.axis="L7", dot.size="total", graph="proportion")
clonalScatter(combined.TCR_with_PBMC, cloneCall="gene", x.axis="L6", y.axis="L7", dot.size="total", graph="proportion")
clonalHomeostasis(combined.TCR, cloneCall="gene")
clonalHomeostasis(combined.TCR, cloneCall="gene",
cloneSize=c(Rare=0.001,Small=0.01,Medium=0.1,Large=0.3,Hyperexpanded=1))
clonalHomeostasis(combined.TCR, cloneCall="gene",
cloneSize=c(Rare=0.01, Small=0.1, Medium=1, Large=5, Hyperexpanded=20))
clonalHomeostasis(combined.TCR, cloneCall="gene",
cloneSize = c("Minor" = 1, "Major" = 10, "Dominant" = Inf))
NA
NA
clonalHomeostasis(combined.TCR, cloneCall="aa", # Changed from "gene"
cloneSize=c(Rare=0.001,Small=0.01,Medium=0.1,Large=0.3,Hyperexpanded=1))
clonalHomeostasis(combined.TCR, cloneCall="aa",
cloneSize=c("Minor"=1, "Major"=10, "Dominant"=Inf))
clonalProportion(combined.TCR, cloneCall="gene")
clonalProportion(combined.TCR, cloneCall="nt", clonalSplit=c(1,5,10,100,1000,10000))
clonalProportion(combined.TCR, cloneCall="nt", clonalSplit=c(10,100,1000,10000,30000,100000))
percentAA(combined.TCR, chain="TRB", aa.length=20)
positionalEntropy(combined.TCR, chain="TRB", aa.length=20)
positionalEntropy(combined.TCR_with_PBMC, chain="TRB", aa.length=20)
positionalProperty(combined.TCR[c(1,2)], chain="TRB", aa.length=20, method="atchleyFactors") +
scale_color_manual(values=hcl.colors(5, "inferno")[c(2,4)])
# percentGeneUsage
vizGenes(combined.TCR, x.axis="TRBV", y.axis=NULL, plot="barplot", summary.fun="proportion")
vizGenes(combined.TCR, x.axis="TRAV", y.axis=NULL, plot="barplot", summary.fun="proportion")
vizGenes(combined.TCR_with_PBMC, x.axis="TRBV", y.axis=NULL, plot="barplot", summary.fun="proportion")
vizGenes(combined.TCR_with_PBMC, x.axis="TRAV", y.axis=NULL, plot="barplot", summary.fun="proportion")
vizGenes(combined.TCR[c(1,2,3,4,5,6,7)], x.axis="TRBV", y.axis="TRBJ", plot="heatmap", summary.fun="percent")
vizGenes(combined.TCR[c(1,2,3,4,5,6,7)], x.axis="TRAV", y.axis="TRAJ", plot="heatmap", summary.fun="percent")
vizGenes(combined.TCR[c(1,2)], x.axis="TRBV", y.axis="TRAV", plot="heatmap", summary.fun="percent")
vizGenes(combined.TCR[c(3,4)], x.axis="TRBV", y.axis="TRAV", plot="heatmap", summary.fun="percent")
vizGenes(combined.TCR[c(5,6,7)], x.axis="TRBV", y.axis="TRAV", plot="heatmap", summary.fun="percent")
# percentGenes: Quantifying Single Gene Usage
percentGenes(combined.TCR, chain="TRB", gene="Vgene", summary.fun="percent")
percentGenes(combined.TCR, chain="TRA", gene="Vgene", summary.fun="percent")
percentGenes(combined.TCR, chain="TRB", gene="Jgene")
percentGenes(combined.TCR, chain="TRA", gene="Jgene", summary.fun="percent")
# percentGenes: Quantifying Single Gene Usage
percentGenes(combined.TCR, chain = "TRB", gene = "Vgene")
percentGenes(combined.TCR_with_PBMC, chain = "TRA", gene = "Vgene",summary.fun = "percent")
percentGenes(combined.TCR_with_PBMC, chain = "TRB", gene = "Jgene")
percentGenes(combined.TCR_with_PBMC, chain = "TRA", gene = "Jgene",summary.fun = "percent")
# percentVJ
# Quantify the proportion of V and J gene usage with percentVJ(). Like percentGenes(), this function will quantify the percentage of V and J paired together across individual repertoires. The output can be visualized using a heatmap or as input for further dimensional reduction.
percentVJ(combined.TCR_with_PBMC, chain="TRB", summary.fun="percent")
percentVJ(combined.TCR_with_PBMC, chain="TRA", summary.fun="percent")
# percentKmer
percentKmer(combined.TCR_with_PBMC, cloneCall="aa", chain="TRB", motif.length=3, top.motifs=25)
percentKmer(combined.TCR_with_PBMC, cloneCall="aa", chain="TRA", motif.length=3, top.motifs=25)
percentKmer(combined.TCR_with_PBMC, cloneCall="nt", chain="TRB", motif.length=3, top.motifs=25)
percentKmer(combined.TCR_with_PBMC, cloneCall="nt", chain="TRA", motif.length=3, top.motifs=25)
# clonalDiversity
clonalDiversity(combined.TCR_with_PBMC, cloneCall="gene")
clonalDiversity(combined.TCR, cloneCall="gene")
#clonalSizeDistribution
clonalSizeDistribution(combined.TCR, cloneCall="aa", method="ward.D2")
clonalSizeDistribution(combined.TCR, cloneCall="nt", method="ward.D2")
clonalSizeDistribution(combined.TCR_with_PBMC, cloneCall="aa", method="ward.D2")
clonalSizeDistribution(combined.TCR_with_PBMC, cloneCall="nt", method="ward.D2")
#clonalOverlap
clonalOverlap(combined.TCR_with_PBMC, cloneCall="strict", method="morisita")
clonalOverlap(combined.TCR_with_PBMC, cloneCall="strict", method="raw")
clonalOverlap(combined.TCR, cloneCall="strict", method="morisita")
clonalOverlap(combined.TCR, cloneCall="strict", method="raw")
clonalRarefaction(combined.TCR, plot.type=1, hill.numbers=0, n.boots=2)
clonalRarefaction(combined.TCR_with_PBMC, plot.type=1, hill.numbers=0, n.boots=2)
clonalRarefaction(combined.TCR, plot.type=2, hill.numbers=0, n.boots=2)
clonalRarefaction(combined.TCR_with_PBMC, plot.type=2, hill.numbers=0, n.boots=2)
clonalRarefaction(combined.TCR, plot.type=3, hill.numbers=0, n.boots=2)
clonalRarefaction(combined.TCR_with_PBMC, plot.type=3, hill.numbers=0, n.boots=2)
clonalRarefaction(combined.TCR, plot.type=1, hill.numbers=1, n.boots=2)
clonalRarefaction(combined.TCR_with_PBMC, plot.type=1, hill.numbers=1, n.boots=2)
clonalRarefaction(combined.TCR,
plot.type = 3,
hill.numbers = 0,
n.boots = 2)
clonalRarefaction(combined.TCR_with_PBMC,
plot.type = 3,
hill.numbers = 0,
n.boots = 2)
# Rarefaction using Shannon Diversity (q = 1)
clonalRarefaction(combined.TCR,
plot.type = 1,
hill.numbers = 1,
n.boots = 2)
clonalRarefaction(combined.TCR_with_PBMC,
plot.type = 1,
hill.numbers = 1,
n.boots = 2)
saveRDS(All_samples_Merged, file = "TCR_analysis-12-01-2026/Seurat_Objects/TCR_Seurat_with_cloneSize_20_12_2026_finalized.rds")