How to Analyze Visium HD Data with Python

Overview

Spatial transcriptomics deconvolution estimates the cell type composition at each spatial location (spot) by integrating single-cell RNA-seq reference data. This tutorial shows you how to:

Load and preprocess Visium HD spatial data
Prepare a single-cell RNA-seq reference
Run cell type deconvolution with FlashDeconv
Visualize and interpret the results

Why FlashDeconv? It's designed for large-scale spatial data like Visium HD, handling millions of spots without requiring a GPU. The scanpy-style API integrates seamlessly with existing workflows.

Installation

Install FlashDeconv and dependencies:

pip install flashdeconv scanpy

For Visium HD parquet file support:

pip install pyarrow

Step 1: Load Your Data

Load Visium HD spatial data

Visium HD data from 10x Genomics comes in binned outputs at different resolutions (2μm, 8μm, 16μm). Here's how to load 8μm binned data:

import scanpy as sc
import pandas as pd
import pyarrow.parquet as pq
from pathlib import Path

def load_visium_hd(data_dir, bin_size="008um"):
    """Load Visium HD data at specified resolution."""
    bin_path = Path(data_dir) / f"square_{bin_size}"

    # Load gene expression matrix
    adata = sc.read_10x_h5(bin_path / "filtered_feature_bc_matrix.h5")
    adata.var_names_make_unique()

    # Load spatial coordinates
    positions = pq.read_table(
        bin_path / "spatial" / "tissue_positions.parquet"
    ).to_pandas()
    positions = positions.set_index("barcode")

    # Align barcodes
    common = adata.obs_names.intersection(positions.index)
    adata = adata[common].copy()
    adata.obsm["spatial"] = positions.loc[
        common, ["pxl_col_in_fullres", "pxl_row_in_fullres"]
    ].values

    return adata

# Load your data
adata_st = load_visium_hd("./Visium_HD_outputs", bin_size="008um")
print(f"Loaded {adata_st.n_obs:,} spots, {adata_st.n_vars:,} genes")

Load single-cell reference

Your reference should be an AnnData object with cell type annotations:

# Load reference scRNA-seq data
adata_ref = sc.read_h5ad("./reference.h5ad")

# Check cell type annotations
print(adata_ref.obs["cell_type"].value_counts())

Step 2: Run Deconvolution

FlashDeconv provides a scanpy-style API that stores results directly in your AnnData object:

import flashdeconv as fd

# Run deconvolution (results stored in adata_st.obsm)
fd.tl.deconvolve(
    adata_st,                    # Spatial AnnData
    adata_ref,                   # Reference scRNA-seq
    cell_type_key="cell_type",   # Column with cell type labels
    # Optional parameters:
    # sketch_dim=512,            # Sketch dimension (default: 512)
    # lambda_spatial=5000,       # Spatial regularization strength
    # n_hvg=2000,                # Number of highly variable genes
)

# Access results
proportions = adata_st.obsm["flashdeconv"]
print(f"Shape: {proportions.shape}")  # (n_spots, n_cell_types)

Parameter tuning: The default parameters work well for most Visium/Visium HD data. For sparse data or small spot sizes, consider increasing lambda_spatial.

Step 3: Visualize Results

Plot cell type proportions spatially

import matplotlib.pyplot as plt

# Get cell type names
cell_types = adata_st.uns["flashdeconv"]["cell_types"]

# Plot each cell type
fig, axes = plt.subplots(2, 4, figsize=(16, 8))
for i, (ax, ct) in enumerate(zip(axes.flat, cell_types)):
    sc.pl.spatial(
        adata_st,
        color=None,
        ax=ax,
        show=False,
    )
    # Color by proportion
    props = adata_st.obsm["flashdeconv"][:, i]
    scatter = ax.scatter(
        adata_st.obsm["spatial"][:, 0],
        adata_st.obsm["spatial"][:, 1],
        c=props,
        s=1,
        cmap="Reds",
    )
    ax.set_title(ct)
    ax.axis("off")

plt.tight_layout()
plt.savefig("cell_type_maps.png", dpi=150)
plt.show()

Identify dominant cell type per spot

import numpy as np

# Assign dominant cell type
dominant_idx = np.argmax(adata_st.obsm["flashdeconv"], axis=1)
adata_st.obs["dominant_cell_type"] = [cell_types[i] for i in dominant_idx]

# Plot
sc.pl.spatial(adata_st, color="dominant_cell_type", spot_size=1)

Advanced: Multi-Resolution Analysis

For Visium HD, you can analyze how cell type signals change across resolutions. This helps identify the optimal bin size for your analysis:

# See the full multi-resolution tutorial notebook:
# examples/resolution_horizon_analysis.ipynb

Open in Google Colab

Next Steps

Citation

If you use FlashDeconv in your research, please cite:

Yang, C., Chen, J. & Zhang, X. FlashDeconv enables atlas-scale,
multi-resolution spatial deconvolution via structure-preserving sketching.
bioRxiv (2025). https://doi.org/10.64898/2025.12.22.696108