← Back to stories Close-up of stained plant cells in onion root under microscope.
Photo by Leo Freire on Pexels
ArXiv 2026-04-07

New arXiv preprint proposes multimodal “foundation model” to link histology and spatial transcriptomics

What the paper claims

A new arXiv preprint (arXiv:2604.03630) presents a multimodal foundation model the authors call STORM — Spatial Transcriptomics and histOlogy Representation Model — that aims to map routine hematoxylin & eosin (H&E) histology to molecular-resolution spatial transcriptomics. Spatial transcriptomics (ST) can reveal where genes are expressed inside tissue, but remains costly and low‑throughput. H&E slides are cheap and ubiquitous in pathology labs. Could routine slides themselves carry enough signal to recover molecular maps? The authors say yes: they reportedly demonstrate that their model can impute gene expression and support downstream biological discovery and clinical prediction tasks across benchmark datasets.

Methods and claims, and the caveats

The paper frames STORM as a multimodal representation learned from paired histology and ST data, usable both for imputing missing molecular measurements and for transfer learning to prediction tasks. It has been reported that the authors benchmark the model on multiple tissue types and show improved imputation and predictive performance versus baselines. Readers should note this is an arXiv preprint and has not undergone peer review; claims about robustness, generalization across cohorts, and clinical readiness are preliminary until validated independently.

Why it matters

If validated, the approach could lower barriers to spatially informed biology and accelerate translational research: hospitals and biobanks already hold vast archives of H&E slides, so a reliable imputation model would scale access to spatial molecular information without the current cost and throughput limits of ST assays. That could speed biomarker discovery and potentially augment diagnostic workflows — but clinical deployment would require rigorous clinical trials and regulatory review.

Broader context and risks

Multimodal foundation models that bridge imaging and molecular data sit at the intersection of AI, genomics and clinical care, raising questions about data governance, reproducibility and equity of access. It has been reported that advances in AI-driven biology have attracted increased regulatory attention globally; cross-border sharing of sensitive biomedical data and models may face export controls or privacy scrutiny amid evolving geopolitical tensions. As with other fast-moving preprints, independent validation and transparent release of code and data will be essential for the community to assess the model’s value and risks.

Link to the preprint: https://arxiv.org/abs/2604.03630

AIResearch
View original source →