Xeuron

AI Metadata Extraction

Extract authors, key findings, references, and an executive summary using AI.

Extraction v1google/gemini-3.1-flash-lite-preview5/1/2026

Executive Summary

TxPert is a deep learning framework developed to predict the transcriptomic responses of cells to genetic perturbations. Accurate prediction of such responses is critical for understanding disease mechanisms and drug discovery but is hampered by the high cost of experimental screens and the limited generalizability of existing computational models. TxPert addresses these challenges by integrating a basal state encoder with a graph neural network (GNN) that leverages diverse biological knowledge graphs—including curated databases and proprietary high-throughput screening data—to predict the outcomes of unseen single or combinatorial perturbations and perturbations across different cellular contexts. Rigorous benchmarking demonstrates that TxPert consistently outperforms existing methods like GEARS and scLAMBDA. The study introduces a robust training and evaluation framework that accounts for batch-matched controls and experimental noise. Notably, TxPert's predictions for unseen single perturbations reach levels of accuracy comparable to split-half experimental reproducibility, providing a competitive benchmark for human-level performance in this domain. The model's success is largely attributed to the synergistic use of complementary biological knowledge graphs, with architectural innovations like the Exphormer-MG graph transformer facilitating effective multi-graph integration. While robust, the model reveals specific performance limitations, such as a reduced ability to accurately predict the downregulation of the unseen perturbation target gene itself. Furthermore, performance is influenced by both the size of the perturbation effect and the existing level of knowledge regarding the targeted gene. By providing a reusable, extendable framework and highlighting these strengths and weaknesses, the authors offer a foundational tool for the scientific community to improve the design of in silico screens, ultimately aiming to accelerate the development of effective therapeutic interventions and personalized medicine.

Authors

Frederik WenkelFirst Author

Valence Labs, Montréal, Quebec, Canada

frederik@valencelabs.com

Wilson Tu

Valence Labs, Montréal, Quebec, Canada

ali@valencelabs.com

Cassandra Masschelein

Valence Labs, Montréal, Quebec, Canada

Hamed Shirzad

Valence Labs, Montréal, Quebec, Canada

Liam Hodgson

Valence Labs, Montréal, Quebec, Canada

Ihab Bendidi

Valence Labs, Montréal, Quebec, Canada

Cian Eastwood

Valence Labs, Montréal, Quebec, Canada

Shawn T. Whitfield

Valence Labs, Montréal, Quebec, Canada

Craig Russell

Valence Labs, Montréal, Quebec, Canada

Yassir El Mesbahi

Valence Labs, Montréal, Quebec, Canada

Jiarui Ding

Computer Science, University of British Columbia, Vancouver, British Columbia, Canada

Marta M. Fay

Recursion, Salt Lake City, UT, USA

Berton Earnshaw

Valence Labs, Montréal, Quebec, Canada

Emmanuel Noutahi

Valence Labs, Montréal, Quebec, Canada

Alisandra K. Denton

Valence Labs, Montréal, Quebec, Canada

ali@valencelabs.com

Abstract

Accurately predicting cellular responses to genetic perturbations is essential for understanding disease mechanisms and designing effective therapies. Yet, exhaustively exploring the space of possible perturbations (for example, multigene perturbations or across tissues and cell types) is prohibitively expensive, motivating methods that can generalize to unseen conditions. We present TxPert, a latent-transfer-based deep learning method that uses multiple knowledge graphs of gene (product)–gene (product) relationships to predict transcriptomic perturbation effects. Different knowledge graphs encode complementary information and we show that a combination of graphs derived from biological databases and high-throughput perturbation screens yields the best performance. For predictions of single unseen perturbations, TxPert approaches the performance of split-half experimental reproducibility. For double unseen perturbations and single perturbations in a different cell line, its predictions increase Person Δ for unseen single perturbations by 8–25% over existing methods.

Key Findings (20)

1
Key finding 1: TxPert uses a latent-transfer-based deep learning architecture for transcriptomic perturbation prediction.
2
Key finding 2: Incorporating multiple knowledge graphs (STRING, GO, PxMap, TxMap) consistently improves predictive performance.
3
Key finding 3: TxPert outperforms existing methods like GEARS and scLAMBDA across various out-of-distribution (OOD) tasks.
4
Key finding 4: Batch effects and confounding significantly impact model performance, requiring batch-matched control strategies.
5
Key finding 5: Retrieval metrics are superior to traditional differentially expressed gene selection for evaluation.
6
Key finding 6: TxPert approaches the performance of split-half experimental reproducibility for single unseen perturbations.
7
Key finding 7: For double unseen perturbations, TxPert achieves a substantial performance lead over GEARS and scLAMBDA.
8
Key finding 8: TxPert effectively generalizes to new cell lines without seen perturbations.
9
Key finding 9: The Exphormer-MG graph transformer architecture provides optimal results for integrating multiple graphs.
10
Key finding 10: Performance is sensitive to the degradation of biological graph structure, showing robustness only until ~60% edge removal.
11
Key finding 11: There is a measurable correlation between perturbation target knowledge level (Pharos rank) and prediction accuracy.
12
Key finding 12: TxPert exhibits a failure mode in predicting the downregulation of the unseen perturbation target itself.
13
Key finding 13: Combinations of STRING and PxMap graphs perform consistently better than STRING alone across all knowledge levels.
14
Key finding 14: Hybrid-BMP message-passing architecture demonstrated top performance for single perturbation tasks in K562 cells.
15
Key finding 15: Baseline performance (mean baseline) is a strong predictor, especially for essential genes.
16
Key finding 16: Gene perturbation stress responses shift cellular states from growth toward quiescence and recycling.
17
Key finding 17: Multi-graph integration strategies (GAT-Hybrid, Exphormer-MG, GAT-MLG, Hybrid-BMP) all show strong performance gains.
18
Key finding 18: Cross-batch control correlation is significantly lower than within-batch correlation.
19
Key finding 19: The use of proprietary phenomics-derived graphs (PxMap/TxMap) enhances predictive capacity.
20
Key finding 20: TxPert provides a reusable framework that sets a new standard for benchmarking transcriptomic perturbation models.

Discussion & Future Directions

The authors acknowledge that independent benchmarks have recently raised questions about the performance of foundation models in biology. TxPert addresses these issues through rigorous evaluation, utilizing strong baselines and showing competitive performance with split-half experimental reproducibility. Future work aims to leverage even larger perturbation datasets, transition toward few-shot or active learning, and improve generalization to primary human tissues. The authors emphasize the need for standardization in benchmarking and the development of metrics that explicitly assess the conditionality and specificity of perturbation effects.

References (47)

[1]Adduri, A. K. et al. (2025). Predicting cellular responses to perturbation across diverse contexts with State. Preprint at https://doi.org/10.1101/2025.06.26.661135
[2]Ahlmann-Eltze, C., Huber, W. & Anders, S. (2025). Deep learning-based predictions of gene perturbation effects do not yet outperform simple linear methods. Nat. Methods 22, 1657–1661.
[3]Bendidi, I. et al. (2024). Benchmarking transcriptomics foundation models for perturbation analysis: one PCA still rules them all. Preprint at https://doi.org/10.48550/arXiv.2410.13956
[4]Bray, M.-A. et al. (2016). Cell painting, a high-content image-based assay for morphological profiling using multiplexed fluorescent dyes. Nat. Protoc. 11, 1757–1774.
[5]Brody, S., Alon, U. & Yahav, E. (2021). How attentive are graph attention networks? Preprint at https://doi.org/10.48550/arXiv.2105.14491
[6]Celik, S. et al. (2024). Building, benchmarking, and exploring perturbative maps of transcriptional and morphological data. PLOS Comput. Biol. 20, e1012463.
[7]Chen, Y. T. & Zou, J. (2023). GenePT: a simple but hard-to-beat foundation model for genes and cells built from chatgpt. Preprint at bioRxiv https://doi.org/10.1101/2023.10.16.562533
[8]Consortium, G. O. (2004). The Gene Ontology (GO) database and informatics resource. Nucleic Acids Res. 32, D258–D261.
[9]Csendes, G., Sanz, G., Szalay, K. Z. & Szalai, B. (2025). Benchmarking foundation cell models for post-perturbation RNA-seq prediction. BMC Genomics 26, 393.
[10]Dixit, A. et al. (2016). Perturb-seq: dissecting molecular circuits with scalable single-cell RNA profiling of pooled genetic screens. Cell 167, 1853–1866.
[11]Evans, N. J., Mills, G. B., Wu, G., Song, X. & McWeeney, S. (2024). Graph structured neural networks for perturbation biology. Preprint at bioRxiv https://doi.org/10.1101/2024.02.28.582164
[12]Feng, C. et al. (2026). A genome-scale single-cell CRISPRi map of transgene regulation across human pluripotent stem cell lines. Cell Genom. 6, 101076.
[13]Huang, A. C. et al. (2025). X-Atlas/Orion: genome-wide Perturb-seq datasets via a scalable fix-cryopreserve platform for training dose-dependent biological foundation models. Preprint at bioRxiv https://doi.org/10.1101/2025.06.11.659105
[14]Joung, J. et al. (2023). A transcription factor atlas of directed differentiation. Cell 186, 209–229.
[15]Jumper, J. et al. (2021). Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589.
[16]Kedzierska, K. Z., Crawford, L., Amini, A. P. & Lu, A. X. (2025). Zero-shot evaluation reveals limitations of single-cell foundation models. Genome Biol. 26, 101.
[17]Kernfeld, E., Yang, Y., Weinstock, J. S., Battle, A. & Cahan, P. (2025). A comparison of computational methods for expression forecasting. Genome Biol. 26, 388.
[18]Klein, D. et al. (2025). CellFlow enables generative single-cell phenotype modeling with flow matching. Preprint at bioRxiv https://doi.org/10.1101/2025.04.11.648220
[19]Klopfenstein, D. V. et al. (2018). GOATOOLS: a Python library for Gene Ontology analyses. Sci. Rep. 8, 10872.
[20]Kraus, O. et al. (2024). Masked autoencoders for microscopy are scalable learners of cellular biology. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (ed. Ceballos, C.) (IEEE).
[21]Littman, R. et al. (2025). Gene-embedding-based prediction and functional evaluation of perturbation expression responses with PRESAGE. Preprint at bioRxiv https://doi.org/10.1101/2025.06.03.657653
[22]Lotfollahi, M. et al. (2023). Predicting cellular responses to complex perturbations in high-throughput screens. Mol. Syst. Biol. 19, e11517.
[23]Lotfollahi, M., Wolf, F. A. & Theis, F. J. (2019). scGen predicts single-cell perturbation responses. Nat. Methods 16, 715–721.
[24]Milacic, M. et al. (2024). The Reactome pathway knowledgebase 2024. Nucleic Acids Res. 52, D672–D678.
[25]Nadig, A. et al. (2025). Transcriptome-wide analysis of differential expression in perturbation atlases. Nat. Genet. 57, 1228–1237.
[26]Norman, T. M. et al. (2019). Exploring genetic interaction manifolds constructed from rich single-cell phenotypes. Science 365, 786–793.
[27]Passaro, S. et al. (2025). Boltz-2: towards accurate and efficient binding affinity prediction. Preprint at bioRxiv https://doi.org/10.1101/2025.06.14.659707
[28]Replogle, J. M. et al. (2022). Mapping information-rich genotype–phenotype landscapes with genome-scale Perturb-seq. Cell 185, 2559–2575.
[29]Roohani, Y., Huang, K. & Leskovec, J. (2024). Predicting transcriptional outcomes of novel multigene perturbations with GEARS. Nat. Biotechnol. 42, 927–935.
[30]Rosen, Y. et al. (2023). Universal cell embeddings: a foundation model for cell biology. Preprint at bioRxiv https://doi.org/10.1101/2023.11.28.568918
[31]Sheils, T. K. et al. (2021). TCRD and Pharos 2021: mining the human proteome for disease biology. Nucleic Acids Res. 49, D1334–D1346.
[32]Shirzad, H. et al. (2025). Even sparser graph transformers. In Proceedings of the 38th International Conference on Neural Information Processing System (eds Globerson, A. et al.) (NIPS).
[33]Shirzad, H., Velingker, A., Venkatachalam, B., Sutherland, D. J. & Sinop, A. K. (2023). Exphormer: sparse transformers for graphs. In Proceedings of the 40th International Conference on Machine Learning (eds Krause, A. et al.) (PMLR).
[34]Szałata, A. et al. (2025). A benchmark for prediction of transcriptomic responses to chemical perturbations across cell types. In Proceedings of the 38th International Conference on Neural Information Processing System (eds Globerson, A. et al.) (NIPS).
[35]Szklarczyk, D. et al. (2019). String v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 47, D607–D613.
[36]Tong, X. et al. (2024). Deep representation learning of chemical-induced transcriptional profile for phenotype-based drug discovery. Nat. Commun. 15, 5378.
[37]Tsherniak, A. et al. (2017). Defining a cancer dependency map. Cell 170, 564–576.
[38]Tu, W., Wenkel, F. & Denton, A. K. (2025). TxPert: leveraging biochemical relationships for out-of-distribution transcriptomic perturbation prediction. Zenodo https://doi.org/10.5281/zenodo.15420279
[39]Veličković, P. et al. (2018). Graph attention networks. Preprint at https://doi.org/10.48550/arXiv.1710.10903
[40]Wang, G., Liu, T., Zhao, J., Cheng, Y. & Zhao, H. (2024). Modeling and predicting single-cell multi-gene perturbation responses with sclambda. Preprint at bioRxiv https://doi.org/10.1101/2024.12.04.626878
[41]Wen, H. et al. (2024). CellPLM: pre-training of cell language model beyond single cells. Preprint at bioRxiv https://doi.org/10.1101/2023.10.03.560734
[42]Wong, D. R., Hill, A. S. & Moccia, R. (2025). Simple controls exceed best deep learning algorithms and reveal foundation model effectiveness for predicting genetic perturbations. Bioinformatics 41, btaf317.
[43]Wu, F. et al. (2019). Simplifying graph convolutional networks. In Proceedings of the 36th International Conference on Machine Learning (eds Chaudhuri, K. & Salakhutdinov, R.) (PMLR).
[44]Wu, Y. et al. (2025). PerturBench: Benchmarking machine learning models for cellular perturbation analysis. Preprint at https://doi.org/10.48550/arXiv.2408.10609
[45]Yun, S., Kim, S., Lee, J., Kang, J. & Kim, H. J. (2021). Neo-GNNs: neighborhood overlap-aware graph neural networks for link prediction. In Proceedings of the 35th International Conference on Neural Information Processing Systems (eds Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P. S., Wortman Vaughan, J.) (NIPS).
[46]Zangari, L., Mandaglio, D. & Tagarelli, A. (2024). Link prediction on multilayer networks through learning of within-layer and across-layer node-pair structural features and node embedding similarity. In Proceedings of the ACM Web Conference (eds Chua, T.-S. & Ngo, C.-W.) (ACM).
[47]Zhang, J. et al. (2025). Tahoe-100M: a giga-scale single-cell perturbation atlas for context-dependent gene function and cellular modeling. Preprint at bioRxiv https://doi.org/10.1101/2025.02.20.639398