MSc Thesis Presentation - Wilson Tu
Name: Wilson Tu
Date: June 25th, 2025
Time: 11:00 AM
Location: ICCS 304
Supervisor: Jiarui Ding
Thesis title: Graph-Augmented Deep Learning Using Literature-Informed Biological Priors for Predicting Perturbations in Single-Cell RNA Sequencing.
Abstract:
Single-cell perturbation experiments promise to revolutionize drug discovery by revealing how gene-level interventions change cellular states. However, modelling these complex, out-of-distribution responses remains challenging. In this thesis, we introduce Transcriptomic Perturbation Predictor (TxPert), a graph-based deep learning model that leverages literature-informed biological priors to improve generalization. TxPert embeds curated gene–gene interaction networks and pathway annotations into its architecture, combining a basal state encoder (mapping control gene expression to a latent state) with a perturbation encoder (mapping one or more gene knockouts into a change vector). The combined latent representation is decoded into predicted gene expression. By training TxPert on diverse single-cell RNA-seq perturbation datasets, we ensure it handles three key out-of-distribution tasks: unseen single-gene perturbations, transfer to novel cell types, and multi-gene perturbations. Empirically, TxPert achieves state-of-the-art accuracy on one-gene perturbation prediction and remains competitive on two-gene cases, substantially outperforming prior models. Crucially, we demonstrate that incorporating biological priors yields meaningful gains in robustness and accuracy across datasets through various ablation studies. We contribute further by examining commonly used metrics and datasets, providing a unifying framework upon which to improve. These findings suggest that graph-augmented models like TxPert can serve as building blocks for “virtual cells,” enabling rapid in silico hypothesis testing. By accurately predicting cell responses to genetic changes, such models could accelerate experiment design and therapeutic discovery.