Title: Model based approaches for detection of DNA copy number alterations in the human genome
Speaker: Sohrab Shah
Abstract DNA copy number alterations (CNAs) are a hallmark of somatic mutations in tumor genomes and congenital abnormalities that lead to diseases such as mental retardation. CNAs define regions on a given chromosome that exhibit in deletion or amplification of the DNA within the region. Accurately identifying the locations of CNAs in an individual sample has applications in the understanding molecular mechanisms of disease as well as the development of diagnostic and prognostic tools. Furthermore, identifying the pattern of recurrent CNAs that occur in a set of samples exhibiting a common phenotype has compelling implications for medical advances. Recent progress in array comparative genomic hybridization (aCGH) have enabled researchers to measure CNAs at high resolution for the entire human genome. Unfortunately, the observed copy number changes are often corrupted by various sources of noise, making the CNAs hard to detect.

In this talk I will explore model-based approaches to the detection of CNAs in aCGH data. Specifically, I will describe an augmented hidden Markov model that significantly improves CNA detection in a single sample over baseline models. I will then describe a proposed model for jointly analysing aCGH data from a set of samples to detect recurrent CNAs. I will report accuracy results on clinically relevant data sets as well as synthetic data with ground truth. Finally, I will discuss limitations and proposed improvements to these approaches.