|Title:||Haplotype inferring via galled-tree networks|
Department of Computer Science
University of British Columbia
The problem of determining haplotypes from genotypes has gained considerable prominence in the research community since the beginning of the HapMap project. Here the focus is on determining the sets of SNP values of individual chromosomes (haplotypes), since such information better captures the genetic causes of diseases. One of the main algorithmic tools for haplotyping is based on the assumption that the evolutionary history for the original haplotypes satisfies perfect phylogeny. The algorithm can be applied only on individual blocks of chromosomes, in which it is assumed that recombinations do not happen. However, exact determination of blocks is usually not possible. It would be desirable to develop a method for haplotyping which can account for recombinations, and thus can be applied on multiblock sections of chromosomes. A natural candidate for such a method is haplotyping via phylogenetic networks or their simplified version: galled-tree networks, which were introduced by Wang, Zhang, Zhang (2001) to model recombinations. We study the problem of haplotyping via galled-tree networks. We first characterize haplotype matrices for which there is a galled-tree network and then use this characterization to show that the problem of haplotyping via galled-tree networks is NP-complete in general. Furthermore, we show that if we restrict the number of mutations on each gall by k, the problem remains NP-complete for each k>=3, and becomes polynomial for k=2 (assuming the input genotype matrix contains 1 in each column which contains 2).
This is joint work with students Xiaohong (Sharon) Zhao, Mehdi Karimi and professors Ladislav Stacho and Arvind Gupta.