|Title:||Robust hidden semi-Markov modeling of array CGH data|
Department of Computer Science, UBC
As an extension to hidden Markov models, the hidden semi-Markov models allow the probability distribution of staying in the same state to be a general distribution. Therefore, hidden semi-Markov models are good at modeling sequences with succession of homogenous zones by choosing appropriate state duration distributions. Hidden semi-Markov models are generative models. Most times they are trained by maximum likelihood estimation. To compensate model mis-specification and provide protection against outliers, hidden semi-Markov models can be trained discriminatively given a labeled training set at the expense of increased training complexity. As an alternative to discriminative training, in this paper, we consider model mis-specification and outliers by adopting robust methods. Specifically, we use Student's t mixture models as the emission distributions of hidden semi-Markov models. The proposed robust hidden semi-Markov models are used to model array based comparative genomic hybridization data. Experiments conducted on the benchmark data from the Coriell cell lines, and the glioblastoma multiforme data illustrate the reliability of the technique.