Subject: | RNA Secondary Structure |
Presenter: | Mirela Andronescu |
Abstract |
Algorithms for predicting the secondary structure of pairs and combinatorial
sets of nucleic acid strands
Secondary structure prediction of nucleic acid molecules is a very
important problem in computational molecular biology. In this thesis we
introduce two new algorithms for: (1) secondary structure prediction of
pairs of nucleic acid molecules (PairFold), and (2) finding which
sequences, formed from a combinatorial set of nucleic acid strands,
have the most stable secondary structures (CombFold). Our algorithms
run in polynomial time in the sequences lengths and are extensions of
the free energy minimization algorithm for secondary structure
prediction without pseudoknots, using the nearest neighbour
thermodynamic model. Predicting hybridization of pairs of molecules is
motivated by important applications such as ribozyme - mRNA target
duplexes, primer binding prediction and DNA code design. Finding the
most stable concatenations in combinatorial sets of strands is useful
for SELEX experiments and for testing whether sets in DNA computing or
tag libraries concatenate without secondary structure. Our results for
PairFold predictions show over 80% accuracy for sequences of up to 100
nucleotides. The performance goes down as the sequences increase in
length and as the number of non-canonical base pairs, pseudoknots and
tertiary interactions, none of these considered here, increases. The
accuracy of CombFold is similar to that of the free energy minimization
algorithm for single strands, being just a polynomial method for
structure prediction of a combinatorial set of strands. We show that
although complex, CombFold can quickly predict large concatenations of
sets drawn from the literature. In the future, these two algorithms can
be combined to predict the most stable duplexes formed by two
combinatorial sets.
MSci thesis presentation
|