Subject: A new DNA word design strategy using random de Bruijn Sequences
Presenter: Christine Heitsch
Abstract A new DNA word design strategy using random de Bruijn Sequences

In the nucleus, lengthy DNA molecules have a canonical double-stranded helix structure well-adapted for information storage and retrieval. In the laboratory, short single-stranded DNA sequences have many possible applications, ranging from microarrays to nanomolecular structures and DNA computation. Computer technology operates on a binary code of zeros and ones, however the genetic code is a four letter alphabet with energetics driven by the Watson-Crick base pairings. Thus, the essential challenge is designing sets of short oligonucleotides, or "DNA words," whose elements are strongly differentiated from each other with respect to the biochemical energetics.

Under the model that stable (mis)hybridization begins at a small region with perfect pairing, we control problematic interactions by preventing a nucleation complex. The complement problem is addressed by restricting repeated substrings with the adoption of de Bruijn sequences as our mathematical basis for noninteracting DNA segments. We provide an algorithm for generating these sequences uniformly at random from the 1.89 x 10^20 total possibilities and analyze its performance. The program's output is first selected according to our criteria and then tested against the predicted biochemical properties. This solves the reverse complement and inverted repeat problems. We then experimentally verify the desired biochemical properties of our DNA words. Finally, we discuss our ability to engineer strings of nucleotide bases with specified characteristics as it pertains to current and future applications.