Title: TreeZip: A New Algorithm for Compressing Large-Scale Phylogenetic Tree Collections
Speaker: Tiffani Williams
Department of Computer Science
Texas A&M University, USA

Phylogenetic trees are family trees that represent the relationships between a group of organisms, or taxa. The most popular techniques for reconstructing phylogenetic trees intelligently navigate an exponentially-sized tree space by solving NP-hard optimization problems that that best hypothesize the evolutionary history for a given set of taxa (or organisms). Instead of reconstructing a single tree, these heuristics often return tens of thousands to hundreds of thousands of trees that represent equally-plausible hypotheses for how the taxa of interest evolved from a common ancestor. As biologists attempt to reconstruct increasingly larger phylogenies, such as the Tree of Life, these tree collections will continue to grow in size.

In this talk, I will discuss TreeZip, which is a new compression algorithm for phylogenetic trees. When compared to using standard compression algorithms, our experimental results show that TreeZip is an effective approach for compressing large-scale phylogenetic tree collections. As the size of tree collections continue to increase, compression algorithms such as TreeZip will be critical for helping biologists manage and share their rapidly expanding phylogenetic tree collections.


Tiffani L. Williams is an Assistant Professor in the Department of Computer Science at Texas A&M University. She earned her B.S. in computer science from Marquette University and Ph.D. in computer science from the University of Central Florida. Afterward, she was a postdoctoral fellow at the University of New Mexico. Her honors include a Radcliffe Institute Fellowship, an Alfred P. Sloan Foundation Postdoctoral Fellowship, and a McKnight Doctoral Fellowship. Her research interests are in the areas of bioinformatics and high- performance computing.