Title: Statistical Approaches to RNA Secondary Structure Prediction and Applications
Speaker: Ye Ding
Wadsworth Center, New York State Department of Health
Abstract

Abstract: RNAs are versatile regulators of gene expression. RNA secondary structures are known to be important for regulatory functions by various types of RNAs. An RNA molecule, particularly a long-chain mRNA, may have a population of structures in the cell. Furthermore, multiple structures have been demonstrated to play important functional roles. Thus a representation of the ensemble of probable structures is of interest. We developed a statistical algorithm to sample rigorously and exactly from the Boltzmann ensemble of secondary structures, and introduced the notion of centroid structures as a new class of structure predictors. These approaches can overcome inherent limitations in conventional algorithms and are the bases for our Sfold RNA folding program (http://sfold.wadsworth.org).

MicroRNAs are small non-coding RNAs that repress protein synthesis by binding to target mRNAs in multicellular eukaryotes. Target identification of microRNA targets is essential to fully understand this new dimension of the complex gene regulatory networks. By employing a two-step model for modeling microRNA:target hybridization, we found that target secondary structure has a major impact on target recognition by microRNAs. Based on analyses of large microRNA targeting data using the model parameters and other sequence and conservation features, we have recently developed a novel computational framework that offers major improvement over established algorithms for prediction of microRNA targets. Computational tools are available through Sfold web server.

Short Bio for Dr. Ye Ding
Dr. Ye Ding is a statistician by training, and a Principal Investigator at Wadsworth Center, New York State Department of Health. In 1985, he obtained a BS in Mathematics (Probability and Statistics major) from Beijing University, Beijing, P.R. China. In 1986 and 1990, respectively, he received a MS and a Ph.D. in Statistics from Carnegie Mellon University, Pittsburgh, PA, USA. Since late 1990s, he has been working on novel algorithms for RNA secondary structure predictions and applications to the rational design of RNA-targeting nucleic acids and the identification of targets for regulatory RNAs. He is the developer of the now widely used Sfold software for RNA folding and applications (http://sfold.wadsworth.org), available to the scientific community since April 2003. The Sfold web server has registered over 219,000 visits by scientists around the world to fold over 130,000 nucleotide sequences. Sfold has been featured by Science NetWatch, Nature Research Highlights, a NAR front cover, a feature article by Genome Technology, Faculty of 1000 Biology evaluations, and a four-page keynote interview article by Research Media. Sfold has bee used by other scientists in the teaching of Bioinformatics and biological chemistry. The original algorithmic ideas of ensemble sampling and centroid representation have been adopted by others not only for RNA problems, but also for other fundamental problems in computational biology and genomics, including sequence alignment, phylogenetic modeling, predictions of cis-regulatory sites for transcription regulation, and protein modeling.