A replica exchange Monte Carlo algorithm for protein folding in the HP model


Chris Thachuk, Alena Shmygelska and Holger H Hoos

 

Background

The ab initio protein folding problem consists of predicting protein tertiary structure from a given amino acid sequence by minimizing an energy function; it is one of the most important and challenging problems in biochemistry, molecular biology and biophysics. The ab initio protein folding problem is computationally challenging and has been shown to be NP-hard even when conformations are restricted to a lattice. In this work, we implement and evaluate the replica exchange Monte Carlo (REMC) method, which has already been applied very successfully to more complex protein models and other optimization problems with complex energy landscapes, in combination with the highly effective pull move neighbourhood in two widely studied Hydrophobic Polar (HP) lattice models.

Results

We demonstrate that REMC is highly effective for solving instances of the square (2D) and cubic (3D) HP protein folding problem. When using the pull move neighbourhood, REMC outperforms current state-of-the-art algorithms for most benchmark instances. Additionally, we show that this new algorithm provides a larger ensemble of ground-state structures than the existing state-of-the-art methods. Furthermore, it scales well with sequence length, and it finds significantly better conformations on long biological sequences and sequences with a provably unique ground-state structure, which is believed to be a characteristic of real proteins. We also present evidence that our REMC algorithm can fold sequences which exhibit significant interaction between termini in the hydrophobic core relatively easily.

Conclusions

We demonstrate that REMC utilizing the pull move neighbourhood significantly outperforms current state-of-the-art methods for protein structure prediction in the HP model on 2D and 3D lattices. This is particularly noteworthy, since so far, the state-of-the-art methods for 2D and 3D HP protein folding - in particular, the pruned-enriched Rosenbluth method (PERM) and, to some extent, Ant Colony Optimisation (ACO) - were based on chain growth mechanisms. To the best of our knowledge, this is the first application of REMC to HP protein folding on the cubic lattice, and the first extension of the pull move neighbourhood to a 3D lattice.


Link to the full paper: A replica exchange Monte Carlo algorithm for protein folding in the HP model. Chris Thachuk, Alena Shmygelska and Holger H Hoos, BMC Bioinformatics 2007, 8:342 (17 Sep 2007).

Downloads:


Readme.txt file

Linux executable for folding HP sequences in 2D (square lattice) - Local Moves

Linux executable for folding HP sequences in 2D (square lattice) - Pull Moves

Linux executable for folding HP sequences in 2D (square lattice) - Local and Pull Moves

Linux executable for folding HP sequences in 3D (cubic lattice) - Local Moves

Linux executable for folding HP sequences in 3D (cubic lattice) - Pull Moves

Linux executable for folding HP sequences in 3D (cubic lattice) - Local and Pull Moves

Source Code:


The following source code is released as is, under the GNU general public license version 2. If you find the code useful, please reference our paper. source code (.tar.gz)

Last updated on October 22, 2007 by Chris Thachuk: cthachuk [at] cs.ubc.ca

Number of visits since October 22, 2007: