|Title:||Improving predictive power of the replica exchange Monte Carlo algorithm for protein folding by application of more complex energy and lattice models|
Department of Computer Science, University of British Columbia
The replica exchange Monte Carlo (REMC) algorithm for protein folding in the HP model is a state-of-the-art on-lattice ab initio protein structure prediction algorithm. We leverage this existing framework and investigate each component of the system, which consists of an energy function for directing the search towards the native conformation, a lattice model to discretize representation of protein conformations in space and a search strategy. REMC, in principle, is not restricted to lattice models, but our work focuses on lattice proteins only. In particular, we generalize REMC to support a variety of more complex lattice models (cubic, face-centered-cubic and double-tetrahedral) and energy functions (hHPNX and β-sheet potential). Our goal is to improve quality of structures predicted by the algorithm in terms of resemblance between predicted and real structures.
We demonstrate that REMC is highly effective for solving β-sheet proteins in the homopolymer model. The enhanced version of REMC outperforms the current state-of-the-art algorithm on all β-sheet homopolymer benchmarks. We also compare the quality of conformations predicted using different combinations of energy and lattice models. Conformations of biological sequences predicted using the HP and hHPNX energy functions on cubic, face-centered-cubic and double-tetrahedral lattices are compared to their respective real structures. We have found that the differences in quality are insignificant. It is possible that our algorithm is not fully optimized to utilize the mentioned energy and lattice models in the most effective way, but we believe that energy function is more likely to be the primary contributor leading to low result quality. We have embedded biological protein structures onto lattices and found evidence that lattice models are theoretically capable of representing native structures relatively closely. We have also contrasted the energy values of embedded structures and predicted structures. The higher energy values of embedded structures compared to predicted structures suggest that the energy function may not be directing the search towards a correct terminal state.
REMC is a highly effective search strategy in the on-lattice protein folding problem for exploring the vast conformational space of any given amino acid sequence. Demonstrated by our results, the potential of lattice models may not be fully realized without more accurate energy functions. Particularly, energy models that account for both backbone and side chain interactions are worthy of further study.