Module 4: Protein Structure (2)


Given energy function (force field) , we want to use ¡°genetic algorithm¡± to find global minimal.
Generic Algorithm:


To minimize  over space of 

Can choose: Important points:

Historically, GA is defined to operate on binary data.

Thus, are bit vectors and need to have good encoding scheme.

Evolutionary algorithm: like GA, but works directly on non-binary data.

Outline application of EA to tertiary structure prediction.

Results (how good is this?)


Applying EA (1000 generations, 10 individuals) gives very bad results (structures found are quite different from native Crambin).


The energy model is not good enough. It turns out that the energy of the result structure is much lower than the energy of the native structure, according to the energy model.

Note: The approach is much more successful for side-chain packing

Note: The EA can generally can be improved by using more sophisticated (problem specific) search operators (here: "local twist"). Protein Secondary Structure Prediction

Algorithm approaches:

Based on probability of accounting certain AAs as in given secondary structure(estimate from PDB)

This method gives only about 50% prediction accuracy

e.g. Predict as a -helix :segment of 6 residues

E[Pa ]>1.03

E[Pa ]>E[Pb ]

Not includes Proline.

Accuracy ? 63%

Neural networks are typically organized in layers. Layers are made up of a number of interconnected 'nodes', which contain an 'activation function'. Patterns are presented to the network via the 'input layer', which communicates to one or more 'hidden layers' where the actual processing is done via a system of weighted 'connections'. The hidden layers then link to an 'output layer' where the answer is output as shown in the graph above.