CPSC 445 - Assignment 3

released: 2016/03/10, thu, 15:30; due: 2016/03/18, fri, 18:00
Include your name, student ID, and all classmates who you worked with.

1. Phylogenetic Trees / Parsimony [30 marks]

(a) Calculate the parsimony score of the two (rooted) trees shown below, using Fitch's algorithm. (Show all steps of your work) [6 marks]

(b) For tree 2, show all the possible internal label combinations that produce the same parsimony score as that calculated in (a). Label which combinations can not be produced by Fitch's algorithm (if any exist). [9 marks]

(c) Calculate the parsimony score of the (rooted) tree shown below, using Fitch's algorithm (Show all your work) [10 marks]

(d) Given n species and m characters for each species, what is the time-complexity of solving the small parsimony problem using Fitch's algorithm, assuming that each set operation takes one time unit? Explain your answer. [5 marks]

(e) Large Parsimony. Briefly summarize each of the techniques given in class for solving the large parsimony problem. [5 marks]

(f) Determine the time complexity of doing one step of the best improvement algorithm. [5 marks]

(g) Describe a good heuristic for breaking out of local optima when performing Stochastic Local Search. [5 marks]

2. Constructing Phylogenetic Trees using UPGMA and Neighbour-Joining [25 marks]

Read pages 169-173 of Durbin et al. before attempting the following problems.

(a) Visit the following website http://mobyle.pasteur.fr/cgi-bin/portal.py#forms::neighbor. Load data by clicking the Use example data button. Click on Advanced Options to choose a distance method. Run the program once with UPGMA and once with Neighbour-joining. Submit both trees produced (take screenshots or draw them if you need to). [2 marks]

(b) Using the UPGMA tree from (a), show all the steps involved in producing the tree from UPGMA. [8 marks]

(c) Using the Neighbour-Joining tree from (a), show all the steps involved in producing the tree from Neighbour Joining. [10 marks]

(d) If the trees produced are different, explain where they are different and why this difference occurs. You may refer to parts (b) and (c). [5 marks]

3. Probabilistic Approaches to Phylogeny [15 marks]

Read pages 193-203 of Durbin et al. and focus on your understanding of Felsenstein's Algorithm.

Describe Felsenstein's Algorithm using plain english, illustrations, and pseudo-code if needed. The aim here is to be able to explain to a colleague, as compactly as possible, when, why and how to use this algorithm, and what each step of the algorithm is responsible for. Keep your answer as precise and concise as possible. [15 marks]

4. Searching Over Trees Using the Metropolis Algorithm [15 marks]

Read pages 206-208 of Durbin et al. and focus on your understanding using Maximum Likelihood for inference and the Metropolis Algorithm.

Consider a simplified phylogentic space consisting of two trees T and T' with probabilities P(T) and P(T'). If the proposal procedure always selects the other tree, i.e., the one that is not the current tree, show that the Metropolis Algorithm produces a sequence where the frequencies of T and T' converge to their probabilities. (Problem 8.8 from page 208 of Durbin et al.) [15 marks]

General remarks:

The assignment has to be handed in on the date it is due before 18:00. To ensure fairness, late hand-ins will generally not be accepted (exceptions can only be made for officially documented medical reasons). Please send your solution via e-mail (as plain text or - possibly scanned - PDF) to Robbie and Holger by this deadline.
This assignment should take you no longer than about 4 hours to complete, if you have good knowledge of the topics covered. However, don't wait until the last minute relying on this estimate - it might not apply to you (or anyone at all), you might need additional time to consult the literature, etc.
While cooperation between students - especially between CS and non-CS students - is encouraged, each student is expected to work out the actual solutions to the problems individually and hand in their own assignment. In other words: help each other, but do not copy solutions.
Feel always free to contact Holger or Robbie if you feel you need further help than can be provided by your fellow students.