**NOTE: Only CPSC 445 students are expected to hand in this assignment;
for them, it is part of their exam preparation.**

Consider the following set of points in two-dimensional Euclidean space, which you may imagine to be a (small) subset of results from a gene expression analysis experiment.

P1 = (3, 2)

P2 = (2, 4)

P3 = (4, 7)

P4 = (7, 1)

P5 = (7, 8)

P6 = (9, 10)

**(a)** Complete the following table of pairwise Euclidean distances
between the points, where all distances are rounded to one decimal place
(i.e., show 2.2 instead of 2.2360679...).
[**1 marks**]

P1 | P2 | P3 | P4 | P5 | P6 | |

P1 | 0 | 2.2 | 5 | 4.1 | 7.2 | ? |

P2 | ? |
3.6 | 5.8 | 6.4 | 9.2 | |

P3 | 0 | 6.7 | 3.1 | 5.8 | ||

P4 | 0 | ? |
9.2 | |||

P5 | 0 | 2.8 | ||||

P6 | 0 |

**(b)** Perform agglomerative hierarchical single linkage clustering
on the six data points and show the resulting cluster tree.
(Show all steps.)
[**3 marks**]

**(c)** Perform agglomerative hierarchical complete linkage clustering
on the six data points and show the resulting cluster tree.
(Show all steps.)
[**3 marks**]

**(d)** Perform k-means clustering with k=3
on the six data points, starting with the following
clustering: {{P1,P2,P5}, {P4}, {P3,P6}}. (Show all steps.)
[**3 marks**]

1: a b -> e_out

2: c a -> c a_out

3: c' b -> c' b_out

Consider an initial state in which membrane 2 contains 3 copies of a and one copy of c, and membrane 3 contains 2 copies of b and 2 copies of c' each.

**(a)** Show all steps of this system until the computation
stops, i.e., no rules can be applied in any of the membranes.
[**4 marks**]

**(b)** What changes if you start the computation with
only one instead of two copies of c' in membrane 3?
[**2 marks**]

- The assignment has to be handed in on the date it is due before 16:00. To ensure fairness, extensions will only be granted under exceptional circumstances (i.e., medical reasons). Handin in the form of a PDF file via e-mail to Holger and Baharak is preferred. Otherwise, hand in your assignment at the CS department office or slide it underneath Holger's office door.
- This assignment should take you about 2-3 hours of work, if you have good knowledge of the topics covered and did all reading assignments. However, don't wait until the last minute relying on this estimate - it might not apply to you (or anyone at all), you might need additional time to consult the literature, ...
- While cooperation between students - especially between CS and non-CS students - is encouraged, each student is expected to work out the actual solutions to the problems individually and hand in their own assignment. In other words: help each other, but do not copy solutions.
- Feel always free to contact Holger, Baharak or Sanja if you feel you need further help than can be provided by your fellow students.