Assignment 3

538B Distributed Systems: Assignment 3

Due: Feb 3rd at 9PM

Winter 2015

In this assignment you will learn about leader election and membership management. You will design and implement a leader election algorithm that builds on a simple key-value service. The leader will use the key-value service to track and advertise the set of active nodes in the system. Your implementation will be resilient to a few kinds of failure.

Overview: You will be provided with an implementation of a simple key-value service that implements an RPC-accessible hash table in which keys are associated with values. Nodes in the system should not communicate with each other directly and must use this service for all communication.

Your task is to implement node logic that allows an arbitrary set of active nodes to agree on a "leader" node. If the leader fails, the remaining nodes should elect a new leader. Once elected, the leader must determine the active nodes in the system and advertise this set (through the key-value service). The active nodes set may change (as nodes may fail or join the system) and the leader must re-advertise the node set to reflect these events. Active nodes should periodically retrieve this list of active nodes.

Individual keys in the key-value service may experience permanent unavailability. Your node implementation must be robust to such unavailability and continue to elect leaders that will properly advertise the set of active nodes.

Key-value service: The key value service associates string keys with string values. Think of it as a remote hash table, providing the same strong consistency semantics that you would expect of a local hash table. The service starts with an empty hash table: every key is mapped to the empty string "". The service supports the following three atomic operations:

  • curr-value, vstamp' ← get(key, vstamp)
    • curr-value contains the current value associated with key, or is set to "unavailable" if the key is unavailable.
  • ret-val, vstamp' ← put(key, value, vstamp)
    • Associates value with key. ret-val is either "", which indicates success, or is set to "unavailable" if the key is unavailable.
  • curr-value, vstamp' ← testset(key, test-value, new-value, vstamp)
    • Tests if the current value associated with key is test-value. If yes, then it associates key with new-value and returns new-value. Otherwise, it returns the value currently associated with key. curr-value is set to "unavailable" if the key is unavailable.
Note the additional vstamp argument and vstamp' return value in each of the above calls. The key-value service maintains a GoVector-based vector clock for each key. The vstamp timestamp argument updates this clock and vstamp' is the resulting latest key-specific vector clock timestamp. Note that the underlying buffer wrapped by GoVector to generate vstamp' (using govec.GoLog.PrepareSend) is nil. Likewise, the buffer used to generate vstamp is irrelevant and may be set to nil.

One of the arguments to the key-value service implementation is a key failure probability. This controls the likelihood of a key becoming unavailable during any one of the above three operations. Initially all keys are available. Once a key becomes unavailable, it is a permanent unavailability (i.e., until the service is restarted). A key's availability is independent from the availability of other keys in the key-value service. When a key is unavailable, the return value for an operation is always set to "unavailable".

Download the key-value service implementation. You will be notified of any changes to this implementation via Piazza.

Implementation requirements:

  • All nodes run the same code and communicate only indirectly, through the key-value service
  • Given a sufficiently long time during which failures do not occur, an active node is eventually elected as a leader.
  • Given a sufficiently long time during which failures do not occur, the elected leader will eventually advertise an accurate list of all active nodes in the system. And, each active node will retrieve the latest version of this list from the key-value service.
  • Your implementation must be robust to node halting failures, including leader halting failures.
  • Your implementation must be robust to nodes that restart (i.e., halt and later re-join the system with the same identity).
  • Your implementation must be robust to varying RPC times.
  • You cannot change the implementation of the key-value service.
  • You must use the GoVector library to track the partial ordering between important events (e.g., message sends/receives and synchronization events) in your distributed system and to log these events to a log file.

Assumptions you can make:

  • The key-value service does not fail/restart and does not misbehave
  • No network failures
  • Each node has a unique identifier (specified on the command line)
  • The key-value service is dedicated to nodes in your system

Assumptions you cannot make:

  • Processes have synchronized clocks (e.g., running on the same physical host).

Advice:

  • Be methodical: first ignore key unavailability and implement leader election without node failures. Later, extend your code to handle failing leaders, then implement active node tracking/advertising, and so on.
  • The GoVector library can help you debug your implementation. You may want to use the ShiViz tool to visualize the GoVector-generated logs by concatenating the logs together into one file. Use the following log parsing regular expression: (?<host>.*)\s(?<clock>{.*})\s(?<event>.*)

Solution spec: Write a go program that implements a node in the system, as described above, and takes the following arguments:

  • ip:port : address of the key-value service
  • id : the unique identifier of the node
  • logfile : filename to which GoVector log messages will be written

What to hand in:
0. A writeup (at most one page long) that describes your design and how it satisfies the above requirements
1. Your implementation of a node in the system
2. Five GoVector-based logs, each one a concatenation of the individual log files generated by a set of your nodes and the key-value service. Each log must correspond to one of the following scenarios:

  • log-elect-advertise.txt : Two nodes start, one is elected a leader, the leader advertises a list of the two nodes. No key unavailability.
  • log-node-join.txt : One node starts, it is elected the leader and it advertises a list of the one node (itself). A new node joins and the leader advertises a list of two nodes. A third node joins and the leader advertises a list of three nodes. No key unavailability.
  • log-leader-fail.txt : Three nodes start, one is elected a leader, the leader advertises a list of the three nodes. The elected leader fails, a new node is elected a leader, and the new leader advertises a list of the two remaining active nodes. No key unavailability.
  • log-node-fail.txt : Three nodes start, one is elected a leader, the leader advertises a list of the three nodes. A non-leader node fails. The leader advertises a list of remaining two nodes. No key unavailability.
  • log-key-fail.txt : Two nodes start, one is elected a leader, the leader advertises a list of the two nodes. Must demonstrate how your implementation copes with key unavailability.
I recommend that you use ShiViz to inspect each of your five logs to check that they indeed capture correct behavior of your system.

Also, note that your system must handle other possible scenarios (not just the five described above).

Meta: Make sure to follow the course collaboration policy and refer to the assignments instructions that detail how to submit your solution.