Assignment 6

416 Distributed Systems: Assignment 6 [Advanced Key Value Store]

Due: April 16, at 11:59pm Vancouver time

Winter 2021

In the previous assignment, A5, you developed a key value store that had a single storage node. Most of the complexity in A5 was to (1) understand the get/put semantics, and (2) cope with storage node failures. In this assignment you will build on your KVS from A5 to develop a more distributed, and more fault tolerant, version of your KVS. In particular, we will retain most of the tracing semantics and introduce multiple storage nodes. As long as some storage node is available, your system will continue to operate and service client get/put requests.

Changes since @752, initial assignment release

Thu 15 Apr 2021 21:16:16 PDT

  • @892 : Minor fix to KvslibGetResult to refer to KvslibGetResult's corresponding FrontEndGetResult action.
  • @849 : Updates to KvslibPutResult monotonic write condition bullet points #3 and #4. Made these more precise in the context of competing put operations and made s storage node join ordering explicit.
  • @814 : Fixed: StorageGetResult must happen-after at least one instance of StorageGet
  • @809 : Added a missing if Err is false to KvslibGetResult spec: if Err is false, then at least one node s in S must have recorded StorageGet and StorageGetResult actions
  • @797 : In case of total failure the first restarted node will need to finish joining before any other node starts
  • @785 : KvslibGetResult spec now well defined for cases where there are other put operations on k.
  • @769 : Added FrontEndStorageFailed{storageId} must precede a FrontEndStorageJoined(S) action where storageId ∈ S
  • @763 : Slight rewordings to make it clear that the latest relevant storage action for monotonicity conditions is relative to FrontEnd*Result.

Overview

System architecture diagram

In this assignment you will build on your KVS in A5. You will add support for multiple storage nodes. Each storage node will have its own local disk that it can use for durable storage. The difficulty with supporting multiple storage nodes is that the frontend will now have to do more coordination, and the coordination will get more complex. Your KVS will support the same put/get kvslib API as in A5.

Your system will need to continue to survive storage process node (process) halting failures. Unlike A5, your system will have access to multiple storage nodes. And, as in A5, each storage node will have access to a local disk that it will use to persist all key-value data. As in A5, if the storage node fails and restarts, it can re-read (recover) the key-value data into memory from persistent disk storage. However, the node can and should also recover the latest key-value state from other storage nodes that are in the system. This is because, a failed node's state may be out date. As in A5, storage nodes must continue to serve read-only requests (gets) from memory. Data mutating operations (puts) must eventually update the disk state.

We will retain the topology of A5: clients must communicate with the frontend node and the frontend node must communicate with the storage nodes using a reliable messaging protocol. We will use recorded distributed traces to evaluate the correctness of your implementation.

kvslib API

KVS lib diagram

Identical to A5.

KVS system consistency semantics

As in A5, your KVS must provide precise consistency semantics for gets and puts. However, we will have to define what these semantics mean when there are multiple storage nodes. For this, we will update the formal tracing requirements listed below. At the high level, however, the A6 semantics resemble those of A5: per key and per client constraints on gets and puts. But there will be some differences due to multiple joining/failing storage nodes. As with A5, you can achieve these semantics with several different designs and there will be a similar performance implementation bonus.

KVS distributed operation

Basic distributed dataflow

The diagram above captures the basic operation and interactions between the kvslib, frontend, and two storage nodes (edge colors denote distributed traces): put, followed by a get. A couple of points of note:

  • As in A5, we do not provide any RPC description for you to follow. So, it's best to think of the diagram above, and all diagrams below, as conceptual data-flow diagrams. You get to design your RPCs on your own.
  • This diagram includes maximal communication dependency between the frontend and storage. There are designs that require less communication. For example, it may be safe for a frontend to reply to a kvslib get request once a reply from one storage node has been received (to optimize latency).
  • The storage node records the put(k,v) state update to disk before replying to the frontend. This is necessary because your system must survive storage node failures: i.e., all storage nodes might fail. More on failures below.
  • The frontend does not, and should not, implement any key-value state caching. It should forward requests to the storage node. Likewise, the kvslib does not, and should not, implement any key-value state caching. These constraints remain from A5.
  • When servicing a get request the storage node does not read the value for key k from disk. Your storage implementation must maintain a consistent in-memory copy of the KVS, which it uses to service all get requests. This constraint remains from A5.

Storage node failure and recovery

This assignment features storage failures. There are no kvslib and frontend failures. Only the storage node(s) will fail (you may assume that fail means killing a process here). The storage node(s) can fail at any time. Furthermore, multiple storage nodes may fail simultaneously. Storage nodes may later restart. On restart the storage node should always be able to recover any stored state from disk and attempt to recover anything it might have missed from other, running, storage nodes.

As in A5 you will use RPCs to detect storage failures from the frontend. If an RPC from the frontend node to the storage node fails, then the frontend must retry exactly once before considering the storage node to have failed. The frontend should retry after waiting for storageTimeout seconds, where storageTimeout is a user-defined parameter. However, unlike A5, the frontend should not let kvslib instance know about the failure unless all the storage nodes have failed (e.g., see the put in the diagram below). The kvslib should not perform any retries.

Your KVS state must be durable. This means that the storage node must keep track of put requests and save them (or updated state) to disk before the clients a receives a response. When a storage node is initialized it should checks if there is state on disk (if not, it should initialize empty state) in the directory specified by the configuration and reload the state of the KVS prior to the failure. A diagram in A5 illustrates the required interaction between storage and disk (without retries and without the joining process detailed below). The diagram below reviews at the high-level what storage node failures look like:

Fail not a fail diagram

Storage failures are similar to what you designed for in A5:

  • A failure may last an arbitrary amount of time. A restart of a storage node is not guaranteed.
  • A storage node may fail during disk initialization and while it is writing to the disk. The storage node should always be able to start successfully and it should not lose data.
But, there are some differences:
  • Multiple storage nodes may fail and later recover. Your system should provide the stated guarantees regardless of the storage node failure/recovery pattern.
  • In general, in your system multiple non-failed and multiple failed storage nodes will exist at the same time.
  • The maximum number of storage nodes that might be active (i.e., joined) at the same time is 256.

Storage node joins

Because there are multiple storage nodes, a storage node join process is more complex. In A5, the storage node must (1) read any state locally, (2) connect to frontend node, and then was considered to have joined the system. In A6, a storage node must (1) read local state, (2) connect to frontend, and (3) catch up to other storage nodes in the system. Only once the new, joining, storage node has caught up can it be considered to have joined the system. As part of catching up, the storage node may service some client operations. But, failure/recovery guarantees will depend on whether a node has managed to join the system (complete the above three steps) or not. To determine when a storage node has joined, we will, of course, introduce new tracing actions. The figure below illustrates this node join process:

Joing process diagram

A couple of points to note about this diagram:

  • The dotted orange line is how S2 catches up to the latest storage state that S1 knows about. The diagram proxies this communication through the frontend, but you can also have storage nodes communicate directly. However, storage node API remains the same, so a storage node initially only knows about the frontend address.
  • Although a storage node is joining, kvslib requests continue. Your frontend must be able to respond to these requests while storage node joining is taking place (while at least one storage node is in joined state). The diagram illustrates a design in which requests are sent to the joining node. This is probably the simplest way to catch up a node to requests that occur while the node is joining.
  • Failures may occur while S2 is joining. For example, S2 might fail while it is joining.

Join-unjoin high-level

The diagram above on the left illustrates the key events for a storage node that joins, fails and unjoins, rejoins. The diagram above on the right illustrates a system execution with four storage nodes that join, fail and unjoin, and rejoin at different times. The diagram includes a description of events associated with each transition across the horizontal dashed line. The diagram also lists the joined and unjoined node sets. These are defined and used in the tracing semantics below. Note that although both diagrams show storage nodes that join, fail and unjoin, and later re-join, a storage node's sequence could be shorter or longer than this triplet.

Non-kvslib API specifications

The Start API for the frontend is identical to A5; we reproduce it below for completeness. The Start API for the storage node is slightly different; we highlight the differences in bold.

  • Frontend
    • err ← Start( clientAPIListenAddr string, storageAPIListenAddr string, storageTimeout uint8, ftrace *tracing.Tracer )
      • Identical semantics to A5
  • Storage
    • err ← Start( storageId string, frontEndAddr string, storageAddr string, diskPath string, strace *tracing.Tracer )
      • When successful, this call should not return. The storage server should run until the program is terminated. This call should return an error if there are networking issues (e.g., incorrect ip:port given), storage issues (e.g., permission errors), or other unrecoverable issues.
      • storageId is a string that uniquely identifies a storage node instance. This string persists (will be reused) across storage node restarts.
      • frontEndAddr is the IP:port of the frontend node that this storage node should connect to.
      • storageAddr is the local IP:port that this storage node should use to connect to the frontend node.
      • diskPath is a file system path where the storage node can create/read/write files. You can assume that this directory is empty when the storage node is run for the first time.
      • strace is a storage tracer instance; use this (1) create a new distributed trace instance using Tracer.CreateTrace(), and (2) record local storage actions in the created trace that are not associated with any particular client request (see below).

Distributed tracing semantics

Tracing actions in A5

The tracing diagram above illustrates the tracing actions. Most of them are identical to A5. The bolded ones are the new/changed actions. For example, FrontEndStorageStarted and FrontEndStorageFailed must report the storageId of the storage node that started or failed, respectively. There are few such storageId additions to various tracing actions. And, there are two new storage trace actions StorageJoining and StorageJoined, and a new frontend trace action FrontEndStorageJoined[s1..sn]. These new actions play the following roles:

  • StorageJoining: indicates that the storage node is actively attempting to join the system, but has not yet joined.
  • StorageJoined: indicates that the storage node has joined the system and has the latest state. A StorageJoined record includes the complete KVS state that the storage node knows about (including any state it loaded from its local disk, if any).
  • FrontEndStorageJoined[s1..sn]: records all the currently joined storage nodes s1..sn that the frontend knows about.

Below we list all of the tracing requirements for A6. We highlight the key changes from A5 in blue. As before, keep in mind that all ordering constraints are evaluated according to the happens before relation (i.e., using vector clocks), not physical clock timestamps. Below we use the term 'trace' to refer to a trace that was created by a client during a put or a get operation. We will refer to frontend trace (or ftrace) or storage trace (strace) when discussing actions recorded as part of the frontend/storage traces, which are distinct from client traces (though they are part of the same logical timespace).

The above description of the system setup indicates that the set of storage nodes in your system will fluctuate over time. So, we need to define a few storage-related terms before we can properly capture the tracing semantics of the overall system.

  • Joining storage node: a storage node whose local trace contains a StorageJoining action as the most recent action.
  • Joined storage node: a storage node whose storageId is recorded as part of the latest FrontEndStorageJoined action in the frontend trace. (This also implies that this storage node's local trace contains a StorageJoined action as the most recent action).
  • Joined storage nodes: a set of storage node ids that are recorded as part of the latest FrontEndStorageJoined action in the frontend trace. For example, if the latest frontend FrontEndStorageJoined action is FrontEndStorageJoined([s1,s2,s3]), then storage nodes associated with storageIds s1, s2, and s3 are considered to be joined storage nodes.
  • Unjoined storage node: a storage node whose storageId is not recorded as part of the latest FrontEndStorageJoined action in the frontend trace and whose storage trace contains a StorageJoined action as the most recent action. Think of such a storage node as being failed with its failure recognized by the front end.

Actions to be recorded by kvslib

  • KvslibBegin{clientId}: signifies the start/initialize of a client's kvslib. For each unique ClientId corresponding to a running client, the tracing log as a whole should contain this action exactly once, as part of the kvslib trace. This action should happen-before all other actions recorded by the client with ClientId.
  • KvslibPut{clientId,opId,key,value}: For a given ClientId, Key and Value, kvslib should record this action just before sending a Put operation to the frontend.
    • A trace must contain this action, or a KvslibGet action, exactly once before either of the corresponding KvslibPutResult and KvslibComplete.
    • This action must happen-before a corresponding FrontEndPut and FrontEndPutResult, which must refer to the same key-value pair.
    • A trace must contain this action before any StoragePut and StorageSaveData actions.
  • KvslibGet{clientId,opId,key}: For a given ClientId and Key, kvslib should record this action just before sending a Get operation to the frontend.
    • A trace must contain this action, or a KvslibPut action, exactly once before either of the corresponding KvslibGetResult and KvslibComplete.
    • This action must happen-before a corresponding FrontEndGet and FrontEndGetResult, which must refer to the same key.
    • A trace must contain this action before any StorageGet and StorageGetResult actions.
  • KvslibPutResult{opId,err}: kvslib should record this action just after receiving the Put results from frontend. Here err is a boolean, per storageFail in the ResultStruct.
    • If KvslibPut is present, a trace must contain this action exactly once after KvslibBegin and KvslibPut.
    • A trace must contain this action after corresponding FrontEndPut and FrontEndPutResult{err'} actions. err must equal err'.
    • A trace must contain this action after any corresponding StorageSaveData actions.
    • Let S be the set of joined storage nodes at the time that frontend recorded the FrontEndPutResult action corresponding to this KvslibPutResult. If S is the empty set, then err must be set to true. If S is a non-empty set, then every node s in S must have recorded a single corresponding StorageSaveData action in the same trace as this KvslibPutResult.
    • Put results must satisfy the monotonic write condition.
      Assume t1 and t2 are two put-related traces writing to the same key and generated by the same client. Let t1 contain KvslibPutResult(opId1,err:false) and let t2 contain KvslibPutResult(opId2,err:false) such that opId1 < opId2. Now consider all storage nodes S (that have appeared or will appear). Every storage node s in S must satisfy the following conditions:
      1. If s does not record StorageSaveData in t1, but s records some StorageSaveData' in t2, then this case is already covered above.
      2. If s records StorageSaveData in t1 and s also records some StorageSaveData' in t2. Then, StorageSaveData must happen before StorageSaveData'.
      3. If s records StorageSaveData in t1 and s does not record any StorageSaveData' in t2 and s records a join that happens-after the FrontEndPutResult in t2, then EITHER that StorageJoined(state) action must reflect the put corresponding to opId2, OR another FrontEndPutResult affecting k that happens-before that StorageJoined. In addition, the StorageSaveData action must happen before the StorageJoined(state) or FrontEndPutResult action.
      4. If s does not record any actions in t1 or t2 and s records a join that happens-after the FrontEndPutResult in t2, then EITHER that StorageJoined(state) action must reflect the put corresponding to opId2, OR another FrontEndPutResult affecting k that happens-before that StorageJoined.
  • KvslibGetResult{opId,key,value,err}: kvslib should record this action just after receiving the Get results from frontend. Here value is the value of the key key. If Err is true (operation failed), the value must be nil. value will have the actual value of key key only when Err is false (operation succeeded).
    • If KvslibGet is present, a trace must contain this action exactly once after KvslibBegin and KvslibGet.
    • A trace must contain this action after corresponding FrontEndGet and FrontEndGetResult{err'} actions. err must equal err'.
    • A trace must contain this action after any corresponding StorageGet and StorageGetResult actions.
    • Let S be the set of joined storage nodes at the time that frontend recorded the FrontEndGetResult action corresponding to this KvslibGetResult.
      Then, every s in S must satisfy the following conditions, relative to that FrontEndGetResult action:
      • If Err is false and Value is not nil, then the most recently recorded StorageJoined{state} by s (if it exists), or the most recently recorded StorageSaveData{key,_} by s (if it exists) must associate key with value. One of these StorageJoined and StorageSaveData actions must be recorded by s. If both are recorded by s, the latest one (that happens-after the other) is considered, and shall be called the latest relevant store action. This latest relevant store action must happen before the KvslibGetResult's corresponding FrontEndGetResult action.
      • If Err is false and Value is nil, then s must neither have a record of an instance of StorageJoined{state} nor a record of an instance of StorageSaveData that happen before the KvslibGetResult's corresponding FrontEndGetResult action and associate key with any value.
      And, if Err is false, at least one node s in S must have recorded StorageGet and StorageGetResult actions that happened before the FrontEndGetResult action.
    • Get results must satisfy the monotonic read condition.
      Assume t1 and t2 are two get-related traces generated by the same client.
      Let t1 contain KvslibGetResult(opId1, key, value, err:false) and let t2 contain KvslibGetResult(opId2, key, value', err:false) such that opId1 < opId2 and value and value' are non-nil.

      Both KvslibGetResult actions must be associated with a FrontEndGetResult, which, as specified above, will have latest relevant store actions that associate key with value and key with value' respectively.

      Let S1 be the set of joined storage nodes at the time that frontend recorded the FrontEndGetResult action corresponding to KvslibGetResult(opId1...). Let S2 be the set of joined storage nodes at the time that frontend recorded the FrontEndGetResult action corresponding to KvslibGetResult(opId2...). Given the above context, every storage node s (that have appeared or will appear), must satisfy the following conditions:
      1. (s ∉ S1) ∧ (s ∈ S2) ⇒ the latest relevant store action on s, relative to KvslibGetResult(opId2,...)'s relevant FrontEndGetResult, must associate k with value', AND must happen-after the latest relevant store action (relative to KvslibGetResult(opId1, ...)'s relevant FrontEndGetResult) on any other storage node s' ∈ S1.
      2. (s ∈ S1) ∧ (s ∈ S2) ⇒ either value' equals value, or the latest relevant store action at s associated with KvslibGetResult(opId1,...)'s relevant FrontEndGetResult must happen-before the latest relevant store action at s associated with KvslibGetResult(opId2...)'s relevant FrontEndGetResult.
      3. (s ∈ S1) ∧ (s ∉ S2) ∧ (s is joined at some point after opId2's FrontEndGetResult, via some StorageJoined j) ⇒ the latest relevant store action at s associated with KvslibGetResult(opId1,...)'s relevant FrontEndGetResult must happen-before j. Additionally, j must associate k with value', OR there must be some FrontEndPutResult(err = false) associating k with some other value'', that happens-between KvsligGetResult(opId2, ...)'s relevant FrontEndGetResult and j.
      4. (s ∉ S1) ∧ (s ∉ S2) ∧ (s is joined at some point after opId2's FrontEndGetResult, via some StorageJoined(state') j') ⇒
        • (4.1) EITHER j' must include state'[k] = value', OR there must be some FrontEndPutResult(err = false) associating k with some other value'', that happens-between KvsligGetResult(opId2, ...)'s relevant FrontEndGetResult and j'.
        • (4.2) Let j be the earliest StorageJoined(state) recorded by s between KvslibGetResult(opId1, ...)'s relevant FrontEndGetResult and KvslibGetResult(opId2, ...)'s relevant FrontEndGetResult. If j exists, then by premise, j must happen before j'.

          Additionally, if j exists, EITHER j must record state[k] = value, OR there must be some FrontEndPutResult(err = false) associating k with some other value that happens-between KvslibGetResult(opId1, ...)'s relevant FrontEndGetResult and j.
  • KvslibComplete{clientId}: For a specific clientId, kvslib records this action to mark the end of a client session, corresponding to the successful completion of KVS.Close(). For each ClientId corresponding to a running client, the tracing log as a whole should contain this action exactly once, as part of the kvslib trace. This action should happen-after all other actions reported by a given clientId.
For each client, OpIds are unique: Any two traces from the same client that contain opIds, must have distinct opId values.

Actions to be recorded by frontend

  • FrontEndStorageStarted{storageId}: Denotes that the frontend has detected the storage node with id storageId as being started or restarted. This action should only be recorded in the frontend trace.
    • For a given storageId this action should alternate with FrontEndStorageFailed: no two FrontEndStorageStarted{storageId} should occur without an intermediate FrontEndStorageFailed{storageId}.
    • This action must precede a FrontEndStorageJoined(S) action where storageId ∈ S
    • This action should be recorded only if the frontend has successfully executed an RPC against the storage node.
  • FrontEndStorageFailed{storageId}: Denotes that the frontend has detected the storage node with id storageId as failed. This action should only be recorded in the frontend trace.
    • For a given storageId this action should alternate with FrontEndStorageStarted: no two FrontEndStorageFailed{storageId} actions should occur without an intermediate FrontEndStorageStarted{storageId}.
    • This action should be recorded if the frontend has (1) failed to successfully execute an RPC against the storage node with id storageId, and (2) has re-attempted the failed RPC after a timeout, and (3) the re-attempted RPC has also failed.
  • FrontEndStorageJoined{S}: frontend should record this action when the set of joined nodes changes, either if a new storage node joins or if a joined node fails and becomes unjoined. The set S must contain storageIds of all the currently joined nodes that the frontend knows about. This action should only be recorded in the frontend trace.
    • If a storage node with storageId has started and caught up to latest KV state, then this action must be recorded once and it must happen after the joined storage node recorded a StorageJoined action in its storage trace, and storageId must be in S.
    • If a storage node with storageId has failed, then this action must be recorded once and it must happen immediately after the frontend node recorded a FrontEndStorageFailed(storageId) action and storageId must not be in S.
  • FrontEndPut{{key,value}}: In a given trace and with a key & value pair, frontend records this action just after receiving a Put operation from a client.
    • A trace must contain this action exactly once after KvslibBegin and KvslibPut, should KvslibPut be present.
    • A trace must contain this action before FrontEndPutResult.
    • A trace must contain this action before any StoragePut and StorageSaveData, if they are present. If they are present, these subsequent actions must refer to the same key,value pair.
  • FrontEndPutResult{err}: The frontend must record this action after receiving Put results from storage node. Here err is a boolean, per storageFail in the ResultStruct.
    • A trace must contain this action exactly once after KvslibBegin and KvslibPut, should KvslibPut be present.
    • A trace must contain this action after a corresponding FrontEndPut.
    • A trace must contain this action after any StoragePut and StorageSaveData actions.
    • Let S be the set of joined storage nodes at the time that frontend records this FrontEndPutResult action. If err is set to True, then S must be the empty set.
  • FrontEndGet{key}: In a given trace and with a key key, frontend records this action just after receiving a Get operation from a client.
    • A trace must contain this action exactly once after KvslibBegin and KvslibGet, should KvslibGet be present.
    • A trace must contain this action before FrontEndGetResult.
    • A trace must contain this action before any instances of StorageGet and StorageGetResult. Any of these subsequent actions should refer to the same key.
  • FrontEndGetResult{key,value,err}: frontend should record this action just after receiving the Get results from storage. Here value is the value of the key key. If err is true (operation failed), the value will be nil. value will have actual value of key key only when err is false (operation succeeded).
    • A trace must contain this action exactly once after KvslibBegin and KvslibGet, should KvslibGet be present.
    • A trace must contain this action after corresponding FrontEndGet.
    • A trace must contain this action after StorageGet and StorageGetResult, and, if err is set to false, this action should report the same key,value pair as as the StorageGetResult.
    • Let S be the set of joined storage nodes at the time that frontend records this FrontEndGetResult action. If err is set to True, then S must be the empty set.

Actions to be recorded by storage

  • StorageLoadSuccess{StorageId, state}: The storage records this action just after successful loading of Key-Value state from disk (or initializing this state, if it does not exist). It should report all key-value pairs that were loaded as state, which is a Go map. This action should be recorded only in the storage trace
    • This should be the first action recorded by the storage node on start (or restart).
  • StorageJoining{StorageId}: A storage node records this action before receiving KVS state to catch-up with the joined nodes in the system. This action should be recorded only in the storage trace
    • If recorded, this action must occur after a locally recorded StorageLoadSuccess action.
  • StorageJoined{StorageId, state}: The storage records this action when it has finished catching up with the joined nodes in the system. The node should report all key-value pairs that were loaded from disk and received as part of the catch up process as state, which is a Go map. This action should be recorded only in the storage trace.
    • If recorded, this action must occur after a locally recorded StorageJoining action.
  • StoragePut{StorageId, key,value}: For a given key-value pair key & value, the storage node records this action just after receiving a Put operation from frontend.
    • A storage node in a trace must record this action between zero and two times, after KvslibBegin and KvslibPut. Should KvslibPut be present, it may only be recorded zero times by a storage node if the storage node is not running or has failed during the put operation; it may only be recorded more than once if the two occurrences are separated by a StorageJoined.
    • A trace must contain this action after FrontEndPut and before FrontEndPutResult.
    • Any instances of this action in a trace must happen-before at least one instance of StorageSaveData in the same trace.
  • StorageSaveData{StorageId, key,value}: The storage records this action just after durably adding/modifying the put value to disk.
    • A storage node in a trace must record this action between zero and two times, after KvslibBegin and KvslibPut. Should KvslibPut be present, it may only be recorded zero times by a storage node if the storage node is not running or has failed during the put operation; it may only be recorded more than once if the two occurrences are separated by a StorageJoined.
    • A trace must contain this action after FrontEndPut and before respective FrontEndPutResult.
    • A trace must contain this action after some instance of StorageJoined, and this action must happen-after at least one instance of StoragePut in the same trace.
  • StorageGet{StorageId, key}: For a given key key, the storage node records this action just after receiving a Get operation from frontend.
    • A storage node in a trace must record this action between zero and two times, after KvslibBegin and KvslibGet. It may only be recorded more than once if the two occurrences are separated by a StorageJoined.
    • A trace must contain this action after FrontEndGet and before FrontEndGetResult.
    • A trace must contain this action after some instance of StorageJoined, and this action must happen-before at least one instance of StorageGetResult in the same trace.
  • StorageGetResult{StorageId, key,value}: The storage records this action just after a Get on the in-memory key-value store. Here value is the value of the key key. If key does not exists in key-value map, then value must be nil. Otherwise, value will have actual value of key key.
    • A storage node in a trace must record this action between zero and two times, after KvslibBegin and KvslibGet. It may only be recorded more than once if the two occurrences are separated by a StorageJoined.
    • A trace must contain this action after FrontEndGet and before FrontEndGetResult.
    • A trace must contain this action after some instance of StorageJoined, and this action must happen-after at least one instance of StorageGet in the same trace.

Assumptions you can and cannot make

  • You can assume that the frontend will start first.
  • Except for the frontend retry value of storageTimeout seconds, you should not make any timing assumptions about how long RPCs/communication may take over the network (i.e., assume an asynchronous environment).
  • If all storage node have failed, you can assume that we will re-start the last joined storage node first. In this case, you may additionally assume that this first restarted node must finish joining (and become joined) before any other node will start.
  • We will never fail a joined storage node if (1) it's the only storage node in the system, and (2) another storage node is joining and has stale state. (This is because the frontend would have no choice but to expose arbitrarily stale state to the client since a storage node would be available, but it would be out of date).

Implementation notes

We provide two starter files for you with updated tracing action definitions and the storage Start API signature. Your code should follow the API and code layout constraints from A5.

Starter files:

Testing

We will release some A6 smoke tests for you to test your tracing log files. As before, we recommend that you test your system across a variety of deployments. One of the best ways to test your system is to use the tracing output and assert that actions are occurring in the expected order.

Implementation requirements

  • The client code must be runnable on CS ugrad machines and be compatible with Go version 1.15.6.
  • Your code must be compatible with the given APIs, and run successfully against the provided initial code for client, frontend, storage nodes.
  • Your solution can only use standard library Go packages.
  • Your solution code must be Gofmt'd using gofmt.

Handin instructions

As with previous assignments, you will use your personal repository for this assignment under the CPSC416-2020W-T2 org. Your repository will look like cpsc416_a6_USERNAME1_USERNAME2.

Keep the A5-provided file layout at the top-level in your solution. You may include additional files, but we do not expect this will be necessary. Do not reference any additional libraries in your solution.

Rough grading scheme

Your code must not change the API, tracing structs and applications' configurations and command-line arguments. Your code must work on ugrad servers. If any of these are violated, your mark for this assignment is 0. This is true regardless of how many characters had to be changed to make your solution compile, and regardless of how well your solution works with a different API or on a different machine.

Both partners in a team will receive the same assignment mark. That is, we will not be doing any accounting of who contributed what to the team's solution.

The high-level A6 mark breakdown looks like this:

  • 20% : Concurrent put/get requests without faults with multiple storage nodes
  • 20% : Correct storage nodes join process
  • 20% : Correct failure semantics/recovery at storage node
  • 20% : Correct failure semantics in kvslib and frontend
  • 20% : Recovery in complex fault scenarios
Please make sure that your code is structured according to our instructions. We will deduct 10% if we have to move, rename, or edit files in your repository to make your project work with our autograding scripts.

Performance bonuses. Do you want to earn more points? It's easy, just make your highly concurrent and fault-tolerant and multi-node system fast. How fast? You have to be faster than 90% of the class. That's only 126 other students, or 63 other A6 groups. There are four performance bonuses:

  • Get throughput (5% on top of A6 mark): what's the highest concurrent get request rate that your system can sustain from multiple concurrent clients with multiple storage nodes? Assume a 100% get workload.
  • Put throughput (5% on top of A6 mark): what's the highest concurrent put request rate that your system can sustain from multiple concurrent clients with multiple storage nodes? Assume a 100% put workload.
  • P99 get latency (5% on top of A6 mark): what's your system's 99th percentile end-to-end get request response time?
  • P99 put latency (5% on top of A6 mark): what's your system's 99th percentile end-to-end put request response time?

Everyone's solutions will be evaluated for a bonus. To qualify, your solution must be able to take nil values for tracer/trace instances in the API. That is, we will only run performance tests on your system if your system can disable tracing (not crash with nil values for tracer/trace).

To earn a bonus you must respect the other constraints: (1) no caching of key-value state in the kvslib or in the frontend, (2) no skipping of fault tolerance logic, (3) no assumptions about when the storage nodes fail or join, (4) no assumptions about network delay/timing, (5) no changes to APIs or configuration files.

We will not release the performance tests we will use. We will also not release the distribution of key/value sizes nor their pattern. We will also not release our definition of sustain for throughput, though you can think of it as the max request rate that your system achieves before hitting a knee in the performance curve.



Make sure to follow the course collaboration policy and refer to the submission instructions that detail how to submit your solution.

Note that this is a group assignment requiring two people. We strongly recommend that you work in a team. Please find a partner using post @5 on Piazza.