In the previous assignment, A5, you developed a key value store
that had a single storage node. Most of the complexity in A5 was
to (1) understand the get/put semantics, and (2) cope with
storage node failures. In this assignment you will build on your
KVS from A5 to develop a more distributed, and more fault
tolerant, version of your KVS. In particular, we will retain
most of the tracing semantics and introduce multiple storage
nodes. As long as some storage node is available, your system
will continue to operate and service client get/put requests.
Changes since @752, initial assignment release
Thu 15 Apr 2021 21:16:16 PDT
-
@892 : Minor fix to KvslibGetResult to refer to KvslibGetResult's corresponding FrontEndGetResult action.
-
@849 : Updates to KvslibPutResult monotonic write condition bullet
points #3 and #4. Made these more precise in the context of
competing put operations and made s storage node join ordering
explicit.
-
@814 : Fixed: StorageGetResult must happen-after at least one instance of StorageGet
-
@809 : Added a missing if Err is false to KvslibGetResult spec:
if Err is false, then at least one node s in S must have
recorded StorageGet and StorageGetResult actions
-
@797 : In case of total failure the first restarted node will
need to finish joining before any other node starts
-
@785 :
KvslibGetResult spec now well defined for
cases where there are other put operations on k.
-
@769 : Added FrontEndStorageFailed{storageId} must precede a FrontEndStorageJoined(S) action where storageId ∈ S
- @763 : Slight rewordings to make it clear that the latest
relevant storage action for monotonicity conditions is
relative to
FrontEnd*Result .
Overview
In this assignment you will build on your KVS in A5. You will add
support for multiple storage nodes. Each storage node will have its
own local disk that it can use for durable storage. The difficulty
with supporting multiple storage nodes is that the frontend will now
have to do more coordination, and the coordination will get more
complex. Your KVS will support the same put/get kvslib API as in A5.
Your system will need to continue to survive storage process node
(process) halting failures. Unlike A5, your system will have access
to multiple storage nodes. And, as in A5, each storage node will
have access to a local disk that it will use to persist all
key-value data. As in A5, if the storage node fails and restarts, it
can re-read (recover) the key-value data into memory from persistent
disk storage. However, the node can and should also recover the
latest key-value state from other storage nodes that are in the
system. This is because, a failed node's state may be out date. As
in A5, storage nodes must continue to serve read-only requests
(gets) from memory. Data mutating operations (puts) must eventually
update the disk state.
We will retain the topology of A5: clients must communicate with the
frontend node and the frontend node must communicate with the
storage nodes using a reliable messaging protocol. We will use
recorded distributed traces to evaluate the correctness of your
implementation.
kvslib API
Identical to A5.
KVS system consistency semantics
As in A5, your KVS must provide precise consistency semantics for
gets and puts. However, we will have to define what these semantics
mean when there are multiple storage nodes. For this, we will update
the formal tracing requirements listed below. At the high level,
however, the A6 semantics resemble those of A5: per key and per
client constraints on gets and puts. But there will be some
differences due to multiple joining/failing storage nodes. As with
A5, you can achieve these semantics with several different designs
and there will be a similar performance implementation bonus.
KVS distributed operation
The diagram above captures the basic operation and interactions
between the kvslib, frontend, and two storage nodes (edge colors
denote distributed traces): put, followed by a get. A couple of
points of note:
- As in A5, we do not provide any RPC description for you to
follow. So, it's best to think of the diagram above, and all
diagrams below, as conceptual data-flow diagrams. You get to
design your RPCs on your own.
-
This diagram includes maximal communication dependency between
the frontend and storage. There are designs that require less
communication. For example, it may be safe for a frontend to
reply to a kvslib get request once a reply from one storage node
has been received (to optimize latency).
-
The storage node records the put(k,v) state update to disk
before replying to the frontend. This is necessary because your
system must survive storage node failures: i.e., all storage
nodes might fail. More on failures below.
-
The frontend does not, and should not, implement any
key-value state caching. It should forward requests to the
storage node. Likewise, the kvslib does not, and should not,
implement any key-value state caching. These constraints
remain from A5.
- When servicing a get request the storage node does not read
the value for key k from disk. Your storage implementation
must maintain a consistent in-memory copy of the KVS, which it
uses to service all get requests. This constraint remains
from A5.
Storage node failure and recovery
This assignment features storage failures. There are no kvslib and
frontend failures. Only the storage node(s) will fail (you may
assume that fail means killing a process here). The storage node(s)
can fail at any time. Furthermore, multiple storage nodes may
fail simultaneously. Storage nodes may later restart. On restart the
storage node should always be able to recover any stored state
from disk and attempt to recover anything it might have missed from
other, running, storage nodes.
As in A5 you will use RPCs to detect storage failures from the
frontend. If an RPC from the frontend node to the storage node
fails, then the frontend must retry exactly once before considering
the storage node to have failed. The frontend should retry after
waiting for storageTimeout seconds,
where storageTimeout is a user-defined
parameter. However, unlike A5, the frontend should not let kvslib
instance know about the failure unless all the storage nodes have
failed (e.g., see the put in the diagram below). The kvslib should
not perform any retries.
Your KVS state must be durable. This means that the storage node
must keep track of put requests and save them (or updated state) to
disk before the clients a receives a response. When a storage node
is initialized it should checks if there is state on disk (if not,
it should initialize empty state) in the directory specified by the
configuration and reload the state of the KVS prior to the
failure. A diagram in A5 illustrates the required interaction
between storage and disk (without retries and without the joining
process detailed below). The diagram below reviews at the high-level
what storage node failures look like:
Storage failures are similar to what you designed for in A5:
-
A failure may last an arbitrary amount of time. A restart of a
storage node is not guaranteed.
-
A storage node may fail during disk initialization and while it
is writing to the disk. The storage node should always be able
to start successfully and it should not lose data.
But, there are some differences:
-
Multiple storage nodes may fail and later recover. Your system
should provide the stated guarantees regardless of the storage
node failure/recovery pattern.
-
In general, in your system multiple non-failed and multiple
failed storage nodes will exist at the same time.
-
The maximum number of storage nodes that might be active (i.e.,
joined) at the same time is 256.
Storage node joins
Because there are multiple storage nodes, a storage node join
process is more complex. In A5, the storage node must (1) read any
state locally, (2) connect to frontend node, and then was considered
to have joined the system. In A6, a storage node must (1) read local
state, (2) connect to frontend, and (3) catch up to other storage
nodes in the system. Only once the new, joining, storage node has
caught up can it be considered to have joined the system. As part of
catching up, the storage node may service some client
operations. But, failure/recovery guarantees will depend on whether
a node has managed to join the system (complete the above three
steps) or not. To determine when a storage node has joined, we will,
of course, introduce new tracing actions. The figure below
illustrates this node join process:
A couple of points to note about this diagram:
-
The dotted orange line is how S2 catches up to the latest
storage state that S1 knows about. The diagram proxies this
communication through the frontend, but you can also have
storage nodes communicate directly. However, storage node API
remains the same, so a storage node initially only knows about
the frontend address.
-
Although a storage node is joining, kvslib requests
continue. Your frontend must be able to respond to these
requests while storage node joining is taking place (while at
least one storage node is in joined state). The diagram
illustrates a design in which requests are sent to the joining
node. This is probably the simplest way to catch up a node to
requests that occur while the node is joining.
-
Failures may occur while S2 is joining. For example, S2 might
fail while it is joining.
The diagram above on the left illustrates the key events for a
storage node that joins, fails and unjoins, rejoins. The diagram
above on the right illustrates a system execution with four storage
nodes that join, fail and unjoin, and rejoin at different times. The
diagram includes a description of events associated with each
transition across the horizontal dashed line. The diagram also lists
the joined and unjoined node sets. These are defined and used in the
tracing semantics below. Note that although both diagrams show
storage nodes that join, fail and unjoin, and later re-join, a
storage node's sequence could be shorter or longer than this
triplet.
Non-kvslib API specifications
The Start API for the frontend is identical to A5; we reproduce it
below for completeness. The Start API for the storage node is
slightly different; we highlight the differences in bold.
- Frontend
err ← Start(
clientAPIListenAddr string,
storageAPIListenAddr string,
storageTimeout uint8,
ftrace *tracing.Tracer
)
-
Identical semantics to A5
- Storage
-
err ← Start(
storageId string,
frontEndAddr string,
storageAddr string,
diskPath string,
strace *tracing.Tracer
)
- When successful, this call should not return. The storage
server should run until the program is terminated. This
call should return an error if there are networking issues
(e.g., incorrect ip:port given), storage issues (e.g.,
permission errors), or other unrecoverable issues.
storageId is a string that uniquely
identifies a storage node instance. This string persists
(will be reused) across storage node restarts.
frontEndAddr is the IP:port of the frontend node that this
storage node should connect to.
storageAddr is the local IP:port that this storage
node should use to connect to the frontend node.
diskPath is a file system path where the
storage node can create/read/write files. You can assume
that this directory is empty when the storage node is run
for the first time.
strace is a storage tracer instance; use this
(1) create a new distributed trace instance
using Tracer.CreateTrace(), and (2) record local storage
actions in the created trace that are not associated with
any particular client request (see below).
Distributed tracing semantics
The tracing diagram above illustrates the tracing actions. Most of
them are identical to A5. The bolded ones are the new/changed
actions. For example, FrontEndStorageStarted and
FrontEndStorageFailed must report
the storageId of the storage node that started or
failed, respectively. There are few such storageId
additions to various tracing actions. And, there are two new storage
trace actions StorageJoining
and StorageJoined , and a new frontend trace
action FrontEndStorageJoined[s1..sn] . These new actions
play the following roles:
StorageJoining : indicates that the storage node
is actively attempting to join the system, but has not yet
joined.
StorageJoined : indicates that the storage node
has joined the system and has the latest
state. A StorageJoined record includes the complete
KVS state that the storage node knows about (including any state
it loaded from its local disk, if any).
FrontEndStorageJoined[s1..sn] : records all the
currently joined storage nodes s1..sn that the frontend knows
about.
Below we list all of the tracing requirements for A6. We highlight
the key changes from A5 in blue. As before, keep in
mind that all ordering constraints are evaluated according to the
happens before relation (i.e., using vector clocks), not physical
clock timestamps. Below we use the term 'trace' to refer to a
trace that was created by a client during a put or a get
operation. We will refer to frontend trace (or ftrace) or storage
trace (strace) when discussing actions recorded as part of the
frontend/storage traces, which are distinct from client traces
(though they are part of the same logical timespace).
The above description of the system setup indicates that the set
of storage nodes in your system will fluctuate over time. So, we
need to define a few storage-related terms before we can
properly capture the tracing semantics of the overall system.
- Joining storage node: a storage node whose local
trace contains a
StorageJoining action as the
most recent action.
- Joined storage node: a storage node
whose
storageId is recorded as part of the
latest FrontEndStorageJoined action in the
frontend trace. (This also implies that this storage node's
local trace contains a StorageJoined action as
the most recent action).
- Joined storage nodes: a set of
storage node ids that are recorded as part of the
latest
FrontEndStorageJoined action in the
frontend trace. For example, if the latest
frontend FrontEndStorageJoined action is
FrontEndStorageJoined([s1,s2,s3]) , then storage
nodes associated with storageIds s1, s2, and s3 are
considered to be joined storage nodes.
- Unjoined storage node: a storage node
whose
storageId is not recorded as part
of the latest FrontEndStorageJoined action in
the frontend trace and whose storage trace contains
a StorageJoined action as the most recent
action. Think of such a storage node as being failed with
its failure recognized by the front end.
Actions to be recorded by kvslib
KvslibBegin{clientId} : signifies the
start/initialize of a client's kvslib. For each
unique ClientId corresponding to a running client,
the tracing log as a whole should contain this action
exactly once, as part of the kvslib trace. This action
should happen-before all other actions recorded by the client
with ClientId.
KvslibPut{clientId,opId,key,value} : For a
given ClientId, Key and Value, kvslib should
record this action just before sending a Put operation to the
frontend.
- A trace must contain this action, or a KvslibGet action,
exactly once before either of the corresponding
KvslibPutResult and KvslibComplete.
- This action must happen-before a corresponding FrontEndPut
and FrontEndPutResult, which must refer to the same
key-value pair.
- A trace must contain this action before any StoragePut and
StorageSaveData actions.
KvslibGet{clientId,opId,key} : For a
given ClientId and Key, kvslib should record this
action just before sending a Get operation to the frontend.
- A trace must contain this action, or a KvslibPut action,
exactly once before either of the corresponding
KvslibGetResult and KvslibComplete.
- This action must happen-before a corresponding FrontEndGet
and FrontEndGetResult, which must refer to the same
key.
- A trace must contain this action before any
StorageGet and StorageGetResult actions.
KvslibPutResult{opId,err} : kvslib should
record this action just after receiving the Put results from
frontend. Here err is a boolean, per storageFail in the
ResultStruct.
- If
KvslibPut is present, a trace must contain this action
exactly once after KvslibBegin and KvslibPut .
- A trace must contain this action after corresponding
FrontEndPut and FrontEndPutResult{err'} actions.
err must equal err'.
- A trace must contain this action after any
corresponding
StorageSaveData actions.
- Let S be the set of joined storage nodes at the
time that frontend recorded
the
FrontEndPutResult action corresponding to
this KvslibPutResult . If S is the empty set,
then err must be set to true. If S is a non-empty
set, then every node s in S must have recorded a single
corresponding StorageSaveData action in the
same trace as this KvslibPutResult .
-
Put results must satisfy the monotonic write condition.
Assume t1 and t2 are two put-related traces writing to the
same key and generated by the same client. Let t1 contain
KvslibPutResult(opId1,err:false) and let t2 contain
KvslibPutResult(opId2,err:false) such that opId1 < opId2.
Now consider all storage nodes S (that have appeared or will
appear). Every storage node s in S must satisfy the
following conditions:
- If s does not record
StorageSaveData in t1,
but s records some StorageSaveData' in t2, then
this case is already covered above.
- If s records
StorageSaveData in t1 and s
also records some StorageSaveData' in t2. Then,
StorageSaveData must happen before StorageSaveData'.
-
If s records
StorageSaveData in t1 and s does
not record any StorageSaveData' in t2 and s
records a join that happens-after the FrontEndPutResult in t2, then
EITHER that StorageJoined(state) action must reflect the put corresponding to opId2,
OR another FrontEndPutResult affecting k that happens-before that StorageJoined.
In addition, the StorageSaveData action must happen before
the StorageJoined(state) or FrontEndPutResult action.
-
If s does not record any actions in t1 or t2 and s records a join that happens-after the FrontEndPutResult in t2,
then EITHER that
StorageJoined(state) action must reflect the put corresponding to opId2,
OR another FrontEndPutResult affecting k that happens-before that StorageJoined.
KvslibGetResult{opId,key,value,err} :
kvslib should record this action just after receiving the Get
results from frontend. Here value is the value of the
key key. If Err is true (operation failed),
the value must be nil.
value will have the actual value of key key only
when Err is false (operation succeeded).
- If
KvslibGet is present, a trace must contain
this action exactly once after KvslibBegin
and KvslibGet .
- A trace must contain this action after corresponding
FrontEndGet
and FrontEndGetResult{err'} actions. err must equal err'.
- A trace must contain this action after any corresponding
StorageGet and StorageGetResult
actions.
-
Let S be the set of joined storage nodes at the time that
frontend recorded the
FrontEndGetResult
action corresponding to
this KvslibGetResult .
Then, every s in S must
satisfy the following conditions, relative to that
FrontEndGetResult action:
- If Err is false and Value is not nil, then the
most recently recorded
StorageJoined{state} by s (if it exists),
or the most recently
recorded StorageSaveData{key,_} by s (if
it exists) must associate key with value. One of these
StorageJoined
and StorageSaveData actions must be
recorded by s. If both are recorded by s, the latest
one (that happens-after the other) is considered, and
shall be called the latest relevant store
action. This latest relevant store
action must happen before the KvslibGetResult 's
corresponding FrontEndGetResult action.
- If Err is false and Value is nil, then s must
neither have a record of an instance
of
StorageJoined{state} nor a record of
an instance of
StorageSaveData that happen before
the KvslibGetResult 's corresponding FrontEndGetResult action and
associate key with any value.
And, if Err is false, at least one node s in S must have recorded
StorageGet and StorageGetResult
actions that happened before
the FrontEndGetResult action.
-
Get results must satisfy the monotonic read
condition.
Assume t1 and t2 are two get-related traces generated by the same client.
Let t1 contain KvslibGetResult(opId1, key, value, err:false) and
let t2 contain KvslibGetResult(opId2, key, value', err:false) such
that opId1 < opId2 and value and value' are non-nil.
Both KvslibGetResult actions must be associated
with a FrontEndGetResult , which, as specified above, will have
latest relevant store actions that associate
key with value and key with value' respectively.
Let S1 be the set of joined storage nodes at the time that
frontend recorded the FrontEndGetResult action
corresponding to
KvslibGetResult(opId1...) . Let S2 be the set
of joined storage nodes at the time that frontend recorded
the FrontEndGetResult action corresponding to
KvslibGetResult(opId2...) . Given the above
context, every storage node s (that have appeared or will
appear), must satisfy the following conditions:
- (s ∉ S1) ∧ (s ∈ S2) ⇒
the latest relevant store action on s, relative to
KvslibGetResult(opId2,...) 's relevant
FrontEndGetResult , must associate k with
value', AND must happen-after the latest relevant store
action (relative to KvslibGetResult(opId1,
...) 's relevant FrontEndGetResult )
on any other storage node s' ∈ S1.
- (s ∈ S1) ∧ (s ∈ S2) ⇒
either value' equals value, or
the latest relevant store action at s associated with
KvslibGetResult(opId1,...)'s relevant
FrontEndGetResult must
happen-before the latest relevant store action at s
associated with KvslibGetResult(opId2...)'s relevant FrontEndGetResult .
- (s ∈ S1) ∧ (s ∉ S2)
∧ (s is joined at some point after opId2's
FrontEndGetResult , via some StorageJoined j) ⇒
the latest relevant store action at s associated with
KvslibGetResult(opId1,...) 's relevant
FrontEndGetResult must happen-before
j. Additionally, j must associate k with value', OR
there must be some FrontEndPutResult(err = false)
associating k with some other value'', that
happens-between KvsligGetResult(opId2, ...) 's relevant
FrontEndGetResult and j.
- (s ∉ S1) ∧ (s ∉ S2) ∧ (s is joined
at some point after opId2's
FrontEndGetResult , via
some StorageJoined(state') j') ⇒
-
(4.1) EITHER j' must include state'[k] = value', OR
there must be some
FrontEndPutResult(err =
false) associating k with some other value'',
that happens-between KvsligGetResult(opId2,
...) 's relevant FrontEndGetResult
and j'.
-
(4.2) Let j be the earliest
StorageJoined(state) recorded by s
between KvslibGetResult(opId1, ...) 's relevant
FrontEndGetResult
and KvslibGetResult(opId2, ...) 's
relevant FrontEndGetResult . If j exists,
then by premise, j must happen before j'.
Additionally, if j exists, EITHER j must record state[k]
= value, OR there must be
some FrontEndPutResult(err = false)
associating k with some other value that
happens-between KvslibGetResult(opId1,
...) 's relevant
FrontEndGetResult and j.
KvslibComplete{clientId} : For a
specific clientId, kvslib records this action to mark the
end of a client session, corresponding to the successful
completion of KVS.Close() . For
each ClientId corresponding to a running client, the
tracing log as a whole should contain this action exactly
once, as part of the kvslib trace. This action should
happen-after all other actions reported by a
given clientId.
For each client, OpIds are unique: Any two traces from the
same client that contain opIds, must have distinct opId values.
Actions to be recorded by frontend
FrontEndStorageStarted{storageId} : Denotes that the frontend has
detected the storage node with id storageId as being started or restarted. This
action should only be recorded in the frontend trace.
- For a given storageId this action should alternate with
FrontEndStorageFailed: no two FrontEndStorageStarted{storageId} should
occur without an intermediate FrontEndStorageFailed{storageId}.
- This action must precede
a
FrontEndStorageJoined(S) action where
storageId ∈ S
- This action should be recorded only if the frontend has
successfully executed an RPC against the storage node.
FrontEndStorageFailed{storageId} : Denotes that the frontend has
detected the storage node with id storageId as failed. This action should only be
recorded in the frontend trace.
- For a given storageId this action should alternate with
FrontEndStorageStarted: no two FrontEndStorageFailed{storageId} actions
should occur without an intermediate FrontEndStorageStarted{storageId}.
-
This action should be recorded if the frontend has (1)
failed to successfully execute an RPC against the storage
node with id storageId, and (2) has re-attempted the failed RPC after a
timeout, and (3) the re-attempted RPC has also failed.
FrontEndStorageJoined{S} : frontend should
record this action when the set of joined nodes changes, either
if a new storage node joins or if a joined node fails and
becomes unjoined. The set S must contain storageIds of all the
currently joined nodes that the frontend knows about. This action should only be
recorded in the frontend trace.
- If a storage node with
storageId has started
and caught up to latest KV state, then this action must be
recorded once and it must happen after the joined storage
node recorded a
StorageJoined action in its storage trace, and
storageId must be in S.
- If a storage node with
storageId has failed,
then this action must be recorded once and it must happen
immediately after the frontend node recorded a
FrontEndStorageFailed(storageId) action
and storageId must not be in S.
FrontEndPut{{key,value}} : In a given trace and with a key & value pair,
frontend records this action just after receiving a Put operation from a client.
- A trace must contain this action exactly once
after KvslibBegin and KvslibPut, should KvslibPut be present.
- A trace must contain this action before FrontEndPutResult.
- A trace must contain this action before any
StoragePut and StorageSaveData, if they are present.
If they are present, these subsequent actions must refer to the same key,value pair.
FrontEndPutResult{err} : The frontend must
record this action after receiving Put results from storage
node. Here err is a boolean, per storageFail in the
ResultStruct.
- A trace must contain this action exactly once
after KvslibBegin and KvslibPut, should KvslibPut be present.
- A trace must contain this action after a corresponding
FrontEndPut.
- A trace must contain this action after
any StoragePut and StorageSaveData actions.
-
Let S be the set of joined storage nodes at the time that
frontend records this
FrontEndPutResult
action. If err is set to True, then S must be the
empty set.
FrontEndGet{key} : In a given trace and with a key key,
frontend records this action just after receiving a Get operation from a client.
- A trace must contain this action exactly once
after KvslibBegin and KvslibGet, should KvslibGet be present.
- A trace must contain this action before FrontEndGetResult.
- A trace must contain this action before
any instances of StorageGet and StorageGetResult.
Any of these subsequent actions should refer to the same key.
FrontEndGetResult{key,value,err} : frontend
should record this action just after receiving the Get results
from storage. Here value is the value of the
key key. If err is true (operation failed),
the value will be nil.
value will have actual value of key key only when err is false (operation succeeded).
- A trace must contain this action exactly once
after KvslibBegin and KvslibGet, should KvslibGet be present.
- A trace must contain this action after corresponding
FrontEndGet.
- A trace must contain this action after
StorageGet and StorageGetResult, and, if err is set to false,
this action should report the same key,value pair as as the StorageGetResult.
-
Let S be the set of joined storage nodes at the time that
frontend records this
FrontEndGetResult
action. If err is set to True, then S must be the
empty set.
Actions to be recorded by storage
StorageLoadSuccess{StorageId, state} : The storage records
this action just after successful loading of Key-Value state
from disk (or initializing this state, if it does not exist). It
should report all key-value pairs that were loaded
as state, which is a Go map. This action should be
recorded only in the storage trace
- This should be the first action recorded by the storage
node on start (or restart).
StorageJoining{StorageId} : A storage node records
this action before receiving KVS state to catch-up with the
joined nodes in the system. This action should be recorded
only in the storage trace
-
If recorded, this action must occur after a locally
recorded
StorageLoadSuccess action.
StorageJoined{StorageId, state} : The storage records
this action when it has finished catching up with the joined
nodes in the system. The node should report all key-value
pairs that were loaded from disk and received as part
of the catch up process as state, which is a Go
map. This action should be recorded only in the storage
trace.
-
If recorded, this action must occur after a locally
recorded
StorageJoining action.
StoragePut{StorageId, key,value} : For a given key-value pair key & value, the storage node records this action
just after receiving a Put operation from frontend.
- A storage node in a trace must record this action between zero and two times,
after KvslibBegin and KvslibPut. Should KvslibPut be present,
it may only be recorded zero times by a storage node if the storage node is not running or has failed during the put operation;
it may only be recorded more than once if the two occurrences are separated by a
StorageJoined .
- A trace must contain this action after FrontEndPut and before FrontEndPutResult.
- Any instances of this action in a trace must happen-before at least one instance of StorageSaveData in the same trace.
StorageSaveData{StorageId, key,value} : The storage records this action just after durably adding/modifying the put value to disk.
- A storage node in a trace must record this action between zero and two times,
after KvslibBegin and KvslibPut. Should KvslibPut be present,
it may only be recorded zero times by a storage node if the storage node is not running or has failed during the put operation;
it may only be recorded more than once if the two occurrences are separated by a
StorageJoined .
- A trace must contain this action after FrontEndPut and before respective FrontEndPutResult.
- A trace must contain this action after some instance of
StorageJoined , and this action must happen-after
at least one instance of StoragePut in the same trace.
StorageGet{StorageId, key} : For a given key key, the storage node records this action
just after receiving a Get operation from frontend.
- A storage node in a trace must record this action between zero and two times,
after KvslibBegin and KvslibGet.
It may only be recorded more than once if the two occurrences are separated by a
StorageJoined .
- A trace must contain this action after FrontEndGet and before FrontEndGetResult.
- A trace must contain this action after
some instance of
StorageJoined , and this action must happen-before at least one instance of StorageGetResult in the same trace.
StorageGetResult{StorageId, key,value} : The storage records this action just after a Get on the in-memory key-value store.
Here value is the value of the key key. If key does not exists in key-value map, then
value must be nil. Otherwise, value will have actual value of key key.
- A storage node in a trace must record this action between zero and two times,
after KvslibBegin and KvslibGet.
It may only be recorded more than once if the two occurrences are separated by a
StorageJoined .
- A trace must contain this action after FrontEndGet and before FrontEndGetResult.
- A trace must contain this action after
some instance of
StorageJoined , and this action must happen-after at least one instance of StorageGet in the same trace.
Assumptions you can and cannot make
-
You can assume that the frontend will start first.
-
Except for the frontend retry value of storageTimeout
seconds, you should not make any timing assumptions about how
long RPCs/communication may take over the network (i.e., assume
an asynchronous environment).
-
If all storage node have failed, you can assume that we will
re-start the last joined storage node first. In this
case, you may additionally assume that this first restarted
node must finish joining (and become joined) before any other
node will start.
-
We will never fail a joined storage node if (1) it's the only
storage node in the system, and (2) another storage node
is joining and has stale state. (This is because the frontend
would have no choice but to expose arbitrarily stale state to the
client since a storage node would be available, but it would be
out of date).
Implementation notes
We provide two starter files for you with updated tracing action
definitions and the storage Start API signature. Your
code should follow the API and code layout constraints from A5.
Starter files:
Testing
We will release some A6 smoke tests for you to test your tracing log
files. As before, we recommend that you test your system across a
variety of deployments. One of the best ways to test your system is
to use the tracing output and assert that actions are occurring in
the expected order.
Implementation requirements
- The client code must be runnable on CS ugrad machines and be
compatible with Go version 1.15.6.
- Your code must be compatible with the given APIs, and run
successfully against the provided initial code for client, frontend,
storage nodes.
- Your solution can only
use standard library Go
packages.
- Your solution code must be Gofmt'd
using gofmt.
Handin instructions
As with previous assignments, you will use your personal repository
for this assignment under the
CPSC416-2020W-T2
org. Your repository will look
like cpsc416_a6_USERNAME1_USERNAME2 .
Keep the A5-provided file layout at the top-level in
your solution. You may include additional files, but we do not expect
this will be necessary. Do not reference any additional libraries in
your solution.
Rough grading scheme
Your code must not change the API,
tracing structs and applications' configurations and command-line
arguments. Your code must work on ugrad servers. If any of these are
violated, your mark for this assignment is 0. This is true regardless
of how many characters had to be changed to make your solution
compile, and regardless of how well your solution works with a
different API or on a different machine.
Both partners in a team will receive the same assignment mark. That
is, we will not be doing any accounting of who contributed what to the
team's solution.
The high-level A6 mark breakdown looks like this:
- 20% : Concurrent put/get requests without faults with multiple storage nodes
- 20% : Correct storage nodes join process
- 20% : Correct failure semantics/recovery at storage node
- 20% : Correct failure semantics in kvslib and frontend
- 20% : Recovery in complex fault scenarios
Please make sure that your code is structured according to our
instructions. We will deduct 10% if we have to move, rename, or edit
files in your repository to make your project work with our
autograding scripts.
Performance bonuses. Do you want to earn more points? It's
easy, just make your highly concurrent and fault-tolerant and
multi-node system fast. How fast? You have to be faster than 90% of
the class. That's only 126 other students, or 63 other A6
groups. There are four performance bonuses:
- Get throughput (5% on top of A6 mark): what's the highest
concurrent get request rate that your system can sustain from
multiple concurrent clients with multiple storage nodes? Assume
a 100% get workload.
- Put throughput (5% on top of A6 mark): what's the highest
concurrent put request rate that your system can sustain from
multiple concurrent clients with multiple storage nodes? Assume
a 100% put workload.
- P99
get latency (5% on top of A6 mark): what's your system's 99th
percentile end-to-end get request response time?
- P99 put latency (5% on top of A6 mark): what's your system's
99th percentile end-to-end put request response time?
Everyone's solutions will be evaluated for a bonus. To qualify,
your solution must be able to take nil values for tracer/trace
instances in the API. That is, we will only run performance
tests on your system if your system can disable tracing (not crash
with nil values for tracer/trace).
To earn a bonus you must respect the other constraints: (1) no
caching of key-value state in the kvslib or in the frontend, (2) no
skipping of fault tolerance logic, (3) no assumptions about when the
storage nodes fail or join, (4) no assumptions about network
delay/timing, (5) no changes to APIs or configuration files.
We will not release the performance tests we will use. We will also
not release the distribution of key/value sizes nor their pattern.
We will also not release our definition of sustain for
throughput, though you can think of it as the max request rate that
your system achieves before hitting a knee in the performance curve.
Make sure to follow the
course collaboration policy and
refer to the submission
instructions that detail how to submit your solution.
Note that this is a group assignment requiring two people. We strongly
recommend that you work in a team. Please find a partner using post @5
on Piazza.
|