416 Distributed Systems: Project 1
Due: November 4 at 11:59pm
In this project you will work in a team of two to build a simple distributed file system that runs on top of a custom blockchain network. Your records file system (RFS) will have a global name space and no access control mechanism. Files in RFS are composed of fixed-length records that can either be read or appended (previously appended records cannot be modified). Your blockchain network will be a randomly connected peer-to-peer network that is loosely based on BitCoin: clients submit operations on files, these operations are flooded among nodes, and nodes mine blocks to add operations to the blockchain. You'll use proof of work to counter sybil nodes/spam and incentivize mining with record coins, which are units of storage in RFS and are necessary for clients to create files/add new records to files.
There are several extra credit options, including adding new APIs (deleting a file), and creating a rogue client to demonstrate an attack on your system.
You will deploy and test your RFS system in VMs across multiple data centers in the Azure cloud.
There are two types of nodes in your system: some number of miners and some number of clients. Together these nodes belong to an instance of the RFS system (a single blockchain and a single global name space of record-based files).
Peer-to-peer graph of miners
RFS miners form a peer-to-peer graph (based on command-line information that tells each miner who their immediate peers are). Miners only communicate with their immediate peers. A miner can only initiate connections to those miners specified as part of the initial command line arguments. To be able to communicate with the other miners in the system, a miner must use a flooding-style protocol, which you will design and implement.
Miners may join the network and they may depart (e.g., because of failure). If a miner knows of at least one peer, then it is connected. If the miner happens to lose connectivity to all of the peers that it knows about, then that miner becomes disconnected. A disconnected miner can only become connected if another miner connects to it (it has no way of learning about other miners while it is disconnected). Note that a miner must always be able to accept incoming connections from any other miner (that knows the right IP:port to connect to this miner).
The diagram below illustrates a connected graph of connected miners, with each miner connected to at least three others. Miner A has three peers: nodes B, C, D. Node E is a disconnected miner as it has no peers.
An RFS miner has two roles: it mines RFS coins for clients to consume, and it participates in the network to help maintain the blockchain. Miners have the following responsibilities:
Clients receive service from miners
RFS clients are applications that use the RFS lib to connect to a miner and to submit operations on files stored by RFS. RFS clients are (1) unaware of the blockchain, (2) utilize the record coins mined by the miner that they connect to, and (3) are the only way for applications to access/interact with RFS.
The diagram below shows the miners network from above with several clients. Note that a miner can have a variable number of clients: some miners may have no clients, and others may have many clients. Also note that each client is only connected to exactly one miner and that clients do not communicate with each other.
The diagram below captures the design and implementation of your system in layer-form:
RFS uses the blockchain as storage and organizes this storage into a set of files that live in a global name space. This name space is accessible (readable and writeable) to any node that is part of the network. Note that there is no access control in RFS; in general, this system design is simple and extremely unsecure. There are also no directories in RFS, only files. RFS files can be created, but never destroyed. An RFS file has a name that is at most 64 bytes long, can have at most 65,535 (uint16) records, and is composed of a totally-ordered sequence of fixed-size records. In your implementation you'll use a record size of 512 bytes.
RFS has a limited API that allows a client to list the existing file names, create a new file, append a record to an existing file, find out the number of records in a file, and read a record at a specific index in a file. This API is detailed further below. As in A1, you are not allowed to change this API. We also provide several test client programs that you can use to test your API.
Because RFS is built on top of a blockchain, RFS will provide eventual consistency semantics to clients.
RFS library API
The RFS library exposes an interface that is detailed in the rfslib.go file. You are not allowed to change this interface. All of the calls require connectivity and return DisconnectedError if the library is no longer connected to a miner or if the miner is disconnected (has no peers). All of the calls below are blocking calls: they do not return until they have finished successfully, or an error occurred. In particular, calls that require record coins to succeed, must block until (1) the miner can mine sufficient coins to process the operation, and (2) the operation succeeds (is confirmed by the network). Furthermore, RFS calls only return state that has been confirmed (e.g., ListFiles only returns files that have been confirmed).
Miners maintain a tree representation of the block chain. The 'chain' refers to the longest path in this tree, starting at the genesis block whose hash is specified on the command line. A miner should only compute no-op and op blocks along the chain, and not along any shorter path in the tree. In the case that there are several (longest) chains, the miner should (1) pick the one that does not cause a validation error for the current op block it is generating, or if no op block is being generated or none cause a validation error, then (2) the miner should pick among the chains uniformly at random.
The miners flood two kinds of state: operations, and blocks. Operations can be of two types: create file (generated by a CreateFile API call) and append record (generated by an AppendRec API call). Note that all other API calls must be serviced by the miner that the client is connected to. For example, the call to ReadRec should not generate any network traffic between miners.
Block generation. Each miner must implement a mining procedure by which it can generate record coins and add a new block in the block chain. A miner can only compute one block at a time and cannot work on multiple blocks simultaneously. There are two kinds of blocks: no-op blocks and op blocks.
We are not going to constrain the size of your block. We recommend that you choose a block size that can hold no more than 100 operations. Miner settings specify a GenOpBlockTimeout parameter. This parameter specifies the minimum time in milliseconds between generating op blocks. This timeout allow the miner to aggregate multiple operations into a single op block and also prevents the miner from blocking op block generation in case there are a few operations. The following diagram explains how your mining algorithm should behave over time and the role of the GenOpBlockTimeout:
Block data structure. An op block is a data structure that contains at least the following data:
You can use sample code from 2017w2 assignment 1 as a starting point for your proof of work code (see the computeNonceSecretHash method).
Note that a block hash computed for a block by a miner A will always be different from blocks generated by other miners (for the same op/no-op) and will always differ from other blocks generated by miner A. This is because a block contains prev-hash, which uniquely identifies its position in the blockchain tree, and a block contains the miner's ID, which makes each block unique to miner A.
A no-op block is identical to an op block except that it does not include any operations. Its hash is similarly computed using a proof-of-work algorithm.
Block validation. This project assumes some amount of trust between miners. For example, miners are trusted to associate their IDs with operations of clients that are connected to them. However, it is important that the miners validate blocks. Here is a minimal set of validations that each miner should perform:
Operation confirmation. As noted in the previous
section, RFS API calls that require new operations in the
blockchain (AppendRec and CreateFile) must be confirmed
before returning. Confirmation of an operation means that the
block containing the operation must be followed by some fixed
number of other blocks in the longest blockchain. This
confirmation number of blocks is provided in command
line settings to the miner and may differ for append and
create file operations. The reason that confirmation is
necessary in a blockchain-based system is because of
concurrency: if two different blocks containing operations are
generated at the same time, then only one of these blocks (and
corresponding operations) will end up on the longest
chain. Confirmation blocks give the client further assurance
that the block with their operation is indeed part of the
longest blockchain (the probability of another chain being
longer diminishes with more confirmations due to proof of
Servicing rfslib API calls. The rfslib must implement the rfslib interface. However, the way to accomplish this is to send each operation to the miner and have the miner process the operation, either locally or by coordinating with the broader miner network. As noted above only two types of operations are flooded in the miner network (and integrated into blocks by all miners): file appends and file creates. All other operations are serviced locally because they do not need to be integrated into the blockchain (e.g., there is no record of a ReadRec in the blockchain). To service an operation locally, the miner maintains a view of RFS global state for the longest chain in the blockchain that it knows about (e.g., the files that exist at this point and their contents). The miner would service some client RFS operations based on this view. Note that the miner must continue to respond to client operations (and it must support multiple clients) even though the blockchain changes.
Miners and clients may fail stop. If a client fails while it is blocking on an operation, the miner may abort the operation or it may commit the operation. If the miner fails, the rfslib instances of clients that are connected to this miner should return a DisconnectedError to the application.
You can assume that a miner that has failed does not ever rejoin the same RFS system instance with the same miner ID.
Miners should be able to detect peer failures by using some form of failure detection. You may want to use your A1 fdlib for this (though consider making it less aggressive by introducing a delay between heartbeat messages). You may use the same mechanism to detect miner failure at clients and client failure at miners.
Miners and clients should not store or cache any state on disk. A failure therefore completely wipes out the node's state.
As long as a miner is connected to at least one other miner, it should continue to operate in the miner network.
Example RFS clients/applications
We provide you with 6 example programs that you can use to test your system. Each of these programs assumes a file called .rfs in the current directory that contains two lines. The first line must contain the local IP:port address to use to connect to the miner, the second line must contain the IP:port address of the miner to connect to (these are exactly the two arguments to rfslib.Initialize described above).
The Azure cloud is composed of several data-centers, located world-wide. Each data center hosts many thousands of machines that can be used (i.e., rented) for a price. In this project you will use the Azure cloud to deploy and test your solution in the wide area. That is, you will deploy your web cache across several VMs on Azure.
Although you will test and deploy your system on Azure, it will have nothing Azure-specific about it, e.g., it should be possible to run your system on the CS ugrad servers without any code modifications (though we will not test this).
Using Azure: stop VMs when not using
We prepared a google slides presentation covering the basic workflow of getting a VM running on Azure for this/future assignments. To setup the Go environment in a VM you can use the azureinstall.sh script.
The default Azure subscription comes with a limitation of 20 cores per region. It is likely that you will need more than 20 nodes in this project. For this consider using several different data centers around the world.
Use this site to check your account balance.
Access information will be posted to piazza.
A key detail is that each second that your VM is running it is draining your balance (yikes!). You should STOP your VMs when you are not using them. It's up to you to police your own account.
Write two go programs called miner.go and rfslib.go that behave according to the description above. Hand in your code using a private UBC github with a name format of P1-uid1-uid2 where uid1 and uid2 are the CS student ids of your team.
The rfslib.go file is the library whose spec is described above. The miner.go file implements a miner process and must have the following command line usage:go run miner.go [settings]
Settings is a json file (see example config.json) that has the following fields. There are two types of fields: those that do not differ between miners in an instance of an RFS system, and those that do differ between miners.
Settings that do not differ between miners:
The above set of settings are required. But, you may include other settings, though please check with us first. You may also decide to start the miner using a different variant of the 'go run' command.
Here are several points of advice that we recommend that you follow:
Grading scheme and marking process
We will follow a demo style grading scheme, similar to A2. At the high level your mark for this assignment is 20% of your final mark and has these components:
Note that the demo actually exercises both the Code and the Demo portion of your mark. So, the best way of thinking about the demo rubric below is that it is 100% of your mark. Of course, we reserve the right to look at your code on our own if we want to to further convince ourselves about some functional aspect or to check that you do indeed implement some particular piece (e.g., calculating the number of coins a specific miner has).
The demo has several parts and will take 25 minutes:
Each demo will be supervised by 2 instructors: either two TAs or Ivan and one of the TAs. Each team has a guaranteed slot of 25 minutes. Note that 25 minutes may be a hard cut off, especially in cases where there is another team scheduled after your team. (We give ourselves about 5-10 minutes to deliberate your mark and review the demo after you leave the room. The more time we have for this, the more likely that you will get a fair shake).
Note that your demo will be run in our environment. That is, when you enter the demo room, we will have Azure VMs up and running and ready for your demo. The VMs will have:
We will release a set of IPs (public and private) for two sets of VMs on piazza. You will use one of these VM sets to demo your system.
When you come in, we will give you time to setup the miners. The miner topology will not be revealed to you. Each VM will contain a settings.json file which will be used to configure the miners. You will be responsible for starting the miners but in parts A-D we will be using and running our own client programs to test the behaviour of your system.
Part A: Single miner, multi client
In part A, we will run several client programs that are connected to the same miner. This part will involve us testing basic functionality of your system, such as creating a file, appending records to a file, and reading records from a file.
Part B: Multi miner, multi client
In part B, we will run several client programs that are connected to different miners. Things we will be testing (with total percentage adding up to 60% for Part B):
Part C: Topology Failures and Recovery
In part C, we will test your system's ability to survive miner failures and to be able to include newly joined miners. This may include network partitions/joins.
Part D: Extra Credit Scenarios
EC1: We will delete a file and verify that the file no longer exists via client connected to a different miner. We will also verify that reading a record from / appending a record to a deleted file results in an error.
EC2: We will ask you to launch a malicious miner using the settings.json provided by us and you will be responsible for demonstrating how a malicious miner could abuse the system.
EC3: We will ask you to generate and show/explain logs by using ShiViz.
Part E: Design Questions
In this part we will ask you questions from a list of questions that we have curated.
Preparing for the demo
Nice to have, or strongly recommended for the demo:
What to bring for the demo: You will need your own laptop(s) to start the miners. We recommend bringing two devices so that you can coordinate with your partner, and you have some redundancy + 4 hands can type faster than 2 hands.
Honesty and collaboration