Distributed Systems

Distributed Systems

CPSC 416, Fall 2018

Tu/Th 8-930AM, HDP 310, UBC course page

Course piazza

Office hours:

Anny ...... Mon 130-230pm, X151
Adam...... Tue 1-2pm, X151
Ivan........ Thu 10-11am, ICCS 327
Vaastav... Fri 3-4pm, X151


Course description

Leslie Lamport, a computer scientist who won the 2013 ACM Turing Award, gave the following definition of a distributed system:

A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable.

Yet, distribution provides numerous benefits. A system becomes more fault tolerant if there are fewer points of failure and it has no centralized components. By extending the system with more physical nodes the system gains performance and becomes more scalable, capable of handling more load. Distribution can also improve latency, by improving geographic diversity, by placing resources closer to clients who use the system.

Achieving these benefits is not easy. As the quote above illustrates, distributed systems can fail in complex ways and these systems are more difficult to build, test, and understand than centralized systems.

This course will introduce you to a broad range of topics in distributed systems. The tentative topics are listed in the schedule below. For the most part this will be a lecture-style course. However, distributed system concepts are notoriously challenging to internalize without first-hand experience. The emphasis of this course, therefore, will be on building distributed system prototypes, small and large.

Course pre-requisites: CPSC 317 (networks) and CPSC 313 (computer hardware and operating systems).

Course staff: Ivan Beschastnikh (Instructor), Vaastav Anand (TA), Anny Gakhokidze (TA), Adam Geller (TA).

Textbooks

There are three optional books for this course:

  1. Go Programming Language
  2. Programming in Go
  3. Distributed Systems: Principles and Paradigms (2nd Edition)
Although there are many tutorials introducing Go and the online Go documentation is well developed, some of you may find the first two books on the list helpful for a step-by-step introduction to Go.

Communication

Use the course Piazza for all course-related communication. The Piazza also supports private posts that you can use to communicate with the instructor and the TAs.

Course-level learning goals

The course will provide an opportunity for students to

  • understand key principles in designing and implementing distributed systems
  • reason about problems that involve distributed components
  • become familiar with important techniques for solving problems that arise in distributed contexts
  • build distributed system prototypes using the Go programming language

Go resources

In this course we will exclusively use the Go programming language for all course work. Learning a new programming language is an important skill. You will practice it in this course. For the most part I will expect that you learn this language on your own. We will be using Go version 1.9.7 (the latest on ugrad servers). If you use a personal machine, make sure to install exactly this version.

Go is a systems language designed at Google. It is especially well suited to building distributed systems. Like with any language, the fastest way to become proficient at Go is to put in the time writing programs in Go. Here are some resources to get you started:

Amanda and Stewart led an in-class Go tutorial in the Winter 2017 version of the course. Here is the recorded version: part 1, and part 2. We will run a Go tutorial during the second week of the course (see post on Piazza).

After you install the correct version of Go, you can now install an IDE of your choice. Anny prefers JetBrain's GoLand. JetBrains provides free licenses for their software to anyone currently enrolled in school. You have to register here with your UBC email, and you will be given a license for their software. You can renew this license each year that you are a student. You can then download GoLand and activate it by using the license. When you are about to create a new project, you will need to select the correct SDK version for your project, which you can do by selecting the path of your go installation. Several advantages of GoLand:

  • smart completion
  • hover over a function and see it's documentation
  • built-in debugger with a nice UI
  • quick navigation between files and symbols
  • easy refactoring

Schedule (a work in progress)

Sep 6
Thu
Course overview, assignment 1 review [slides]

Read through Go resources prior to class, and practice as much Go as you can.

Sep 11
Tue
Networks review: layering, e2e, fate sharing, internet design [slides]

Sep 13
Thu
Finish net review, RPC [slides]

Go RPC client and server code examples.

Sep 18
Tue
Distributed file systems: NFS [slides]

Assignment 1 due

Sep 20
Thu
Caching in AFS, dist. FS semantics (e.g., session semantics) [slides]
Sep 25
Tue
SPDY/HTTP 2.0, CDNs, Consistent caching [slides]
Sep 27
Thu
Peer-to-Peer (P2P) part 1: file sharing [slides]
Oct 2
Tue
Peer-to-Peer (P2P) part 2: DHTs and BitCoin [in-class notes]

Assignment 2 due

Oct 4
Thu
BitCoin continued; Project 1 overview [in-class notes]

Project 1 released

Oct 9
Tue
Time synchronization and logical time [Lamport/vector clocks] [slides]
Oct 11
Thu
Distributed mutual exclusion [slides]
    Readings:
  • Skim Chapter 9 from Kshemkalyani and Singhal 'Distributed Computing: Principles, Algorithms, and Systems'
Oct 16
Tue
Fault Tolerance, local faults
Oct 18
Thu
RAID
Oct 23
Tue
TBD

Ivan out of town. Visiting lecture, probably on (in)security of distributed machine learning.

Oct 25
Thu
Primary backup replication, chain replication
Oct 30
Tue
Transactions, part 1: ACID semantics and 2-phase locking

Nov 1
Thu
TBD

Ivan out of town. Visiting lecture, probably on software defined networking.
Project 1 due November 1

Nov 6
Tue
Transactions, part 2: logging

Nov 8
Thu
Two phase commit (2PC)

Nov 13
Tue
Three phase commit (3PC)

Nov 15
Thu
Quorum replication; Paxos protocol 1/2
Nov 20
Tue
Quorum replication; Paxos protocol 2/2

Nov 22
Thu
CAP theorem
Nov 27
Tue
Distributed computing: MapReduce and friends

Nov 29
Thu
Security in a distributed setting

Friday, November 30 is the last day of classes.

Dec 06
Thu

Final exam December 06 2018, 08:30 AM, Room TBD.

Assignments

There are 2 assignments. All assignments must be completed in Go. The first assignment must be done individually, the second assignment must be done in a group of 2.

Solution must be submitted using UBC github (see below) by 11:59PM of the day of the deadline. Special instructions for compiling/running the code should be included as a README.txt file.

Project 1

Project 1 is a larger assignment that must be done in a group of 2 students and must be deployed on Azure. Project 1 is due on November 1st.

Based on the Piazza poll, for project 2 you will construct a distributed file system on top of a P2P blockchain. You will build both the DFS and the blockchain from scratch.

Project 2

Project 2 is an open-ended project that must be done in a team of either 3 or 4 people and must be (at least partially) deployed on Azure. Project 2 has multiple TBD deadlines.

UBC GitHub submission instructions

We will use an enterprise version of GitHub at UBC for all assignment/project code and writeup submissions.

To hand in your code you will need to precisely follow a couple of steps. First, log into github.ugrad.cs. Notice that you are part of the CPSC416-2018W-T1 org. This is the org under which you will create all of your repositories in this course. All students and staff in the course have accounts under this org.

For repository/submission of a course deliverable you must create a new private repository under the above org. For A1 the repository must have the name A1-[USERID] where you should replace [USERID] with your cs undergraduate login. For A2 (assignment 2), P1 (project 1), and P2 (project 2) you will use a similar format. Here is a picture of what your screen should look like when you create your repo and the different formats you should use for different course deliverables. Use these formats exactly as shown:



Next, you must add the staff team as a collaborator on your repository. If you do not do this, then we cannot access your code. To do this, go to your newly created repo, click on settings in the top right, then select Collaborators & teams in the left listing, and finally choose the staff team in the Add a team drop down:



Finally, you must give the staff team Admin permission level:



Work inside your repo as you would usually. We will mark the commit that immediately precedes the deadline time.

Exam

To practice for the exam we will go over 1-3 questions at the start of each class. You can also download the complete set of practice questions we have covered thus far (updated continuously).

Final exam December 06 2018, 08:30 AM, Room TBD.

Grading

Final course mark will be based off of:
Assignment 1 (individual) 10%
Assignment 2 (pairs) 15%
Project 1 (pairs) 20%
Project 2 (team of 3 or 4) 30%
Final exam 20%
Participation (piazza/class) 5%

Note that Assignment 1 is an individual effort, while Assignment 2 and the two projects must be team efforts.

Late policy

The deadline for any assignment can be extended by one day with a 20% penalty to the mark. Assignments will not be accepted 24 hours past the original deadline.

Deadline for project 1 can be extended under the same terms as the assignments.

Deadlines for project 2 cannot be extended.

If you have an emergency (e.g., health) that prevents you from meeting a deadline. You must notify the instructor before the deadline.

How to do well in this course

Learn Go early and practice it regularly. Learning a new language while being time constrained is stressful and not fun. Since the assignments rapidly increase in their difficulty, it will be to your advantage to learn Go as quickly as possible and to learn it well. The posted Go resources are a great starting point, but reading is no substitute for practice, bug, debug, practice, practice, bug, coffee, debug, practice, ...

Do not skimp on software engineering. Distributed systems are hard. They are hard to understand, to build, to debug, to run, to trace, to document, etc. Do not make your life any more difficult. Use best practices from software engineering to help you in this course. Write unit and integration tests, use version control, document your code with comments, write small prototypes, refactor your code, make your code readable and easy to run and debug. If you fail to follow best practices, they will come back to bite you later on. Unfortunately, this course will not explicitly teach you these best practices, but you probably took a course that introduced you to these concepts. If you have any questions, just ask us on Piazza.

Choose your teammates, wisely. Some assignments will depend critically on your ability to work effectively with one other student. You are responsible for resolving personal and technical differences among teammates on your own. Let us know as early as possible if you have team concerns, before they turn into crises.

Reach out for success. This is intended to be a challenging fourth year course, but that does not mean that you have to work through it on your own! The course piazza should be your first stop for all technical questions. The course has specific office hours (see top of page), but I and the TAs are flexible. Send any of us an email to schedule a time to discuss the course, the assignments, etc. University students often encounter setbacks from time to time that can impact academic performance. Discuss your situation with us or an academic advisor as early as possible. For help in addressing mental or physical health concerns, including seeing a UBC counselor or doctor, visit this link.

Academic integrity and collaboration guidelines

UBC has a detailed policy regarding academic integrity. You must familiarize yourself with this policy.

Acknowledgments

Many of the materials used in this course are derived from CMU's 15-440: Distributed Systems course from Spring 2014, and are used with permission from the content authors.