Distributed Systems

Distributed Systems

CPSC 416, Winter 2021

Tu/Th 8-930AM, online, UBC page, Piazza, Canvas

Office hours


Course description

Leslie Lamport, a computer scientist who won the 2013 ACM Turing Award, gave the following definition of a distributed system:

A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable.

Yet, distribution provides numerous benefits. A system becomes more fault tolerant if there are fewer points of failure and it has no centralized components. By extending the system with more physical nodes the system gains performance and becomes more scalable, capable of handling more load. Distribution can also improve latency, by improving geographic diversity, by placing resources closer to clients who use the system.

Achieving these benefits is not easy. As the quote above illustrates, distributed systems can fail in complex ways and these systems are more difficult to build, test, and understand than centralized systems.

This course will introduce you to a broad range of topics in distributed systems. The tentative topics are listed in the schedule below. For the most part this will be a lecture-style course. However, distributed system concepts are notoriously challenging to internalize without first-hand experience. The emphasis of this course, therefore, will be on building distributed system prototypes, small and large.

Course pre-requisites: CPSC 317 (networks) and CPSC 313 (computer hardware and operating systems).

Course staff: Ivan Beschastnikh (Instructor), Mayank Tiwari (TA), Finn Hackett (TA), Shayan Hosseini (TA), Shiqi He (TA), Jaafar (PostDoc).

Textbooks

There are three optional books for this course:

  1. Go Programming Language
  2. Programming in Go
  3. Distributed Systems: Principles and Paradigms (2nd Edition)
Although there are many tutorials introducing Go and the online Go documentation is well developed, some of you may find the first two books on the list helpful for a step-by-step introduction to Go.

Communication

This course will be held online, therefore it is essential to have clear channels of communication.

  • Use the course Piazza for all course-related communication. The Piazza also supports private posts that you can use to communicate with the instructor and the TAs.
  • We will use Zoom for synchronous sessions during the scheduled course hours. These sessions will be recorded and made available to you. You can find the zoom links and zoom recordings on the course Canvas.

Course-level learning goals

The course will provide an opportunity for students to

  • understand key principles in designing and implementing distributed systems
  • reason about problems that involve distributed components
  • become familiar with important techniques for solving problems that arise in distributed contexts
  • build distributed system prototypes using the Go programming language

Go resources

In this course we will exclusively use the Go programming language for all course work. Learning a new programming language is an important skill. You will practice it in this course. For the most part I will expect that you learn this language on your own. We will be using Go version 1.15.6 (available as /cs/local/bin/go on ugrad servers). If you use a personal machine, make sure to install the right version. Though, please note that all homework solutions will be tested on the ugrad server machines.

Go is a systems language designed at Google. It is especially well suited to building distributed systems. Like with any language, the fastest way to become proficient at Go is to put in the time writing programs in Go. Here are some resources to get you started:

Amanda and Stewart led an in-class Go tutorial in the Winter 2017 version of the course. Here is the recorded version: part 1, and part 2.

After you install the correct version of Go, you can now install an IDE of your choice. One option is JetBrain's GoLand, which provides free licenses for their software to anyone currently enrolled in school. You have to register here with your UBC email, and you will be given a license for their software. You can renew this license each year that you are a student. You can then download GoLand and activate it by using the license. When you are about to create a new project, you will need to select the correct SDK version for your project, which you can do by selecting the path of your go installation. Several advantages of GoLand:

  • smart completion
  • hover over a function and see it's documentation
  • built-in debugger with a nice UI
  • quick navigation between files and symbols
  • easy refactoring

Schedule (a work in progress)

Jan 12
Tue
Course overview, assignment 1 review [slides]

Read through Go resources prior to class, and practice as much Go as you can.

Jan 14
Thu
Networks review: layering, e2e, fate sharing, internet design [slides]

Assignment 1 due Jan 17 at 6pm PST

Jan 19
Tue
Finish net review, start on RPC [slides]
Jan 21
Thu
Distributed file systems: NFS [slides]

Assignment 2 due Jan 24 at 6pm PST

Jan 26
Tue
Caching in AFS, session semantics [slides]
Jan 28
Thu
SPDY/HTTP 2.0, CDNs, Consistent caching [slides]
Feb 2
Tue
Peer-to-Peer (P2P) part 1: file sharing [slides]
Feb 4
Thu
Peer-to-Peer (P2P) part 2: BitCoin [in-class notes]

Assignment 3 due Feb 7 at 11:59pm PST

Feb 9
Tue
Time synchronization and logical time [Lamport/vector clocks] [slides]
Feb 11
Thu
Distributed mutual exclusion [slides]
    Readings:
  • Skim Chapter 9 from Kshemkalyani and Singhal 'Distributed Computing: Principles, Algorithms, and Systems'
Feb 16
Tue
No class (UBC reading break); no office hours
Feb 18
Thu
No class (UBC reading break); no office hours
Feb 23
Tue
Failures and RAID [slides]
Feb 25
Thu
Primary backup replication [slides]
Mar 2
Tue
Transactions, part 1: ACID semantics and 2-phase locking [slides]
Mar 4
Thu
Transactions, part 2: logging [slides]
Mar 9
Tue
Two phase commit (2PC) [slides]
Mar 11
Thu
2PC in other topologies and Three phase commit (3PC)
[2pc-other-topologies slides, 3PC slides],
Assignment 4 due Mar 14 at 11:59pm PST
Mar 16
Tue
Quorum replication; Paxos protocol 1/2 [slides]
Mar 18
Thu
Quorum replication; Paxos protocol 2/2 [slides]
Mar 23
Tue
CAP theorem [slides]
Mar 25
Thu
Guest lecture by M. Shayan (Korbit AI) on Distributed Machine Learning. [slides]
Mar 30
Tue
Guest lecture by Jodi S. (Google) on Privacy [slides]
Apr 1
Thu
Guest lecture by Amanda L. (Google) on data center infrastructure
Assignment 5 due Apr 2 at 11:59pm PST
Apr 6
Tue
Distributed computing: MapReduce and Spark [slides]
Apr 8
Thu
Guest lecture by Peter C. (Google) on Kubernetes [slides]
Apr 13
Tue
Security in a distributed setting [slides] Assignment 6 due Apr 16 at 11:59pm PST
Apr 18
Sun

Final exam Sunday, Apr 18 2021, at 15:30PM.

Assignments

There will be at least 6 assignments. All assignments must be completed in Go. The first three assignment must be done individually, the following assignments will probably be done in groups of 2.

Solution must be submitted using UBC github (see below) by 6:00PM of the day of the deadline. Special instructions for compiling/running the code should be included as part of your README.txt file.

UBC GitHub submission instructions

We will use an enterprise version of GitHub at UBC for all assignment/project code and writeup submissions.

Log into github.students.cs. Notice that you are part of the CPSC416-2020W-T2 org. This is the org under which you will see repositories (that we will create for you) for all of the assignments in the course.

Work inside your assignment repo as you would usually. Don't forget to push your commits. We will mark the commit that immediately precedes the deadline time.

Exam

To practice for the exam we will go over 1-3 questions at the start of each class. You can also download the complete set of practice questions we have covered thus far (updated continuously).

Final exam will take place Sunday, Apr 18 2021, at 15:30PM.

Grading

Final course mark will be based off of:
Assignment 1 (individual) 7%
Assignment 2 (individual) 8%
Assignment 3 (individual) 15%
Assignment 4 (individual) 10%
Assignment 5 (group) 15%
Assignment 6 (group) 20%
Final exam 20%
Participation (piazza/zoom) 5%

Late policy

The deadline for one assignment can be extended by one day with no penalty to the mark. Extension requests must be made explicitly, and must be made no later than 24 hours past the deadline. Make a private piazza post to request an extension. Assignment solutions will not be accepted 24 hours past the original deadline.

If you have an emergency (e.g., health) that prevents you from meeting a deadline. You must notify the instructor before the deadline.

How to do well in this course

Learn Go early and practice it regularly. Learning a new language while being time constrained is stressful and not fun. Since the assignments rapidly increase in their difficulty, it will be to your advantage to learn Go as quickly as possible and to learn it well. The posted Go resources are a great starting point, but reading is no substitute for practice, bug, debug, practice, practice, bug, coffee, debug, practice, ...

Do not skimp on software engineering. Distributed systems are hard. They are hard to understand, to build, to debug, to run, to trace, to document, etc. Do not make your life any more difficult. Use best practices from software engineering to help you in this course. Write unit and integration tests, use version control, document your code with comments, write small prototypes, refactor your code, make your code readable and easy to run and debug. If you fail to follow best practices, they will come back to bite you later on. Unfortunately, this course will not explicitly teach you these best practices, but you probably took a course that introduced you to these concepts. If you have any questions, just ask us on Piazza.

Reach out for success. This is intended to be a challenging fourth year course, but that does not mean that you have to work through it on your own! The course piazza should be your first stop for all technical questions. The course has specific office hours (see top of page), but I and the TAs are flexible. Send any of us an email to schedule a time to discuss the course, the assignments, etc. University students often encounter setbacks from time to time that can impact academic performance. Discuss your situation with us or an academic advisor as early as possible. For help in addressing mental or physical health concerns, including seeing a UBC counselor or doctor, visit this link.

Academic integrity, collaboration guidelines, resources

UBC has a detailed policy regarding academic integrity. You must familiarize yourself with this policy.

UBC provides resources to support student learning and to maintain healthy lifestyles but recognizes that sometimes crises arise and so there are additional resources to access including those for survivors of sexual violence. UBC values respect for the person and ideas of all members of the academic community. Harassment and discrimination are not tolerated nor is suppression of academic freedom. UBC provides appropriate accommodation for students with disabilities and for religious and cultural observances. UBC values academic honesty and students are expected to acknowledge the ideas generated by others and to uphold the highest academic standards in all of their actions. Details of the policies and how to access support are available here.

Acknowledgements

Many of the materials used in this course are derived from CMU's 15-440: Distributed Systems course from Spring 2014, and are used with permission from the content authors.