Distributed Systems

Distributed Systems

CPSC 416, Winter 2022

Tu/Th 5-630PM online in January, then in ESB 1013

UBC page, Piazza, Canvas

Office hours


Course description

Leslie Lamport, a computer scientist who won the 2013 ACM Turing Award, gave the following definition of a distributed system:

A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable.

Yet, distribution provides numerous benefits. A system becomes more fault tolerant if there are fewer points of failure and it has no centralized components. By extending the system with more physical nodes the system gains performance and becomes more scalable, capable of handling more load. Distribution can also improve latency, by improving geographic diversity, by placing resources closer to clients who use the system.

Achieving these benefits is not easy. As the quote above illustrates, distributed systems can fail in complex ways and these systems are more difficult to build, test, and understand than centralized systems.

This course will introduce you to a broad range of topics in distributed systems. The tentative topics are listed in the schedule below. For the most part this will be a lecture-style course. However, distributed system concepts are notoriously challenging to internalize without first-hand experience. The emphasis of this course, therefore, will be on building distributed system prototypes, small and large.

Course pre-requisites: CPSC 317 (networks) and CPSC 313 (computer hardware and operating systems).

Course staff: Ivan Beschastnikh (Instructor), Yanze (TA), Mishaal (TA), Mayank (TA).

Textbooks

There are three optional books for this course:

  1. Go Programming Language
  2. Programming in Go
  3. Distributed Systems: Principles and Paradigms (2nd Edition)
Although there are many tutorials introducing Go and the online Go documentation is well developed, some of you may find the first two books on the list helpful for a step-by-step introduction to Go.

Communication

At least part of this course (in January) will be held online, therefore it is essential to have clear channels of communication.

  • Use the course Piazza for all course-related communication. The Piazza also supports private posts that you can use to communicate with the instructor and the TAs.
  • We will use Zoom for synchronous online sessions (in January) during the scheduled course hours. These sessions will be recorded and made available to you. You can find the zoom links and zoom recordings on the course Canvas.

Course-level learning goals

The course will provide an opportunity for students to

  • understand key principles in designing and implementing distributed systems
  • reason about problems that involve distributed components
  • become familiar with important techniques for solving problems that arise in distributed contexts
  • build distributed system prototypes using the Go programming language

Go resources

In this course we will exclusively use the Go programming language for all course work. Learning a new programming language is an important skill. You will practice it in this course. For the most part I will expect that you learn this language on your own. We will be using Go version 1.16.7 (available at /cs/local/bin/go on ugrad servers). If you use a personal machine, make sure to install this exact version. Though, please note that all homework solutions will be tested on the ugrad server machines.

Go is a systems language originally introduced by Google. It is especially well suited to building distributed systems. Like with any language, the fastest way to become proficient at Go is to put in the time writing programs in Go. Here are some resources to get you started:

Amanda and Stewart led an in-class Go tutorial in the Winter 2017 version of the course. Here is the recorded version: part 1, and part 2.

After you install the correct version of Go, you can now install an IDE of your choice. One option is JetBrain's GoLand, which provides free licenses for their software to anyone currently enrolled in school. You have to register here with your UBC email, and you will be given a license for their software. You can renew this license each year that you are a student. You can then download GoLand and activate it by using the license. When you are about to create a new project, you will need to select the correct SDK version for your project, which you can do by selecting the path of your go installation. Several advantages of GoLand:

  • smart completion
  • hover over a function and see it's documentation
  • built-in debugger with a nice UI
  • quick navigation between files and symbols
  • easy refactoring

Schedule (a work in progress; will change)

Jan 11
Tue
Course overview, assignment 1 review [slides]

Read through Go resources prior to class, and practice as much Go as you can.

Jan 13
Thu
Networks review: layering, e2e, fate sharing, internet design [slides]
Jan 18
Tue
Remote procedure calls (RPC) [slides]
Jan 20
Thu
Distributed file systems: NFS [slides]

Assignment 1 due Jan 24 at 6pm PST

Jan 25
Tue
Caching in AFS, session semantics [slides]
Jan 27
Thu
SPDY/HTTP 2.0, CDNs, Consistent caching [slides]

Feb 1
Tue
Peer-to-Peer (P2P) part 1: file sharing [slides]
Feb 3
Thu
Peer-to-Peer (P2P) part 2: BitCoin [slides]
Feb 8
Tue
Time synchronization and logical time [Lamport/vector clocks] [slides]
Feb 10
Thu
Distributed mutual exclusion [slides]
    Readings:
  • Skim Chapter 9 from Kshemkalyani and Singhal 'Distributed Computing: Principles, Algorithms, and Systems'
Feb 15
Tue
Failures and RAID [slides]

Assignment 2 due Feb 15 at 6pm PST

Feb 17
Thu
Primary backup replication [slides]
Feb 22
Tue
No class (UBC reading break); no office hours
Feb 24
Thu
No class (UBC reading break); no office hours
Mar 1
Tue
Transactions, part 1: ACID semantics and 2-phase locking [slides]
Mar 3
Thu
Transactions, part 2: logging [slides]
Mar 8
Tue
Two phase commit (2PC) [slides]
Mar 10
Thu
2PC in other topologies and Three phase commit (3PC) [2pc-other-topologies slides, 3PC slides],

Assignment 3 due Mar 11 at 11:59pm PST

Mar 15
Tue
Quorum replication; Paxos protocol 1/2 [slides]
Mar 17
Thu
Quorum replication; Paxos protocol 2/2 [slides]
Mar 18
Fri

Project proposal drafts due

Mar 22
Tue
CAP theorem [slides]
Mar 24
Thu
Guest lecture by Peter C. (Google) on Kubernetes [slides]
Mar 25
Fri

Project proposals due

Mar 29
Tue
Guest lecture by Jodi S. (Google) on Privacy [slides]
Mar 31
Thu
Guest lecture by M. Shayan (Korbit AI) on Distributed Machine Learning. [slides]
Apr 1
Fri

Project milestone 1 due.

Apr 5
Tue
Distributed computing: MapReduce and Spark [slides]
Apr 7
Thu
Security in a distributed setting [slides]
Apr 8
Fri

Project milestone 2 due.

Apr 13
Wed

Final exam: Wednesday Apr 13 2022, 19:00 @ LIFE 2201.

Apr 15
Fri

Project milestone 3 due.

Apr 18-22

Project demos.

Apr 19
Tue

Project code and final reports due.

Assignments

All assignments must be completed in Go. More details TBD.

Solution must be submitted using UBC github (see below) by 6:00PM of the day of the deadline. Special instructions for compiling/running the code should be included as part of your README.txt file.

UBC GitHub submission instructions

We will use an enterprise version of GitHub at UBC for all assignment/project code and writeup submissions.

Log into github.students.cs. Notice that you are part of the CPSC416-2021W-T2 org. This is the org under which you will see repositories (that we will create for you) for all of the assignments in the course.

Work inside your assignment repo as you would usually. Don't forget to push your commits. We will mark the commit that immediately precedes the deadline time.

Exam

To practice for the exam we will go over 1-3 questions at the start of each class. You can also download the complete set of practice questions we have covered thus far (updated continuously).

Final exam: Wednesday Apr 13 2022, 19:00 @ LIFE 2201.

Grading

Final course mark will be based off of:
Assignment 1 (individual) 5%
Assignment 2 (individual) 10%
Assignment 3 (group) 15%
Project: open ended (group) 45%
Final exam 20%
Participation (piazza/zoom/in-class) 5%

Late policy

You will receive a 0 for late work unless you have an approved extension.

For individual assignments, the deadline for one assignment can be extended by 24 hours with no penalty to the mark. Extension requests must be made explicitly with a private post to Piazza. Extension requests must be made no later than 24 hours past the deadline. Assignment solutions with an extension will not be accepted 24 hours past the original deadline.

For group assignments/projects you can request an extension of 24 hours as long as someone in your group has an unused extension. You cannot receive an extension of more than 24 hours for a group deliverable.

If you have an emergency (e.g., health) that prevents you from meeting a deadline. You must notify the instructor before the deadline.

How to do well in this course

Learn Go early and practice it regularly. Learning a new language while being time constrained is stressful and not fun. Since the assignments rapidly increase in their difficulty, it will be to your advantage to learn Go as quickly as possible and to learn it well. The posted Go resources are a great starting point, but reading is no substitute for practice, bug, debug, practice, practice, bug, coffee, debug, practice, ...

Do not skimp on software engineering. Distributed systems are hard. They are hard to understand, to build, to debug, to run, to trace, to document, etc. Do not make your life any more difficult. Use best practices from software engineering to help you in this course. Write unit and integration tests, use version control, document your code with comments, write small prototypes, refactor your code, make your code readable and easy to run and debug. If you fail to follow best practices, they will come back to bite you later on. Unfortunately, this course will not explicitly teach you these best practices, but you probably took a course that introduced you to these concepts. If you have any questions, just ask us on Piazza.

Reach out for success. This is intended to be a challenging fourth year course, but that does not mean that you have to work through it on your own! The course piazza should be your first stop for all technical questions. The course has specific office hours (see top of page), but I and the TAs are flexible. Send any of us an email to schedule a time to discuss the course, the assignments, etc. University students often encounter setbacks from time to time that can impact academic performance. Discuss your situation with us or an academic advisor as early as possible. For help in addressing mental or physical health concerns, including seeing a UBC counselor or doctor, visit this link.

Academic integrity, collaboration guidelines, resources

UBC has a detailed policy regarding academic integrity. You must familiarize yourself with this policy.

UBC provides resources to support student learning and to maintain healthy lifestyles but recognizes that sometimes crises arise and so there are additional resources to access including those for survivors of sexual violence. UBC values respect for the person and ideas of all members of the academic community. Harassment and discrimination are not tolerated nor is suppression of academic freedom. UBC provides appropriate accommodation for students with disabilities and for religious and cultural observances. UBC values academic honesty and students are expected to acknowledge the ideas generated by others and to uphold the highest academic standards in all of their actions. Details of the policies and how to access support are available here.

Acknowledgements

Many of the materials used in this course are derived from CMU's 15-440: Distributed Systems course from Spring 2014, and are used with permission from the content authors.