Tetrisched: Space-Time Scheduling for Heterogeneous Datacenters

Date
Location

ICCSX836

Tetrisched: Space-Time Scheduling for Heterogeneous Datacenters

SPEAKER: Alexey Tumanov, Carnegie Mellon University

HOST: Andrew Warfield 

ABSTRACT

I'm planning to talk about TetriSched. The key idea behind TetriSched is that it's a cluster scheduler that understands spatial (types of resources) and temporal (when to run) job preferences and leverages the flexibility and placement choices afforded by those preferences to come up with efficient heterogeneous resource allocations. It supports complex combinatorial soft constraints in a general way through an algebraic expression language that captures arbitrary forms of placement preferences, including the commonly mentioned locality and anti-affinity. Recently, we integrated TetriSched within the Hadoop YARN framework and ran real system experiments to demonstrate that TetriSched outperforms YARN Capacity Scheduler significantly due to the combination of TetriSched's support for gang scheduling, plan-ahead (ability to reason about deferred resource allocation), and soft constraints.

BIO

I'm a PhD Candidate at Carnegie Mellon, pursuing my PhD in Systems under the supervision of Greg Ganger. At CMU, I've also been fortunate to collaborate closely with Onur Mutlu, Mor Harchol-Balter, and Michael Kozuch. At a high level, my research interests revolve around systems support for large-scale and data-intensive distributed computing. For more detail, please refer to my publication list. http://www.cs.toronto.edu/~atumanov/

At the University of Toronto, I worked under the supervision of Eyal de Lara and Michael Brudno, as a full-time research assistant, contributing to SnowFlock and SnowFlock - related projects. I received my research-based M.Sc. in Computer Science from York University in Toronto, working with Centre for Vision Research affiliated advisors - Robert Allison and Wolfgang Stuerzlinger. My thesis focused on mitigating and compensating for the delay and its variability inherent to the distributed interactive virtual reality applications.

Lastly, I worked on distributed cluster technology R&D in the industry as well. I was involved with the development of cluster middleware responsible for distributed datacenter resource management, allocation, and scheduling. I was also one of the key contributors to the development of the Intel Cluster Ready Open Cluster Stack(OCS).

Find more undergrad events on our internal portal at https://my.cs.ubc.ca.

This event's address: https://my.cs.ubc.ca/event/2014/11/tetrisched-space-time-scheduling-heterogeneous