Running the benchmark in a scalable way

Published on 2020-03-18 by Eduard Trulls

The benchmark relies on the Slurm job scheduler, which allows users to queue jobs (with dependencies) on multi-user clusters. We made this decision for the 2019 edition of the challenge, which ran nearly exclusively on Compute Canada supercomputers. This allowed us to process a large number of entries, both challenge submissions and our own baselines, for a estimated total of ~55 core-years.

the benchmark can run on a single thread (with the --run_mode=interactive flag); this is fine for small-scale experiments, but time-consuming. Setting up and maintaining a job scheduler on a multi-computer cluster requires a significant amount of work/expertise. To overcome this, we have released instructions to set up a scalable, on-demand cluster on the Google Cloud Platform, which you can find here:

https://github.com/etrulls/slurm-gcp

Please note that GCP provides 300 USD on free credits to new users.