Running the benchmark in a scalable way
The benchmark relies on the Slurm job scheduler, which allows users to queue jobs (with dependencies) on multi-user clusters. We made this decision for the 2019 edition of the challenge, which ran nearly exclusively on Compute Canada supercomputers. This allowed us to process a large number of entries, both challenge submissions and our own baselines, for a estimated total of ~55 core-years.
the benchmark can run on a single thread (with the
--run_mode=interactive flag); this is fine for small-scale experiments, but time-consuming.
Setting up and maintaining a job scheduler on a multi-computer cluster requires a significant amount of work/expertise.
To overcome this, we have released instructions to set up a scalable, on-demand cluster on the Google Cloud Platform,
which you can find here:
Please note that GCP provides 300 USD on free credits to new users.