Scheduler (Fairshare & RAC)
The WestGrid job scheduler is priority queue with a back fill mechanism. The scheduler will dispatch the highest priority job in the queue if there are sufficient resources for it to run. If there are insufficient resources to submit the highest priority job the scheduler will find the next highest priority job whose execution will not overlap with the approximate* earliest start of the original job (* since jobs can finish before their time cutoff the scheduler is using an upper bound of the earliest start time for a job). A job's priority is a weighted sum of several components. The most important of components (by weight) are requested resources and fairshare.
Resource Usage
At the moment the only resource request that will affect your jobs priority is the number of processors you request. This means the amount of time or memory you request will have no impact on your job's priority; though memory intensive runs are harder to dispatch regardless of priority.
Somewhat counter-intuitively (at first) asking for more processors will increase your jobs priority. This is done to improve over-all cluster performance; multi-node jobs are far less likely to get dispatched by the back fill mechanism so must be given a higher priority to compensate. The current contribution to your jobs priority on Glacier and Orcinus is 100*<# of requested processors>.
Fairshare (& RAC)
A user's (or account's) fairshare value is weighted average of cluster usage in a set of disjoint time windows. For example; Orcinus and Glacier use 7 time windows that each last 36 hours with the following weights:
Note that these are not sliding windows. There is a set time where the current window ends and is rolled over. So for example 30 hours into the current window some user has 10% cluster use for w1 (current window) and 30% usage for windows w2 & w3 and 0% for all others their current fairshare value will be 0.108. If they stop using the cluster at this point their fairshare value will reset to 0 after 252+6 hours.
One important note; a user's/account's cluster usage in a time window is a % of total usage of the cluster in the time window NOT a % of the available resources. So if the cluster is used by only 1 user in a time window they will be treated as having 100% usage for the window regardless of how many nodes they actually use.
The fairshare value is used in conjunction with a user's/account's fairshare target. Without an RAC this is set to be about 1-2% of a cluster. With an RAC this is set to whatever was awarded (i.e. 300 node RAC on Orcinus would give a target of ~10%). The fairshare component of a priority is then:
(FS Weight)
* (FS User Weight) * ((FS User Target) - (FS User Value))
* (FS Account Weight) * ((FS Account Target) - (FS Account Value))
(FS Weight)
* (FS User Weight) * ((FS User Target) - (FS User Value))
* (FS Account Weight) * MAX(0, (FS Account Target) - (FS Account Value))
(difference => accounts with an RAC are not penalized for going over target)
On Orcinus and Glacier:
- FS Weight = 100
- FS User Weight = 50
- FS Account Weight = 100
The last important note is that fairshare values are specific to individual clusters on WestGrid; i.e. using Orcinus heavily will not affect your (or your group's) priority on Glacier or Lattice. |