WestGrid - quick user guide

This page is part of the EmpiricalAlgorithmics web.

WestGrid - quick user guide

Introduction

WestGrid operates high performance computing (HPC), collaboration and visualization infrastructure across western Canada. It encompasses 14 partner institutions across four provinces.
An extensive overview of WestGrid can be found at the WestGrid website. You also can read the QuickStart Guide for New Users at http://www.westgrid.ca/support/quickstart/new_users

How to get a WestGrid account?

Lead researcher: You will select this option if you are the leader of a project (research group). You will be asked to enter some information about the nature of your research project before you apply for your user account. You will be given a Project ID Number that other collaborators in your group may cite when applying for their user accounts. Note that only faculty members can be project leaders.
Join an existing project (research group): This option is for researchers that are supervised by a leading researcher (for example, a student who is working for a professor.) To join a pre-existing group, you will need to obtain the Project ID Number from the project's leader and enter the number on the form. The project leader will be asked to verify the information you submit. You can look up project ID numbers using the web page https://rsg.nic.ualberta.ca/project_lookup.php.

To apply for an account, proceed to the Account Request page https://rsg.nic.ualberta.ca/.

What will you get in the next a few days?

After you submit your application, you will get a few e-mails from WestGrid.

WestGrid Account Application Received: WestGrid Account Management received your application.
Asking Permission from Project Leader: If you are willing to join an existing project (usually is the case for students), WestGrid will send an e-mail to the project leader asking for conformation.
WestGrid Application Accepted: Your application for a WestGrid account has been approved.
WestGrid account created: Westgrid has set up a Westgrid account for you on silo.westgrid.ca and hopper.westgrid.ca. Those are storage servers for medium and long-term data storage. For more information about using Silo and Hopper, please visit http://westgrid.ca/support/quickstart/silo. Note: The shell on Silo is restricted; it can only be used for managing and downloading files. You cannot run programs or scripts on Silo.
Welcome to cluster name_: Your account on the _cluster name has been activated. In my case, the cluster name is glacier.westgrid.ca. Note: the file system for storage and cluster is different. You can not directly access files stored in storage server from cluster. You will need to use gcp to copy files between different WestGrid machines.

How to transfer my files to/between WestGrid?

Assume my host machine is okanagan.cs.ubc.ca, my WestGrid storage server is silo.westgrid.ca and my cluster in WestGrid is glacier.westgrid.ca. The file I want transfer is test.txt.

Transfer files between WestGrid machines (from glacier.westgrid.ca to silo.westgrid.ca)
```
    gcp test.txt username@silo.westgrid.ca:~/
   
```

Transfer files between your local machine to WestGrid (from okanagan.cs.ubc.ca to glacier.westgrid.ca)

    okanagan:> scp test.txt username@glacier.westgrid.ca:~/
    username@glacier.westgrid.ca's password: password

If you want write a script to transfer many files from your local machine to WestGrid, entering password will be a problem. Here is the solution: First log in on okanagan.cs.ubc.ca as user username and generate a pair of authentication keys. Do not enter a passphrase:

    okanagan:~> ssh-keygen -t rsa
    Generating public/private rsa key pair.
    Enter file in which to save the key (/ubc/cs/home/username/.ssh/id_rsa): 
    Enter passphrase (empty for no passphrase): 
    Enter same passphrase again:  
    Your identification has been saved in /ubc/cs/home/username/.ssh/id_rsa.
    Your public key has been saved in /ubc/cs/home/username/.ssh/id_rsa.pub.
    The key fingerprint is:
    0e:97:88:0f:86:70:39:8f:44:13:e3:f4:5f:79:32:cd username@okanagan

Go to ~./ssh and transfer id_rsa.pub to glacier.westgrid.ca under .ssh in your home directory.

    scp id_rsa.pub xulin730@glacier.westgrid.ca:~/.ssh/

Add id_rsa.pub to authorized_keys2

    cat id_rsa.pub >> authorized_keys2

Now, try to use scp to transfer files (no password required).

Running Jobs

A great majority of the computational work on WestGrid systems is carried out through non-interactive batch processing. Job scripts containing commands to be executed are submitted from a login server to a batch job handling system, which queues the requests, allocates processors and starts and manages the jobs. The system software that handles your batch jobs consists of two pieces: a resource manager (TORQUE) and a scheduler (Moab). This system is fairly similar to our SunGridEngine. For detailed information, please visit http://westgrid.ca/support/running_jobs.

A batch job script is a text file of commands for the UNIX shell to interpret, similar to what you could execute by typing directly at a keyboard. The job is submitted to an queue using the qsub command. A job will wait in the queue depending on factors such as system load and the priority assigned to the job. When appropriate resources become available to run a job, it started on one or more assigned processors. A job will be terminated if it exceeds its allotted time limit, or, on some systems, if it exceeds memory limits. By default, the standard output and error streams from the job are directed to files in the directory from which the job was submitted. For detailed information of how to write a job script, please visit http://westgrid.ca/support/running_jobs#directives

A few useful commands:

qstat: Check the status of the cluster
qsub: Submit jobs to the queue (You can also submit array job such as qsub -t 1-100)
qdel: Delete you own jobs in case of something wrong

A few notes if you are using glacier.westgrid.ca (http://guide.westgrid.ca/guide-pages/jobs.html):

PBS -l qos=debug: For testing your scripts without waiting (10-15min jobs). (this applies to orcinus as well)
PBS -l nodes=1,software=MATLAB: For submitting MATLAB jobs. glacier.westgrid.ca has 20 MATLAB licenses.
http://guide.westgrid.ca/guide-pages/queue_state: For cluster utilization

Scheduler (Fairshare & RAC)

The WestGrid job scheduler is priority queue with a back fill mechanism. The scheduler will dispatch the highest priority job in the "eligible jobs" queue if there are sufficient resources for it to run. If there are insufficient resources to submit the highest priority job, the scheduler will find the next highest priority job whose execution will not overlap with the approximate* earliest start of the original job (* since jobs can finish before their time cutoff the scheduler is using an upper bound of the earliest start time for a job). A job's priority is a weighted sum of processor equivalent hours discounted over a 10 day time period.

Requested Resources

The resource that affects dispatching is processor-equivalent hours.

Processor-equivalent hours refers to number of processors your job will "take away" from the pool of resources. With small memory jobs, processor-equivalent hours are the same as processor hours, however, with high memory jobs the memory left on a node may become insufficient for the other processors to be utilized. The QDR nodes have 24 GB for 12 processors. Therefore there is 2GB for processor. If you use X GB, you will be counted as using max(# processors requested, X/2).

Fairshare (& RAC)

A user's (or account's) fairshare value is a weighted average of cluster usage in a set of disjoint time windows. For example; Orcinus and Glacier use 7 time windows that each last 36 hours with the following weights:

window	w1	w2	w3	w4	w5	w6	w7
weight	1.0	0.9	0.81	0.73	0.66	0.59	0.53

One important note; a user's/account's cluster usage in a time window is a % of total usage of the cluster in the time window NOT a % of the available resources. So if the cluster is used by only 1 user in a time window they will be treated as having 100% usage for the window regardless of how many nodes they actually use. However, this is largely not of concern since orcinus typically has bery few idle cores.

The fairshare value is used in conjunction with an account's fairshare target. Without a RAC this is set to be about 2% of a cluster. With a RAC this is set to whatever was awarded (i.e. 200 node RAC on Orcinus would give a target of ~2%). The fairshare component of a priority is then:

Without a RAC:

          (FS Weight) 
              * (FS User Weight) * ((FS User Target) - (FS User Value))
              * (FS Account Weight) * ((FS Account Target) - (FS Account Value))

With a RAC:

          (FS Weight) 
              * (FS User Weight) * ((FS User Target) - (FS User Value)) 
              * (FS Account Weight) * MAX(0, (FS Account Target) - (FS Account Value))

(difference => accounts with an RAC are not penalized for going over target)

On Orcinus and Glacier:

FS Weight = 100
FS User Weight = 50
FS Account Weight = 100

The last important note is that fairshare values are specific to individual clusters on WestGrid; i.e. using Orcinus heavily will not affect your (or your group's) priority on other westgrid clusters.

Priority of jobs within a single user's queue

Using qstat -u (username), you can look at your queue.

Below is an example queueing state:

6111297                v7q8    Running     1     3:04:25  Thu Jan 14 17:46:04
6111294                v7q8    Running     1     3:04:25  Thu Jan 14 17:46:04
6111295                v7q8    Running     1     3:04:25  Thu Jan 14 17:46:04
6111293                v7q8    Running     1     3:04:25  Thu Jan 14 17:46:04
6111296                v7q8    Running     1     3:04:25  Thu Jan 14 17:46:04

5 active jobs          5 of 9616 processors in use by local jobs (0.5%)
                        919 of 931 nodes active      (98.71%)

eligible jobs----------------------
JOBID              USERNAME      STATE PROCS     WCLIMIT            QUEUETIME

6111302                v7q8       Idle     1     6:00:00  Tue Jan 12 13:10:49
6111299                v7q8       Idle     1     6:00:00  Tue Jan 12 13:10:49
6111300                v7q8       Idle     1     6:00:00  Tue Jan 12 13:10:49
6111298                v7q8       Idle     1     6:00:00  Tue Jan 12 13:10:49
6111301                v7q8       Idle     1     6:00:00  Tue Jan 12 13:10:49

5 eligible jobs

blocked jobs-----------------------
JOBID              USERNAME      STATE PROCS     WCLIMIT            QUEUETIME

6111303                v7q8       Idle     1     6:00:00  Tue Jan 12 13:10:49
6111304                v7q8       Idle     1     6:00:00  Tue Jan 12 13:10:49
6111305                v7q8       Idle     1     6:00:00  Tue Jan 12 13:10:49
6111306                v7q8       Idle     1     6:00:00  Tue Jan 12 13:10:49
6111307                v7q8       Idle     1     6:00:00  Tue Jan 12 13:10:49

5 blocked jobs

The jobs with "eligible" status are the only jobs which the dispatch system considers for allocation. The dispatch system doesn't know about your "blocked" jobs until they are upgraded to have "eligible" job status. The dispatcher will take the highest priority job among all eligible jobs and dispatch if there are sufficient resources. As described earlier, other lower priority jobs may ony be dispatched if their walltimes do not exceed the minimum startime of higher priority jobs (minimum of runtime of currentlty running jobs that would free enough space for higher priority job). This means that the user must wait for high resource "eligible" jobs to be dispatched before lower resource "blocked" jobs can be.

Tracking Dispatching and Priority

To see the usage of your group or individual account, first navigate to: cd /global/system/info/ Within the ./fair_share subdirectory, there are files with "fair share" information for every day in the current month and further subdirectories containing files going back 5 years.

Each file contains the % usage info for all accounts and users according to the time window weighting scheme describes above. If you grep the file for your account and user, you will get something like so:

FSInterval        %     Target       0       1       2       3       4       5       6
-------------
ACCT
-------------
gdx-911-ae       11.46   7.50+   12.62   15.47   12.12    7.61   10.05   11.89    8.78
gdx-911-aa*       2.11   0.50-    1.40    3.28    3.90    3.33    1.27    0.00 -------

USER
-------------
v7q8              3.90 -------    4.58    5.81    4.49    1.78    3.75    4.16    1.71

Account gdx-911-ae has 11.46 % process equivalent usage over orcinus weighted over the 7 36 hour time windows. The target usage of the dispatch system for account gdx-911-ae is 7.50+ (at least 7.5) processor equivalent percentage of orcinus. As of January, 2016, orcinus has 10,000 processors. Thus 7.5% allocation means the account should have ~750 processor allocated to it at any one time.

Within the ./stats subdirectory, contains files with detailed usage information for all the accounts. If you grep for your accounts, the file will display usage as so:

           |--------- Active ------|---------------------------- Completed -------------------------------|
acct         Jobs Procs ProcHours    Jobs    %    PHReq       %     PHDed      %   FSTgt    AvgXF   AvgQH
gdx-911-ae    107   481   13981.3     373   8.9   2.35K     0.16    8.95K    2.05  2.00+    14.5    70.2
gdx-911-aa     73   292   27376.5      -0  -0.0  ------    -0.00    7.65K    1.75  2.00     -0.0    -0.0

The number of jobs, processors (Procs), and processor equivalent hours (ProcHours) are show for active jobs. The "Completed" sections shows job stats for completed jobs aggregated over the whole year. PHReq corresponds to process hours requested. PHDed corresponds to prcoess hours dedciated to our group???

Disk Quota

Your disk quotas are based on the number of files, and not just the amount of disk space you use.

To check your quota on orcinus, type the following in a directory within your filesystem

lfs quota -u v7q8 ./

A handy short script to see the number of files in your directories recursively is shown below. The script obtains the number of files in all subdirectories and displays the 50 largest directories in terms of number of files.

find / -xdev -type d -print0 |
while IFS= read -d '' dir; do
  echo "$(find "$dir" -maxdepth 1 -print0 | grep -zc .) $dir"
done |
sort -rn |
head -50

When managing disk usage, it may be useful to identify the most memory intensive file system locations. The command below sorts the folders in the directory by memory usage and prints the top 10 heaviest. This command could take substantial time to execute if there are a large number of files in the sub directories.

du -sch .[!.]* * | sort -h -r | head -n 10

Raw edit | More topic actions

Topic revision: r6 - 2016-01-21 - cchris13