Tags:
tag this topic
create new tag
view all tags
---+ WestGrid - quick user guide This page is part of the EmpiricalAlgorithmics web. %TOC% ---++ Introduction WestGrid operates high performance computing (HPC), collaboration and visualization infrastructure across western Canada. It encompasses 14 partner institutions across four provinces. %BR% An extensive overview of WestGrid can be found at the [[http://www.westgrid.ca/][WestGrid website]]. You also can read the *QuickStart Guide for New Users* at [[http://www.westgrid.ca/support/quickstart/new_users]] ---++ How to get a WestGrid account? 1. *Lead researcher*: You will select this option if you are the leader of a project (research group). You will be asked to enter some information about the nature of your research project before you apply for your user account. You will be given a Project ID Number that other collaborators in your group may cite when applying for their user accounts. Note that only faculty members can be project leaders. 1. *Join an existing project (research group)*: This option is for researchers that are supervised by a leading researcher (for example, a student who is working for a professor.) To join a pre-existing group, you will need to obtain the Project ID Number from the project's leader and enter the number on the form. The project leader will be asked to verify the information you submit. You can look up project ID numbers using the web page [[https://rsg.nic.ualberta.ca/project_lookup.php]]. To apply for an account, proceed to the Account Request page [[https://rsg.nic.ualberta.ca/]]. ---++ What will you get in the next a few days? After you submit your application, you will get a few e-mails from WestGrid. 1. *WestGrid Account Application Received*: WestGrid Account Management received your application. 1. *Asking Permission from Project Leader*: If you are willing to join an existing project (usually is the case for students), WestGrid will send an e-mail to the project leader asking for conformation. 1. *WestGrid Application Accepted*: Your application for a WestGrid account has been approved. 1. *WestGrid account created*: Westgrid has set up a Westgrid account for you on *silo.westgrid.ca* and *hopper.westgrid.ca*. Those are storage servers for medium and long-term data storage. For more information about using Silo and Hopper, please visit [[http://westgrid.ca/support/quickstart/silo]]. Note: The shell on Silo is restricted; it can only be used for managing and downloading files. You cannot run programs or scripts on Silo. 1. *Welcome to _cluster name_*: Your account on the _cluster name_ has been activated. In my case, the _cluster name_ is *glacier.westgrid.ca*. Note: the file system for storage and cluster is different. You can not directly access files stored in storage server from cluster. You will need to use =gcp= to copy files between different WestGrid machines. ---++ How to transfer my files to/between WestGrid? Assume my host machine is *okanagan.cs.ubc.ca*, my WestGrid storage server is *silo.westgrid.ca* and my cluster in WestGrid is *glacier.westgrid.ca*. The file I want transfer is =test.txt=. * Transfer files between WestGrid machines (from *glacier.westgrid.ca* to *silo.westgrid.ca*) <verbatim> gcp test.txt username@silo.westgrid.ca:~/ </verbatim> * Transfer files between your local machine to WestGrid (from *okanagan.cs.ubc.ca* to *glacier.westgrid.ca*) <verbatim> okanagan:> scp test.txt username@glacier.westgrid.ca:~/ username@glacier.westgrid.ca's password: password </verbatim> * If you want write a script to transfer many files from your local machine to WestGrid, entering password will be a problem. Here is the solution: First log in on *okanagan.cs.ubc.ca* as user _username_ and generate a pair of authentication keys. Do not enter a passphrase: <verbatim> okanagan:~> ssh-keygen -t rsa Generating public/private rsa key pair. Enter file in which to save the key (/ubc/cs/home/username/.ssh/id_rsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /ubc/cs/home/username/.ssh/id_rsa. Your public key has been saved in /ubc/cs/home/username/.ssh/id_rsa.pub. The key fingerprint is: 0e:97:88:0f:86:70:39:8f:44:13:e3:f4:5f:79:32:cd username@okanagan </verbatim> Go to =~./ssh= and transfer =id_rsa.pub= to *glacier.westgrid.ca* under =.ssh= in your =home= directory. <verbatim> scp id_rsa.pub xulin730@glacier.westgrid.ca:~/.ssh/ </verbatim> Add =id_rsa.pub= to =authorized_keys2= <verbatim> cat id_rsa.pub >> authorized_keys2 </verbatim> Now, try to use =scp= to transfer files (no password required). ---++ Running Jobs A great majority of the computational work on WestGrid systems is carried out through non-interactive batch processing. Job scripts containing commands to be executed are submitted from a login server to a batch job handling system, which queues the requests, allocates processors and starts and manages the jobs. The system software that handles your batch jobs consists of two pieces: a resource manager (TORQUE) and a scheduler (Moab). This system is fairly similar to our SunGridEngine. For detailed information, please visit [[http://westgrid.ca/support/running_jobs]]. A batch job script is a text file of commands for the UNIX shell to interpret, similar to what you could execute by typing directly at a keyboard. The job is submitted to an queue using the =qsub= command. A job will wait in the queue depending on factors such as system load and the priority assigned to the job. When appropriate resources become available to run a job, it started on one or more assigned processors. A job will be terminated if it exceeds its allotted time limit, or, on some systems, if it exceeds memory limits. By default, the standard output and error streams from the job are directed to files in the directory from which the job was submitted. For detailed information of how to write a job script, please visit [[http://westgrid.ca/support/running_jobs#directives]] A few useful commands: * =qstat=: Check the status of the cluster * =qsub=: Submit jobs to the queue (You can also submit array job such as =qsub -t 1-100=) * =qdel=: Delete you own jobs in case of something wrong A few notes if you are using *glacier.westgrid.ca* ([[http://guide.westgrid.ca/guide-pages/jobs.html]]): * =PBS -l qos=debug=: For testing your scripts without waiting. * =PBS -l nodes=1,software=MATLAB=: For submitting MATLAB jobs. *glacier.westgrid.ca* has 20 MATLAB licenses. * [[http://guide.westgrid.ca/guide-pages/queue_state]]: For cluster utilization
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r1
|
B
acklinks
|
V
iew topic
|
Ra
w
edit
|
M
ore topic actions
Topic revision: r1 - 2009-09-29
-
xulin730
Home
Site map
BETA web
Communications web
Faculty web
Imager web
LCI web
Main web
SPL web
Sandbox web
TWiki web
TestCases web
BETA Web
Create New Topic
Index
Search
Changes
Notifications
RSS Feed
Statistics
Preferences
P
View
Raw View
Print version
Find backlinks
History
More topic actions
Edit
Raw edit
Attach file or image
Edit topic preference settings
Set new parent
More topic actions
Account
Log In
Register User
E
dit
A
ttach
Copyright © 2008-2025 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback