Difference: CLUMEQ (1 vs. 2)

Revision 22006-05-21 - bleuant

Line: 1 to 1
 
META TOPICPARENT name="WESTGRID"
CLUMEQ is a super computer available to McGill students (though I'm sure any student attending a Canadian university may have access). See details here: http://www.clumeq.mcgill.ca/. Unfortunately, it is some what outdated both in terms of software and hardware. However, it is a large parallel system.
Line: 28 to 28
  8) I have made a matlab script that will kill all the user queues. Check it out here qdelall.m. One uses it like this under matlab: qdelall('your_user_name');
Added:
>
>
9) Sometimes there aren't enough licenses free to execute your program. This is a pain. However, someone has solved this Stupid Cluster Tricks. Good luck with this.

10) To solve the point above for matlab licenses, one may include the following line of execution into your matlab code:

	 % halt execution for a minute if we can't get the image toolbox
	 while (~license('checkout','image_toolbox'))
		  pause(60); 
	 end;
Note: This can be done for every single toolbox required. However, to prevent possible license-required deadlocks, I highly suggest one manually (within your matlab script, and most probably during your initilisation procedure) control the checking out of required matlab toolbox licenses as the matlab license() function does not allow one to release a toolbox license. But, of course, this only works if one can initiate a maltab session.

11) To solve the above 2 problems, one can take the ultimate step and compile the M-file into a system executable. This is done via matlab's compiler. However, one must be aware of the licensing limitations associated with a stand-alone system executable from the matlab compiler. Of course, if one uses a call in the "Can not be compiled" column, then one must revert back to the solution above. Gee, this is getting complicated.

12) The ultimate matlab licensing solution would involved a system executable which attempts to launch a matlab engine. Whilst monitoring the standard error output stream (ie: stderr), if it detects an error "-4" (the error number associated with insufficient matlab licenses), then it can relaunch the matlab engine until it doesn't fail. The problem with this solution is that it is racing against all the other processes in a polling manner-- quite inelegant. So really this isn't the ultimate solution.

  Note: Please feel free to edit this page. If something isn't clear, then feel free to email Albert Law.

Revision 12006-05-10 - bleuant

Line: 1 to 1
Added:
>
>
META TOPICPARENT name="WESTGRID"
CLUMEQ is a super computer available to McGill students (though I'm sure any student attending a Canadian university may have access). See details here: http://www.clumeq.mcgill.ca/. Unfortunately, it is some what outdated both in terms of software and hardware. However, it is a large parallel system.

Here are some things found out about CLUMEQ that might help someone who is new to CLUMEQ (after reading the CLUMEQ FAQ).

0) Forget about CLUMEQ. WESTGRID's http://www.westgrid.ca/support/Facilities#head-7d986acf67ff63ef2a2e9d0553cd80bba84c2aa3|glacier (http://www.westgrid.ca) is much bigger and better in almost every single way. See the WESTGRID wiki.

Note: 2006-05-05 CLUMEQ is about to get a $30,000,000 upgrade. It might be worth compairing the two systems in 2007.\ 1) Opening an account is more of a formality than anything. Everyone gets accepted.

2) Matlab is available but only in single processor format.

3) I have made a Matlab script that will take an M-file (whose primary function is to run many identical test cases with different parameters after some setup code at the top of the M-file) and create a queue to submit to the CLUMEQ queue. See a graphics explanation here. See the code here clumeq.m.

Note: Albert has made a better launcher file, and it can be found in the WESTGRID wiki. However, it seems to fail to input jobs into the proper queues of (single/multi/big/clumeq). It doesn't affect the processes.

4) Only 16 single processor processes will be executed at any one time. 12 of which may be yours.

5) The admin, Patrice, if a very useful and responsive. When troubles arise, do email him.

6) CLUMEQ is woefully under used. So there's alot of time available to each active user. This applies more or less for the general usage of single processors.

7) CLUMEQ makes 2 files, stdout and errout, after running your job queue file (ie: PBS file). However, the error file sometimes cuts off if Matlab is writing to that error file (unknown why). This makes debugging Matlab sometimes difficult. Beware....

8) I have made a matlab script that will kill all the user queues. Check it out here qdelall.m. One uses it like this under matlab: qdelall('your_user_name');

Note: Please feel free to edit this page. If something isn't clear, then feel free to email Albert Law.

 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback