% qsub simple.sh
You should receive a status report that provides information about all jobs currently known to the grid engine system. For each job, the status report lists the following items:
Job ID, which is the unique number that is included in the submit confirmation
Name of the job script
Owner of the job
State indicator; for example r means running
Submit or start time
Name of the queue in which the job runs
You can control the output of the finished jobs by checking their stdout and stderr redirection files. By default, these files are generated in the job owner`s home directory on the host that ran the job. The names of the files are composed of the job script file name with a .o extension for the stdout file and with a .e extension for the stderr file, followed by the unique job ID. Thus the stdout and the stderr files of your job can be found under the names simple.sh.o1 and simple.sh.e1 respectively. These names are used if your job was the first ever executed in a newly installed grid engine system.
This section describes how to use the commands qstat, qdel, and qmod to monitor, delete, and modify jobs from the command line.
To monitor jobs, type one of the following commands, guided by
is detailed in the following sections:
%qstatqstat with no options provides an overview of submitted jobs only. qstat -f includes information about the currently configured queues in addition. qstat -ext contains details such as up-to-date job usage and tickets assigned to a job.
In the first form, a header line indicates the meaning of the columns. The purpose of most of the columns should be self-explanatory. The state column, however, contains single character codes with the following meaning: r for running, s for suspended, q for queued, and w for waiting. See the qstat(1) man page for a detailed explanation of the qstat output format.
The second form is divided into two sections. The first section displays the status of all available queues. The second section, titled PENDING JOBS, shows the status of the sge_qmaster job spool area. The first line of the queue section defines the meaning of the columns with respect to the queues that are listed. The queues are separated by horizontal lines. If jobs run in a queue, they are printed below the associated queue in the same format as in the qstat command in its first form. The pending jobs in the second output section are also printed as in qstat`s first form.Controlling Jobs With qdel and qmod
To control jobs from the command line, type one of the following
with the appropriate arguments.
% qdel argumentsUse the qdel command to cancel jobs, regardless of whether the jobs are running or are spooled. Use the qmod command to suspend and resume (unsuspend) jobs already running.
% qmod arguments
For both commands, you need to know the job identification number, which is displayed in response to a successful qsub command. If you forget the number, you can retrieve it with qstat, as described in previous section.
Here are several examples of the qdel and qmod
% qdel job-idIn order to delete, suspend, or resume a job, you must be the owner of the job or a grid engine manager or operator.
% qdel -f job-id1, job-id2
% qmod -s job-id
% qmod -us -f job-id1, job-id2
% qmod -s job-id.task-id-range
You can use the -f (force) option with both commands to register a job status change at sge_qmaster without contacting sge_execd. You might want to use the force option in cases where sge_execd is unreachable, for example, due to network problems. The -f option is intended for use only by the administrator. In the case of qdel, however, users can force deletion of their own jobs if the flag ENABLE_FORCED_QDEL in the cluster configuration qmaster_params entry is set. See the sge_conf(5) man page for more information.
From the command line, type the
following command with appropriate arguments.
%qsub argumentsThe qsub -m command requests email to be sent to the user who submitted a job or to the email addresses specified by the -M flag if certain events occur. See the qsub(1) man page for a description of the flags. An argument to the -m option specifies the events. The following arguments are available:
b – Send email at the beginning of the job.
e – Send email at the end of the job.
a – Send email when the job is rescheduled or aborted (for example, by using the qdel command).
s – Send email when the job is suspended.
n – Do not send email. n is the default.
Use a string made up of one or more of the letter arguments to specify several of these options with a single -m option. For example, -m be sends email at the beginning and at the end of a job.
The grid engine system provides a set of ancillary programs (commands) for users to do the following tasks:
Submit and delete jobs
Check job status
Suspend or enable queues and jobs
For grid users, frequently used commands include:
qdel – Provides the means for a user, operator, or manager to send signals to jobs or to subsets thereof.
qhost – Displays status information about execution hosts.
qstat – Provides a status listing of all jobs and queues associated with the cluster.
qsub – The user interface for submitting batch jobs to the grid engine system.
qhold – Holds back submitted jobs from execution.
qhost – Displays status information about execution hosts.
qlogin – Initiates a telnet or similar login session with automatic selection of a low-loaded, suitable host.
qmake – A replacement for the standard UNIX make facility. qmake extends make by its ability to distribute independent make steps across a cluster of suitable machines.
qmod – Enables the owner to suspend or enable a queue. All currently active processes that are associated with this queue are also signaled.
qmon – Provides an X Windows Motif command interface and monitoring facility.
qresub – Creates new jobs by copying running or pending jobs.
qrls – Releases jobs from holds that were previously assigned to them, for example, through qhold.
qrsh – Can be used for various purposes, such as the following.
To provide remote execution of interactive applications through the grid engine system. qrsh is comparable to the standard UNIX facility rsh.
To allow for the submission of batch jobs that, upon execution, support terminal I/O and terminal control. Terminal I/O includes standard output, standard error, and standard input.
To provide a submission client that remains active until the batch job finishes.
To allow for the grid engine software-controlled remote execution of the tasks of parallel jobs.
qselect – Prints a list of queue names corresponding to specified selection criteria. The output of qselect is usually fed into other grid engine system commands to apply actions on a selected set of queues.
qsh – Opens an interactive shell in an xterm on a lightly loaded host. Any kind of interactive jobs can be run in this shell.
qtcsh – A fully compatible replacement for the widely known and used UNIX C shell (csh) derivative, tcsh. qtcsh provides a command shell with the extension of transparently distributing execution of designated applications to suitable and lightly loaded hosts through grid engine software.