IBM RS/6000 SP2 ¼öÆÛÄÄÇ»ÅÍ

2002/03/05(23:37) from 218.149.65.204
ÀÛ¼ºÀÚ : °­ÁÙ±â (jkkang65@hanmail.net) Á¶È¸¼ö : 1967 , ÁÙ¼ö : 194
Re: [LoadLeveler] Job Command File ¼³¸í - 2
Introduction to Using LoadLeveler


Contents
Introduction
Command Interface to LoadLeveler
Classes
Nodes vs Processors
Submitting Requests
Monitoring Queues and Requests
Canceling Requests
Examples
More Information
Introduction
LoadLeveler is the batch scheduler running on the IBM SP. To run parallel jobs, you submit them to LoadLeveler and you use it to ascertain the status of jobs. Unlike interactive jobs, batch jobs are controlled via scripts (which you must write), and once you submit a job script, you can log out, and go for coffee, as LoadLeveler rounds up the resources required by the job, runs it, and handles standard I/O streams for you.

Command Interface to LoadLeveler
Here are frequently used commands ("man" pages are available):

llclass

Tells you about the available classes (queues)


llsubmit

Submits a job to be dispatched by LoadLeveler


llq

Returns information about jobs that have been dispatched.


llcancel

Removes a job which you have previously submitted


Classes
The term "batch queues" in other scheduling systems (like NQS) is analagous to the term "classes" in the LoadLeveler world. NQS pipe queues don't have a LoadLeveler analog--you submit directly to the destination class, and must be sure that the specific resources (such as number of nodes and run-time hours) you request are within the range provided by the class.

The command "llclass" shows the names of the available classes. To see the resource limits of a particular class, use:

llclass -l

Nodes vs Processors
The IBM SP is a collection of symmetric memory processor (SMP) nodes. Each node consists of a small number of processors which share one memory (on icehawk, there are four processors per node, sharing 2 GB). LoadLeveler allocates nodes, not processors. Thus, if you need 12 processors, you should request three nodes, four processors each. Requesting more nodes would result in wasted processors and unnecessary charges to your project.
Submitting Requests
The command:

llsubmit

will submit the given script for processing. You must write a script specific to your job. It contains the information LoadLeveler needs to allocate the resources your job requires, to handle standard I/O streams, and to run the job.

The following sample script runs an MPI program on 8 CPUs (using all four processors on each of two nodes):

#!/bin/ksh
# @ output = $(Executable).$(jobid).out
# @ error = $(Executable).$(jobid).err
# @ environment = MP_SHARED_MEMORY=yes
# @ notification  = never
# @ network.mpi = css0,not_shared,US
# @ node_usage = not_shared
# @ wall_clock_limit=1800
# @ job_type = parallel
# @ node = 2
# @ tasks_per_node = 4
# @ class = Qlarge
# @ queue

cd /u1/uaf/username/Progs/MLP
./mlp

See example #1 for a complete discussion of this script.

Monitoring Queues and Requests
The command:

llq

will show all the jobs currently running or queued. For details about your particular job, issue the command:

llq -l

where is obtained from the "Id" field of the llq output.

Canceling Requests
The command:

llcancel

where is obtained from the "Id" field of the llq output, will remove the job from the queues and terminate the job if it is running.

Examples
Example #1
This is a basic LoadLeveler script to run an 8 processor MPI job.
#!/bin/ksh
#
# ---
# --- Begin LoadLeveler Job Specifications ---
# ---
# @ output = $(Executable).$(jobid).out
# @ error = $(Executable).$(jobid).err
# @ environment = MP_SHARED_MEMORY=yes
# @ notification  = never
# @ network.mpi = css0,not_shared,US
# @ node_usage = not_shared
# @ wall_clock_limit=1800
# @ job_type = parallel
# @ node = 2
# @ tasks_per_node = 4
# @ class = Qlarge
# @ queue

# ---
# --- Begin executable shell commands ---
# ---
cd /u1/uaf/username/Progs/MLP
./mlp

The script consists of two main parts. Instructions to Loadleveler come first in the file and appear on lines which begin with:

# @
The LoadLeveler specifications end with

# @ queue
which says to queue the request. The second part of the script shell commands which will be executed when the job runs.


Here's a line-by-line discussion of the sample script:

#!/bin/ksh
Specifies the shell to be used when executing the command portion of the script. Korn shell is the default, but it doesn't hurt to specify it.



# ---
# --- Begin LoadLeveler Job Specifications ---
# ---
# @ output = $(Executable).$(jobid).out

Standard output from this job will be copied to this file. The file name is generated at run time according to the definition, and will have the name of the script file and the unique LoadLeveler job id.

# @ error = $(Executable).$(jobid).err
Standard error file.

# @ environment = MP_SHARED_MEMORY=yes
Sets the given variable in the environment of the job--you could specify a number of variables this way. (MP_SHARED_MEMORY should always be set to "yes.")

# @ notification = never
LoadLeveler will "never" send email notification of events regarding this job. Other notification options include "always," "error," and "start."

# @ network.mpi = css0,not_shared,US
Sets the communication protocol and network. For MPI jobs, you should always use these settings.

# @ node_usage = not_shared
Specifies that once allocated, your job will not share nodes with other jobs. This is the only mode possible on icehawk.

# @ wall_clock_limit=1800
Specifies the wall clock time requirements of your job. (This request must fall within the limits of the class you're requesting.)

# @ job_type = parallel
Your job is parallel

# @ node = 2
Tells LoadLeveler that your job requires 2 SMP nodes. (This request must fall within the limits of the class you're requesting.)

# @ tasks_per_node = 4
The two specifications, tasks_per_node X node, determine the total number of MPI processes that will be created. Often, tasks_per_node will equal the number of physical processors per node, however, other scenarios exist. For instance, a hybrid MPI/OpenMP code might run with one MPI process per node, but then spawn four OpenMP threads per node to use all four processors.

# @ class = Qlarge
Specifies the LoadLeveler class.

# @ queue
Required as final LoadLeveler specification! Queues your job. Any LoadLeveler key words following queue are ignored.

 
# ---
# --- Begin executable shell commands ---
# ---
cd /u1/uaf/username/Progs/MLP
The first shell command to be run when LoadLeveler starts the job. This changes the default directory to that containing the user's executable file.
./mlp
Runs the user's executable file, mlp.
More Information
"man" pages ("man llq",  "man llsubmit",  etc.)
http://www.rs6000.ibm.com/resource/aix_resource/sp_books/loadleveler/index.html
ARSC Consulting
Go to:  ARSC HomePage Resources and Allocations General Information News and Publications Science and Engineering Search This Site User Services Help!
Arctic Region Supercomputing Center * P.O. Box 756020 * Fairbanks, Alaska * 99775-6020
voice: (907) 474-6935 * fax: (907) 474-5494 * e-mail: info@arsc.edu


Modify Delete Post Reply Backward Forward List
Powered by Kang Jul Ki