Contents


Deprecated: please see new documentation site.



This document is meant to be a condensed yet complete introduction to LoadLeveler's Job Command files and is specific to the LONI P5 environment. If LoadLeveler keywords or options that you are aware of from other experiences with LoadLeveler can not be found on this page it is because those particular items are not currently applicable in LONI P5s. Full documentation for Job Command files can be found in the online book, Using and Administering LoadLeveler (in pdf format).


Introduction

A Job Command file is simply a text file that may contain the following different types of information:

  • LoadLeveler keyword statements
  • Shell command statements
  • LoadLeveler variables
  • Comment Statements

The following rules dictate the form of your Job Command file:

  • Keyword statements begin with # @. There can be any number of blanks between the # and the @.
  • Comments begin with a # just as they do for shell scripts.
  • Statement components are seperated by blanks.
  • The backslash \ is the line continuation character. The continued line must not begin with # @.
  • Keywords are case insensitive.

Your Job Command file may tell LoadLeveler what you want to run through the executable keyword or you may have your Job Command file serve as the executable by not specifying an executable or by explicitly setting the executable to be the Job Command file itself. Therefore, your Job Command file may be a shell script that drives the programs that you want to execute.

The Job Command file may also contain different job steps that together constitute your job. You may name each job step and you must have a queue keyword entry for each job step you want to run. Unless otherwise noted, the keywords you set for the first job step will be inherited by all subsequent job steps. By default, LoadLeveler will view each job step as an independent entity but, by using the dependency keyword, you can conditionally execute different programs depending on the return value of the previous job steps.

Job Command File Keywords

  • account_no: Allows you to specify an allocation to associate with a job. The user's default allocation is used if none is specified.
 # @ account_no = string
  • arguments: Specify a list of arguments to pass to the program you list as the executable. Default is no arguments.
 # @ arguments = arg1 arg2 ....
  • blocking: Blocking specifies that tasks be assigned to machines in multiples of a certain integer. This may be an integer value or unlimited. Unlimited blocking specifies that tasks be assigned to each machine until it runs out of initiators, at which time tasks will be assigned to the machine which is next in the order of priority. By default, nothing is set.
 # @ blocking = unlimited 
  • checkpoint: A value of interval, yes or no indicates if a job is able to be checkpointed. Checkpointing a job is a way of saving the state of the job so that if the job does not complete it can be restarted from the saved state rather than starting the job from the beginning. Default value is no.
 # @ checkpoint = yes
  • class: Specify the job class (queue) you want to submit your job to. The command llclass will list the available classes on LONI P5s and show how busy each class is currently. The current choices are: checkpt, workq, preempt, single, or interactive. The default is checkpt.
 # @ class = workq 
  • comment: Associate a user defined text comment with the job.
 # @ comment = this is my very first job
  • core_limit: Set the hard and/or soft limits for the size of the core file that may result from your job. The value is a comma separated pair of integer resource values. The units are described in Resources section below..
 # @ core_limit = 100mb
  • cpu_limit: Set the hard and/or soft limit for the amount of cpu time a submitted job step can use. The value is a comma separated pair of times in the form hh:mm:ss.ff. Default is unlimited.
 # @ cpu_limit = 01:20:00,01:00:00
  • data_limit: Set the hard and/or soft limit for the size of the data segment to be used by the job step. The values are common separated integers in units shown in Resource section below. The default is unlimited.
 # @ data_limit = 16mw,12mw
  • dependency: Specifies the dependencies between job steps. A job step that depends on a previous job step may evaluate logical expressions based on the previous job step's return values. For more info, see LoadLeveler_Job_Chains_and_Dependencies
 # @dependency ...
  • environment: Specify your initial environment variables when your job starts. Separate environment specifications may be separated with semicolons. Special values are as follows:
    • COPY_ALL All environment variables from your shell are copied. This is the default.
    • $var The variable var is coped into your job's environment.
    • !var The variable var is NOT to be copied.
    • var=value The variable var is set to value and copied into your job's environment
 # @ environment COPY_ALL; !var2; var1=up
  • error: Set the name of the file that will hold standard error from your job. If you do not specify a file, standard error goes to std.err.
 # @ error = filename
  • executable: For serial jobs, the executable keyword gives the name of the program to run. For parallel jobs, the executable should be /usr/bin/poe or a shell script that itself invokes poe. If executable is not set, the executable is assumed to be the Job Command file itself. The executable keyword statement sets the $(base_executable) to the executable filename without the directory portion as a side effect.
 # @ executable = a.out
  • file_limit: Specifies the hard and/or soft limit for the size of a file. The value is a pair of comma-separated resource values (see Resource below). Default is no set values.
 # @ file_limit = 4gb
  • hold: Specifies whether you want to place a hold on your job when you submit it. For this purpose you should select a user hold as this is the type of hold that you have permission to remove. The hold will remain in place until you release it with the llhold -r command. Holds may be of type user, system or usersys (both a user and a system hold). Default is no hold setting.
 # @ hold = usersys
  • image_size Tell LoadLeveler the maximum virtual image size, in kilobytes, that your program will occupy during execution. If you do not specify the image size, the size will be that of the executable. If you specify a size that is too small your job may be dispatched to a machine with insufficient resources and your job may crash. Conversely, overestimating the size may result in LoadLeveler having difficulty finding a machine with sufficient resources to run your job. Default is the size of the executable file.
 # @ image_size = 200
  • initialdir: The pathname of the directory that will serve as the initial working directory for your job. If not specified, initialdir defaults to the current working directory at the time your job is submitted to LoadLeveler. File names in the Job Command file that are not absolute path names (do not begin with a /) are relative to the initialdir. Default is thesubmission current working directory.
 # @ initialdir = pathname
  • input: Specify the name of the file to use as standard input when your job runs. If not specified, /dev/null is used.
 # @ input = filename
  • job_cpu_limit: Set the hard and/or soft limit for the CPU time to be used by all processes in a job. If a job starts a process that itself forks other processes the sum total of CPU time consumed will be regulated by this limit. The value is a pair of comma-delimited time values. The valid units and special values for limits are described in this table. The default is unlimited.
 # @ job_cpu_limit = 20:00,15:30
  • job_name: Set the name of the job. The job name may be set only once, subsequent instances are ignored. The job_name will only appear in the long reports from the llq, llstatus, and llsummary commands and in email notifications about the job. There is no default value.
 # @ job_name = my_first_job
  • job_type: Tells LoadLeveler what type of job you want to run. Valid choices are serial and parallel. If you select serial, you may not specify any of the following keywords: node, tasks_per_node, total_tasks, network.LAPI, network.MPI, or other keyword related to parallel processing. The default value is serial.
 # @ job_type = parallel
  • max_processors: Specify the maximum number of nodes for a parallel job, regardless of the number of processors in the node. You may not specify both max_processors and node and we encourage users to use the node keyword instead of max_processors. There is no default value.
 # @ max_processors = 64
  • min_processors: Specify the minimum number of nodes for a parallel job, regardless of the number of processors in the node. You may not specify both min_processors and node and we encourage users to use the node keyword instead of min_processors. There is no default value.
 # @ min_processors = 32
  • network: For parallel jobs only, tell LoadLeveler how you want your tasks to communicate with one another. The value specified is a comma-delimited triplet of network_type, usage, and mode. The full keyword name includes a protocol specifier:
    • network.MPI: Message Passing Interface
    • network.LAPI: Low-Level Application Programming Interface

Users who are not developing their own communications protocols will probably want to use the MPI protocol. The network_type may be ethernet, sn_single (sn_all). For most instances you will probably want to use the sn_single (sn_all) for communications as it offers the highest bandwidth and also no other network traffic will use the switch. The switch interface is reserved to be used exclusively for parallel applications. The peak bidirectional bandwidth for the three interfaces available on the Parallel Execution Nodes are:

    • Ethernet: 1,000
    • sn_single: 4,000

The usage qualifier describes whether the network adapter can be shared by other tasks. Possible values are shared and not_shared. The default is shared. The mode qualifier specifies the communication mode you wish to use. You may select either IP (the Internet Protocol) or US (for User Space). Of these the US choice represents a lower level communication protocol that should yield higher performance for your applications. The network keyword can not be set if you have already set an Adapter requirement or preference. Also, the value you set for the network keyword in a job step is not inherited by subsequent job steps within the same Job Command file. An example of the network keyword usage, for a job that I want to run on the Parallel Execution Nodes in LONI P5s that uses MPI and that should communicate over the Federation switch, sharing the adapter and using the User Space subsystem reads as follows (it is the default):

# @ network.MPI = sn_single,shared,US
  • node: Tells LoadLeveler the minimum and maximum number of nodes you require for a given job step. The value is not inherited by subsequent job steps. The default minima is 1 and the default maxima is the number of nodes that service the particular job class (queue). The syntax for the command is:
# @ node = 6,10 # node = [min][,max]

To use the node keyword in addition to the total_tasks keyword the min and max values must be equal or you must specify only one value. To specify that I want to run on at least 6 nodes and on up to 10 nodes, it they are available, my Job Command file would have a node keyword entry that looks like

  • node_usage: Specify whether this job step shares nodes with other job steps. Possible values are shared and not_shared. Shared means that the nodes can be shared with other tasks of other job steps. not_shared means that no other job steps are scheduled to run on the node. The default is shared. Only the Parallel Execution nodes may have more than one task executing on a node so this option only concerns jobs submitted to the P1 and PM job classes. The default value is shared.
 # @ node_usage = not_shared
  • notification: Tells LoadLeveler when you want the user specified by the notify_user keyword to be sent email. The following options are supported:
    • always: Notify when the job begins, ends and if it has an error
    • error: Notify only if job fails
    • start: Notify only when the job begins
    • never: deafening silence from LoadLeveler
    • complete: Notify when the job ends, default value
 # @ notification = always 
  • notify_user: The email address that you wish LoadLeveler to send email notifications to. The default value for this variable is your pelican_login@ianx where ianx is LoadLeveler's name for the 2 interactive nodes (ian1 - ian2). That is to say your account on whichever interactive node you are submitting the job from. Please be advised that mail sent to your account at casper.lsu.edu will not be deliverable. Please use your PAWS email account for notifications or some other email account. Default is the user name on the submitting machine.
 # @ notify_user = user@host
  • output: The file that will hold the standard output from your job. If no file is specified, std.out is used.
 # @ output = filename
  • preferences: Specifies the characteristics that you prefer be available on the machine that executes the job steps. LoadLeveler attempts to run the job steps on machines that meet your preferences. If such a machine is not available LoadLeveler will then assign your job to machines which meet only your requirements. There are no default preferences. String values must be quoted. Examples:
 # @ preferences = (Memory >= 64) && (Arch == "Power5")

or

 # @ preferences = Machine == "l1f1n14" 
  • queue: Instructs LoadLeveler to place one copy of the job step in the queue. This statement is required. It marks the end of a job step.
 # @ queue
  • requirements: Specifies your requirements for your job. LoadLeveler will only dispatch your job to machines that meet your requirements. The possible requirements (and preferences) to check are listed below:
    • Arch: Machine architecture, Power5.
    • Disk: The amount of disk space, in kilobytes, you think are required to run a job.
    • Machine: The name of the machine you must run on.
    • Memory: The amount of physical memory, in Megabytes, that your job requires.
    • OpSys: Operating System. All LONI P5 nodes are kept at the same version of the AIX operating system to avoid conflicts between submitting and executing machines.

The following are examples of specifying requirements. You could use any of these statements with the preferences keyword instead. To specify that your serial job requires a Power5 processor with at least 64 Megabytes of memory and at least 512 kilobytes of disk space you would include the following in your Job Command file.

 # @ requirements = (Disk >= 512) && (Memory >= 64) && (Arch == "Power5")

And to specify a set of machines on which to run a parallel job

 # @requirements = (Machine -== { "l1f1n02" "l1f1n03" "l1f1n04" "l1f1n05"})
  • restart: Tells LoadLeveler if you want your job to be restarted if it can not be completed on the machine it was originally dispatched to. If you specify restart = no your job will be canceled if it can not complete. A checkpointed job is always considered to be restart-able.
 # @ restart = yes | no #Default value: yes
  • rss_limit: Set the hard and/or soft limit for the resident set size. The syntax is rss_limit = hardlimit,softlimit.
 # @ rss_limit = 120,100 #Default value: unlimited
  • shell: The name of the shell to use to run a job step. The default value is your login shell.
 # @ shell = /bin/ksh #Default value: /bin/bash
  • stack_limit: Set the hard and/or soft limit for the size of the stack segment that your job may use. The syntax is stack_limit = hardlimit,softlimit.
 # @ stack_limit = 512mb #Default value: unlimited
  • startdate: Tells LoadLeveler when you want your job to run. The default value is the current date and time. The syntax is:
 # @ startdate = date time

where date is expressed as MM/DD/YY and time is expressed as HH:mm[:ss]If you specify a start date in the future, your job is kept in the deferred state until the start time.

 # @ startdate = 04/05/06 22:00:00
  • step_name: Gives a name to a job step within your Job Command file. You may use any combination of alphanumeric characters and underscores `_' and periods `.' with the following exceptions. You may not name a job step T or F or use a number as the first character in the name. The default name for the first job step is the character `0' and subsequent job steps in the Job Command file are named sequentially.
 # @ step_name = my_job.step
  • tasks_per_node: For a parallel job, the number of tasks you want to run on a node. This keyword is used in conjunction with the node keyword and its value is not inherited by subsequent job steps. not inherited by subsequent job steps. The default value is 1 task per node.
 # @ tasks_per_node = number #Default value: The default is one task per node.
  • total_tasks: For a parallel job, specifies the total number of tasks you want to run on all available nodes. The total_tasks keyword is used in conjunction with the node keyword and the node keyword must specify a single value (minimum and maximum numbers must be identical). This keyword is not inherited by subsequent job steps. You may not specify both total_tasks and tasks_pre_node.
 # @ total_tasks = number #Default value: No default is set.
  • user_priority: Set the priority of your job step relative to other job steps, in the same job class, owned by you. This keyword does not impact your jobs priority relative to jobs owned by other users. The user_priority may take on values from 0 to 100 with the higher value having the higher priority for being dispatched to run by LoadLeveler. The default value is 50.
 # @ user_priority = number #Default value: The default priority is 50
  • wall_clock_limit: Set the hard and/or soft limit for the elapsed time for which a job can run. LoadLeveler reckons the wall clock time for a job from the time the job was dispatched to a machine to execute. The default value for the wall_clock_limit is 30 minutes. The syntax is wall_clock_limit = hardlimit,softlimit. The valid units and special values for limits are described in this table.
 # @ wall_clock_limit = 10:00:00 #Default value: 12:00:00  (12 hours) 

Job Command File Variables

LoadLeveler creates several variables that you may use within your Job Command File in, for example, the name of your output file. The variable names are case insensitive but you must reference them with the following syntax:

$(variable_name)

In addition some keyword statements set variables that you may reference. The following is a listing of the LoadLeveler variables.

  • $(host): The hostname of the machine from which you submitted the job. Equivalent to the $(hostname) variable.
  • $(domain): The domain of the host from which you submitted the job.
  • $(jobid): A sequential number assigned to the job by the submitting machine. Equivalent to the $(cluster) variable.
  • $(stepid): The sequential number assigned to a job step when more than one queue statement appears in the Job Command file. The $(stepid) and $(process) variables are equivalent.
  • $(executable): Contains the name of the executable if you set the executable keyword.
  • $(base_executable): Contains the name of the executable without the directory path if you set the executable keyword.
  • $(class): Contains the name of the job class that your job has been submitted to if you set the class keyword.
  • $(comment): Contains the comment text if you set the comment keyword.
  • $(job_name): Contains the job name text if you set the job_name keyword.
  • $(step_name): Contains the step name text if you set the step_name keyword.

Resource Limits

Resource limits, with the exception of the wall_clock_time limit, are unlimited for all users by default. The time limits for the different job classes are listed in the this table. If your job exceeds a soft limit that you have imposed on it a trap-able signal is sent to your processes. If your job exceeds a hard limit a nontrap-able signal is sent to terminate your processes. The units for space related limits may be any of the following, the default unit is bytes.

  • b: bytes
  • w: words (4 bytes)
  • kb: kilobytes (210 bytes)
  • kw: kilowords (210 words)
  • mb: megabytes (220 bytes)
  • mw: megawords (220 words)
  • gb: gigabytes (230 bytes)
  • gw: gigawords (230 words)

For the time related limits, cpu_limit, job_cpu_limit, and wall_clock_limit the hard limit and soft limit are expressed as:

 [[hours:]minutes:]seconds[.fraction]

So that the default format is the number of seconds but you may specify time limits in hours minutes and seconds or minutes and seconds as well. The fractional number of seconds is rounded.

You may also specify any of the three following strings with all the limit keywords. rlim_infinity and unlimited both set the limit to be the largest representable positive integer. copy copies the resource limit in place for your user id (as reported by ulimit -a for Korn shell and limit for C shell) at the time the job is submitted.

Additional Resources


Users may direct questions to sys-help@loni.org.

Powered by MediaWiki