Deprecated: please see new documentation site.

LoadLeveler on all LONI p5 machines employs a backfill scheduler. The objective of backfill scheduling is to maximize the use of resources to achieve the highest system efficiency, while preventing potentially excessive delays in starting jobs with large resource requirements.

The benefit of using backfill scheduling is two-fold: On one hand, large jobs can run because the backfill scheduler does not allow jobs with smaller resource requirements to continuously use up resource before the larger jobs can accumulate enough resource to run; On the other hand, smaller jobs can be backfiiled to run on the resource larger jobs are accumulating, if they can finish before the projected start time of larger jobs.

With that being said, it helps to shorten the wait time of your job to specify wall_clock_limit whenever possible, which is the maximum time your job will run. The shorter it is, the better chance that your jobs will be backfilled to run. Below is an example:

A job of another user's that requests 8 nodes is waiting in the queue and is projected to run in 10 hours (see Useful LoadLeveler Commands for information on how to check the estimated start time of a job). Now suppose that 5 out of these 8 nodes are idle when you submit your own job which will run 2 hours on 2 nodes. If you specify wall_clock_limit to a value less than 10 hours, LoadLeveler will see the chance of backfill and your job will run immediately. However, if you do not specify wall_clock_limit, the system will assume it to be the default value, which is currenly 5 days for most queues, and your job will have to wait in the queue.

Users may direct questions to

Powered by MediaWiki