Information Technology
Why Are My Jobs Not Running?
PBS Scheduling on RTC

30-Sep-2007


Introduction

RTC uses PBSPro to launch jobs and Maui to schedule jobs.  This scheduler is configured with a "fairshare" policy rather than a FIFO (First-In First-Out) policy.  A FIFO policy could result in one or a few users dominating the job queue indefinitely.  A fairshare policy means that the job priority of an individual user will drop as their job count increases.  This could result in old jobs remaining in the queue waiting to run while newer jobs run immediately.  The remainder of this document describes this affect and how you can determine if this is happening to your jobs.


Factors that Influence Job Priority

There are several factors that determine a job's priority in the queue:
  • Resource reservations - queue resources have been reserved in advance.  This is usually for time-specific projects or demonstrations.
  • System Load - Users are restricted to 30 running jobs at any given time when the system is not busy, and 10 running jobs at any given time when the system is heavily loaded.
  • Resources Requested - Maui assigns higher priorities to jobs that request large numbers of processors as compared to small numbers of processors.  The more processors requested, the higher the priority will be.
  • Fairshare - the primary factor for most jobs.  The more jobs a user runs, the lower their priority will be on future jobs.  This policy is based on up to 14 days of historical usage.  This will allow new jobs to run ahead of jobs that are already in the queue waiting to run.


Why is My Job Stuck in the Queue?

There are situations that may occur that can cause your job to appear stuck in the queue unable to run:

Incorrect Resource Request

  • One reason a job might be stuck in the queue is a request for an invalid set of resources.  The most common error is to request too many processors per node (ppn).  Each node has two processors (except for the four quad processor nodes).  Requesting more than two per node represents a request that can not be satisfied so PBS will not be able to schedule the job even though it will be accepted into the queue.   The following represents an invalid request:  
    • nodes=2:ppn=3

Queue Limits

  • In order to guarantee that everyone gets an opportunity to run jobs, we currently have a limit of 30 active (running) jobs per user when the system is relatively idle, and 10 active jobs per user when the system is relatively busy.  When jobs are submitted beyond these limits, they will remain in the queue waiting for your active jobs to finish even if there are resources available to run your jobs.
  • There are restrictions on the number of processors that can be assigned to a queue at any given time. If you submit a job and you receive an error such as job cannot run in partition DEFAULT. (job job.ID violates active SOFT MAXPROC limit of 176 for class medium (R: 1, U: 188)), this means that the particular queue is full. No more processors can be assigned to this queue even if idle processors are available. This prevents any one queue from dominating the entire system for long periods of time. In this case your job will wait until enough processors in the medium queue have become available before your job will start.

Fairshare Policy

  • You submit a job, but jobs submitted after yours run first.  The most likely reason for this to happen is that your priority is lower than other users based on the 14 day historical usage of our fairshare policy.  To find out when a job is predicted to run, use the showstart jobID command.

Backfill Policy

  • This is a scheduling optimization which allows Maui to make better use of available resources by running jobs out of order. Using job data such as walltime and resources requested, the scheduler can start other, lower-priority jobs so long as they do not delay the highest priority jobs.  Because of the way it works, essentially filling in holes in node space, backfill tends to favor smaller and shorter running jobs more than larger and longer running ones.
NOTE:  It is important to specify an accurate walltime for your job in your PBS submission script.  Selecting the default of 4 hours for jobs that are known to run for less time may result in the job being delayed by the scheduler due to an overestimation of the time the job needs to run.

Node Fragmentation/Resources Requested

  • You submit a multiprocessor parallel job and should have a high priority, but the job remains queued.  Using an example of a 8-node, 16 processor job, the scheduler will not schedule your job until all four nodes are completely empty at the same time.  On a busy system with hundreds of jobs submitted at random times, it is unlikely that randomly scheduled jobs are going to finish at the same time such that 8 nodes with 16 processors are going to be available at any given time.  Essentially the node usage has become fragmented.  So your job will remain queued and other jobs in the queue will backfill and be allowed to utilize individual idle processors as they become available before your job is "projected" to start.  However, this situation is unlikely to occur on RTC due to the fact that the Maui scheduler assigns higher priority to jobs requesting large numbers of CPUs.  You may only have to wait until existing running jobs exit before your muliprocessor job runs because it will be submitted with a high priority automatically.
  • You submit a single processor job and there are idle processors available, but your job remains queued.  This likely means that the system is currently waiting for enough idle processors to become available so that higher priority multiprocessor jobs can run.  In this case, the system will allow running jobs to finish but will not start any new single processor jobs until the multiprocessor jobs have enough resources to run.  

 

IT
Division of Information Technology
MS-119, P.O. Box 1892, Rice University, Houston, Texas 77251-1892
713-348-HELP(4357)