|
Why Are My Jobs Not Running?
PBS Scheduling on RTC
30-Sep-2007
Introduction
RTC uses PBSPro to launch jobs and Maui to schedule jobs.
This
scheduler is configured with a "fairshare" policy rather than a FIFO
(First-In First-Out) policy. A FIFO policy could result in one or
a few users dominating the job queue indefinitely. A fairshare
policy means that the job priority of an individual user will drop as
their job count increases. This could result in old jobs
remaining in the queue waiting to run while newer jobs run
immediately. The remainder of this document describes this affect
and how you can determine if this is happening to your jobs.
Factors that Influence Job Priority
There are several factors that determine a job's priority in the
queue:
- Resource reservations - queue resources have been reserved in advance. This is usually
for time-specific projects or demonstrations.
- System Load - Users
are restricted to 30 running jobs at any given time when the system is
not busy, and 10 running jobs at any given time when the system is
heavily loaded.
- Resources Requested - Maui assigns higher priorities to jobs that request large numbers of
processors as compared to small numbers of processors. The more
processors requested, the higher the priority will be.
- Fairshare - the
primary factor for most jobs. The more jobs a user runs, the
lower their priority will be on future jobs. This policy is based
on up to 14 days of historical usage. This will allow new jobs to
run ahead of jobs that are already in the queue waiting to run.
Why is My Job Stuck in the Queue?
There are situations that may occur that can cause your job to appear
stuck in the queue unable to run:
Incorrect Resource Request
- One reason a job might be stuck in the queue is a request
for an
invalid set of resources. The most common error is to request too
many
processors per node (ppn). Each node has two processors (except
for the four quad processor nodes).
Requesting more than two per node represents a request that can not be
satisfied so PBS will not be able to schedule the job even though it
will be accepted into the queue. The following represents
an invalid
request:
Queue Limits
- In order to guarantee that everyone gets an opportunity to
run jobs, we
currently have a limit of 30 active (running) jobs per user when the
system is relatively idle, and 10 active jobs per user when the system
is relatively busy. When jobs are submitted beyond these limits,
they will remain in the queue waiting for your active jobs to finish
even if there are resources available to run your jobs.
- There are restrictions on the number of processors that can be assigned to a queue at any given time. If you submit a job and you receive an error such as job cannot run in partition DEFAULT. (job job.ID violates active SOFT MAXPROC limit of 176 for class medium (R: 1, U: 188)), this means that the particular queue is full. No more processors can be assigned to this queue even if idle processors are available. This prevents any one queue from dominating the entire system for long periods of time. In this case your job will wait until enough processors in the medium queue have become available before your job will start.
Fairshare Policy
- You submit a job, but jobs submitted after yours run
first. The most likely reason for this to happen is that your
priority is lower than other users based on the 14 day historical usage
of our fairshare policy. To find out when a job is predicted to
run, use the showstart jobID command.
Backfill Policy
- This
is a scheduling optimization which allows Maui to make better use of
available resources by running jobs out of order. Using job data such
as walltime and resources requested, the scheduler can start other,
lower-priority jobs so long as they do not delay the highest priority
jobs. Because
of the way it works, essentially filling in holes in node space,
backfill tends to favor smaller and shorter running jobs more than
larger and longer running ones.
NOTE: It is important to specify an accurate walltime for your job in your
PBS submission script.
Selecting the default of 4 hours for jobs that are known to run for
less time may result in the job being delayed by the scheduler due
to an overestimation of the time the job needs to run.
Node Fragmentation/Resources Requested
- You submit a multiprocessor parallel job and should have a
high priority, but the job remains queued. Using an example of a
8-node, 16 processor job, the scheduler will not schedule your job
until all four nodes are completely empty at the same time. On a
busy system with hundreds of jobs submitted at random times, it is
unlikely that randomly scheduled jobs are going to finish at the same
time such that 8 nodes with 16 processors are going to be available at
any given time. Essentially the node usage has become
fragmented. So your job will remain queued and other jobs in the
queue will backfill and be allowed to utilize individual idle
processors as they become available before your job is "projected" to
start. However, this situation is unlikely to occur on RTC due to
the fact that the Maui scheduler assigns higher priority to jobs
requesting large numbers of CPUs. You may only have to wait until
existing running jobs exit before your muliprocessor job runs because
it will be submitted with a high priority automatically.
- You submit a single processor job and there are idle
processors available, but your job remains queued. This likely
means that the system is currently waiting for enough idle processors
to become available so that higher priority multiprocessor jobs can
run.
In this case, the system will allow running jobs to finish but will not
start any new single processor jobs until the multiprocessor jobs have
enough
resources to run.
|