![]() |
|||
|
Why Are My Jobs Not Running?
PBS Scheduling on Ada and STIC 06-Oct-2009 IntroductionAda and STIC use Torque (PBS) to launch jobs and Moab to schedule jobs.
This
scheduler is configured with a "fairshare" policy rather than a FIFO
(First-In First-Out) policy. A FIFO policy could result in one or
a few users dominating the job queue indefinitely. A fairshare
policy means that the job priority of an individual user will drop as
their job count increases. This could result in old jobs
remaining in the queue waiting to run while newer jobs run
immediately. The remainder of this document describes this affect
and how you can determine if this is happening to your jobs.
Factors that Influence Job PriorityThere are three primary factors that determine a job's priority in the
queue:
Why is My Job Stuck in the Queue?There are situations that may occur that can cause your job to appear
stuck in the queue unable to run:
Incorrect Resource Request
Fairshare Policy
Backfill Policy
NOTE: It is important to specify an accurate walltime for your job in your
PBS submission script.
Selecting the default walltime for jobs that are known to run for
less time may result in the job being delayed by the scheduler due
to an overestimation of the time the job needs to run.
Backfill Chunking
Node Fragmentation/Resources Requested
When Will My Job Run? In order to determine when your job is projected to start, use the showstart <jobID> command where <jobid> is the ID number of your job. This command will report the projected start time of your job based on your priority and position in the queue relative to all other jobs. The projected start time is just a point-in-time estimate and will change as jobs enter and leave the queue. In some cases showstart will indicate that your job should start in 00:00:00, which means it should start immediately, but the job remains waiting in the queue anyway. In this case, your job is most likely being blocked by Backfill Chunking as described above. Unfortunately the current version of our scheduler will not report this data to you. This will be addressed in a future release. How Do I See My Job Priority? There are two ways to determine your job priority. One way is to use the showq -i command. This command will list all jobs waiting to run and will display them in order of priority with the highest priority first.
In the sample output above, jobs 50451 and 50452 have asterisks by their jobID numbers. This indicates that the scheduler has guaranteed that these jobs are the next jobs to run unless higher priority jobs are submitted in the interim. The asterisks indicate that the scheduler is currently reserving node space for these jobs and will continue to do so until the jobs run. The only exception to this rule is if a higher priority job is submitted before these jobs run. In this case, the higher priority jobs will run first. If the jobs with asterisks by them are not the top priority jobs, then the scheduler is trying to backfill lower priority jobs. The second way to determine job priority is to use the diagnose -p command. This command will list all jobs on the system in order of priority with the highest priority first. It will also show how the priority is calculated. Here is an example of the output:
Each column in the output for each job represents a component of the priority value. The total of each line will be 100%. The most important columns above are FS (Fairshare), Serv (Service), and Res (Resources Requested), although values might appear in all columns. For example, job 507922 above has 44% of its priority determined by the Fairshare policy as described earlier in this document. The lower the historical usage of the owner of the job, the higher the Fairshare value will be. It has 11.7% determined by its Service time (wait time in queue which is 2989 minutes and the number of times it has been bypassed to run, 96 times). The longer this job waits for execution, the higher this percentage will become. This job also has 44.3% of its priority determined by the Resources (number of CPUs) that it has requested (200 as listed in parenthesis). The scheduler does not treat each component of the priority equally however. The current queue policy will favor larger jobs so the Resource component is weighted more heavily than the other components. The system will then calculate the job priority by taking the values of all of the priority components into consideration. These priority values are dynamic and will change over time and will change as jobs enter and exit the queue. To see this, note that job 515236 above has a 40.2% FS component and a 49% Res component but is behind several jobs that have lower FS and Res components (jobs 497961 through 497968). This is because those jobs have a higher Serv component (16.3%) because they have been waiting longer. So the system is favoring those jobs slightly due to their wait time in the queue even though they have only requested 60 CPUs while job 515236 has requested 200. In this case the size of the job was not enough to grant it a higher priority because it had not been waiting in the queue very long relative to the other jobs. Also note that a job with a very low, perhaps even zero, FS value is an indication that this user has had a very high historical usage over the last 7 days. This user's priority will be lower based on this fact. The higher the utilization for a user, the lower the FS value will be in this output. The reverse is also true. A high FS value means that this user has had low usage over the last 7 days. How is my Fairshare Score Determined? The Fairshare score (FS) is determined by the historical utilization of a user over a 7 day window. To see the Fairshare score, run the diagnose -f command:
The above sample output (truncated) shows the Fairshare score for two users. The first user, user1, has a score of 3.14. In this context this means that this user's utilization has been low (about 3%) over the last 7 days. The utilization for the last 7 days (columns 0 through 6) reflect low utilization on each day. In comparison, user2 has a score of 14.52 which reflects high utilization (about 14%) as is shown on days 0 through 3. The user had no utilization on days 4 through 6. NOTE: The Fairshare score displayed here has an inverse relationship to the number shown with diagnose -p. The score shown here is a percentage while diagnose -p shows a score. The lower the utilization percentage, the higher the score. NOTE: It is possible for a Fairshare value to continue to rise even as your utilization drops. In the case of user2 above, the utilization value for days 4 through 6 is zero. Over time these days will disappear from the calculation. Replacing no utilization with low utilization in the average calculation will make your fairshare score go up while your utilization is dropping. Why Are My Jobs Scheduled Out of Order? It is possible for you to submit many jobs and have the newer jobs scheduled to run ahead of your older jobs that are already in the queue. This is based on the starting priority assigned to your job when it is submitted. Your starting priority is based on all of the scheduling factors listed above in this document. These scheduling factors can change minute by minute depending on cluster activity, especially due to the Fairshare policy. Your Fairshare priority changes over time as your historical usage increases or decreases. If your Fairshare priority is increasing while you are submitting jobs, then jobs submitted last are likely to have higher priority than jobs submitted first. To see your starting priority, run the checkjob <jobID> command where <jobID> is the job you are interested in. If the order of execution for your jobs is important, please see our FAQ. Getting Help If you need help in understanding why your job has a particular position in the queue, please submit a request to the Help Desk. It is very important that you include the output of showq -i and diagnose -p. Since the job priorities change dynamically, it is likely that the priorities will have changed significantly by the time we see your request for help. So the output of the above command will give us a snapshot of the job priorities at the time you requested help.
|
|||
|