![]() |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Introduction to Ada - Rice's Cray
XD1 Cluster
07-Aug-2008 IntroductionAda is Rice's newest and largest computing cluster. It is a 632
AMD64 CPU core machine with dual core 2.2 GHz AMD Opteron 275 CPUs and
with 1 MB L2 cache. Each core has 2 GB of memory. Each node
has two CPUs or four cores on it, with a total of 8 GB of RAM.
All 8 GB are visible on each core, although 4GB is local to each dual
core CPU and therefore slightly faster to access. The system also
has three filesystems. A 5 TB Lustre filesystem (/lustre)
provides fast I/O to run user applications; 5 TB for home
directories (/home) and another 5 TB for group-based allocation
(/projects). The interconnect is Cray proprietary "RapidArray"
which is based on Infiniband.
Ada is running SuSE 9.0 Linux and the 2.6.5 kernel, with small changes made by Cray, particularly for their RapidArray interconnect. Most installed software is in /opt/apps. See the module command for information on how to use these applications. If you need any software that is not present, please let us know. For information on the unix shell configuration program called module, PBS, compilers, MPI, and contact information, see the remainder of this document. A final note. Be careful about changing your unix shell's configuration (.profile, .cshrc, .bash, etc) until you get things working. The system and the necessary shell environment is a little different from the RTC. Logging in to AdaAda can be accessed from any machine on the Rice campus with SSH. If
you need
off-campus access, you will have to install VPN on your computer and
then login to Ada via SSH. For more information regarding off-campus
access, please visit our Off-Campus Access
FAQ
To login to ada from a Linux or Unix machine, type:
To transfer files into Ada from a Linux or Unix machine, use scp:
For more information about using SSH, please see our SSH FAQ. Login Nodes Once you are logged in to Ada, you are logged into one of four login nodes. These nodes are intended for users to compile software, prepare data files, and submit jobs to the job queue. They are not intended for running compute jobs. Please run all compute jobs in one of the job queues described later in this document. Filesystems, Quotas and Job OutputAda currently enforces disk quotas for all users. There is a 10 GB quota for home directories (/home, also called /users) and a 50 GB quota per group (/projects). There are no quotas on /lustre. However, /lustre is for applications that need fast I/O and is not for permanent storage. Any files on /lustre that are not modified for more than two weeks will be deleted automatically! The /home (also called /users) and /projects filesystems are intended for permanent storage only. They are not intended for job I/O. For more details about filesystems for job I/O, please see our FAQ. NOTE: Do not use /users and /projects for job I/O. Please see our FAQ for more details on job I/O. To see your current quota and your disk usage, run this
command:
To see the quota and usage for all groups that you belong to, run this command:
Customizing Your Environment with the module CommandEach user can customize their enviroment using the module command. This command
lets you select software and will source the appropriate paths and
libraries. All the requested user applications are located under the /opt/apps/ directory.
To list what applications are available, type:
To load the module for PGI v6.1.2, type:
For assistance with module, type man
module. Job SchedulingThe batch job scheduling system implemented on Ada consists of two packages: Torque and Moab. Torque is in charge of resource management and monitoring while the Moab scheduler decides when and where jobs should run. Torque is an enhanced, commercial version of OpenPBS and implements all of the usual PBS commands as described later in this document. Fairshare Scheduling PolicyWe implement the Moab fairshare feature to
provide
a
fair utilization of the available resources. This
is accomplished by allowing historical resource utilization
information to be incorporated into job feasibility and priority
decisions. This is normally the most significant component of a job's
priority, which ultimately defines the position of the job on a
queue. We do not use a FIFO (First-In-First-Out) scheduler on Ada.
Backfill Scheduling PolicyThis
is a scheduling optimization which allows Moab to make better use of
available resources by running jobs out of order. Using job data such
as walltime and resources requested, the scheduler can start other,
lower-priority jobs so long as they do not delay the highest priority
jobs. Because
of the way it works, essentially filling in holes in node space,
backfill tends to favor smaller and shorter running jobs more than
larger and longer running ones.
NOTE: It is important to specify an accurate walltime for your job in your PBS submission script. Selecting the default of 4 hours for jobs that are known to run for less time may result in the job being delayed by the scheduler due to an overestimation of the time the job needs to run. Available Queues and System LoadWe currently provide two queues for general accessibility, compute and interactive: Compute is a standard priority queue that can allocate all of the available resources (maximum of 544 processors) and has a maximum job walltime of 4 hours. Interactive is a higher priority queue with the purpose of serving debugging
sessions and interactive jobs. The maximum number of CPUs that
can be
accessed through this queue is 16 with a maximum job walltime of 30
minutes. This queue is available 24 hours per day. To use this queue, you must use the -I option on the qsub command line (see Batch Scheduling with PBS for qsub options). This will give you an interactive command line prompt on a compute node. There may be other queues present on the system. These are normally dedicated to special projects/allocations. A
good way to obtain the status of all queues and their current usage
is to run the following PBS command:
Here is a brief description of the relevant fields: Walltime: Maximum walltime a job can request Run: Number of jobs in running state Que: Number of jobs in queued state State: The queue is enabled “E” and running (started) "R" Determining Why a Job is not RunningThere may be several reasons why a job is not running and appears to be
stuck in the queue. Please see our PBS Job
Scheduling FAQ for more information.
Batch Processing with PBSOnce you have an executable, you need to create a job script containing the following PBS options:
See Table 1 below for PBS submission options. Table 1. PBS Submission Options
Job Launchers (mpiexec, mpirun)The job launcher's purpose is to spawn copies of your executable across the resources allocated to your job. We currently recommend and support mpiexec for this task. It is a cleaner, safer and faster alternative to mpirun. By default mpiexec only needs your executable, the rest of the information will be extracted from PBS. Cray also provides a special application launcher that works in conjunction with mpiexec. The xd1launcher ensures that your application takes advantage of XD1 software features such as LSS (Linux Synchronized Scheduler) and CPU affinity. This is an easy way to increase the performance of your application on the XD1 without much effort. Examples: Run
“myprogram” as a
parallel MPI code on each of the processors allocated by PBS:
Run “myprogram” on only 8 processors:
We still provide mpirun if your application must use it because it doesn't support anything else. Note that rsh is the default communication protocol for mpirun. However, Ada requires ssh for the communication protocol. The following example is the job presented above launched using mpirun with ssh configured as the default protocol :
Make sure you configured passwordless ssh in your account prior running mpirun or communication between the nodes assigned to your job will fail. Job Scripts A job script may consist of PBS directives, comments and executable statements. A PBS directive provides a way of specifying job attributes in addition to the command line options. For example, we could create a myjob.pbs script this way:
If you need to debug your program and want to run in interactive mode, the same request could be constructed like this:
NOTE: It is important to specify an accurate walltime for your job in your PBS submission script. Selecting the default of 4 hours for jobs that are known to run for less time may result in the job being delayed by the scheduler due to an overestimation of the time the job needs to run. Submitting and Monitoring JobsOnce your job script is ready, use qsub to submit it:
This will return a jobid while the output and error stream of the job will be saved to two files inside the directory where the job was submitted. The status of the job can be obtained using Moab commands. See Table 2 for a list of Moab commands. Table 2. Moab commands
There are four different states that a job can be after submission: active, idle, blocked or deferred. The showq command with no arguments will list all jobs in their current state. Active: These are jobs that have been started. Idle: These jobs are eligible to run but there's simply not enough resources to allocate to them at this time. Blocked: These jobs are not being considered for running, probably due to a policy violation. Jobs will eventually get out of this state and go into the idle queue. For instance, a queue has reached the maximum number of active processes assigned to it and it's blocking all jobs until resources are released by active jobs. Deferred: Jobs in this state normally have a batch hold which means that they requested resources of a type or amount that do not exist on the system. (walltime, number of nodes, etc). If your job is deferred, please review the resource requirements on your submission script and make sure that the destination queue can satisfy them. Modifying and Deleting Jobs It is possible to modify job attributes after the job was submitted and is not in the running state. The pbs command qalter supports all of the parameters available on qsub. This example reduces the walltime originally requested for the job:
A job can also be
relocated
to a different queue using the qmove command :
A job can be deleted
by
using the qdel command:
Compilers and ProgrammingSeveral programming models are supported on Ada. Programs that are of
sequential, parallel or distributed can be run. Sequential programs
require one processor to run. Parallel and distributed programs utilize
multiple processors concurrently. Parallel programs are a subset of
distributed programs. Generally speaking, distributed computing involve
parametric
sweeps, task farming, etc. Message passing, threaded applications
generally fit under the scope of parallel computing.
SPMD is one of the most popular method of parallelism, where a single executable works on its own data. The supported compilers on Ada are PGI, GCC, and J2EE SDK. MPICH implementations of PGI and GCC are available and can be loaded upon demand using the module command. Compiling Serial CodeFirst of all you will have to load the appropriate compiler
environment. To do so you will have to type:
Once the environment is set, you can compile your program with one of the following:
Compiling Parallel CodeTo compile a parallel version of your code that has MPI calls, use the
appropriate mpich library. Again, use module to load the appropriate compiler environment.
To compile your code you will have use the MPICH scripts that are currently in your default path. The MPICH scripts are responsible for invoking the compiler, linking your program with the MPI library and setting the MPI include files (mpi.h and mpif.h). Once the environment is set, you can compile your program with one of the following (assuming the PGI compiler as above):
Getting HelpIf you have any further questions please see our FAQ. If you still
have questions, please let us know:
http://helpdesk.rice.edu helpdesk@rice.edu 713-348-4357 Please follow our guidelines when contacting the Help Desk for faster problem resolution. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|