![]() |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Introduction to SUG@R - 25-Aug-2009 Table of Contents
IntroductionSUG@R is Rice's Intel Xeon compute cluster. SUG@R contains 134 SunFire x4150 nodes from Sun Microsystems. Each node has two quad-core Intel Xeon processors running at 2.83GHz, yielding a system wide total of 1072 processor cores. There is a maximum of 102 nodes (816 processors) available to all users and is subject to change due to special projects, maintenance tasks, and so on. Each processor can access
up to 16GB of RAM. All nodes use a Gigabit ethernet interconnect. The system
also
has three filesystems. A 9 TB Panasas
filesystem provides fast I/O to run user applications, a 1 TB filesystem for user
home directories, another 2 TB for group-based allocation
and 250 GB for
software (/opt/apps). A complete system overview is online.
SUG@R is running Red Hat Enterprise 5 Linux and the 2.6.18 kernel. Most installed software is in /opt/apps. See the module command for information on how to use these applications. If you need any software that is not present, please let us know. All jobs requiring fast network interconnect (MPI jobs) must be run on Ada or STIC. The SUG@R system is designed to support jobs that do not need a fast network interconnect. Therefore, only jobs of 8 processors (one node) or less should be submitted to this system. Recommended parallel job types within a node are MPI (OpenMPI), SMP (OpenMP and compiler-assisted autoparallelization) and threading (pthreads, Java threads). Exceeding one node per job will result in degraded performance for everyone and such jobs are subject to termination without notice at the discretion of the systems administrators. For information on the unix shell configuration program called module, PBS, compilers, OpenMPI, and contact information, see the remainder of this document. A final note: Be careful about changing your unix shell's configuration (.profile, .cshrc, .bash, etc) until you get things working. The system and the necessary shell environment is a little different from Ada and RTC so caution should be used when trying to duplicate your environment from one of these clusters. Logging in to SUG@RSUG@R can be accessed from any machine on the Rice campus with SSH. If
you need
off-campus access, you will have to install VPN on your computer and
then login to SUG@R via SSH. For more information regarding off-campus
access, please visit our Off-Campus Access
FAQ
To login to SUG@R from a Linux or Unix machine, type:
To transfer files into SUG@R from a Linux or Unix machine, use scp:
For more information about using SSH, please see our SSH FAQ. Once you are logged in to SUG@R, you are logged into one of two login nodes as shown in the diagram below. These nodes are intended for users to compile software, prepare data files, and submit jobs to the job queue. They are not intended for running compute jobs. Please run all compute jobs in one of the job queues described later in this document.
Filesystems, Quotas, and Job OutputSUG@R currently enforces disk quotas for all users. There is a quota for home directories (accessed via $HOME) and for the projects (accessed via $PROJECTS). There are no quotas on $SHARED_SCRATCH. However, this filesystem is for applications that need fast I/O and is not for permanent storage. Any files on $SHARED_SCRATCH that are not modified for more than one month will be deleted automatically! Permanent storage is in $HOME and $PROJECTS only. A summary of all filesystems available to all users is presented in the following table:
NOTE: The $HOME filesystem is scheduled for an upgrade later this year which will result in more disk space and larger quotas. NOTE: $HOME and $PROJECTS cannot be used for job I/O. Jobs found to be using $HOME and $PROJECTS for job I/O are subject to termination without notice. Please see our FAQ for more details on job I/O. NOTE: The physical paths listed in the chart above are subject to change. You should always access the filesystems using environment variables. For example, to access /shared.scratch/dirname, use this command:
To see your current quota and your disk usage for your home directory, run this command:
To see the quota and usage for the projects directories for all groups that you belong to, run this command:
For information on how to use $PROJECTS, please see our FAQ.
Customizing Your Environment with the module CommandEach user can customize their enviroment using the module command. This command
lets you select software and will source the appropriate paths and
libraries. All the requested user applications are located under the /opt/apps directory.
To list what applications are available, type:
To load the module for the Intel compiler, use:
For more information on using the module command in PBS batch scripts, please see our FAQ. Job Scheduling
|
| Queue
Name |
Maximum number of nodes per job | Maximum number of
CPUs per job |
Maximum number of CPUs in use by a single user at any given time | Maximum number of CPUs in this queue | Minimum Walltime |
Maximum
Walltime |
| commons |
1 | 8 | 32 (normal load) 128 (light load) |
768 | 00:00:00 |
24:00:00 |
| interactive | 1 | 8 | 8 per interactive session | 32 | 00:00:00 | 00:30:00 |
Commons is a standard priority queue that can allocate the maximum number of CPUs per job and currently has a maximum job walltime of 24 hours. The total number of CPUs in this queue is subject to change at any time due to special projects and system maintenance tasks. This system is designed for small, single node jobs. Therefore, jobs requiring more than 8 CPU cores or more than one node are discouraged. These jobs should be run on Ada or STIC.
Interactive is a higher priority queue with the purpose of serving interactive jobs. The maximum number of CPUs that can be accessed through this queue is 32 with a maximum job walltime of 30 minutes. This queue is available 8AM to 10PM each day. See our FAQ for more details.
NOTE: The maximum number of cores (processors) allowed to be running at one time for any user is 32 under normal load regardless of how many jobs are in the queue or how many cores per job requested. This number will be increased to 128 automatically under light system load. The maximum number of cores (processors) that may be requested in any one job is 8 and they must be within the same node (no MPI traffic between nodes).
NOTE: Do not run CPU intensive processes on SUG@R's login nodes. Use one of the queues listed above. Any CPU intensive process running on the login nodes is subject to termination without notice.
There
may be other queues present on the system. These are normally
dedicated to special projects/allocations.
A
good way to obtain the status of all queues and their current usage
is to run the following PBS command:
|
Once you have an
executable program and are ready to run it on the compute nodes, you must create a job script containing
the
following PBS options:
After the job script has been constructed you must submit it to the job scheduler for execution. The remainder of this section will describe the anatomy of a PBS script and how to submit and monitor jobs.
All jobs must be submitted via a PBS batch script or invoking qsub at the command line . See the table below for PBS submission options.
PBS Submission Options
Option |
Description |
#PBS -N jobname |
Assigns a job name. The default is the name of PBS job script. |
#PBS -l nodes=1:ppn=2 |
The number of nodes and processors
per node. |
#PBS -l nodes=1:ppn=1 |
Using both of these options will give your job exclusive access to a node such that no other jobs can share the node. This combination of arguments will assign one processor to your job and will give it exclusive access to all of the resources (i.e. memory) of the entire node without interference from other jobs. Please see our FAQ for more details on exclusive access. |
#PBS -l walltime=01:00:00 |
The maximum wall-clock time
needed for this job to run. |
| #PBS -l pmem=2000m | The maximum amount of physical memory used by any single process of the job (in megabytes). See our FAQ for more details. |
| #PBS -q queuename |
Specify the name of the queue to use. |
#PBS -o mypath |
The full path for the standard output (stdout) .OU files. |
#PBS -e mypath |
The full path for the standard error (stderr) .ER files. |
#PBS -j oe |
Join option that merges the standard error stream with the standard output stream of the job. |
#PBS -V |
Exports all environment variables to the job. |
| #PBS -M username@rice.edu | Email address for job status messages. |
| #PBS -m bae | PBS will notify the user via email when the job begins, aborts or terminates. |
#PBS -m n |
Turn off all email from the job. |
The job launcher's purpose is to spawn copies of your executable across the resources allocated to your job. We currently recommend and support mpiexec for this task. It is a cleaner, safer and faster alternative to mpirun. By default mpiexec only needs your executable, the rest of the information will be extracted from PBS.
The following is an example of how to use mpiexec inside your PBS batch script. This example will run myprogram.exe as a
parallel OpenMPI code on all of the processors requested by this example and allocated by PBS:
#PBS -l nodes=1:ppn=4 mpiexec /path/to/myprogram.exe |
NOTE: The above example assumes that myprogram.exe is a program designed to be parallel (using MPI). If your program has not been parallelized, then running on more than one processor will not improve performance and will result in wasted processor time.
A job script may
consist of
PBS directives, comments and executable statements. A PBS directive
provides a way of specifying job attributes in addition to the
command line options. For example, we could create a myjob.pbs script
this way:
|
NOTE: It is important to specify an accurate walltime for your job in your PBS submission script. Selecting the default walltime for jobs that are known to run for less time may result in the job being delayed by the scheduler due to an overestimation of the time the job needs to run.
If you need to debug
your
program and want to run in interactive mode, the same request could
be constructed like this (via the qsub command):
|
When you submit a job, it will inherit several environment variables that are automatically set by PBS. These environment variables can be useful in your job submission scripts as seen in the examples above. A summary of the most important variables are presented in the table below.
Variable Name |
Description |
$TMPDIR |
Location of scratch space on each node. See our FAQ for more details. |
$PBS_NODEFILE |
Location of a file that contains a list of all nodes assigned to the job. |
$PBS_O_WORKDIR |
Path from where the job was submitted. |
|
Table 2. Maui commands
Command |
Description |
showq |
Show a detailed list of all submitted jobs. |
checkjob job.ID |
Show a detailed description of
the job given by job.ID. |
showstart job.ID |
Gives an estimate of the
expected start time of the job given by job.ID |
There are four different states that a job can be after submission: active, idle, blocked or deferred. The showq command with no arguments will list all jobs in their current state.
Active (Running): These are jobs that have been started.
Idle: These jobs are eligible to run but there's simply not enough resources to allocate to them at this time.
Blocked: These jobs aren't being considered for running, probably due to a
policy violation. Jobs will eventually get out of this state and go
into the idle queue. For
instance, a queue has reached the maximum number of active processors
assigned to it and it's blocking all jobs until resources are
released by active jobs.
Deferred: Jobs in this state normally have a batch hold which means that they
requested resources of a type or amount that do not exist on the
system. (walltime, number of nodes, etc). If your job is deferred,
please review the resource requirements on your submission script and
make sure that the destination queue can satisfy them.
It
is possible to modify job attributes after the job was submitted and
is not in the running state. The pbs command qalter supports all of
the parameters available on qsub. This example reduces
the walltime originally requested for the job:
|
A job can also be
relocated
to a different queue using the qmove command :
|
A job can be deleted
by
using the qdel command:
|
Several programming models are supported on SUG@R. Programs that
are of
sequential and parallel (within a node) can be submitted. Sequential programs
require one processor to run. Parallel programs utilize
multiple processors concurrently. The maximum size of a parallel job on SUG@R is 8 processors. Message passing and threaded applications
generally fit under the scope of parallel computing. Recommended parallel job types within a node are MPI (OpenMPI), SMP (OpenMP and compiler-assisted autoparallelization) and threading (pthreads, Java threads).
The supported compilers on SUG@R are Intel and GCC with Intel being the preferred compiler. OpenMPI
implementations of Intel and GCC are available and can be loaded upon
demand using the module command.
|
|
When invoked as described above, the compiler will perform the preprocessing, compilation, assembly and linking stages in a single step. The output file (or executable) is specified by executablename and the source code file is specificed by sourcecode.f77, for example. Omitting the -o executablename option will result in the executable being named a.out by default. For additional instructions and advanced options please view the online manual pages for each compiler (i.e. execute the command man ifort ).
| module command |
Description |
| module load openmpi/1.2.6-gcc | For gcc compiled version |
| module load openmpi/1.2.6-intel.10.1.015 | For Intel compiled version |
mpicc -o executablename mpi_sourcecode.c |
When invoked as described above, the compiler will perform the preprocessing, compilation, assembly and linking stages in a single step. The output file (or executable) is specified by executablename and the source code file is specificed by mpi_sourcecode.f77, for example. Omitting the -o executablename option will result in the executable being named a.out by default. For additional instructions and advanced options please view the online manual pages for each compiler (i.e. execute the command man mpif77 ).
The GNU compiler is installed as part of the Red Hat Enterprise Linux distribution. Use man gcc to view the online manual for the C and C++ compiler, and man gfortran to view the online manual for the Fortran compiler.
If you have any further questions please see our FAQ. If you still
have questions, please let us know:
http://helpdesk.rice.edu
helpdesk@rice.edu
713-348-4357
Please follow our guidelines when contacting the Help Desk for faster problem resolution.
![]() |