I - HPC and Resource Management
Resources and Partitions
An HPC job is a description of the resources required, any preparatory steps such as loading modules or otherwise setting up an environment, and the commands to run the software, along with any postprocessing that may be appropriate.
The job is specified through a special form of script often called a batch script. Usually it is written in bash
.
Resources include the quantity of time requested, the amount of memory, the number of cores per node, and if appropriate the number of nodes or the number and/or architecture of GPU.
In the abstract, a queue is a sequence of jobs to be prioritized and handled. In a cluster, a queue, which Slurm calls a partition, is implemented with a group of compute nodes that provide a particular set of resources.
Slurm is a software package that manages the resources of the cluster and balances the demands of many competing job requests. It consists of a workload manager, often called a scheduler, and a slurmd “daemon” which runs on each node and handles the execution and monitoring of the jobs.
Cores, Nodes, and Tasks
Hardware
The Slurm model is a cluster consisting of a number of nodes. Each node is a separate server. These servers are similar to an ordinary desktop computer, but are more reliable and usually provide more memory and cores that an ordinary desktop.
A core is a computing unit. It is part of a cpu.
Memory refers to random-access memory. It is not the same thing as storage. If a process reports running out of memory, it means RAM memory. Running out of disk space will result in a different error.
For more details about the structure of a computational cluster, see our introduction.
Processes and Tasks
A process can be envisioned an instance of an executable that is running on a particular computer. Most executables run only a single process. Some executables run threads within the root process.
Slurm refers to the root process as a task. By default, each task is assigned to one core.
Slurm Resource Requests
SLURM refers to queues as partitions . We do not have a default partition; each job must request one explicitly.
Queue Name | Purpose | Job Time Limit | Max Memory / Node / Job | Max Cores / Node |
---|---|---|---|---|
standard | For jobs on a single compute node | 7 days | 1462 GB | 96 |
gpu | For jobs that can use general purpose GPU’s (A40,A100,A6000,V100,RTX3090) |
3 days | 1953 GB | 128 |
parallel | For large parallel jobs on up to 50 nodes (<= 1500 CPU cores) | 3 days | 750 GB | 96 |
interactive | For quick interactive sessions (up to two RTX2080 GPUs) | 12 hours | 216 GB | 96 |
To see an online list of available partitions, from a command line type
$qlist
A more detailed view of the partitions and their limits is available through the command
$qlimits
Batch Scripts
Jobs are described to the resource manager in the form of a script. Typically this is written in the bash scripting language. Bash is the default shell on most Linux-based systems, which includes the majority of HPC systems, so it is expected to be available to interpret the script. However, Slurm accepts scripts in other languages if the interpreter is available. We will consider only bash scripts in this tutorial.
To prepare a job, the user writes a script. The top of the script is a preamble that describes the resource requests. The rest of the script contains the instructions to execute the job. The script is then submitted to the Slurm system. The Slurm workload manager examines the preamble to determine the resources needed, while ignoring the rest of the script. It uses the resource request along with a fair share algorithm to set the priority of the job. The job is then placed into the requested partition to wait for the resources to become available.
Once the job starts, the slurmd daemon runs the script as an ordinary shell script. The preamble consists of comments (code that is not executed by the interpreter) so they are ignored. The rest of the script must be a valid bash shell script.