Job Examples¶
Parallel Jobs¶
For submitting parallel jobs, a few rules have to be understood and followed. In general, they depend on the type of parallelization and architecture.
OpenMP Jobs¶
An SMP-parallel job can only run within a node, so it is necessary to include the options --node=1
and --ntasks=1
. The maximum number of processors for an SMP-parallel program is 896 on the cluster
Julia
as described in the
section on memory limits. Using the option
--cpus-per-task=<N>
Slurm will start one task and you will have N
CPUs available for your job.
An example job file would look like:
Job file for OpenMP application
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --tasks-per-node=1
#SBATCH --cpus-per-task=8
#SBATCH --time=08:00:00
#SBATCH --mail-type=start,end
#SBATCH --mail-user=<your.email>@tu-dresden.de
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
./path/to/binary
MPI Jobs¶
For MPI-parallel jobs one typically allocates one core per task that has to be started.
MPI libraries
There are different MPI libraries on ZIH systems for the different micro architectures. Thus, you have to compile the binaries specifically for the target architecture of the cluster of interest. Please refer to the sections building software and module environments for detailed information.
Job file for MPI application
#!/bin/bash
#SBATCH --ntasks=864
#SBATCH --time=08:00:00
#SBATCH --job-name=Science1
#SBATCH --mail-type=end
#SBATCH --mail-user=<your.email>@tu-dresden.de
srun ./path/to/binary
Multiple Programs Running Simultaneously in a Job¶
In this short example, our goal is to run four instances of a program concurrently in a single
batch script. Of course, we could also start a batch script four times with sbatch
, but this is
not what we want to do here. However, you can also find an example about
how to run GPU programs simultaneously in a single job
#!/bin/bash
#SBATCH --ntasks=4
#SBATCH --cpus-per-task=1
#SBATCH --time=01:00:00
#SBATCH --job-name=PseudoParallelJobs
#SBATCH --mail-type=end
#SBATCH --mail-user=<your.email>@tu-dresden.de
# The following sleep command was reported to fix warnings/errors with srun by users (feel free to uncomment).
#sleep 5
srun --exclusive --ntasks=1 ./path/to/binary &
#sleep 5
srun --exclusive --ntasks=1 ./path/to/binary &
#sleep 5
srun --exclusive --ntasks=1 ./path/to/binary &
#sleep 5
srun --exclusive --ntasks=1 ./path/to/binary &
echo "Waiting for parallel job steps to complete..."
wait
echo "All parallel job steps completed!"
Request Resources for Parallel Make¶
From time to time, you want to build and compile software and applications on a compute node.
But, do you need to request tasks or CPUs from Slurm in order to provide resources for the parallel
make
command? The answer is "CPUs".
Interactive allocation for parallel make
command
marie@login$ srun --ntasks=1 --cpus-per-task=16 --mem=16G --time=01:00:00 --pty bash --login
[...]
marie@compute$ # prepare the source code for building using configure, cmake or so
marie@compute$ make -j 16
Exclusive Jobs for Benchmarking¶
Jobs on ZIH systems run, by default, in shared-mode, meaning that multiple jobs (from different
users) can run at the same time on the same compute node. Sometimes, this behavior is not desired
(e.g. for benchmarking purposes). You can request for exclusive usage of resources using the Slurm
parameter --exclusive
.
Exclusive does not allocate all available resources
Setting --exclusive
only makes sure that there will be no other jobs running on your
nodes. It does not, however, mean that you automatically get access to all the resources
which the node might provide without explicitly requesting them.
E.g. you still have to request for a GPU via the generic resources parameter (gres
) on the GPU
cluster. On the other hand, you also have to request all cores of a node if you need them.
CPU cores can either to be used for a task (--ntasks
) or for multi-threading within the same task
(--cpus-per-task
). Since those two options are semantically different (e.g., the former will
influence how many MPI processes will be spawned by srun
whereas the latter does not), Slurm
cannot determine automatically which of the two you might want to use. Since we use cgroups for
separation of jobs, your job is not allowed to use more resources than requested.
Here is a short example to ensure that a benchmark is not spoiled by other jobs, even if it doesn't use up all resources of the nodes:
Job file with exclusive resources
#!/bin/bash
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=2
#SBATCH --cpus-per-task=8
#SBATCH --exclusive # ensure that nobody spoils my measurement on 2 x 2 x 8 cores
#SBATCH --time=00:10:00
#SBATCH --job-name=benchmark
#SBATCH --mail-type=start,end
#SBATCH --mail-user=<your.email>@tu-dresden.de
srun ./my_benchmark
Array Jobs¶
Array jobs can be used to create a sequence of jobs that share the same executable and resource
requirements, but have different input files, to be submitted, controlled, and monitored as a single
unit. The option is -a, --array=<indexes>
where the parameter indexes
specifies the array
indices. The following specifications are possible
- comma separated list, e.g.,
--array=0,1,2,17
, - range based, e.g.,
--array=0-42
, - step based, e.g.,
--array=0-15:4
, - mix of comma separated and range base, e.g.,
--array=0,1,2,16-42
.
A maximum number of simultaneously running tasks from the job array may be specified using the %
separator. The specification --array=0-23%8
limits the number of simultaneously running tasks from
this job array to 8.
Within the job you can read the environment variables SLURM_ARRAY_JOB_ID
and
SLURM_ARRAY_TASK_ID
which is set to the first job ID of the array and set individually for each
step, respectively.
Within an array job, you can use %a
and %A
in addition to %j
and %N
to make the output file
name specific to the job:
%A
will be replaced by the value ofSLURM_ARRAY_JOB_ID
%a
will be replaced by the value ofSLURM_ARRAY_TASK_ID
Job file using job arrays
#!/bin/bash
#SBATCH --array=0-9
#SBATCH --output=arraytest-%A_%a.out
#SBATCH --error=arraytest-%A_%a.err
#SBATCH --ntasks=864
#SBATCH --time=08:00:00
#SBATCH --job-name=Science1
#SBATCH --mail-type=end
#SBATCH --mail-user=<your.email>@tu-dresden.de
echo "Hi, I am step $SLURM_ARRAY_TASK_ID in this array job $SLURM_ARRAY_JOB_ID"
Note
If you submit a large number of jobs doing heavy I/O in the Lustre filesystems you should limit the number of your simultaneously running job with a second parameter like:
#SBATCH --array=1-100000%100
Please read the Slurm documentation at https://slurm.schedmd.com/sbatch.html for further details.
Chain Jobs¶
You can use chain jobs to create dependencies between jobs. This is often useful if a job
relies on the result of one or more preceding jobs. Chain jobs can also be used to split a long
running job exceeding the batch queues limits into parts and chain these parts. Slurm has an option
-d, --dependency=<dependency_list>
that allows to specify that a job is only allowed to start if
another job finished.
In the following we provide two examples for scripts that submit chain jobs.
Scaling experiment using chain jobs
This scripts submits the very same job file myjob.sh
four times, which will be executed one
after each other. The number of tasks is increased from job to job making this submit script a
good starting point for (strong) scaling experiments.
#!/bin/bash
task_numbers="1 2 4 8"
dependency=""
job_file="myjob.sh"
for tasks in ${task_numbers} ; do
job_cmd="sbatch --ntasks=${tasks}"
if [ -n "${dependency}" ] ; then
job_cmd="${job_cmd} --dependency=afterany:${dependency}"
fi
job_cmd="${job_cmd} ${job_file}"
echo -n "Running command: ${job_cmd} "
out="$(${job_cmd})"
echo "Result: ${out}"
dependency=$(echo "${out}" | awk '{print $4}')
done
The output looks like:
marie@login$ sh submit_scaling.sh
Running command: sbatch --ntasks=1 myjob.sh Result: Submitted batch job 2963822
Running command: sbatch --ntasks=2 --dependency afterany:32963822 myjob.sh Result: Submitted batch job 2963823
Running command: sbatch --ntasks=4 --dependency afterany:32963823 myjob.sh Result: Submitted batch job 2963824
Running command: sbatch --ntasks=8 --dependency afterany:32963824 myjob.sh Result: Submitted batch job 2963825
Example to submit job chain via script
This script submits three different job files, which will be executed one after each other. Of course, the dependency reasons can be adopted.
#!/bin/bash
declare -a job_names=("jobfile_a.sh" "jobfile_b.sh" "jobfile_c.sh")
dependency=""
arraylength=${#job_names[@]}
for (( i=0; i<arraylength; i++ )) ; do
job_nr=$((i + 1))
echo "Job ${job_nr}/${arraylength}: ${job_names[$i]}"
if [ -n "${dependency}" ] ; then
echo "Dependency: after job ${dependency}"
dependency="--dependency=afterany:${dependency}"
fi
job="sbatch ${dependency} ${job_names[$i]}"
out=$(${job})
dependency=$(echo "${out}" | awk '{print $4}')
done
The output looks like:
marie@login$ sh submit_job_chains.sh
Job 1/3: jobfile_a.sh
Job 2/3: jobfile_b.sh
Dependency: after job 2963708
Job 3/3: jobfile_c.sh
Dependency: after job 2963709
Requesting GPUs¶
Examples of jobs that require the use of GPUs can be found in the Job Examples with GPU section.