SLURM backends¶
NOTE: SlurmConfig
objects are created internally in fractal-server
, and they are not meant to be initialized by the user; the same holds for
SlurmConfig
attributes (e.g. mem_per_task_MB
) which are not meant to be part of the FRACTAL_SLURM_CONFIG_FILE
JSON file (details on the expected file content are defined in the SlurmConfigFile
model).
SLURM configuration¶
The logic for setting up the SLURM configuration of a given WorkflowTask
is implemented in the slurm_config.py
submodule.
The different sources for SLURM configuration options (like partition
, cpus_per_task
, ...) are:
- All attributes that are explicitly set in the
WorkflowTask.meta
dictionary attribute take highest priority; - Next priority goes to all attributes that are explicitly set in the
WorkflowTask.task.meta
dictionary attribute; - Lowest-priority (that is default) values come from the configuration in
FRACTAL_SLURM_CONFIG_FILE
.
Example¶
The configuration file could be the one defined here, while a certain WorkflowTask
could have
workflow_task.meta = {"cpus_per_task": 3}
workflow_task.task.meta = {"cpus_per_task": 2, "mem": "10G"}
WorkflowTask
will correspond to
partition=main
cpus_per_task=3
mem=10G
Exporting environment variables¶
The fractal-server
admin may need to set some global variables that need to be included in all SLURM submission scripts; this can be achieved via the
extra_lines
field in the SLURM configuration file, for instance as in
{
"default_slurm_config": {
"partition": "main",
"extra_lines": [
"export SOMEVARIABLE=123",
"export ANOTHERVARIABLE=ABC"
]
}
}
There exists another use case where the value of a variable depends on the user who runs a certain task. A relevant example is that user A (who will run the task via SLURM) needs to define the cache-directory paths for some libraries they use (and those must be paths where user A can write). This use case is also supported in the specs of fractal-server
SLURM configuration
file:
If this file includes a block like
{
...
"user_local_exports": {
"LIBRARY_1_CACHE_DIR": "somewhere/library_1",
"LIBRARY_2_FILE": "somewhere/else/library_2.json"
}
}
...
export LIBRARY_1_CACHE_DIR=/my/cache/somewhere/library_1
export LIBRARY_2_FILE=/my/cache/somewhere/else/library_2.json
...
user_local_exports
are interpreted as relative to a base directory which is user-specific (for instance /my/cache/
, in the example above), and which is defined in the User.settings.cache_dir
attribute.
Also note that in this case fractal-server
only compiles the configuration options into lines of the SLURM submission script, without performing any check on the validity of the given paths.
SLURM batching¶
The SLURM backend in fractal-server
may combine multiple tasks in the same SLURM job (AKA batching), in order to reduce the total number of SLURM jobs
that are submitted. This is especially relevant for clusters with constraints on the number of jobs that a user is allowed to submit over a certain timespan.
The logic for handling the batching parameters (that is, how many tasks can be combined in the same SLURM job, and how many of them can run in parallel) is implemented in this submodule.
User impersonation¶
sudo
-based impersonation¶
The user who runs fractal-server
must have sufficient priviliges for running some commands via sudo -u
to impersonate other users of the SLURM cluster without any password. The required commands include sbatch
, scancel
, cat
, ls
and mkdir
. An example of how to achieve this is to add this block to the sudoers
file:
Runas_Alias FRACTAL_IMPERSONATE_USERS = fractal, user1, user2, user3
Cmnd_Alias FRACTAL_CMD = /usr/bin/sbatch, /usr/bin/scancel, /usr/bin/cat, /usr/bin/ls, /usr/bin/mkdir
fractal ALL=(FRACTAL_IMPERSONATE_USERS) NOPASSWD:FRACTAL_CMD
fractal
is the user running fractal-server
, and {user1,user2,user3}
are the users who can be impersonated. Note that one could also grant fractal
the option of impersonating a whole UNIX group, instead of listing users one by one.
SSH-based impersonation¶
In this scenario, one or many service users exist on the SLURM cluster, which will be the one running all jobs. The user settings for each Fractal user determines which service user will be impersonated (through SSH) when connecting to the cluster to run jobs.