preempt allows to run jobs outside the usual limits
on unused computing resources without blocking the access to the resources for jobs running
on the other queues.
This means jobs can be stopped (“preempted”) suddenly at every moment if a regular job needs the resources.
The job will be restarted when the resources become available.
The code must regularly backup its state (“checkpoint”) and make sure the backup is safe (the job could be stopped while backing up) and be able to restart on a previous backup.
The execution time is limited to 3 days. All nodes are reachable, to limit the execution on some nodes, you can use the nodes constraints. To know all the available constraints, one can use
sinfo -o "%.100N %.12c %.20R %.90f"|grep preempt
NODELIST CPUS PARTITION AVAIL_FEATURES
miriel[044-045,048,050-053,056-058,060,062-064,067-071,073,075-076,078-079,081,083-088] 24 preempt miriel,intel,haswell,infinipath
miriel[001-006,008-043] 24 preempt miriel,intel,haswell,omnipath,infinipath
zonda[01-21] 64 preempt amd,zonda
Advises to backup your application state (checkpointing)
- Use a specific function to backup and another function to restore from a backup.
- Backup in a temporary file (or several files in a temporary directory), then rename (atomic operation) the file or directory with a final name to stamp the backup.
- If the application is stopped during the backup, the temporary backup will be ignored, the previous stamped backup will be used.
- When your application starts, it should first check if a backup is available, and if yes, use it to restart from it.
- Backing up should typically be done after a MPI barrier to make sure all nodes are synchronized.
- Backup frequency should be adapted to its duration (data writing on disk). A rough idea is that an application should not run for more than 30 mins to 1 hour without doing a backup.