* Minor updates Nov 2019.
This commit is contained in:
@@ -1,7 +1,10 @@
|
||||
SLURM ACCOUNTING (sacct)
|
||||
========================
|
||||
|
||||
CAVEAT:
|
||||
Created: April 2019<br>
|
||||
Updated: November 2019
|
||||
|
||||
**CAVEAT:**
|
||||
This document was originally developed by referencing SLURM
|
||||
18.08.1 used on Turing.
|
||||
I also tried to consult the newer version (master branch
|
||||
@@ -11,6 +14,12 @@ incompatible with this version.
|
||||
Please use a grain of salt when reading, and always consult with
|
||||
manual pages, source code, etc in case of doubt.
|
||||
|
||||
*Update 2019-11-06*:
|
||||
SLURM man page now contains the description of the accounting fields.
|
||||
Please look at
|
||||
<https://slurm.schedmd.com/sacct.html#lbAF> .
|
||||
|
||||
|
||||
|
||||
UNDERSTANDING SLURM ACCOUNTING FIELDS
|
||||
-------------------------------------
|
||||
@@ -21,7 +30,10 @@ SLURM accounting can produce very many fields.
|
||||
The "cooked" job ID. Please see the discussion below.
|
||||
|
||||
`JobIDRaw`:
|
||||
The "raw" job ID. Please see the discussion below.
|
||||
The "raw" job ID.
|
||||
In a vast majority of cases, the `JobIDRaw` field is identical to `JobID`
|
||||
except in the case of array jobs.
|
||||
Please see the discussion below.
|
||||
|
||||
`TimelimitRaw`:
|
||||
The raw value of time limit, in minutes.
|
||||
@@ -76,7 +88,7 @@ A `JOBSTEP` can have several subtypes:
|
||||
* `SLURM_BATCH_SCRIPT`, in which case JobIDRaw will obtain the `.batch` suffix.
|
||||
* `SLURM_EXTERN_CONT`, in which case JobIDRaw will obtain the `.extern` suffix.
|
||||
Apparently, this is meant to indicate "external" type of job steps,
|
||||
including.
|
||||
described further below.
|
||||
* many others; but in this case, it will print JobIDRaw in `[0-9]+\.[0-9]+`
|
||||
pattern
|
||||
* Other types (usually it will have index numbers like 0, 1, 2, ...)
|
||||
@@ -87,7 +99,7 @@ A `JOBSTEP` can have several subtypes:
|
||||
A "vanilla" job entry corresponds to a single job submitted by a user to SLURM.
|
||||
This will not be a job array.
|
||||
|
||||
* Characteristics : `JobID ~ /^[0-9]+$/`.
|
||||
* Regexp match : `JobID ~ /^[0-9]+$/`.
|
||||
|
||||
|
||||
#### Array Job
|
||||
@@ -95,7 +107,7 @@ This will not be a job array.
|
||||
An "array" job entry corresponds to a single job as part of a job
|
||||
array submitted by a user to SLURM.
|
||||
|
||||
* Characteristics : `JobID ~ /^[0-9]+_[0-9]+$/`.
|
||||
* Regexp match : `JobID ~ /^[0-9]+_[0-9]+$/`.
|
||||
|
||||
The Job ID contains two numbers separated by an underscore.
|
||||
The number before the underscore refers to the job ID as reported by
|
||||
@@ -114,7 +126,7 @@ square brackets around the job suffix:
|
||||
A heterogenous job entry corresponds to a part of a heterogenous job
|
||||
submitted by a user to SLURM.
|
||||
|
||||
* Characteristics : `JobID ~ /^[0-9]+\+[0-9]+$/`.
|
||||
* Regexp match: `JobID ~ /^[0-9]+\+[0-9]+$/`.
|
||||
|
||||
The Job ID contains two numbers separated by a plus sign.
|
||||
The number before the underscore refers to the job ID as reported by
|
||||
@@ -130,7 +142,7 @@ sbatch) when more than one CPU cores were requested by the job.
|
||||
|
||||
Characteristics of SLURM_BATCH_SCRIPT accounting records:
|
||||
|
||||
* JobIDRaw =~ /^[0-9]+\.batch$/
|
||||
* Regexp match: `JobIDRaw ~ /^[0-9]+\.batch$/`
|
||||
|
||||
* The record does NOT have user ID (field `User`)
|
||||
|
||||
@@ -177,7 +189,7 @@ that may be only when a specific "job completion" task is specified.
|
||||
#### Questions & (Possible) Answers
|
||||
|
||||
* Why there is a separate "NNNNN.batch" record?
|
||||
It is perhaps when the job is multi-node.
|
||||
Perhaps, this record was made when the job is multi-node.
|
||||
It appears to me that the ".batch" record is for accounting the batch script
|
||||
itself (which will run only on node #0 of the allocated resources).
|
||||
|
||||
@@ -192,7 +204,7 @@ This is what I found after this exploration:
|
||||
|
||||
> We only need to include accounting records where the `JobIDRaw` field
|
||||
> contains only whole integers (i.e. matching regex `^[0-9]+$`).
|
||||
|
||||
> Further,
|
||||
|
||||
|
||||
## References
|
||||
|
||||
Reference in New Issue
Block a user