* Minor updates Nov 2019.
This commit is contained in:
@@ -1,7 +1,10 @@
|
|||||||
SLURM ACCOUNTING (sacct)
|
SLURM ACCOUNTING (sacct)
|
||||||
========================
|
========================
|
||||||
|
|
||||||
CAVEAT:
|
Created: April 2019<br>
|
||||||
|
Updated: November 2019
|
||||||
|
|
||||||
|
**CAVEAT:**
|
||||||
This document was originally developed by referencing SLURM
|
This document was originally developed by referencing SLURM
|
||||||
18.08.1 used on Turing.
|
18.08.1 used on Turing.
|
||||||
I also tried to consult the newer version (master branch
|
I also tried to consult the newer version (master branch
|
||||||
@@ -11,6 +14,12 @@ incompatible with this version.
|
|||||||
Please use a grain of salt when reading, and always consult with
|
Please use a grain of salt when reading, and always consult with
|
||||||
manual pages, source code, etc in case of doubt.
|
manual pages, source code, etc in case of doubt.
|
||||||
|
|
||||||
|
*Update 2019-11-06*:
|
||||||
|
SLURM man page now contains the description of the accounting fields.
|
||||||
|
Please look at
|
||||||
|
<https://slurm.schedmd.com/sacct.html#lbAF> .
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
UNDERSTANDING SLURM ACCOUNTING FIELDS
|
UNDERSTANDING SLURM ACCOUNTING FIELDS
|
||||||
-------------------------------------
|
-------------------------------------
|
||||||
@@ -21,7 +30,10 @@ SLURM accounting can produce very many fields.
|
|||||||
The "cooked" job ID. Please see the discussion below.
|
The "cooked" job ID. Please see the discussion below.
|
||||||
|
|
||||||
`JobIDRaw`:
|
`JobIDRaw`:
|
||||||
The "raw" job ID. Please see the discussion below.
|
The "raw" job ID.
|
||||||
|
In a vast majority of cases, the `JobIDRaw` field is identical to `JobID`
|
||||||
|
except in the case of array jobs.
|
||||||
|
Please see the discussion below.
|
||||||
|
|
||||||
`TimelimitRaw`:
|
`TimelimitRaw`:
|
||||||
The raw value of time limit, in minutes.
|
The raw value of time limit, in minutes.
|
||||||
@@ -76,7 +88,7 @@ A `JOBSTEP` can have several subtypes:
|
|||||||
* `SLURM_BATCH_SCRIPT`, in which case JobIDRaw will obtain the `.batch` suffix.
|
* `SLURM_BATCH_SCRIPT`, in which case JobIDRaw will obtain the `.batch` suffix.
|
||||||
* `SLURM_EXTERN_CONT`, in which case JobIDRaw will obtain the `.extern` suffix.
|
* `SLURM_EXTERN_CONT`, in which case JobIDRaw will obtain the `.extern` suffix.
|
||||||
Apparently, this is meant to indicate "external" type of job steps,
|
Apparently, this is meant to indicate "external" type of job steps,
|
||||||
including.
|
described further below.
|
||||||
* many others; but in this case, it will print JobIDRaw in `[0-9]+\.[0-9]+`
|
* many others; but in this case, it will print JobIDRaw in `[0-9]+\.[0-9]+`
|
||||||
pattern
|
pattern
|
||||||
* Other types (usually it will have index numbers like 0, 1, 2, ...)
|
* Other types (usually it will have index numbers like 0, 1, 2, ...)
|
||||||
@@ -87,7 +99,7 @@ A `JOBSTEP` can have several subtypes:
|
|||||||
A "vanilla" job entry corresponds to a single job submitted by a user to SLURM.
|
A "vanilla" job entry corresponds to a single job submitted by a user to SLURM.
|
||||||
This will not be a job array.
|
This will not be a job array.
|
||||||
|
|
||||||
* Characteristics : `JobID ~ /^[0-9]+$/`.
|
* Regexp match : `JobID ~ /^[0-9]+$/`.
|
||||||
|
|
||||||
|
|
||||||
#### Array Job
|
#### Array Job
|
||||||
@@ -95,7 +107,7 @@ This will not be a job array.
|
|||||||
An "array" job entry corresponds to a single job as part of a job
|
An "array" job entry corresponds to a single job as part of a job
|
||||||
array submitted by a user to SLURM.
|
array submitted by a user to SLURM.
|
||||||
|
|
||||||
* Characteristics : `JobID ~ /^[0-9]+_[0-9]+$/`.
|
* Regexp match : `JobID ~ /^[0-9]+_[0-9]+$/`.
|
||||||
|
|
||||||
The Job ID contains two numbers separated by an underscore.
|
The Job ID contains two numbers separated by an underscore.
|
||||||
The number before the underscore refers to the job ID as reported by
|
The number before the underscore refers to the job ID as reported by
|
||||||
@@ -114,7 +126,7 @@ square brackets around the job suffix:
|
|||||||
A heterogenous job entry corresponds to a part of a heterogenous job
|
A heterogenous job entry corresponds to a part of a heterogenous job
|
||||||
submitted by a user to SLURM.
|
submitted by a user to SLURM.
|
||||||
|
|
||||||
* Characteristics : `JobID ~ /^[0-9]+\+[0-9]+$/`.
|
* Regexp match: `JobID ~ /^[0-9]+\+[0-9]+$/`.
|
||||||
|
|
||||||
The Job ID contains two numbers separated by a plus sign.
|
The Job ID contains two numbers separated by a plus sign.
|
||||||
The number before the underscore refers to the job ID as reported by
|
The number before the underscore refers to the job ID as reported by
|
||||||
@@ -130,7 +142,7 @@ sbatch) when more than one CPU cores were requested by the job.
|
|||||||
|
|
||||||
Characteristics of SLURM_BATCH_SCRIPT accounting records:
|
Characteristics of SLURM_BATCH_SCRIPT accounting records:
|
||||||
|
|
||||||
* JobIDRaw =~ /^[0-9]+\.batch$/
|
* Regexp match: `JobIDRaw ~ /^[0-9]+\.batch$/`
|
||||||
|
|
||||||
* The record does NOT have user ID (field `User`)
|
* The record does NOT have user ID (field `User`)
|
||||||
|
|
||||||
@@ -177,7 +189,7 @@ that may be only when a specific "job completion" task is specified.
|
|||||||
#### Questions & (Possible) Answers
|
#### Questions & (Possible) Answers
|
||||||
|
|
||||||
* Why there is a separate "NNNNN.batch" record?
|
* Why there is a separate "NNNNN.batch" record?
|
||||||
It is perhaps when the job is multi-node.
|
Perhaps, this record was made when the job is multi-node.
|
||||||
It appears to me that the ".batch" record is for accounting the batch script
|
It appears to me that the ".batch" record is for accounting the batch script
|
||||||
itself (which will run only on node #0 of the allocated resources).
|
itself (which will run only on node #0 of the allocated resources).
|
||||||
|
|
||||||
@@ -192,7 +204,7 @@ This is what I found after this exploration:
|
|||||||
|
|
||||||
> We only need to include accounting records where the `JobIDRaw` field
|
> We only need to include accounting records where the `JobIDRaw` field
|
||||||
> contains only whole integers (i.e. matching regex `^[0-9]+$`).
|
> contains only whole integers (i.e. matching regex `^[0-9]+$`).
|
||||||
|
> Further,
|
||||||
|
|
||||||
|
|
||||||
## References
|
## References
|
||||||
|
|||||||
Reference in New Issue
Block a user