⭐️squeue

Overview

squeue is the command in Slurm for viewing job queue status:

Display all pending and running jobs
Support multiple filtering, sorting, and formatting options
View detailed job information and state reasons

Common Options

Option	Description	Example
`-a, --all`	Show all jobs in queues	`squeue -a`
`-j, --job=job(s)`	Specify job ID	`squeue -j 123,124`
`-u, --user=user_name(s)`	Filter by user	`squeue -u user1`
`-p, --partition=partition(s)`	Filter by partition	`squeue -p gpu`
`-t, --states=states`	Filter by state	`squeue -t RUNNING`
`-i, --iterate=seconds`	Refresh interval	`squeue -i 5` (refresh every 5 seconds)
`-o, --format=format`	Custom output	`squeue -o "%i %P %j %u %t %M %D %R"`
`-S, --sort=fields`	Sort output	`squeue -S "-t"` (sort by time desc)
`--start`	Show expected start time	`squeue --start`

Examples

Use squeue to view job status.

# squeue -a
JOBID PARTITION     NAME     USER ST       TIME            NODES NODELIST(REASON)
8     compute       sleep    root  R       8-02:07:58      2     ip-10-10-2-[70,80]

Here JOBID is the job ID, NAME is the job name, USER is the user, TIME is elapsed runtime, NODES is the number of nodes in use, and NODELIST is the list of nodes running the job. -a means show all jobs in all queues.

squeue supports showing multiple fields. For example, in addition to the default output, show the submit time and working directory:

squeue -a -o "%.18i %.9P %.8j %.8u %.2t %20V %.10M %.6D %R %Z"

Output Field Details

JOBID: Job ID.
PARTITION: Partition name.
NAME: Job name.
USER: Username.
ST: State.
- PD: pending, PENDING.
- R: running, RUNNING.
- CA: cancelled, CANCELLED.
- CF: configuring, CONFIGURING.
- CG: completing, COMPLETING.
- CD: completed, COMPLETED.
- F: failed, FAILED.
- TO: timeout, TIMEOUT.
- NF: node failure, NODE FAILURE.
- SE: special exit state, SPECIAL EXIT STATE.
TIME: Elapsed runtime.
NODELIST(REASON): Allocated node list (reason):
- AssociationCpuLimit: The association's specified CPUs are in use; the job will run eventually.
- AssociationMaxJobsLimit: The association's max jobs limit has been reached; the job will run eventually.
- AssociationNodeLimit: The association's specified nodes are in use; the job will run eventually.
- AssociationJobLimit: The job has reached its maximum allowed job count limit.
- AssociationResourceLimit: The job has reached its maximum allowed resource limit.
- AssociationTimeLimit: The job has reached its time limit.
- BadConstraints: The job has constraints that cannot be satisfied.
- BeginTime: The job's earliest start time has not been reached.
- Cleaning: The job was requeued and is still performing cleanup from the previous run.
- Dependency: The job is waiting for a dependent job to finish.
- FrontEndDown: No front-end node is available to run this job.
- InactiveLimit: The job has reached the system inactive limit.
- InvalidAccount: The job user account is invalid; cancel and resubmit with the correct account.
- InvalidQOS: The job QOS is invalid; cancel and resubmit with the correct QoS.
- JobHeldAdmin: The job is held by the system administrator.
- JobHeldUser: The job is held by the user.
- JobLaunchFailure: The job could not be launched, possibly due to filesystem failures or invalid program names.
- Licenses: The job is waiting for the required licenses.
- NodeDown: The required nodes are down.
- NonZeroExitCode: The job ended with a non-zero exit code.
- PartitionDown: The required partition is in DOWN state.
- PartitionInactive: The required partition is in Inactive state.
- PartitionNodeLimit: The job's required nodes exceed the current partition limit.
- PartitionTimeLimit: The job's required partition reached its time limit.
- PartitionCpuLimit: The CPUs for the job's partition are already in use; the job will run eventually.
- PartitionMaxJobsLimit: The partition's max jobs limit has been reached; the job will run eventually.
- PartitionNodeLimit: The specified nodes in the job's partition are already in use; the job will run eventually.
- Priority: The required partition has higher priority jobs or reservations.
- Prolog: The job's PrologSlurmctld prolog is still running.
- QOSJobLimit: The job's QOS has reached its max jobs limit.
- QOSResourceLimit: The job's QOS has reached its max resource limit.
- QOSGrpCpuLimit: All CPUs for the job's QoS group are in use; the job will run eventually.
- QOSGrpMaxJobsLimit: The job's QoS group max jobs limit has been reached; the job will run eventually.
- QOSGrpNodeLimit: All nodes for the job's QoS group are in use; the job will run eventually.
- QOSTimeLimit: The job's QOS has reached its time limit.
- QOSUsageThreshold: The required QOS usage threshold was violated.
- ReqNodeNotAvail: The required nodes are not available, such as when nodes are down.
- Reservation: The job is waiting for its reserved resources to become available.
- Resources: The job will wait until the required resources are available.
- SystemFailure: Slurm system failure, such as filesystem or network failure.
- TimeLimit: The job exceeded its time limit.
- QOSUsageThreshold: The required QoS usage threshold was violated.
- WaitingForScheduling: Waiting to be scheduled.

Overview​

Common Options​

Examples​

Output Field Details​

Overview

Common Options

Examples

Output Field Details