Skip to main content

Partition Administrators

By default, administrative operations on the cluster can only be performed by the super administrator. In large environments, the super administrator may not be able to handle all administrative operations. Therefore, we introduce the concept of partition administrators. A partition administrator is a special user who can perform administrative operations on a specific partition in the cluster, but cannot administer the entire cluster. Currently, partition administrators can perform the following operations on a partition:

  • Cancel jobs in the partition.
  • Set whether the partition can accept new jobs (DRAIN/UP).

Supported Versions

10.61 and later

Usage

  • Partition administration is configured with the partition Admins parameter. This parameter supports multiple partition administrators. The format is Linux usernames, separated by commas (,).
  • Users with partition administrator privileges can use scancel to cancel other users' jobs in the corresponding partition.
  • Users with partition administrator privileges can use bkill and qdel in the wrappers to cancel other users' jobs in the corresponding partition.
  • Users with partition administrator privileges can use scontrol update partition=<partition_name> state=<UP/DRAIN> to set whether the partition can accept new jobs.

Example

Assume there is an admin user admin. We can configure them as the administrator of partition compute as follows:

PartitionName=compute Nodes=compute[1-3] ... Admins=admin
warning
  • Partition administrators can only perform management operations on their partitions, not on the entire cluster.
  • If a partition administrator is configured incorrectly or the username cannot be resolved, the cluster controller will fail to start.
  • When using scontrol, avoid updating other partition options at the same time; otherwise the state update will fail.