Skip to main content

Adaptive Scheduling

fsched (fsched-dev(FIXME)+) supports adaptive scheduling.

Purpose

After setting the relevant parameters, the system periodically checks the resource usage of each running job and dynamically modifies the job's resource requests based on the results.

warning
  • Adjusting the requested CPU count of running jobs may conflict with CPU pinning.
tip
  • Only single-node jobs are supported.
  • Adaptive scheduling takes effect only in partitions where the relevant parameters are configured.
  • Operations that adjust job resource requests are recorded in the OS logs on the head node.
  • The current memory average uses the instantaneous total memory value of the job.

Cluster Configuration

Modify the following configuration to use the plugin select/cons_tres_ex, which supports changing resource requests for running jobs.

SelectType=select/cons_tres_ex

Partition Configuration

The following partition parameters are used for adaptive scheduling.

ParameterDescriptionValue Type and RangeDefault
AdaptSchedIntervalCheck intervalInteger, default unit is minutes, range 1 to 180 minutesDefault is invalid, meaning no periodic checks will run
AdaptMinJobElapsedMinimum job runtimeInteger, default unit is seconds, minimum 1 secondDefault is invalid, meaning no minimum job runtime limit
AdaptCpuModeCPU adjustment modeOptions include INC_DEC (increase or decrease), INC (increase only), DEC (decrease only)Default 0, meaning no CPU adjustment
AdaptMemModeMemory adjustment modeOptions include INC_DEC (increase or decrease), INC (increase only), DEC (decrease only)Default 0, meaning no memory adjustment
AdaptMemBasisMemory adjustment basisOptions include MAX (based on maximum), AVE (based on average)Default MAX, meaning adjust memory based on the maximum

Example

  1. Modify cluster configuration.

    SelectType=select/cons_tres_ex
  2. Modify the partition configuration for the partition where adaptive scheduling is enabled.

    AdaptSchedInterval=10
    AdaptMinJobElapsed=60
    AdaptCpuMode=INC_DEC
    AdaptMemMode=INC_DEC
    AdaptMemBasis=MAX
  3. After the cluster reconfiguration succeeds, run a job in a partition configured with adaptive scheduling and view the job's current MinCPUsNode and MinMemoryNode.

    [jj@centos7-16c-1 ~]$ srun -p partition-9BWXR -w centos7-16c-2 stress --cpu 3 --vm 1 --vm-bytes 30M --vm-keep -t 1200s&
    [jj@centos7-16c-1 ~]$ squeue
    JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
    22 partition stress jj R 0:05 1 centos7-16c-2
    [jj@centos7-16c-1 ~]$ scontrol show job 22
    JobId=22 JobName=stress
    UserId=jj(2001) GroupId=jj(2004) MCS_label=N/A
    Priority=4294901759 Nice=0 Account=_fsched_all QOS=fastone-1-_fsched_all-partition-9bwxr WCKey=*
    JobState=RUNNING Reason=None Dependency=(null)
    Requeue=1 Restarts=0 BatchFlag=0 Reboot=0 ExitCode=0:0
    RunTime=00:00:14 TimeLimit=UNLIMITED TimeMin=N/A
    SubmitTime=2025-07-22T16:22:18 EligibleTime=2025-07-22T16:22:18
    AccrueTime=Unknown
    StartTime=2025-07-22T16:22:18 EndTime=Unknown Deadline=N/A
    SuspendTime=None SecsPreSuspend=0 LastSchedEval=2025-07-22T16:22:18
    Partition=partition-9BWXR AllocNode:Sid=centos7-16c-1:15590
    ReqNodeList=centos7-16c-2 ExcNodeList=(null)
    NodeList=centos7-16c-2
    BatchHost=centos7-16c-2
    NumNodes=1 NumCPUs=1 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
    TRES=cpu=1,mem=1M,node=1,billing=1
    Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
    MinCPUsNode=1 MinMemoryNode=1M MinTmpDiskNode=0
    Features=(null) DelayBoot=00:00:00
    OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
    Command=stress --cpu 3 --vm 1 --vm-bytes 30M --vm-keep -t 1200s
    WorkDir=/fastone/users/jj
    Power=
  4. After the job runs for a while (longer than AdaptMinJobElapsed and after the next adjustment interval), the job's requested resources will be adjusted automatically. You can view MinCPUsNode and MinMemoryNode again.

    [jj@centos7-16c-1 ~]$ scontrol show job 22
    JobId=22 JobName=stress
    UserId=jj(2001) GroupId=jj(2004) MCS_label=N/A
    Priority=4294901759 Nice=0 Account=_fsched_all QOS=fastone-1-_fsched_all-partition-9bwxr WCKey=*
    JobState=RUNNING Reason=None Dependency=(null)
    Requeue=1 Restarts=0 BatchFlag=0 Reboot=0 ExitCode=0:0
    RunTime=00:07:22 TimeLimit=UNLIMITED TimeMin=N/A
    SubmitTime=2025-07-22T16:22:18 EligibleTime=2025-07-22T16:22:18
    AccrueTime=Unknown
    StartTime=2025-07-22T16:22:18 EndTime=Unknown Deadline=N/A
    SuspendTime=None SecsPreSuspend=0 LastSchedEval=2025-07-22T16:22:18
    Partition=partition-9BWXR AllocNode:Sid=centos7-16c-1:15590
    ReqNodeList=centos7-16c-2 ExcNodeList=(null)
    NodeList=centos7-16c-2
    BatchHost=centos7-16c-2
    NumNodes=1 NumCPUs=4 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
    TRES=cpu=4,mem=30M,node=1,billing=4
    Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
    MinCPUsNode=4 MinMemoryNode=30M MinTmpDiskNode=0
    Features=(null) DelayBoot=00:00:00
    OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
    Command=stress --cpu 3 --vm 1 --vm-bytes 30M --vm-keep -t 1200s
    WorkDir=/fastone/users/jj
    Power=
  5. You can view the related logs in the OS logs on the head node.

    [root@jjtest-head ~]# grep "resource adjust" /var/log/messages
    Nov 6 18:24:17 jjtest-head statesvc[1006657]: [2025-11-06 18:24:17.142] [statesvc] [info] start update and resource adjustment threads
    Nov 6 18:24:17 jjtest-head statesvc[1006657]: [2025-11-06 18:24:17.143] [statesvc] [info] resource adjustment thread started
    Nov 6 18:26:17 jjtest-head statesvc[1006657]: [2025-11-06 18:26:17.166] [statesvc] [info] _modify_job_resources: job 3 (user: root) resource adjusted: cpu=1->7, mem=1MB->30MB