跳到主要内容

集群自定义参数

参数名描述fsched版本限制
AccountingStoreJobCommentIf set to ""YES"" then include the job's comment field in the job complete message sent to the Accounting Storage database. The default is ""YES"".
AcctGatherNodeFreqThe AcctGather plugins sampling interval for node accounting. For AcctGather plugin values of none, this parameter is ignored.
AllowSpecResourcesUsageIf set to 1, Slurm allows individual jobs to override node's configured CoreSpecCount value.
AuthAltTypesCommand separated list of alternative authentication plugins that the slurmctld will permit for communication.
AuthInfoAdditional information to be used for authentication of communications between the Slurm daemons (slurmctld and slurmd) and the Slurm clients.
BatchStartTimeoutThe maximum time (in seconds) that a batch job is permitted for launching before being considered missing and releasing the allocation.
BurstBufferTypeThe plugin used to manage burst buffers.
CheckpointTypeThe system-initiated checkpoint method to be used for user jobs.
CliFilterPluginsA comma delimited list of command line interface option filter/modification plugins. The specified plugins will be executed in the order listed.
CliFilterPluginsFileThe content of clifilter lua script file.
CompleteWaitThe time, in seconds, given for a job to remain in COMPLETING state before any additional jobs are scheduled.
CoreSpecPluginIdentifies the plugins to be used for enforcement of core specialization.
CpuFreqDefDefault CPU frequency value or frequency governor to use when running a job step if it has not been explicitly set with the --cpu-freq option.
CpuFreqGovernorsList of CPU frequency governors allowed to be set with the salloc, sbatch, or srun option --cpu-freq.
CredTypeThe cryptographic signature tool to be used in the creation of job step credentials.
DebugFlagsDefines specific subsystems which should provide more detailed event logging. Multiple subsystems can be specified with comma separators.
DefCpuPerGPUDefault count of CPUs allocated per allocated GPU.
DefMemPerCPUDefault real memory size available per allocated CPU in megabytes. Used to avoid over-subscribing memory and causing paging.
DefMemPerGPUDefault real memory size available per allocated GPU in megabytes. The default value is 0 (unlimited). Also see DefMemPerCPU and DefMemPerNode.
DefMemPerNodeDefault real memory size available per allocated node in megabytes. Used to avoid over-subscribing memory and causing paging.
DefaultStorageHostThe default name of the machine hosting the accounting storage and job completion databases.
DefaultStorageLocThe fully qualified file name where accounting records and/or job completion records are written when the DefaultStorageType is ""filetxt"".
DefaultStoragePassThe password used to gain access to the database to store the accounting and job completion data.
DefaultStoragePortThe listening port of the accounting storage and/or job completion database server.
DefaultStorageTypeThe accounting and job completion storage mechanism type. Acceptable values at present include ""filetxt"", ""mysql"" and ""none"".
DefaultStorageUserThe user account for accessing the accounting storage and/or job completion database.
DisableRootJobsIf set to ""YES"" then user root will be prevented from running any jobs. The default value is ""NO"", meaning user root will be able to execute jobs.
EioTimeoutThe number of seconds srun waits for slurmstepd to close the TCP/IP connection used to relay data between the user application and srun when the.
EpilogMsgTimeThe number of microseconds that the slurmctld daemon requires to process an epilog completion message from the slurmd daemons.
FairShareDampeningFactorDampen the effect of exceeding a user or group's fair share of allocated resources.
FederationParametersUsed to define federation options. Multiple options may be comma separated.
FirstJobIdThe job id to be used for the first submitted to Slurm without a specific requested value.
GetEnvTimeoutControls how long the job should wait (in seconds) to load the user's environment before attempting to load it from a cache file.
GroupUpdateForceIf set to a non-zero value, then information about which users are members of groups allowed to use a partition will be updated periodically, even.
GroupUpdateTimeControls how frequently information about which users are members of groups allowed to use a partition will be updated, and how long user group.
GpuFreqDefDefault GPU frequency to use when running a job step if it has not been explicitly set using the --gpu-freq option.
HealthCheckIntervalThe interval in seconds between executions of HealthCheckProgram. The default value is zero, which disables execution.
HealthCheckNodeStateIdentify what node states should execute the HealthCheckProgram. Multiple state values may be specified with a comma separator.
HealthCheckProgramFully qualified pathname of a script to execute as user root periodically on all compute nodes that are not in the NOT_RESPONDING state.
JobAcctGatherTypeThe job accounting mechanism type.
JobAcctGatherFrequencyThe job accounting and profiling sampling intervals.
JobAcctGatherParamsArbitrary parameters for the job account gather plugin Acceptable values at present include:.
JobCheckpointDirSpecifies the default directory for storing or reading job checkpoint information.
JobCompHostThe name of the machine hosting the job completion database. Only used for database type storage plugins, ignored otherwise.
JobCompLocThe fully qualified file name where job completion records are written when the JobCompType is ""jobcomp/filetxt"" or the database where job com‐.
JobCompPassThe password used to gain access to the database to store the job completion data.
JobCompPortThe listening port of the job completion database server. Only used for database type storage plugins, ignored otherwise.
JobCompTypeThe job completion logging mechanism type.
JobCompUserThe user account for accessing the job completion database. Only used for database type storage plugins, ignored otherwise.
JobContainerTypeIdentifies the plugin to be used for job tracking. The slurmd daemon must be restarted for a change in JobContainerType to take effect.
JobFileAppendThis option controls what to do if a job's output or error file exist when the job is started.
JobRequeueThis option controls the default ability for batch jobs to be requeued.
JobSubmitPluginsA comma delimited list of job submission plugins to be used. The specified plugins will be executed in the order listed.
JobSubmitPluginsFileA lua script file for job submission plugins.
KeepAliveTimeSpecifies how long sockets communications used between the srun command and its slurmstepd process are kept alive after disconnect.
KillOnBadExitIf set to 1, a step will be terminated immediately if any task is crashed or aborted, as indicated by a non-zero exit code.
KillWaitThe interval, in seconds, given to a job's processes between the SIGTERM and SIGKILL signals upon reaching its time limit.
NodeFeaturesPluginsIdentifies the plugins to be used for support of node features which can change through time.
LaunchParametersIdentifies options to the job launch plugin.
LicensesSpecification of licenses (or other resources available on all nodes of the cluster) which can be allocated to jobs.
MailDomainDomain name to qualify usernames if email address is not explicitly given with the ""--mail-user"" option.
MailProgFully qualified pathname to the program used to send email per user request.
MaxArraySizeThe maximum job array size. The maximum job array task index value will be one less than MaxArraySize to allow for an index value of zero.
MaxJobCountThe maximum number of jobs Slurm can have in its active database at one time.
MaxJobIdThe maximum job id to be used for jobs submitted to Slurm without a specific requested value.
MaxMemPerCPUMaximum real memory size available per allocated CPU in megabytes. Used to avoid over-subscribing memory and causing paging.
MaxMemPerNodeMaximum real memory size available per allocated node in megabytes. Used to avoid over-subscribing memory and causing paging.
MaxStepCountThe maximum number of steps that any job can initiate. This parameter is intended to limit the effect of bad batch scripts.
MaxTasksPerNodeMaximum number of tasks Slurm will allow a job step to spawn on a single node. The default MaxTasksPerNode is 512. May not exceed 65533.
MCSParametersMCS = Multi-Category Security MCS Plugin Parameters. The supported parameters are specific to the MCSPlugin.
MCSPluginMCS = Multi-Category Security : associate a security label to jobs and ensure that nodes can only be shared among jobs using the same security.
MessageTimeoutTime permitted for a round-trip communication to complete in seconds. Default value is 10 seconds.
MinJobAgeThe minimum age of a completed job before its record is purged from Slurm's active database.
MpiDefaultIdentifies the default type of MPI to be used. Srun may override this configuration parameter in any case.
MpiParamsMPI parameters. Used to identify ports used by older versions of OpenMPI and native Cray systems.
MsgAggregationParamsMessage aggregation parameters.
OverTimeLimitNumber of minutes by which a job can exceed its time limit before being canceled.
PowerParametersSystem power management parameters. The supported parameters are specific to the PowerPlugin.
PreemptModeEnables gang scheduling and/or controls the mechanism used to preempt jobs.
PreemptTypeThis specifies the plugin used to identify which jobs can be preempted in order to start a pending job.
PreemptExemptTimeGlobal option for minimum run time for all jobs before they can be considered for preemption.
PriorityCalcPeriodThe period of time in minutes in which the half-life decay will be re-calculated. Applicable only if PriorityType=priority/multifactor.
PriorityDecayHalfLifeThis controls how long prior resource use is considered in determining how over- or under-serviced an association is (user, bank account and cluster).
PriorityFavorSmallSpecifies that small jobs should be given preferential scheduling priority. Applicable only if PriorityType=priority/multifactor.
PriorityFlagsFlags to modify priority behavior. Applicable only if PriorityType=priority/multifactor. The keywords below have no associated value
PriorityMaxAgeSpecifies the job age which will be given the maximum age factor in computing priority.
PriorityParametersArbitrary string used by the PriorityType plugin.
PrioritySiteFactorParametersArbitrary string used by the PrioritySiteFactorPlugin plugin.
PrioritySiteFactorPluginThe specifies an optional plugin to be used alongside "priority/multifactor".
PriorityTypeThis specifies the plugin to be used in establishing a job's scheduling priority.
PriorityUsageResetPeriodAt this interval the usage of associations will be reset to 0. This is used if you want to enforce hard limits of time usage per association.
PriorityWeightAgeAn integer value that sets the degree to which the queue wait time component contributes to the job's priority.
PriorityWeightAssocAn integer value that sets the degree to which the association component contributes to the job's priority.
PriorityWeightFairshareAn integer value that sets the degree to which the fair-share component contributes to the job's priority.
PriorityWeightJobSizeAn integer value that sets the degree to which the job size component contributes to the job's priority.
PriorityWeightPartitionPartition factor used by priority/multifactor plugin in calculating job priority. Applicable only if PriorityType=priority/multifactor.
PriorityWeightQOSAn integer value that sets the degree to which the Quality Of Service component contributes to the job's priority.
PriorityWeightTRESA comma separated list of TRES Types and weights that sets the degree that each TRES Type contributes to the job's priority.
PrivateDataThis controls what type of information is hidden from regular users. By default, all information is visible to all users.
ProctrackTypeIdentifies the plugin to be used for process tracking on a job step basis.
PrologEpilogTimeoutThe interval in seconds Slurms waits for Prolog and Epilog before terminating them. The default behavior is to wait indefinitely.
PropagatePrioProcessControls the scheduling priority (nice value) of user spawned tasks.
PropagateResourceLimitsExceptA list of comma separated resource limit names.
ReconfigFlagsFlags to control various actions that may be taken when an ""scontrol reconfig"" command is issued.
RequeueExitEnables automatic requeue for batch jobs which exit with the specified values.
RequeueExitHoldEnables automatic requeue for batch jobs which exit with the specified values, with these jobs being held until released manually by the user.
ResumeFailProgramThe program that will be executed when nodes fail to resume to by ResumeTimeout.
ResumeProgramSlurm supports a mechanism to reduce power consumption on nodes that remain idle for an extended period of time.
ResumeTimeoutMaximum time permitted (in seconds) between when a node resume request is issued and when the node is actually available for use.
ResvOverRunDescribes how long a job already running in a reservation should be permitted to execute after the end time of the reservation has been reached.
RoutePluginIdentifies the plugin to be used for defining which nodes will be used for message forwarding and message aggregation.
SallocDefaultCommandNormally, salloc(1) will run the user's default shell when a command to execute is not specified on the salloc command line.
SbcastParametersControls sbcast command behavior. Multiple options can be specified in a comma separated list.
SchedulerParametersThe interpretation of this parameter varies by SchedulerType. Multiple options may be comma separated.
SchedulerTimeSliceNumber of seconds in each time slice when gang scheduling is enabled (PreemptMode=SUSPEND,GANG).
SchedulerTypeIdentifies the type of scheduler to be used.
SelectTypeParametersThe permitted values of SelectTypeParameters depend upon the configured value of SelectType.
SlurmdParametersParameters specific to the Slurmd. Multiple options may be comma separated.
SlurmctldDebugThe level of detail to provide slurmctld daemon's logs. The default value is info.
SlurmctldHostThe short, or long, hostname of the machine where Slurm control daemon is executed (i.e. the name returned by the command ""hostname -s"").
SlurmctldLogFileFully qualified pathname of a file into which the slurmctld daemon's logs are written. The default value is none (performs logging via syslog).
SlurmctldParametersMultiple options may be comma-separated.
SlurmctldPlugstackA comma delimited list of Slurm controller plugins to be started when the daemon begins and terminated when it ends.
SlurmctldPortThe port number that the Slurm controller, slurmctld, listens to for work. The default value is SLURMCTLD_PORT as established at system build time.
SlurmctldSyslogDebugThe slurmctld daemon will log events to the syslog file at the specified level of detail.
SlurmctldTimeoutThe interval, in seconds, that the backup controller waits for the primary controller to respond before assuming control.
SlurmdDebugThe level of detail to provide slurmd daemon's logs. The default value is info.
SlurmdLogFileFully qualified pathname of a file into which the slurmd daemon's logs are written. The default value is none (performs logging via syslog).
SlurmdPidFileFully qualified pathname of a file into which the slurmd daemon may write its process id. This may be used for automated signal processing.
SlurmdPortThe port number that the Slurm compute node daemon, slurmd, listens to for work.
SlurmdSpoolDirFully qualified pathname of a directory into which the slurmd daemon's state information and batch job script information are written.
SlurmdSyslogDebugThe slurmd daemon will log events to the syslog file at the specified level of detail.
SlurmdTimeoutThe interval, in seconds, that the Slurm controller waits for slurmd to respond before configuring that node's state to DOWN.
SlurmSchedLogFileFully qualified pathname of the scheduling event logging file. The syntax of this parameter is the same as for SlurmctldLogFile.
SlurmSchedLogLevelThe initial level of scheduling event logging, similar to the SlurmctldDebug parameter used to control the initial level of slurmctld logging.
SrunPortRangeThe srun creates a set of listening ports to communicate with the controller, the slurmstepd and to handle the application I/O.
StateSaveLocationFully qualified pathname of a directory into which the Slurm controller, slurmctld, saves its state (e.g. ""/usr/local/slurm/checkpoint"").
SuspendExcNodesSpecifies the nodes which are to not be placed in power save mode, even if the node remains idle for an extended period of time.
SuspendExcPartsSpecifies the partitions whose nodes are to not be placed in power save mode, even if the node remains idle for an extended period of time.
SuspendProgramSuspendProgram is the program that will be executed when a node remains idle for an extended period of time.
SuspendTimeoutMaximum time permitted (in seconds) between when a node suspend request is issued and when the node is shutdown.
TaskPluginIdentifies the type of task launch plugin, typically used to provide resource management within a node
TaskPluginParamOptional parameters for the task plugin. Multiple options should be comma separated.
TCPTimeoutTime permitted for TCP connection to be established. Default value is 2 seconds.
TmpFSFully qualified pathname of the file system available to user jobs for temporary storage. This parameter is used in establishing a node's TmpDisk space. The default value is "/tmp".
TopologyParamComma separated options identifying network topology options.
TopologyPluginIdentifies the plugin to be used for determining the network topology and optimizing job allocations to minimize network contention.
TrackWCKeyBoolean yes or no. Used to set display and track of the Workload Characterization Key. Must be set to track correct wckey usage.
TreeWidthSlurmd daemons use a virtual tree network for communications. TreeWidth specifies the width of the tree (i.e. the fanout).
UnkillableStepProgramIf the processes in a job step are determined to be unkillable for a period of time specified by the UnkillableStepTimeout variable, the pro‐.
UnkillableStepTimeoutThe length of time, in seconds, that Slurm will wait before deciding that processes in a job step are unkillable after they have been signaled with.
UsePAMIf set to 1, PAM (Pluggable Authentication Modules for Linux) will be enabled. PAM is used to establish the upper bounds for resource limits.
VSizeFactorMemory specifications in job requests apply to real memory size (also known as resident set size).
WaitTimeSpecifies how many seconds the srun command should by default wait after the first task terminates before terminating all remaining tasks.
X11ParametersFor use with Slurm's built-in X11 forwarding implementation.
MaxServerThreadsMaximum parallel threads to service incoming RPCsfsched-10.25 +
MaxAgentThreadsMaximum parallel threads to service outgoing RPCsfsched-10.25 +
FschedSrunPingIntervalTime interval between srun pingsfsched-10.25 +
FschedSrunPingMaxFailuresMax failures of srun pings before srun is killedfsched-10.25 +
FschedAgentConnectTimeoutTimeout for agent connectionfsched-10.25 +
FschedPrologRetryIntervalTime interval between prolog retriesfsched-10.25 +
MaxInprogressRpcCallsMax number of inprogress RPC calls allowedfsched-10.25 +
SelectTypeIdentifies the type of resource selection algorithm to be used. Acceptable values include: 'select/cons_tres' or 'select/cons_tres_ex'fsched-10.61 +