Skip to main content

Job Submission Lua Plugin

The Job Submit Plugin runs on the head node. It executes after a user's job request is submitted to the head node and before the job is actually queued.

Purpose

The Job Submit Plugin can do the following:

  • Modify job submission information, such as the job's partition, cores, and memory amount.
  • Check job information to determine whether it is allowed to run, and so on.

Installation and Configuration

Add the following to slurm.conf:

JobSubmitPlugins=job_submit/lua

Save the Lua script to /etc/slurm/job_submit.lua.

Supported Functions

-- Executed on job submission
function slurm_job_submit(job_desc, part_list, submit_uid)
end

-- Executed on job modification
function slurm_job_modify(job_desc, job_rec, part_list, modify_uid)
end

Parameters:

  • job_desc: job description information
  • part_list: all partition information
  • submit_uid: UID of the submitting user
  • job_rec: current job record (job_modify only)
  • modify_uid: UID of the user performing the modification (job_modify only)

Data Structures

Job Descriptor (job_desc)

FieldTypeDescription
accountstringJob account name
admin_commentstringAdmin comment
array_task_cntnumberNumber of array tasks
batch_featuresstringJob batch features
burst_bufferstringJob burst buffer
commentstringJob comment
cpus_per_tresstringCPUs per TRES (resource manager)
delay_bootnumberDelay boot time
direct_set_prionumberDirectly set priority
featuresstringJob features
gresstringRequested GRES (generic resources)
group_idnumberJob group ID
job_idnumberJob ID
job_statenumberJob state
licensesstringJob licenses
max_cpusnumberMaximum CPUs available to the job
max_nodesnumberMaximum nodes available to the job
mem_per_tresstringMemory allocated per TRES
min_cpusnumberMinimum CPUs requested by the job
min_mem_per_nodenumberMinimum memory requirement per node
min_mem_per_cpunumberMinimum memory requirement per CPU
min_nodesnumberMinimum nodes requested by the job
namestringJob name
nicenumberJob priority offset (scheduling)
pack_job_idnumberPack job ID
pack_job_id_setstringPack job ID set
pack_job_offsetnumberPack job offset
partitionstringJob partition; note this is the requested value and may be empty
pn_min_cpusnumberMinimum CPUs per node
pn_min_memorynumberMinimum memory per node (64-bit integer)
prioritynumberJob priority
qosstringJob Quality of Service (QoS)
rebootnumberJob reboot setting
req_switchnumberJob requested switch information
site_factornumberSite factor, usually for scheduling policies
spank_job_envtableSPANK (Slurm plugin) job environment variables
spank_job_env_sizenumberSize of SPANK job environment variables
time_limitnumberJob time limit
time_minnumberMinimum job time limit
tres_bindstringTRES binding information
tres_freqstringTRES frequency information
tres_per_jobstringTRES per job
tres_per_nodestringTRES per node
tres_per_socketstringTRES per socket
tres_per_taskstringTRES per task
user_idnumberUser ID
user_namestringUser name
wait4switchnumberJob wait-for-switch status
wckeystringJob workload key (workload key)

Partition (part_list)

FieldTypeDescription
allow_accountsstringAllowed account names
allow_alloc_nodesstringAllowed node names
allow_groupsstringAllowed group names
allow_qosstringAllowed QoS settings
alternatestringAlternate partition name
billing_weights_strstringPartition billing weight string
default_timenumberDefault time limit (minutes)
def_mem_per_cpunumberDefault memory per CPU (if allocated by CPU)
def_mem_per_nodenumberDefault memory per node
deny_accountsstringDenied account names
deny_qosstringDenied QoS settings
flag_defaultnumberWhether it is the default partition (0: no, 1: yes)
flagsnumberPartition flags field (e.g., enabled)
max_cpus_per_nodenumberMaximum CPUs allowed per node
max_mem_per_cpunumberMaximum memory per CPU
max_mem_per_nodenumberMaximum memory per node
max_nodesnumberMaximum nodes
max_nodes_orignumberOriginal maximum nodes
max_sharenumberMaximum sharing ratio
max_timenumberMaximum time limit (minutes)
min_nodesnumberMinimum nodes
min_nodes_orignumberOriginal minimum nodes
namestringPartition name
nodesstringNode list
priority_job_factornumberJob priority factor
priority_tiernumberPriority tier
qosstringQoS settings
state_upnumberWhether the partition is active (0: inactive, 1: active)

Global Object slurm

Log Functions

Field/FunctionDescription
log_errorLog error messages
log_infoLog informational messages (level 0)
log_verboseLog verbose messages (level 1)
log_debugLog debug messages (level 2)
log_debug2Log more detailed debug messages (level 3)
log_debug3Log even more detailed debug messages (level 4)
log_debug4Log the most detailed debug messages (level 5)
log_userLog user messages

Error Codes

Field/FunctionDescription
ERRORIndicates a general error
FAILUREIndicates operation failure
SUCCESSIndicates operation success
ESLURM_ACCESS_DENIEDAccess denied
ESLURM_ACCOUNTING_POLICYAccounting policy error
ESLURM_INVALID_ACCOUNTInvalid account
ESLURM_INVALID_LICENSESInvalid licenses
ESLURM_INVALID_NODE_COUNTInvalid node count
ESLURM_INVALID_TIME_LIMITInvalid time limit
ESLURM_JOB_MISSING_SIZE_SPECIFICATIONJob missing size specification
ESLURM_MISSING_TIME_LIMITMissing time limit

Other Definitions

Field/FunctionDescription
ALLOC_SID_ADMIN_HOLDAllocation ID reserved by admin
ALLOC_SID_USER_HOLDAllocation ID reserved by user
INFINITEInfinite value
INFINITE6464-bit infinite value
MAIL_JOB_BEGINSend mail when job begins
MAIL_JOB_ENDSend mail when job ends
MAIL_JOB_FAILSend mail when job fails
MAIL_JOB_REQUEUESend mail when job requeues
MAIL_JOB_TIME100Send mail when job reaches 100% time
MAIL_JOB_TIME90Send mail when job reaches 90% time
MAIL_JOB_TIME80Send mail when job reaches 80% time
MAIL_JOB_TIME50Send mail when job reaches 50% time
MAIL_JOB_STAGE_OUTSend mail when job stage ends
MEM_PER_CPUMemory per CPU
NICE_OFFSETPriority offset
JOB_SHARED_NONEJob not shared
JOB_SHARED_OKJob shared
JOB_SHARED_USERUser shared job
JOB_SHARED_MCSMCS shared job
NO_VAL6464-bit invalid value
NO_VALInvalid value
NO_VAL1616-bit invalid value
NO_VAL88-bit invalid value
SHARED_FORCEForce sharing

Job Descriptor Bit Flags

Field/FunctionDescription
GRES_DISABLE_BINDDisable resource binding
GRES_ENFORCE_BINDEnforce resource binding
KILL_INV_DEPKill invalid dependencies
NO_KILL_INV_DEPDo not kill invalid dependencies
SPREAD_JOBSpread job
USE_MIN_NODESUse minimum nodes