Overview
Fsched is a deeply optimized derivative of the open-source Slurm scheduler (19.05 branch), designed for high-performance computing scenarios. This guide provides a quick reference for Fsched core commands to help users use cluster resources efficiently.
Intended Audience
- Basic users: quickly master common commands with this guide
- Advanced users: use advanced features together with the Slurm official documentation
Common Command Quick Reference
Below are some of the most commonly used commands:
sbatch: Submit a job script to run. The script can also include one or moresruncommands to start parallel tasks.srun: Run parallel jobs interactively in real time, usually for short tests, or combined withsallocandsbatch.salloc: Allocate resources for jobs that need real-time handling. A typical scenario is to allocate resources and start a shell, then use that shell to runsruncommands to execute parallel tasks.sinfo: Show partition or node status with many filtering, sorting, and formatting options.squeue: Show jobs and job-step status in the queue, with many filtering, sorting, and formatting options.scancel: Cancel pending or running jobs or job steps, and can also send arbitrary signals to all processes in running jobs or job steps.sacct: Show accounting information for active or completed jobs or job steps (corresponding to billable compute time).scontrol: Display or set the status of Slurm jobs, partitions, nodes, etc.sacctmgr: Command-line tool for managing accounting data