Skip to main content

Upgrade Fsched

Scope

This document applies to the following scenarios. Follow the instructions for your scenario:

  • Existing clusters
  • FCP platform
  • FCC-E (image)
danger
  • Installing the same Fsched version already used by the current cluster may affect running jobs.

Steps

Existing Clusters

  1. Copy the installation package (fsched-*.tar.gz) to the target machine.

  2. Install fsched-{BUILDVERSION}. Replace {BUILDVERSION} in the command with the version to be installed.

    sudo tar -xvf fsched-{BUILDVERSION}.tar.gz -C /opt
    sudo /opt/fsched-{BUILDVERSION}/install.sh -t /usr/bin [-r]

    -r: Optional parameter. Attempts to restart services automatically. Effective in fsched-10.61 and later.

    When the installation succeeds, the last line of output from the install script is:

    Successfully installed fsched from ...

    If not, the installation has failed. Check the failed step, correct the issue, and run the installation script again.

  3. If step 2 does not restart services automatically, restart the relevant services manually.

    1. On the head node, restart slurmctld and fs-statesvc.
    2. On compute nodes, restart slurmd.
  4. If HA is configured, upgrade the standby node first and then the primary node.

FCP Platform

  1. For the FCP platform, upgrade existing clusters by following the steps in Existing Clusters.
  2. Copy the installation package (fsched-*.tar.gz) and overwrite /opt/components/fsched.tar.gz. This ensures that newly added nodes receive the updated Fsched component.

Image (FCC-E Only)

  1. Create a virtual machine from the console.
  2. Copy fsched-*.tar.gz to the virtual machine.
  3. Install it by following the installation steps.
  4. Create an image and register it with the API according to the image update procedure.

Other Notes

  • Fsched supports mixing different versions within the same cluster, but with the following restrictions:
    • Head-node versions must be identical in HA mode.
    • If the head node and compute nodes use different versions, the head-node version must be newer than the compute-node version.
    • New features in later versions are unavailable on older nodes.

Extracting to a Non-Standard Directory

Each Fsched installation package contains its version number and is typically located at /opt/fsched-xxx (where xxx is the version). The path actually in use is /opt/fsched, which is a symbolic link to a specific version. Therefore, it is safe to extract over an installation package that is not currently in use, but it is not safe to extract over a version that is currently in use.

  1. Create a temporary directory.
    mkdir /tmp/fsched
  2. Extract to the temporary directory.
    tar -xvf fsched-*.tar.gz -C /tmp/fsched ./opt
  3. Copy the files under /tmp/fsched/opt to the corresponding directories.