Configuration Guide for Running Bladed Jobs with PBS and an External AD Integration
Prerequisites
External AD Server
- Ensure the AD server is network-reachable from the Windows compute nodes.
- Add the AD domain administrator
pbsadmin. Password:<PBSADMIN_PASSWORD>. Refer to the Password Retrieval Guide. If you cannot access it, contact the corresponding customer service manager, delivery manager, or technical support personnel. - Make sure the AD users have both
gid numberanduid numberconfigured. - The AD user names must match the user names on the Windows compute nodes.
External Storage
- Ensure the external storage server is network-reachable from the Windows compute nodes.
- Make sure the storage protocol supports both NFS and CIFS.
- Make sure the storage supports user authentication and AD integration.
Platform Package Version
- Platform version:
24.11bladed - Download location (must be this version or later):
s3://fastone-artifects/fastone-package/24-11bladed/fastone-fcp-24.11bladed.228719.tgz
Procedure
Install Node Dependencies
- Refer to PBS Node Minimal Dependency Installation.
- Notes:
- If a cloud Windows image is used, modify the machine SID.
- When joining the Windows machine to the domain, set DNS to the IP address of the AD domain controller.
- Enable remote login manually on the Windows machine and join it to the AD domain.
- Add the
pbsadminuser to the local Administrators group:net localgroup administrators "ad_domain\\pbsadmin" /addReplacead_domainwith the domain name of the AD domain controller.
Enable the Platform PBS Scheduler
- Refer to the Knowledge Base/SOP document for enabling the PBS scheduler to run Bladed applications.
Register Nodes with the Platform
- Note: On the Host Management page, when selecting Create Host, add the node by using the
fastoneuser as theusername.
PBS Cluster
-
Shared storage mount path:
- Ensure the shared storage directory
/fs/usersis readable and writable for all users:chmod -R 777 /fs/usersReplace/fs/userswith the configured shared storage path.
- Ensure the shared storage directory
-
Windows compute node configuration:
-
Steps:
- Example mount script content. Log in to Windows with an AD user and perform the operation:
@echo off
net use Z: /d /yes # Customize the drive letter, for example Z:
net use Z: \\fs.test.com\vol1 # Replace with your shared storage path. Ensure the path under drive Z: corresponds to the shared storage path on Linux. For example, Z:\users and /fs/users should be at the same directory level.- Run the script in
cmdas an Administrator to mount the shared storage:
psexec -s -h -c -f -accepteula "C:\Users\ad_user\mount-bladed.bat" # Replace with the path to your mount script- Note: If the drive is not
Z:, update the symbolic link as well. Run the following commands incmdas an Administrator:
rmdir /S /Q c:\fastone # Delete the existing symbolic link
mklink /d c:\fastone D: # Replace D: with your own shared storage drive letter
-
Job Configuration
- Configure the application and run the job:
- Prepare a Bladed CWL document.
- Copy the corresponding CWL file to the Fastone platform.
- Download the latest
bladed-utils.exeas described in the document. - Copy
bladed-utils.exeto theC:\bindirectory on all Windows compute nodes. Create the directory if it does not exist. - As described in the document, enter the file path or folder path in the input section when creating a new job on the platform, then run the job.
Common Issues
-
Job data
- Ensure that the job data permissions on shared storage allow all users to read and write.
- Confirm that the absolute paths in the
*.infiles match the shared storage path. If not, update them promptly. - Check whether the path to the Bladed executable in the job data matches the path in the
*.infile. If not, correct it.
-
Compute node restart
- Remount the shared storage manually.
PBS_MOMservice stopped- On the head node, run
pbsnodes -ato check whether the compute node status isdown. - On the Windows node, run
sc query pbs_momto check whether thePBS_MOMservice status isSTOPPED. - Run
net start PBS_MOMto start thePBS_MOMservice, then runsc query pbs_momagain to verify that the service status isRUNNING.
- On the head node, run