Skip to main content

Version: FCP 25.11

Impact of Restarting or Shutting Down the Platform and Related Nodes

The overall FCP platform consists of three major parts:

Platform management nodes
- Core node
- Monitor node (optional)
Cluster nodes
- Head node
- Compute node
- Login node
- Desktop node
External supporting service nodes
- Authentication service (optional)
- NTP service
- Storage service

If the nodes above are shut down, the impact is as follows:

Node Type	In-Cluster (Fsched) Jobs	Task Mode	Cluster Management	Cluster Monitoring	User Management	Data Access	Remote Access
Management node	Long downtime may make task accounting information inaccurate; short downtime has no effect	Task submission is unavailable	Cluster management is unavailable	Cluster monitoring is unavailable	User management is unavailable	Data access is unavailable	Remote access is unavailable
Monitor node	None	None	None	Cluster monitoring is unavailable	None	None	None
Head node	New jobs cannot be submitted. Running jobs continue until completion, but resources cannot be released afterward	Tasks fail	Cluster management is unavailable	Some monitoring information cannot be collected	None	None	None
Compute node	Jobs running on the node fail	Tasks running on the node fail	Cluster management is unavailable	Monitoring information for that node cannot be collected	None	None	None
Login node	Interactive jobs running on the node fail	None	Cluster management is unavailable	Monitoring information for that node cannot be collected	None	None	None
Desktop node	Jobs running on the node fail	None	Cluster management is unavailable	Monitoring information for that node cannot be collected	None	None	None
Authentication service	Long downtime (> 1 minute) prevents task submission because submitter identity cannot be verified; short downtime has no effect	Long downtime (> 1 minute) prevents task submission because submitter identity cannot be verified	Users cannot log in	None	User management is unavailable	Authentication cannot be verified	Authentication cannot be verified
NTP service	Long failures cause time drift, which breaks node-to-node validation and prevents jobs from running; short failures have no effect	Long failures cause time drift, which breaks node-to-node validation and prevents jobs from running	None	None	None	None	None
Storage service	Task execution may fail, depending on the application	Task submission is unavailable	Cluster management is unavailable and management operations may block	None	None	Data access is unavailable	If the user home directory is on shared storage, users cannot log in