Job Data Storage
Background
FCP-Suite can persist job data on the Monitor node. This is done to:
- Support detailed task queries
- Provide richer job monitoring metrics across cluster, partition, and user dimensions
- Offer more stable and reliable support at a scale of 5 million jobs per day and 100,000 concurrent running jobs
- Improve monitoring data response speed
- Extend data retention time so users can query job information from months ago or even earlier
How Data Is Stored
- Job data is stored in PostgreSQL on the Monitor node.
- You can change the number of days that scheduler jobs are retained on the monitoring configuration page.
- The default retention period is 90 days. Increasing it may slow down the response time of the cluster monitoring page due to larger data volume.
- Users can query all job data within the configured retention period from cluster monitoring.
- Data is managed with a daily granularity. After the configured retention range is exceeded, the data is removed from the primary query range.
- Data removed from the primary query range is not deleted immediately. Query support for that data will be provided in the future.
Estimated Data Scale
| Job Count | Estimated Time Range | Data Size |
|---|---|---|
| 1.5 million | < 1 day | 1.4G |
| 5 million | 1 day | 4.37G |
| 150 million | 30 days | 128G |
| 450 million | 90 days | 384G |