Description
We need to predict the system performance prior to the actual execution, as the execution often takes a very long time to complete. This will enable the exploration of the different search spaces in a shorter period of time, to find a better solution within the search space
We can refer to EuroSys ’12: Jockey: Guaranteed Job Latency in Data Parallel Clusters as a related work.
Some of the related TODOs are as follows:
- Aggregating task metrics and historical data/traces
- A mechanism for classifying the tasks and the relevant metrics & configurations that contribute to the resulting performance of the task
- Utilizing our implementation of the event-based simulator (implemented as a scheduler) to integrate the task time prediction mechanism into the existing components
- Experiments to confirm the accuracy of the simulator
Attachments
Issue Links
- requires
-
NEMO-477 Implement a model that represent a task level exeuction time with statistical analysis
- Open
-
NEMO-478 Implement an Accurate Simulator based on Functional model
- Open
-
NEMO-479 Approximate the factors that affect the stage group level execution time
- Open
-
NEMO-476 Fix and Improve metric store logic
- Open