Uploaded image for project: 'Apache Nemo'
  1. Apache Nemo
  2. NEMO-464

Implement an accurate task execution simulator to predict distributed data processing execution

    XMLWordPrintableJSON

Details

    • Umbrella
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None

    Description

      We need to predict the system performance prior to the actual execution, as the execution often takes a very long time to complete. This will enable the exploration of the different search spaces in a shorter period of time, to find a better solution within the search space

      We can refer to EuroSys ’12: Jockey: Guaranteed Job Latency in Data Parallel Clusters as a related work.

      Some of the related TODOs are as follows:

      • Aggregating task metrics and historical data/traces
      • A mechanism for classifying the tasks and the relevant metrics & configurations that contribute to the resulting performance of the task
      • Utilizing our implementation of the event-based simulator (implemented as a scheduler) to integrate the task time prediction mechanism into the existing components
      • Experiments to confirm the accuracy of the simulator

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              wonook Wonook
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: