Details
-
New Feature
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
Reviewed
Description
Previous benchmarks ( HADOOP-2369 , HADOOP-3770 ), while informed by production jobs, were principally load generating tools used to validate stability and performance under saturation. The important dimensions of that load- submission order/rate, I/O profile, CPU usage, etc- only accidentally match that of the real load on the cluster. Given related work that characterizes production load ( MAPREDUCE-751 ), it would be worthwhile to use mined data to impose a corresponding load for tuning and guiding development of the framework.
The first version will focus on modeling task I/O, submission, and memory usage.
Attachments
Attachments
Issue Links
- is blocked by
-
MAPREDUCE-751 Rumen: a tool to extract job characterization data from job tracker logs
- Closed
-
MAPREDUCE-966 Rumen interface improvement
- Closed