Details
-
New Feature
-
Status: Open
-
Major
-
Resolution: Unresolved
-
0.23.0
-
None
-
None
Description
Teragen is a good benchmark of raw DFS write throughput. Terasort is a good benchmark of the whole MR system (input, shuffle, output). I've added a simple "teraread" example which reads through the terasort input data without performing any processing: this acts as a good benchmark of a read-only workload (similar to real-life "find a needle in a haystack" MR jobs)