Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Won't Fix
-
2.5.0
-
None
-
None
Description
Currently, the IFile format used by the MR shuffle checksums all data using the zlib CRC32 polynomial. If we allow use of CRC32C instead, we can get a large reduction in CPU usage by leveraging the native hardware CRC32C implementation (approx half a second of CPU time savings per GB checksummed).
Attachments
Attachments
Issue Links
- is related to
-
HDFS-3528 Use native CRC32 in DFS write path
- Closed
- relates to
-
MAPREDUCE-2841 Task level native optimization
- Resolved
-
HADOOP-10859 Native implementation of java Checksum interface
- Resolved