[HADOOP-13091] DistCp masks potential CRC check failures - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Patch Available
Priority: Major
Resolution: Unresolved
Affects Version/s: 2.7.1
Fix Version/s: None
Component/s: None
Labels:
- distcp

Target Version/s:

2.10.3
Release Note:
Failure to retrieve file checksums now considered an error condition in distcp jobs.

Description

There appear to be edge cases whereby CRC checks may be circumvented when requests for checksums from the source or target file system fail. In this event CRCs could differ between the source and target and yet the DistCp copy would succeed, even when the 'skip CRC check' option is not being used.

The code in question is contained in the method org.apache.hadoop.tools.util.DistCpUtils#checksumsAreEqual(...)

Specifically this code block suggests that if there is a failure when trying to read the source or target checksum then the method will return true (i.e. the checksums are equal), implying that the check succeeded. In actual fact we just failed to obtain the checksum and could not perform the check.

try {
  sourceChecksum = sourceChecksum != null ? sourceChecksum : 
    sourceFS.getFileChecksum(source);
  targetChecksum = targetFS.getFileChecksum(target);
} catch (IOException e) {
  LOG.error("Unable to retrieve checksum for " + source + " or "
    + target, e);
}
return (sourceChecksum == null || targetChecksum == null ||
  sourceChecksum.equals(targetChecksum));

I believe that at the very least the caught IOException should be re-thrown. If this is not deemed desirable then I believe an option (--strictCrc?) should be added to enforce a strict check where we require that both the source and target CRCs are retrieved, are not null, and are then compared for equality. If for any reason either of the CRCs retrievals fail then an exception is thrown.

Clearly some FileSystems do not support CRCs and invocations to FileSystem.getFileChecksum(...) return null in these instances. I would suggest that these should fail a strict CRC check to prevent users developing a false sense of security in their copy pipeline.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HDFS-10338.001.patch
29/Apr/16 05:16
8 kB
Yiqun Lin
HDFS-10338.002.patch
02/May/16 05:13
10 kB
Yiqun Lin
HADOOP-13091.003.patch
06/May/16 03:43
14 kB
Yiqun Lin
HADOOP-13091.004.patch
10/May/16 02:14
14 kB
Yiqun Lin

Activity

People

Assignee:: Yiqun Lin

Reporter:: Elliot West

Votes:: 0 Vote for this issue

Watchers:: 13 Start watching this issue

Dates

Created:: 27/Apr/16 17:02

Updated:: 24/May/22 08:08