Details
-
Bug
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
2.0.2-alpha
-
None
-
None
Description
If distcp can't read the checksum files for the source and destination files-- for any reason-- it ignores the checksums and overwrites the destination file. It does produce a log message, but I think the correct behavior would be to throw an error and stop the distcp.
If the user really wants to ignore checksums, he or she can use -skipcrccheck to do so.
The relevant code is in DistCpUtils#checksumsAreEquals:
try { sourceChecksum = sourceFS.getFileChecksum(source); targetChecksum = targetFS.getFileChecksum(target); } catch (IOException e) { LOG.error("Unable to retrieve checksum for " + source + " or " + target, e); }
Attachments
Issue Links
- is related to
-
HADOOP-15788 Improve Distcp for long-haul/cloud deployments
- Open
- relates to
-
HDFS-3054 distcp -skipcrccheck has no effect
- Closed