Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Goal
Reduce user friction
Background
When distcp'ing between two different file systems, distcp still uses block-level checksum by default, even though the two file systems can be very different in how they manage blocks, so that a block-level checksum no longer makes sense between these two.
e.g. distcp between HDFS and Ozone without overriding dfs.checksum.combine.mode throws IOException because the blocks of the same file on two FSes are different (as expected):
$ hadoop distcp -i -pp /test o3fs://buck-test1.vol1.ozone1/ java.lang.Exception: java.io.IOException: File copy failed: hdfs://duong-1.duong.root.hwx.site:8020/test/test.bin --> o3fs://buck-test1.vol1.ozone1/test/test.bin at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552) Caused by: java.io.IOException: File copy failed: hdfs://duong-1.duong.root.hwx.site:8020/test/test.bin --> o3fs://buck-test1.vol1.ozone1/test/test.bin at org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:262) at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:219) at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:48) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.io.IOException: Couldn't run retriable-command: Copying hdfs://duong-1.duong.root.hwx.site:8020/test/test.bin to o3fs://buck-test1.vol1.ozone1/test/test.bin at org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:101) at org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:258) ... 11 more Caused by: java.io.IOException: Checksum mismatch between hdfs://duong-1.duong.root.hwx.site:8020/test/test.bin and o3fs://buck-test1.vol1.ozone1/.distcp.tmp.attempt_local1346550241_0001_m_000000_0.Source and destination filesystems are of different types Their checksum algorithms may be incompatible You can choose file-level checksum validation via -Ddfs.checksum.combine.mode=COMPOSITE_CRC when block-sizes or filesystems are different. Or you can skip checksum-checks altogether with -skipcrccheck.
And it works when we use a file-level checksum like COMPOSITE_CRC:
$ hadoop distcp -i -pp /test o3fs://buck-test2.vol1.ozone1/ -Ddfs.checksum.combine.mode=COMPOSITE_CRC 22/10/18 19:07:42 INFO mapreduce.Job: Job job_local386071499_0001 completed successfully 22/10/18 19:07:42 INFO mapreduce.Job: Counters: 30 File System Counters FILE: Number of bytes read=219900 FILE: Number of bytes written=794129 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=0 HDFS: Number of bytes written=0 HDFS: Number of read operations=13 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 HDFS: Number of bytes read erasure-coded=0 O3FS: Number of bytes read=0 O3FS: Number of bytes written=0 O3FS: Number of read operations=5 O3FS: Number of large read operations=0 O3FS: Number of write operations=0 ..
Alternative
(if changing global defaults could potentially break distcp'ing between HDFS/S3/etc. Also weichiu mentioned COMPOSITE_CRC is only added in Hadoop 3.1.1. So this might be the only way.)
Don't touch the global default, and make it a client-side config.
e.g. add a config to allow automatically usage of COMPOSITE_CRC (dfs.checksum.combine.mode) when distcp'ing between HDFS and Ozone, which would be the equivalent of specifying -Ddfs.checksum.combine.mode=COMPOSITE_CRC on the distcp command but the end user won't have to specify it every single time.