[HADOOP-18564] Use file-level checksum by default when copying between two different file systems - ASF JIRA

Details

Type: Improvement
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: None
Labels:
- DistCp

Description

Goal

Reduce user friction

Background

When distcp'ing between two different file systems, distcp still uses block-level checksum by default, even though the two file systems can be very different in how they manage blocks, so that a block-level checksum no longer makes sense between these two.

e.g. distcp between HDFS and Ozone without overriding dfs.checksum.combine.mode throws IOException because the blocks of the same file on two FSes are different (as expected):

$ hadoop distcp -i -pp /test o3fs://buck-test1.vol1.ozone1/
java.lang.Exception: java.io.IOException: File copy failed: hdfs://duong-1.duong.root.hwx.site:8020/test/test.bin --> o3fs://buck-test1.vol1.ozone1/test/test.bin
	at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492)
	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552)
Caused by: java.io.IOException: File copy failed: hdfs://duong-1.duong.root.hwx.site:8020/test/test.bin --> o3fs://buck-test1.vol1.ozone1/test/test.bin
	at org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:262)
	at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:219)
	at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:48)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
	at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Couldn't run retriable-command: Copying hdfs://duong-1.duong.root.hwx.site:8020/test/test.bin to o3fs://buck-test1.vol1.ozone1/test/test.bin
	at org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:101)
	at org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:258)
	... 11 more
Caused by: java.io.IOException: Checksum mismatch between hdfs://duong-1.duong.root.hwx.site:8020/test/test.bin and o3fs://buck-test1.vol1.ozone1/.distcp.tmp.attempt_local1346550241_0001_m_000000_0.Source and destination filesystems are of different types
Their checksum algorithms may be incompatible You can choose file-level checksum validation via -Ddfs.checksum.combine.mode=COMPOSITE_CRC when block-sizes or filesystems are different. Or you can skip checksum-checks altogether  with -skipcrccheck.

And it works when we use a file-level checksum like COMPOSITE_CRC:

With -Ddfs.checksum.combine.mode=COMPOSITE_CRC

$ hadoop distcp -i -pp /test o3fs://buck-test2.vol1.ozone1/ -Ddfs.checksum.combine.mode=COMPOSITE_CRC
22/10/18 19:07:42 INFO mapreduce.Job: Job job_local386071499_0001 completed successfully
22/10/18 19:07:42 INFO mapreduce.Job: Counters: 30
	File System Counters
		FILE: Number of bytes read=219900
		FILE: Number of bytes written=794129
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=0
		HDFS: Number of bytes written=0
		HDFS: Number of read operations=13
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=2
		HDFS: Number of bytes read erasure-coded=0
		O3FS: Number of bytes read=0
		O3FS: Number of bytes written=0
		O3FS: Number of read operations=5
		O3FS: Number of large read operations=0
		O3FS: Number of write operations=0
..

Alternative

(if changing global defaults could potentially break distcp'ing between HDFS/S3/etc. Also weichiu mentioned COMPOSITE_CRC is only added in Hadoop 3.1.1. So this might be the only way.)

Don't touch the global default, and make it a client-side config.

e.g. add a config to allow automatically usage of COMPOSITE_CRC (dfs.checksum.combine.mode) when distcp'ing between HDFS and Ozone, which would be the equivalent of specifying -Ddfs.checksum.combine.mode=COMPOSITE_CRC on the distcp command but the end user won't have to specify it every single time.

cc duongnguyen weichiu

Use file-level checksum by default when copying between two different file systems

Details

Description

Goal

Background

Alternative

Attachments

Activity

People

Dates