Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
3.1.0
-
None
Description
When using distcp without -skipcrcchecks . If there's a checksum mismatch between src and dest store types (e.g hdfs to s3), then the error message will talk about blocksize, even when its the underlying checksum protocol itself which is the cause for failure
Source and target differ in block-size. Use -pb to preserve block-sizes during copy. Alternatively, skip checksum-checks altogether, using -skipCrc. (NOTE: By skipping checksums, one runs the risk of masking data-corruption during file-transfer.)
update: the CRC check takes always place on a distcp upload before the file is renamed into place. and you can't disable it then
Attachments
Attachments
Issue Links
- depends upon
-
HADOOP-15297 Make S3A etag => checksum feature optional
- Resolved
- is depended upon by
-
HADOOP-14831 Über-jira: S3a phase IV: Hadoop 3.1 features
- Resolved
- is related to
-
HADOOP-16536 Backport HADOOP-16158 and HADOOP-15273 to branch-2
- Patch Available
- relates to
-
HADOOP-13282 S3 blob etags to be made visible in S3A status/getFileChecksum() calls
- Resolved