Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Duplicate
-
2.6.5
-
None
-
None
-
None
-
zstd 1.3.3
hadoop 2.6.5
— a/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/zstd/TestZStandardCompressorDecompressor.java
+++ b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/zstd/TestZStandardCompressorDecompressor.java
@@ -62,10 +62,8 @@
@BeforeClass
public static void beforeClass() throws Exception {
CONFIGURATION.setInt(IO_FILE_BUFFER_SIZE_KEY, 1024 * 64);- uncompressedFile = new File(TestZStandardCompressorDecompressor.class
- .getResource("/zstd/test_file.txt").toURI());
- compressedFile = new File(TestZStandardCompressorDecompressor.class
- .getResource("/zstd/test_file.txt.zst").toURI());
+ uncompressedFile = new File("/tmp/badcase.data");
+ compressedFile = new File("/tmp/badcase.data.zst");
zstd 1.3.3 hadoop 2.6.5 — a/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/zstd/TestZStandardCompressorDecompressor.java +++ b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/zstd/TestZStandardCompressorDecompressor.java @@ -62,10 +62,8 @@ @BeforeClass public static void beforeClass() throws Exception { CONFIGURATION.setInt(IO_FILE_BUFFER_SIZE_KEY, 1024 * 64); uncompressedFile = new File(TestZStandardCompressorDecompressor.class .getResource("/zstd/test_file.txt").toURI()); compressedFile = new File(TestZStandardCompressorDecompressor.class .getResource("/zstd/test_file.txt.zst").toURI()); + uncompressedFile = new File("/tmp/badcase.data"); + compressedFile = new File("/tmp/badcase.data.zst");
Description
Problem:
In our production environment, we put file in hdfs with zstd compressor, recently, we find that a specific file may leads to zstandard compressor failures.
And we can reproduce the issue with specific file(attached file: badcase.data)
Analysis:
ZStandarCompressor use buffersize( From zstd recommended compress out buffer size) for both inBufferSize and outBufferSize
but zstd indeed provides two separately recommending inputBufferSize and outputBufferSize
Workaround
One workaround, using recommended in/out buffer size provided by zstd lib can avoid the problem, but we don't know why.
zstd recommended input buffer size: 1301072 (128 * 1024)
zstd recommended ouput buffer size: 131591
Attachments
Attachments
Issue Links
- duplicates
-
HADOOP-17096 Fix ZStandardCompressor input buffer offset
- Resolved
- relates to
-
HADOOP-13578 Add Codec for ZStandard Compression
- Resolved