[HADOOP-17292] Using lz4-java in Lz4Codec - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 3.3.0
Fix Version/s: 3.3.1, 3.4.0
Component/s: common
Labels:
- pull-request-available

Target Version/s:

3.3.1, 3.4.0
Hadoop Flags:

Reviewed
Release Note:

Hide
The Hadoop's LZ4 compression codec now depends on lz4-java. The native LZ4 is performed by the encapsulated JNI and it is no longer necessary to install and configure the lz4 system package.

The lz4-java is declared in provided scope. Applications that wish to use lz4 codec must declare dependency on lz4-java explicitly.

Show
The Hadoop's LZ4 compression codec now depends on lz4-java. The native LZ4 is performed by the encapsulated JNI and it is no longer necessary to install and configure the lz4 system package. The lz4-java is declared in provided scope. Applications that wish to use lz4 codec must declare dependency on lz4-java explicitly.

Description

In Hadoop, we use native libs for lz4 codec which has several disadvantages:

It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and they have to be installed separately on each node of the clusters, container images, or local test environments which adds huge complexities from deployment point of view. In some environments, it requires compiling the natives from sources which is non-trivial. Also, this approach is platform dependent; the binary may not work in different platform, so it requires recompilation.
It requires extra configuration of java.library.path to load the natives, and it results higher application deployment and maintenance cost for users.
Projects such as Spark use lz4-java which is JNI-based implementation. It contains native binaries in jar file, and it can automatically load the native binaries into JVM from jar without any setup. If a native implementation can not be found for a platform, it can fallback to pure-java implementation of lz4.

Attachments

Issue Links

breaks

HADOOP-17390 Skip license check on lz4 code files

Resolved

is related to

HADOOP-17399 lz4 sources missing for native Visual Studio project

Resolved

HADOOP-17464 Create hadoop-compression module

Open

HDFS-15690 Add lz4-java as hadoop-hdfs test dependency

Resolved

relates to

HADOOP-17532 Yarn Job execution get failed when LZ4 Compression Codec is used

Resolved

HADOOP-17891 lz4-java and snappy-java should be excluded from relocation in shaded Hadoop libraries

Resolved

HADOOP-17125 Using snappy-java in SnappyCodec

Resolved

links to

GitHub Pull Request #2350

GitHub Pull Request #2576

(2 relates to, 2 links to)

Activity

People

Assignee:: L. C. Hsieh

Reporter:: L. C. Hsieh

Votes:: 1 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 29/Sep/20 18:54

Updated:: 14/Sep/21 18:19

Resolved:: 18/Nov/20 20:09

Time Tracking

Estimated:

Not Specified

Remaining:

Logged:

12h