Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Resolved
-
None
-
None
-
None
Description
dataflow license is GNU GPL2: https://mvnrepository.com/artifact/org.checkerframework/dataflow
Apache vs GNU GPL: according to this source
4. What is the difference between the Apache License 2.0 and the GNU GPL? The GNU GPL is a copyleft license. So software that uses any GPL-licensed component has to release its full source code and all rights to modify and distribute the entire code. The Apache License 2.0 doesn’t impose any such terms. You’re not forced to release your modified version. Besides, you can choose to release your modified version under a different license (however, you’re required to retain the Apache License for the unmodified parts of the code). 5. Is the Apache License compatible with the GNU GPL? Apache License 2.0 is compatible with GPLv3, so you can freely mix the code that’s released under these two licenses. The resulting software, however, must be released under GPLv3. However, the Apache License 2.0 in incompatible with GPLv2 due to the restriction that terminates the grant of patent rights if the license sues over patent infringement. Previous Apache versions, being heavily based on the BSD license, are compatible.
in the current state, dataflow is included in the tez dist package, which makes anyone using tez need to comply with GPL license
grep -iRH "org.checkerframework.dataflow" --include="*.jar" ggrep: tez-plugins/tez-yarn-timeline-cache-plugin/target/tez-yarn-timeline-cache-plugin-0.10.2-SNAPSHOT-jar-with-dependencies.jar: binary file matches ggrep: tez-plugins/tez-history-parser/target/tez-history-parser-0.10.2-SNAPSHOT-jar-with-dependencies.jar: binary file matches ggrep: tez-dist/target/tez-0.10.2-SNAPSHOT/lib/checker-qual-2.5.2.jar: binary file matches ggrep: tez-dist/target/tez-0.10.2-SNAPSHOT/lib/hadoop-shaded-guava-1.1.1.jar: binary file matches
if I look at the checker-qual jar for example:
jar -tf tez-dist/target/tez-0.10.2-SNAPSHOT/lib/checker-qual-2.5.2.jar | grep dataflow org/checkerframework/dataflow/ org/checkerframework/dataflow/qual/ org/checkerframework/dataflow/qual/Pure$Kind.class org/checkerframework/dataflow/qual/TerminatesExecution.class org/checkerframework/dataflow/qual/SideEffectFree.class org/checkerframework/dataflow/qual/Pure.class org/checkerframework/dataflow/qual/Deterministic.class
the problem is that's dangerous to remove hadoop-shaded-guava alltogether from the distribution as other hadoop jars will directly depend on them
without a hadoop solution, we should try to manually remove the problematic package from hadoop-shaded-guava.jar and every other *jar-with-dependencies.jar that was built by tez
Attachments
Issue Links
- is fixed by
-
TEZ-4504 Upgrade Guava to 32.0.1 due to CVE-2023-2976
- Resolved
- is related to
-
HADOOP-18086 Remove org.checkerframework.dataflow from hadoop-shaded-guava artifact (GNU GPLv2 license)
- Resolved
- links to