Details
Description
We need a single Pig installation that works with both Hadoop versions. The current shim implementation assumes different builds for each version. We can solve this statically through internal build/installation system or by making the shim dynamic so that pig.jar will work on both version with runtime detection. Attached patch is to convert the static shims into a shim interface with 2 implementations, each of which will be compiled against the respective Hadoop version and included into single pig.jar (similar to what Hive does).
The default build behavior remains unchanged, only the shim for ${hadoopversion} will be compiled. Both shims can be built via: ant -Dbuild-all-shims=true
Attachments
Attachments
Issue Links
- is related to
-
PIG-2125 Make Pig work with hadoop .NEXT
- Closed
- relates to
-
HCATALOG-10 Shouldn't assume the secure hadoop installation
- Closed
Even though the common code (outside shim) can be compiled against either of the Hadoop MR versions, it needs to run against the version it was compiled against (due to changes from class to interface in several cases). Whenever we have something like jobcontext.getConfiguration() etc., the bytecode for the method call will be different depending on whether jobcontext is a class or interface (compile time fine, runtime not). Other places like somemethod(JobContext context) don't have that problem. Could get it to work for basic illustrate, but as soon as MR comes into the picture, there are many many places in the common code that are affected and it is not reasonably possible to shim all those.
Our solution will be an installer that contains set of jar files compiled against both versions and resolve the dependency at startup/install time.