Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
Reviewed
Description
HBaseStorage.initializeHBaseClassLoaderResources() uses TableMapReduceUtil APIs to add dependency jars. That sets the tmpjars setting which makes JobClient ship the jars to hdfs and use that path in distributed cache. That bypasses the optimizations in PIG-2672 and PIG-3861 which avoid shipping the jars to hdfs. Instead it should implement the getShipFiles() API introduced in PIG-4141 so that PIG-2672 or PIG-3861 avoid shipping the same jar multiple times to hdfs for a job.