Details
Description
If an MR job using HCatOutputFormat fails, and FileOutputCommitterContainer::abortJob() is called, one would expect that partitions aren't created/registered with HCatalog.
When using dynamic-partitions, one sees that this behaves correctly. But when static-partitions are used, partitions are created regardless of whether the Job succeeded or failed.
(This manifested as a failure when the job is repeated. The retry-job fails to launch since the partitions already exist from the last failed run.)
This is a result of bad code in FileOutputCommitter::cleanupJob(), which seems to do an unconditional partition-add. This can be fixed by adding a check for the output directory before adding partitions (in the !dynamicParititoning case), since the directory is removed in abortJob().
We'll have a patch for this shortly. As an aside, we ought to move the partition-creation into commitJob(), where it logically belongs. cleanupJob() is deprecated and common to both success and failure code paths.
Attachments
Attachments
Issue Links
- depends upon
-
HCATALOG-466 Update pig version used in unit tests to 0.10.0
- Resolved
- is related to
-
PIG-2712 Pig does not call OutputCommitter.abortJob() on the underlying OutputFormat
- Closed