Details
-
Sub-task
-
Status: Closed
-
Major
-
Resolution: Fixed
-
3.0.0
Description
Currently. bootstrap REPL LOAD expect the target database to be empty or not exist to start bootstrap load.
But, this adds overhead when there is a failure in between bootstrap load and there is no way to resume it from where it fails. So, it is needed to create checkpoints in table/partitions to skip the completely loaded objects.
Use the fully qualified path of the dump directory as a checkpoint identifier. This should be added to the table / partition properties in hive via a task, as the last task in the DAG for table / partition creation.
Attachments
Attachments
Issue Links
- relates to
-
HIVE-19739 Bootstrap REPL LOAD to use checkpoints to validate and skip the loaded data/metadata.
- Closed
- links to