Details
-
Improvement
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
1.4.2
-
None
-
None
Description
Attempt to add an extensible validation framework to Sqoop. Adds an optional CLI option: --validate
There are 3 basic interfaces:
ValidationThreshold - Determines if the error margin between the source and target are acceptable: Absolute, Percentage Tolerant, etc.
Default implementation is AbsoluteValidationThreshold which ensures the row counts from source and targets are the same.
ValidationFailureHandler - Responsible for handling failures: log an error/warning, abort, etc. Default implementation logs a warning message to the configured logger.
Validator - Drives the validation logic by delegating the decision to ValidationThreshold and delegating failure handling to ValidationFailureHandler. The default implementation comes with a RowCountValidator which validates the row counts from source and the target.
You could extend these interfaces for more specific implementations and override 'em in sqoop configuration that is picked up.
Attachments
Attachments
Issue Links
- links to