Details
-
Improvement
-
Status: Closed
-
Minor
-
Resolution: Duplicate
-
None
Description
Starting from 0.13 precombine field is optional in Spark.
Before this was only available in Flink, but in Flink COMBINE_BEFORE_UPSERT is set to false by default and if no precombine field is provided upserts can be done without any configuration changes.
In Hudi + Spark, on the other hand, users must explicitly set COMBINE_BEFORE_UPSERT option to false first in order to do upserts in absence of precombine field.
As a Hudi user, if no precombine field is provided I would like Hudi to automatically set the appropriate option of COMBINE_BEFORE_UPSERT, to provide a seamless experience.
I assume precombine field can be optional only if the table type is CoW, for MoR precombine is required for it to work properly so it's ok to throw an error in absence of precombine when operation is upsert.
Therefore this should work only for CoW.
Attachments
Issue Links
- links to