Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-5848

No PreCombineField mode - make COMBINE_BEFORE_UPSERT=false automatically

    XMLWordPrintableJSON

Details

    Description

      Starting from 0.13 precombine field is optional in Spark.
      Before this was only available in Flink, but in Flink COMBINE_BEFORE_UPSERT is set to false by default and if no precombine field is provided upserts can be done without any configuration changes.

      In Hudi + Spark, on the other hand, users must explicitly set COMBINE_BEFORE_UPSERT option to false first in order to do upserts in absence of precombine field.

      As a Hudi user, if no precombine field is provided I would like Hudi to automatically set the appropriate option of COMBINE_BEFORE_UPSERT, to provide a seamless experience.

      I assume precombine field can be optional only if the table type is CoW, for MoR precombine is required for it to work properly so it's ok to throw an error in absence of precombine when operation is upsert.
      Therefore this should work only for CoW.

      Attachments

        Issue Links

          Activity

            People

              kazdy kazdy
              kazdy kazdy
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: