Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-4276

Reconcile schema - inject null values for missing fields and add new fields

    XMLWordPrintableJSON

Details

    Description

      Improve schema reconciliation to make it more flexible in presence of full schema evolution being enabled.

      Desired behavior:

      1. incoming data has missing columns that were already defined in the table –> null values will be injected into missing columns 
      2. incoming data contains new columns not defined yet in the table -> columns will be added to the table schema (incoming dataframe?)
      3. incoming data has missing columns that are already defined in the table and new columns not yet defined in the table -> new columns will be added to the table schema, missing columns will be injected with null values

      No column should be dropped when using hive sync utility when schema reconciliation is enabled.

      Related GH issue:
      https://github.com/apache/hudi/issues/5873

       

      Attachments

        Issue Links

          Activity

            People

              xiaotaotao tao meng
              kazdy kazdy
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: