Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-2917

Rollback may be incorrect for canIndexLogFile index

    XMLWordPrintableJSON

Details

    • 0.25

    Description

      Problem:

      we may find some data which should be rollbacked in hudi table.

      Root cause:

      Let's first recall how rollback plan generated about log blocks for deltaCommit. Hudi takes two cases into consideration.

      1. For some log file with no base file, they are comprised by records which are all 'insert record'. Delete them directly. Here we assume all inserted record should be covered by this way.
      2. For those fileID which are updated according to inflight commit meta of instant we want to rollback, we append command block to these log file to rollback.  Here all updated record are handled.

      However, the first condition is not always true. For indexes which can index log file, they could insert record to some existing log file. In current process, inflight hoodieCommitMeta was generated before they are assigned to specific filegroup. 

       

      Fix: 

      What's needed to fix this problem, we need to use the result of partitioner to generate hoodieCommitMeta rather than workProfile. Also, we may need more comments in rollback code to remind this case.

      Attachments

        Activity

          People

            guanziyue ZiyueGuan
            guanziyue ZiyueGuan
            sivabalan narayanan
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 1h
                1h
                Remaining:
                Remaining Estimate - 1h
                1h
                Logged:
                Time Spent - Not Specified
                Not Specified