Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-3619

The 'supplement to GC algorithm' breaks major delta compaction

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.17.0
    • 1.18.0, 1.17.1
    • compaction, tserver
    • None

    Description

      With the functionality introduced with ad920e69f doesn't handle the appearance of an empty rowset as the result of major delta compaction scheduled, and that leads to errors like below once it's run its course:

      W20240906 10:59:01.768857 189660 tablet_mm_ops.cc:364] T 64144a1d4b864aa080e6cc53056546a5 P 574954b3b13a415c83a1660e7f51ee4e: Major delta compaction failed on 64144a1d4b864aa080e6cc53056546a5: Corruption: Failed major delta compaction on RowSet(1675): No min key found: CFile base data in RowSet(1675)
      

      Similarly, the mt-tablet-test is sporadically failing due to the same issue when the test workload happens to create similar situation with all-the-rows-deleted rowsets:

      MultiThreadedHybridClockTabletTest/5.UpdateNoMergeCompaction: src/kudu/tablet/mt-tablet-test.cc:489: Failure
      Failed
      Bad status: Corruption: Failed major delta compaction on RowSet(1): No min key found: CFile base data in RowSet(1)
      

      There is a simple test scenario that triggers the issue: https://gerrit.cloudera.org/#/c/21809/.

      As a workaround, it's possible to set the --all_delete_op_delta_file_cnt_for_compaction to a very high value, e.g. 1000000.

      To address the issue properly, it's necessary to update the major delta compaction code to handle situations where the result rowset is completely empty. In theory, swapping out the result rowset with an empty one should be enough: for example, see how it's done in changelist 705954872.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              aserbin Alexey Serbin
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: