Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-5557

Wrong candidate files found in metadata table

    XMLWordPrintableJSON

Details

    Description

      Suppose the hudi table has five fields, but only two fields are indexed. When part of the filter condition in SQL comes from index fields and the other part comes from non-index fields, the candidate files queried from the metadata table are wrong.

      For example following hudi table schema

      name: varchar(128)
      age: int
      addr: varchar(128)
      city: varchar(32)
      job: varchar(32) 

      table properties

      hoodie.table.type=MERGE_ON_READ
      hoodie.metadata.enable=true
      hoodie.metadata.index.column.stats.enable=true
      hoodie.metadata.index.column.stats.column.list='name,city'
      hoodie.enable.data.skipping=true 

      sql

      select * from hudi_table where name='tom' and age=18;  

      if we set hoodie.enable.data.skipping=false, the data can be found. But if we set hoodie.enable.data.skipping=true, we can't find the expected data.

      Attachments

        Activity

          People

            Unassigned Unassigned
            rfyu ruofan
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: