Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-3741

Push bloom filters to Kudu scanners

    XMLWordPrintableJSON

Details

    Description

      Impala relies on bloom filters to reduce number of rows from coming out of the scan node for selective joins.
      Queries get up to 20x speedup, not having bloom filter support in Kudu will create a big performance gap between Parquet and Kudu.
      https://github.com/cloudera/Impala/blob/cdh5-trunk/be/src/util/bloom-filter.h

      Attachments

        Issue Links

          Activity

            People

              wzhou Wenzhe Zhou
              mjacobs Matthew Jacobs
              Votes:
              2 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: