Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-3846

Metadata Caching : A count(*) query took more time with the cache in place

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • None
    • 1.16.0
    • Metadata
    • None

    Description

      git.commit.id.abbrev=3c89b30

      I have a folder with 10k complex files. The generated cache file is around 486 MB. The below numbers indicate that we regressed in terms of performance when we generated the metadata cache

      0: jdbc:drill:zk=10.10.100.190:5181> select count(*) from `complex_sparse_50000files`;
      +----------+
      |  EXPR$0  |
      +----------+
      | 1000000  |
      +----------+
      1 row selected (30.835 seconds)
      0: jdbc:drill:zk=10.10.100.190:5181> refresh table metadata `complex_sparse_50000files`;
      +-------+---------------------------------------------------------------------+
      |  ok   |                               summary                               |
      +-------+---------------------------------------------------------------------+
      | true  | Successfully updated metadata for table complex_sparse_50000files.  |
      +-------+---------------------------------------------------------------------+
      1 row selected (10.69 seconds)
      0: jdbc:drill:zk=10.10.100.190:5181> select count(*) from `complex_sparse_50000files`;
      +----------+
      |  EXPR$0  |
      +----------+
      | 1000000  |
      +----------+
      1 row selected (47.614 seconds)
      

      Attachments

        Issue Links

          Activity

            People

              amansinha100 Aman Sinha
              rkins Rahul Kumar Challapalli
              Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: