Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-2807

During insert operation impala creates too many files for a table size < block size

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Won't Fix
    • Impala 2.3.0
    • None
    • Perf Investigation
    • None

    Description

      When loading the "customer" table from TPC-DS based schema, total no. of files created is 20 (which is equal to number of impala nodes in the cluster).
      The total size of the this table is 204.2 MiB which can fit in a single block while it occupies 20 blocks in this case.
      When ran the same insert command with a single impalad running in the cluster single block was able to hold all the table data and only one hdfs file was created.

      Attachments

        Activity

          People

            Unassigned Unassigned
            dkumar@cloudera.com Dileep Kumar
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: