[HIVE-4926] Queries which specify clustered-by keys as constants will still scan all buckets - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 0.12.0
Fix Version/s: None
Component/s: Query Processor
Labels:
- perfomance

Description

When tables are CLUSTERED BY (key) into multiple buckets, a query which specifies a key in the query predicate will still scan all buckets in the directory.

In the ideal scenario, only bucket needs to be inspected for a given key, particularly if hive.enforce.bucketing is turned on.

When a simple filter query like the following is run

select * from store_sales where ss_item_sk = 1;

The log files contain

org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader: Processing file hdfs://hadoop1.lxc:56565/user/hive/warehouse/hive_bucketed.db/store_sales/000005_0
 org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader: Processing file hdfs://hadoop1.lxc:56565/user/hive/warehouse/hive_bucketed.db/store_sales/000006_0
 org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader: Processing file hdfs://hadoop1.lxc:56565/user/hive/warehouse/hive_bucketed.db/store_sales/000007_0
 org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader: Processing file hdfs://hadoop1.lxc:56565/user/hive/warehouse/hive_bucketed.db/store_sales/000008_0
 org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader: Processing file hdfs://hadoop1.lxc:56565/user/hive/warehouse/hive_bucketed.db/store_sales/000009_0

This is going through 32x the amount of data, compared to the right approach of scanning only the partitions which match the predicate.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HIVE-4926-test.tgz
24/Jul/13 19:02
513 kB
Gopal Vijayaraghavan

Issue Links

duplicates

HIVE-5831 filter input files for bucketed tables

Resolved

Activity

People

Assignee:: Unassigned

Reporter:: Gopal Vijayaraghavan

Votes:: 1 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 24/Jul/13 19:01

Updated:: 25/Apr/17 12:04