Details
-
Improvement
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
None
Description
For map join, all data in small aliases are hashed and stored into temporary file in MapRedLocalTask. But for some aliases without filter or projection, it seemed not necessary to do that. For example.
select a.* from src a join src b on a.key=b.key;
makes plan like this.
STAGE PLANS: Stage: Stage-4 Map Reduce Local Work Alias -> Map Local Tables: a Fetch Operator limit: -1 Alias -> Map Local Operator Tree: a TableScan alias: a HashTable Sink Operator condition expressions: 0 {key} {value} 1 handleSkewJoin: false keys: 0 [Column[key]] 1 [Column[key]] Position of Big Table: 1 Stage: Stage-3 Map Reduce Alias -> Map Operator Tree: b TableScan alias: b Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {key} {value} 1 handleSkewJoin: false keys: 0 [Column[key]] 1 [Column[key]] outputColumnNames: _col0, _col1 Position of Big Table: 1 Select Operator File Output Operator Local Work: Map Reduce Local Work Stage: Stage-0 Fetch Operator
table src(a) is fetched and stored as-is in MRLocalTask. With this patch, plan can be like below.
Stage: Stage-3 Map Reduce Alias -> Map Operator Tree: b TableScan alias: b Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {key} {value} 1 handleSkewJoin: false keys: 0 [Column[key]] 1 [Column[key]] outputColumnNames: _col0, _col1 Position of Big Table: 1 Select Operator File Output Operator Local Work: Map Reduce Local Work Alias -> Map Local Tables: a Fetch Operator limit: -1 Alias -> Map Local Operator Tree: a TableScan alias: a Has Any Stage Alias: false Stage: Stage-0 Fetch Operator
Attachments
Attachments
Issue Links
- is blocked by
-
HIVE-6229 Stats are missing sometimes (regression from HIVE-5936)
- Resolved
- is related to
-
HIVE-6731 Non-staged mapjoin optimization doesn't work with vectorization.
- Open
-
HIVE-6749 Turn hive.auto.convert.join.use.nonstaged off by default
- Resolved
- relates to
-
HIVE-6537 NullPointerException when loading hashtable for MapJoin directly
- Resolved
-
HIVE-6682 nonstaged mapjoin table memory check may be broken
- Resolved
- links to