Details
-
Task
-
Status: Resolved
-
Minor
-
Resolution: Won't Fix
-
Kudu_Impala
-
None
Description
Dan on the Kudu team suggested removing the ability to use implicit column names in the DISTRIBUTE BY clause.
Ex "DISTRIBUTE BY HASH INTO 4 BUCKETS" is allowed now and would use all the primary keys as the hash columns.
Both RANGE and HASH would/should require explicit naming. Apparently the implicit schema definitions have lead to poor results in the past.