Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
Impala 2.5.0
Description
Because of a quirk in the runtime filters implementation, we currently have to disable them when a join spills. The only reason this is necessary is that the filters are constructed as part of the hash table build. But there is no reason that we need to construct the filters at that point: we could instead construct the filters when doing initial processing of the build input, at which point we see all build-side input rows regardless of whether they are spilled or not.
This might actually perform better since it would move some of the CPU work from the CPU-intensive hash table build to the less CPU-intensive construction of the input stream.