[IMPALA-8687] --rpc_use_loopback may not work for runtime filter RPCs - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Major
Resolution: Won't Fix
Affects Version/s: None
Fix Version/s: None
Component/s: Distributed Exec
Labels:
None

Target Version:

Impala 3.3.0
Epic Color:
ghx-label-6

Description

Following on from ~~IMPALA-8659~~, we may have some cases where impalads do self-RPCs via the thrift internal service ~~IMPALA-7984~~. This JIRA is to investigate if this is a problem, and to fix it (either by intercepting self-RPCs in Thrift or by making code changes to avoid it).

Basic join where global runtime filters should apply:

select straight_join count(*)
from alltypes t1 join /*+ shuffle */ alltypes t2 on t1.id = t2.id
where t2.string_col = '1';

Interesting cases

Dedicated coordinator with distributed plan ==> expect that all joins and scans run on executors and all filter aggregation happens on coordinator.
Single node plan (num_nodes=1) ==> expect that all filters are local ==> no RPCs required
Combined coordinator/executor with distributed plan ==> may do self-RPC

So I think in the dedicated coordinator/executor case we're ok. Note that ~~IMPALA-3825~~ may violate the above assumptions.

I can pretty easily reproduce the issue on combined coordinators/executors with verbosity level 2. This is a log excerpt from the Impalad tarmstrong-box:22000

I0619 17:28:00.913919 25525 client-cache.cc:47] GetClient(tarmstrong-box:22000)
I0619 17:28:00.913924 25525 client-cache.cc:57] GetClient(): returning cached client for tarmstrong-box:22000
I0619 17:28:00.914047 25425 rpc-trace.cc:202] RPC call: ImpalaInternalService.PublishFilter(from ::ffff:127.0.0.1:41902)
I0619 17:28:00.914587 25425 query-exec-mgr.cc:98] QueryState: query_id=624be7fc0bc0e122:0fbdc17200000000 refcnt=6
I0619 17:28:00.914597 25425 fragment-instance-state.cc:511] PublishFilter(): instance_id=624be7fc0bc0e122:0fbdc17200000002 filter_id=0
I0619 17:28:00.915010 25425 query-exec-mgr.cc:162] ReleaseQueryState(): query_id=624be7fc0bc0e122:0fbdc17200000000 refcnt=6
I0619 17:28:00.915038 25425 rpc-trace.cc:212] RPC call: backend:ImpalaInternalService.PublishFilter from ::ffff:127.0.0.1:41902 took 1.000ms
I0619 17:28:00.915043 25525 client-cache.cc:152] Releasing client for tarmstrong-box:22000 back to cache
I0619 17:28:00.915175 25525 rpc-trace.cc:212] RPC call: backend:ImpalaInternalService.UpdateFilter from ::ffff:127.0.0.1:41930 took 5.000ms
I0619 17:28:00.922312 25437 scan-node.cc:192] 624be7fc0bc0e122:0fbdc17200000002] Filters arrived. Waited 351ms

Attachments

Issue Links

relates to

IMPALA-3825 Distribute runtime filter aggregation across cluster

Resolved

IMPALA-8659 Allow self-RPCs for KRPC to go via loopback

Resolved

IMPALA-7984 Port UpdateFilter() and PublishFilter() to KRPC

Resolved

Activity

People

Assignee:: Tim Armstrong

Reporter:: Tim Armstrong

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 19/Jun/19 23:25

Updated:: 24/Jun/19 21:20

Resolved:: 24/Jun/19 21:20