Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
-
ghx-label-1
Description
We would need to document the behaviour of IMPALA-10811 as a limitation with AWS NLB. Problem description:
Initial RPC to submit a query and fetch the query handle can take quite long time to return as it can do various operations for planning and submission that involve executing Catalog Operations like Rename, Alter Table Recover partition that can take time on tables with many partitions(https://github.com/apache/impala/blob/1231208da7104c832c13f272d1e5b8f554d29337/be/src/exec/catalog-op-executor.cc#L92). Attached is the profile of one such DDL query.
These RPCs are:
1. Beeswax:
2. HS2:
One of the side effects of such RPC taking long time is that clients such as impala-shell using AWS NLB can get stuck for ever. The reason is NLB tracks and closes connections after 350s and cannot be configured. But after closing the connection it doesn;t send TCP RST to the client. Only when client tries to send data or packets NLB issues back TCP RST to indicate connection is not alive. Documentation is here: https://docs.aws.amazon.com/elasticloadbalancing/latest/network/network-load-balancers.html#connection-idle-timeout. Hence clients like impala-shell waiting for RPC to return gets stuck indefinitely.
Attachments
Issue Links
- is a clone of
-
IMPALA-10811 RPC to submit query getting stuck for AWS NLB forever.
- Resolved