Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
2.11.1
-
None
-
Docs Required, Release Notes Required
Description
The issue is mostly about Kubernetes/Openshift deployment but could also affect other scenarios relying on external services (AWS?).
Consider the following case: multiple nodes (PODs) were started simultaneously and all of them are trying to locate if there are other nodes available using
TcpDiscoveryKubernetesIpFinder. that just returns a set of registered IPs. Since there is no delay or retry attempt, all nodes could be returned with an empty IPs list and decide to be a coordinator, i.e. to start multiple independent grids.
Proposed changes: extend TcpDiscoveryKubernetesIpFinder with either a configurable delay or repetitions counter to check if there is a non-empty list of available IPs.