Description
Aurora client has a built in mechanism to automatically retry thrift API operations if the connection with scheduler times out, experiences transport exception, or encounters a transient exception on the scheduler side.
Retrying thrift calls due to scheduler connection timeout and transient exceptions (see AURORA-187) is safe. However, as Aurora has no concept of idempotency, its client can retry non-idempotent operations upon encountering transport exceptions which can lead to nondeterministic situations.
For example, if client requests go through a proxy to reach scheduler, client might consider a non-idempotent request failed and automatically retry it while the original request has been received and processed by the scheduler.
Attachments
Issue Links
- relates to
-
AURORA-1924 Aurora client should reconcile idempotent job creations
- Resolved