Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
Currently, in ratis "writeStateMachinecall" gets retried indefinitely in event of a timeout. In case, where disks are slow/overloaded or number of chunk writer threads are not available for a period of 10s, writeStateMachine call times out in 10s. In cases like these, the same write chunk keeps on getting retried causing the same chunk of data to be overwritten. The idea here is to abort the request once the node failure timeout reaches.
Attachments
Issue Links
- causes
-
HDDS-10717 nodeFailureTimeoutMs should be initialized before syncTimeoutRetry
- Resolved
- relates to
-
HDDS-9821 XceiverServerRatis SyncTimeoutRetry is overridden
- Resolved
- links to