Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
Original description: Rather than tracking the total number of times DFSInputStream failed to talk to a datanode for a particular block, such failures and the the list of datanodes involved should be scoped to individual blocks. In particular, the "deadnode" list should be a map of blocks to a list of failed nodes, the latter reset and the nodes retried per the existing semantics.
[see comment below for new thinking, left this comment to give context to discussion]
Attachments
Issue Links
- is related to
-
HDFS-127 DFSClient block read failures cause open DFSInputStream to become unusable
- Closed
- relates to
-
HADOOP-1911 infinite loop in dfs -cat command.
- Closed
-
HDFS-656 Clarify error handling and retry semantics for DFS read path
- Open