[HDFS-5131] Need a DEFAULT-like pipeline recovery policy that works for writers that flush - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Duplicate
Affects Version/s: 2.0.6-alpha
Fix Version/s: None
Component/s: None
Labels:
None

Target Version/s:

2.1.0-beta

Description

The Hadoop 2 pipeline-recovery mechanism currently has four policies: DISABLE (never do recovery), NEVER (never do recovery unless client asks for it), ALWAYS (block until we have recovered the write pipeline to minimum replication levels), and DEFAULT (try to do ALWAYS, but use a heuristic to "give up" and allow writers to continue if not enough datanodes are available to recover the pipeline).

The big problem with default is that it specifically falls back to ALWAYS behavior if a client calls hflush(). On its face, it seems like a reasonable thing to do, but in practice this means that clients like Flume (as well as, I assume, HBase) just block when the cluster is low on datanodes.

In order to work around this issue, the easiest thing to do today is set the policy to NEVER when using Flume to write to the cluster. But obviously that's not ideal.

I believe what clients like Flume need is an additional policy which essentially uses the heuristic logic used by DEFAULT even in cases where long-lived writers call hflush().

Attachments

Issue Links

duplicates

HDFS-4257 The ReplaceDatanodeOnFailure policies could have a forgiving option

Closed

relates to

HDFS-4257 The ReplaceDatanodeOnFailure policies could have a forgiving option

Closed

Activity

People

Assignee:: Tsz-wo Sze

Reporter:: Mike Percy

Votes:: 1 Vote for this issue

Watchers:: 15 Start watching this issue

Dates

Created:: 26/Aug/13 09:33

Updated:: 21/Oct/14 01:16

Resolved:: 21/Oct/14 01:16