Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-14032 [libhdfs++] Phase 2 improvements
  3. HDFS-10781

libhdfs++: redefine NN timeout to be "time without a response"

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • hdfs-client
    • None

    Description

      In the find tool, we submit a zillion requests to the NameNode asynchronously. As the queue on the NameNode grows, the time to response for each individual message will increase. In the find tool, we were eventually getting timeouts on requests, even though the NN was respoinding as fast as its little feet could carry it.

      I propose that we should redefine timeouts to be on a per-connection basis rather than per-request. If a client has an outstanding request to the NN but hasn't gotten a response back within n msec, it should declare the connection dead and retry. As long as the NameNode is being responsive to the best of its ability and providing data, we will not declare the link dead.

      One potential for Failure of Least Astonishment here is that it will mean any particular request from a client cannot be depended on to get a positive or negative response within a fixed amount of time, but I think that may be a good trade to make.

      Attachments

        Activity

          People

            Unassigned Unassigned
            bobhansen Bob Hansen
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: