Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-14032 [libhdfs++] Phase 2 improvements
  3. HDFS-11266

libhdfs++: Redesign block reader with with simplicity and resource management in mind

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Patch Available
    • Major
    • Resolution: Unresolved
    • None
    • None
    • hdfs-client
    • None

    Description

      The goal here is to significantly simplify the block reader and make it much harder to introduce issues. There are plenty of examples of these issues in the subtasks of HDFS-8707, the one that finally motivated a reimplementation is HDFS-10931.

      Goals:
      -The read side protocol of the data transfer pipeline is fundamentally really simple (even if done asynchronously). The code should be equally simple.

      -Get the code in a state that should be easy enough to reason about with a solid understanding of HDFS and basic understanding of C++ and vice versa: improve comments and avoid using esoteric C++ constructs. This is a must-have in order to lower the bar to contribute.

      -Get rid of dependencies on the existing continuation stuff. Myself and others have spent far too much time debugging both the continuation code and bugs introduced because the continuation code was hard to reason about. Notable issues:
      -It's cool from a theoretical perspective, but after 18 months of working on this it's still unclear what problem the continuation pattern helped solve that callbacks couldn't.
      -They spend more time allocating memory than the rest of the code does doing real work - seriously, profile it. This can't be fixed because the Pipeline takes ownership of all Continuation objects and then deletes them.
      -The way the block reader really uses them is a hybrid of a state machine, continuations, and directly using asio callbacks to bounce between the two.

      Proposed approach:
      Still have a BlockReader class that owns a PacketReader class, the packet reader is analogous to the ReadPacketContinuation that the BlockReader builds now. The difference is that none of this will be stitched together at runtime using continuations, and once we have a block reader with a member packet reader that gets allocated up front. The PacketReader can be recycled in order to avoid allocations. The block reader is only responsible for requesting block info, after that it keeps invoking the PacketReader until enough data has been read.

      Async chaining:
      Move to a state machine based approach. This allows the readers to be pinned in memory, where each state is represented as a method. The asynchronous IO becomes the state transitions. A callback is supplied to the asio async call that jumps to the next state upon completion of the IO operation. Epsilon transitions will be fairly rare, but if we need them to temporarily drop a lock as is done in the RPC code io_service::post can be used rather than a call that actually does IO.

      I'm fairly confident in this approach since I used the same to implement various hardware async bus interfaces in VHDL to good effect i.e. high performance and easy to understand. An asio callback is roughly analogous to a signal in a sensitivity list as the methods are to process blocks.

      Example state machine that would send some stuff, then wait to get something back like what the current BlockReader::AsyncRequestBlock does using the approach described above.

      class ExampleHandshake {
        // class would own any small buffers so they can be directly accessed
       public:
        void SendHandshake();
       private:
        void OnHandshakeSend();
        void OnHandShakeDone();
      
        asio::io_service service_;
        asio::ip::tcp::socket socket_;
      }
      
      void ExampleHandshake::SendHandshake() {
        // trampoline to jump into read state once write completes
        auto trampoline[this](asio::error_code ec, size_t sz) {
          //error checking here
         this->OnHandshakeSend();
        };
        asio::write(service_, socket_, asio buffer of data here, trampoline);
      }
      
      void ExampleHandshake::OnHandshakeSend() {
        // when read completes bounce into handler
        auto trampoline = [this](asio::error_code ec, size_t sz) {
          this->OnHandshakeDone();
        };
        asio::read(service_, socket_, asio buffer for received data, trampoline);
      }
      
      void ExampleHandshake::OnHandshakeDone() {
        //just finished sending request, and receiving response, go do something
      }
      

      Attachments

        1. HDFS-11266.HDFS-8707.000.patch
          34 kB
          James Clampffer

        Activity

          People

            James C James Clampffer
            James C James Clampffer
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: