Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-16428 Fine-grained network buffer management for backpressure
  3. FLINK-16404

Avoid caching buffers for blocked input channels before barrier alignment

    XMLWordPrintableJSON

Details

    • Hide
      Note that the metric of `lastCheckpointAlignmentBuffered` has been removed in this ticket, because the upstream task will not send any following data after barrier until alignment on downstream side. But this info in web UI still exists and always shows 0 now. We will also remove it from UI in a follow-up separate ticket future.
      Show
      Note that the metric of `lastCheckpointAlignmentBuffered` has been removed in this ticket, because the upstream task will not send any following data after barrier until alignment on downstream side. But this info in web UI still exists and always shows 0 now. We will also remove it from UI in a follow-up separate ticket future.

    Description

      One motivation of this issue is for reducing the in-flight data in the case of back pressure to speed up checkpoint. The current default exclusive buffers per channel is 2. If we reduce it to 0 and increase somewhat floating buffers for compensation, it might cause deadlock problem because all the floating buffers might be requested away by some blocked input channels and never recycled until barrier alignment.

      In order to solve above deadlock concern, we can make some logic changes on both sender and receiver sides.

      • Sender side: It should revoke previous received credit after sending checkpoint barrier, that means it would not send any following buffers until receiving new credits.
      • Receiver side: The respective channel releases the requested floating buffers if barrier is received from the network. After barrier alignment, it would request floating buffers for the channels with positive backlog, and notify the sender side of available credits. Then the sender can continue transporting the buffers.

      Based on above changes, we can also remove the `BufferStorage` component completely, because the receiver would never reading buffers for blocked channels. Another possible benefit is that the floating buffers might be more properly made use of before barrier alignment.

      The only side effect would bring somehow cold setup after barrier alignment. That means the sender side has to wait for credit feedback to transport data just after alignment, which would impact on delay and network throughput. But considering the checkpoint interval not too short in general, so the above side effect can be ignored in practice. We can further verify it via existing micro-benchmark.

      After this ticket done, we still can not set exclusive buffers to zero ATM, there exists another deadlock issue which would be solved separately in another ticket.

      Attachments

        1. image-2021-02-22-15-27-57-983.png
          32 kB
          super.han
        2. image-2021-02-22-15-29-55-096.png
          10 kB
          super.han
        3. image-2021-02-22-15-30-03-318.png
          10 kB
          super.han

        Issue Links

          Activity

            People

              kevin.cyj Yingjie Cao
              zjwang Zhijiang
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 10m
                  10m