Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-35299

FlinkKinesisConsumer does not respect StreamInitialPosition for new Kinesis Stream when restoring from snapshot

    XMLWordPrintableJSON

Details

    Description

      What

      The FlinkKinesisConsumer allows users to read from multiple Kinesis Streams.

      Users can also specify a STREAM_INITIAL_POSITION, which configures if the consumer starts reading the stream from TRIM_HORIZON / LATEST / AT_TIMESTAMP.

      When restoring the Kinesis Consumer from an existing snapshot, users can configure the consumer to read from additional Kinesis Streams. The expected behavior would be for the FlinkKinesisConsumer to start reading from the additional Kinesis Streams respecting the STREAM_INITIAL_POSITION configuration. However, we find that it currently reads from TRIM_HORIZON.

      This is surprising behavior and should be corrected.

      Why

      Principle of Least Astonishment

      How

      We recommend that we reconstruct the previously seen streams by iterating through the sequenceNumsStateForCheckpoint in FlinkKinesisConsumer#initializeState().

      Risks

      This might increase the state restore time. We can consider adding a feature flag for users to turn this check off.

      Attachments

        Issue Links

          Activity

            People

              antssilva96 Antonio Silva
              hong Hong Liang Teoh
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: