Uploaded image for project: 'Apache Gobblin'
  1. Apache Gobblin
  2. GOBBLIN-1740

Split Workunit for Kafka Streaming jobs

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • None
    • None
    • gobblin-kafka
    • None

    Description

      When adding task from a running Helix streaming job, we need the capability to split an existing workunit. 

      We can reuse most of the attributes within the old workunit, and just need to update following props:

      task.id

      writer.output.dir

      partition.id

      gobblin.kafka.streaming.numPartitions

      Since we get the partition watermark from state store, so we don't need to re-calculate that within the workunit. All kafka related properties can also be reused.

      Attachments

        Activity

          People

            shirshanka Shirshanka Das
            hanghangliu Hanghang Liu
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: