Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
Until today, we have been using "processorId" to be synonymous to the logical "containerId", assigned by Samza.
It is easy for Samza to generate a unique set of containerIds per job because the number of containers is expected to be fixed/constant throughout the job's lifecycle. However, with the new Zookeeper based model, we allow the number of processors to be changed while the job is executing. In other words, we want to make a Samza job "elastic" in nature.
The proposal in SAMZA-1084 expects the user to assign a unique processorId to each StreamProcessor associated with the job. This is tedious on the user since the processors are going to be distributed across one or more machines and the user should coordinate among these machines for guaranteeing uniqueness of processorId within a job.
The goal of this JIRA is to understand and define the semantics of processorId and investigate a solution which does not impose this requirement on the user.
Attachments
Issue Links
- links to