Details
-
Bug
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
0.6.1-incubating, 0.7.0-incubating
-
None
-
None
Description
InputFormats that cannot be processed in parallel implement the NonParallelInput interface.
During optimization, the optimizer checks for this interface and sets the DOP of an operator to 1 if it is found. Other operators such as Mappers set their DOP during program construction to the DOP of their preceding task (if not specified otherwise). Since non-splittable data sources are only considered later by the optimizer, a Map operator will not have the same DOP as an preceding non-splittable data source.
The simple solution is to set the DOP of a non-splittable data source during program construction.