Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
1.5.0
-
None
-
None
Description
With Flink 1.5.0 my Apache Beam job was not runnable unless I turned off latencyTracking feature. That job generated huge amount of latency metrics + histogram aggregates which updating occupied job-manager too much and cluster did fall appart.
This was discussed on mailing list:
The purpose of the ticket is reason about how to improve this and on which end. I am currently not sure what is the root cause:
a) Beam-To-Flink translation does generate too much of of "noise operators"
b) Flink does not handle latencyTracking well for large jobs
Attachments
Issue Links
- relates to
-
FLINK-10484 New latency tracking metrics format causes metrics cardinality explosion
- Closed
-
FLINK-10246 Harden and separate MetricQueryService
- Closed