Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Later
-
None
-
None
-
None
Description
Currently, whenever a metric is (un)registered the calling thread iterates over all reporters and executes their registration logic, which may involve slow or even blocking operations depending on the reporter implementation. For scheduled reporters this also introduces concurrency as registration can occur while a report is being created, potentially causing exceptions as seen in FLINK-10035.
I propose to make the registration of metrics asynchronous, i.e. when a metric is registered it is simply put in a queue instead that the metrics thread (that is also doing the reporting) would pull from.
This further isolates jobmanager/taskmanager/task threads from user-code, should speed up deployment/shutdown of tasks and makes interactions with reporters single-threaded.
Attachments
Issue Links
- relates to
-
FLINK-10035 ConcurrentModificationException with flink-metrics-slf4j
- Closed