Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
1.2.0
-
None
Description
MetricsSystem construction attempts to namespace metrics from each executor using that executor's ID.
The ID is currently set at Executor construction time (uncoincidentally, just before the ExecutorSource is registered), but this is after the MetricsSystem has been initialized (which happens during SparkEnv construction, which itself happens during ExecutorBackend construction, before Executor construction).
I noticed this problem because I wasn't seeing any JVM metrics from my executors in a Graphite dashboard I've set up; turns out all the executors (and the driver) were namespacing their metrics under "<driver>", and Graphite responds to such a situation by only taking the last value it receives for each "metric" within a configurable time window (e.g. 10s). I was seeing per-executor metrics, properly namespaced with each executor's ID, from ExecutorSource, which as I mentioned above is registered after the executor ID is set.
I have a one-line fix for this that I will submit shortly.