Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.4.0
-
Reviewed
Description
If an app submission results in attempting to auto-create a leaf queue with an empty short name, the app submission should be rejected without the RM crashing. Currently, the queue will be created, but the RM encounters a FATAL exception due to metrics collision.
For example, if an app is placed to 'root.' the RM will fail with the below.
2023-09-12 20:23:43,294 FATAL org.apache.hadoop.yarn.event.EventDispatcher: Error in handling event type APP_ADDED to the Event Dispatcher org.apache.hadoop.metrics2.MetricsException: Metrics source QueueMetrics,q0=root already exists! at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newSourceName(DefaultMetricsSystem.java:152) at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.sourceName(DefaultMetricsSystem.java:125) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.register(MetricsSystemImpl.java:229) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueMetrics.forQueue(CSQueueMetrics.java:309) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.AbstractCSQueue.<init>(AbstractCSQueue.java:147) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.AbstractLeafQueue.<init>(AbstractLeafQueue.java:148) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.<init>(LeafQueue.java:42) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.createNewQueue(ParentQueue.java:495) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.addDynamicChildQueue(ParentQueue.java:563) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.addDynamicLeafQueue(ParentQueue.java:517) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerQueueManager.createAutoQueue(CapacitySchedulerQueueManager.java:678) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerQueueManager.createQueue(CapacitySchedulerQueueManager.java:511) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getOrCreateQueueFromPlacementContext(CapacityScheduler.java:898) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.addApplication(CapacityScheduler.java:962) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1920) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:170) at org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66) at java.base/java.lang.Thread.run(Thread.java:834)
Attachments
Issue Links
- is related to
-
YARN-10635 CSMapping rule can return paths with empty parts
- Resolved
- links to