Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.7.1
-
None
-
Reviewed
Description
It seems there is a bug on FSAppAttempt#getAllowedLocalityLevelByTime method
// default level is NODE_LOCAL if (! allowedLocalityLevel.containsKey(priority)) { allowedLocalityLevel.put(priority, NodeType.NODE_LOCAL); return NodeType.NODE_LOCAL; }
If you first invoke this method, it doesn't init time in lastScheduledContainer and this will lead to execute these code for next invokation:
// check waiting time long waitTime = currentTimeMs; if (lastScheduledContainer.containsKey(priority)) { waitTime -= lastScheduledContainer.get(priority); } else { waitTime -= getStartTime(); }
the waitTime will subtract to FsApp startTime, and this will be easily more than the delay time and allowedLocality degrade. Because FsApp startTime will be start earlier than currentTimeMs. So we should add the initial time of priority to prevent comparing with FsApp startTime and allowedLocalityLevel degrade. And this problem will have more negative influence for small-jobs. The YARN-4399 also discuss some problem in aspect of locality.