Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
None
-
None
-
None
Description
There are some user specific error for container launch failure, like:
when enabling LinuxContainerExecutor, but some node doesn't have such user exists, so container launch should get failed with following information:
2016-02-14 15:37:03,111 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1434045496283_0036_000002 State change from LAUNCHED to FAILED 2016-02-14 15:37:03,111 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Application application_1434045496283_0036 failed 2 times due to AM Container for appattempt_1434045496283_0036_000002 exited with exitCode: -1000 due to: Application application_1434045496283_0036 initialization failed (exitCode=255) with output: User jdu not found
Obviously, this node is not suitable for launching container for this user's other applications. We need a per user blacklist track mechanism rather than per application now.
Attachments
Issue Links
- is duplicated by
-
YARN-4790 Per user blacklist node for user specific error for container launch failure.
- Open