Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.3.0
-
None
-
Reviewed
-
Description
Consider this scenario of 3 realms: A, B and COMMON, where A trusts COMMON, and B trusts COMMON (one way trusts both), and both A and B run HDFS + YARN clusters.
Now if one logs in with a COMMON credential, and runs a job on A's YARN that needs to access B's HDFS (such as a DistCp), the operation fails in the RM, as it attempts a renewDelegationToken(…) synchronously during application submission (to validate the managed token before it adds it to a scheduler for automatic renewal). The call obviously fails cause B realm will not trust A's credentials (here, the RM's principal is the renewer).
In the 1.x JobTracker the same call is present, but it is done asynchronously and once the renewal attempt failed we simply ceased to schedule any further attempts of renewals, rather than fail the job immediately.
We should change the logic such that we attempt the renewal but go easy on the failure and skip the scheduling alone, rather than bubble back an error to the client, failing the app submission. This way the old behaviour is retained.
Attachments
Attachments
Issue Links
- is related to
-
HDFS-9525 hadoop utilities need to support provided delegation tokens
- Resolved
-
SPARK-34295 Allow option similar to mapreduce.job.hdfs-servers.token-renewal.exclude
- Resolved
- relates to
-
YARN-2836 RM behaviour on token renewal failures is broken
- Open