Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.3.0
-
None
Description
Consider the following pyspark script:
sc = SparkContext() // do stuff sc.stop() // do some other stuff sc = SparkContext()
That code didn't use to work at all in 2.2 (failure to create the second context), but makes more progress in 2.3. But it fails to create new Hive delegation tokens; you see this error in the output:
17/10/16 16:26:50 INFO security.HadoopFSDelegationTokenProvider: getting token for: DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-1714191595_19, ugi=blah(auth:KERBEROS)]] 17/10/16 16:26:50 INFO hive.metastore: Trying to connect to metastore with URI blah 17/10/16 16:26:50 INFO hive.metastore: Connected to metastore. 17/10/16 16:26:50 ERROR metadata.Hive: MetaException(message:Delegation Token can be issued only with kerberos authentication. Current AuthenticationMethod: TOKEN) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_delegation_token_result$get_delegation_token_resultStandardScheme.read(ThriftHiveMetastore.java) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_delegation_token_result$get_delegation_token_resultStandardScheme.read(ThriftHiveMetastore.java) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_delegation_token_result.read(ThriftHiveMetastore
The error is printed in the logs but it doesn't cause the app to fail (which might be considered wrong).
The effect is that when that old delegation token expires the new app will fail.
But the real issue here is that Spark shouldn't be mixing delegation tokens from different apps. It should try harder to isolate a set of delegation tokens to a single app submission.
And, in the case of Hive, there are many situations where a delegation token isn't needed at all.
Attachments
Issue Links
- blocks
-
SPARK-11035 Launcher: allow apps to be launched in-process
- Resolved
- is related to
-
SPARK-22341 [2.3.0] cannot run Spark on Yarn when Yarn impersonation is turned off
- Resolved
- links to