Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Not A Problem
-
1.2.1
-
None
-
spark on yarn cluster.
Description
I submit spark application to yarn cluster from a spark client machine, then I find this problem. When using spark on yarn-client model, driver is running in client, then the yarn cluster need to remotely connect to driver. But spark use InetAddress to read out the hostname of client machine,without checking if the hostname is legal or useful. So in my condition, some client machine have hostnames like "sjs_1_2", then this application fail because of cannot connect to driver on "sjs_1_2".
I suppose there should be a check for if a hostname is legal, and if not, using the IP instead.
And for this problem, I found an env. configuration "SPARK_LOCAL_HOSTNAME" can be used. if I set "SPARK_LOCAL_HOSTNAME" to be IP address in spark-env.sh, then this problem is solved. But it seems this configuration isn't introduced in any introductions or references and I found it when reading codes.
But I still think a check of if hostname is illegal is needed,